The Elements of Statistical Learning

Author: Trevor Hastie
Publisher: Springer Science & Business Media
ISBN: 9780387216065
Release Date: 2013-11-11
Genre: Mathematics

During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide” data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.

Statistical Inference for Ergodic Diffusion Processes

Author: Yury A. Kutoyants
Publisher: Springer Science & Business Media
ISBN: 1852337591
Release Date: 2004
Genre: Mathematics

An elementary introduction to the field at the start of the book introduces a class of examples - both non-standard and classical - that reappear constantly throughout the book to illustrate the merits and demerits of the procedures as the investigation progresses. The statements of the problems are in the spirit of classical mathematical statistics, and special attention is paid to asymptotically efficient procedures."--Jacket.

Ensemble Machine Learning

Author: Cha Zhang
Publisher: Springer Science & Business Media
ISBN: 9781441993250
Release Date: 2012-02-17
Genre: Computers

It is common wisdom that gathering a variety of views and inputs improves the process of decision making, and, indeed, underpins a democratic society. Dubbed “ensemble learning” by researchers in computational intelligence and machine learning, it is known to improve a decision system’s robustness and accuracy. Now, fresh developments are allowing researchers to unleash the power of ensemble learning in an increasing range of real-world applications. Ensemble learning algorithms such as “boosting” and “random forest” facilitate solutions to key computational issues such as face recognition and are now being applied in areas as diverse as object tracking and bioinformatics. Responding to a shortage of literature dedicated to the topic, this volume offers comprehensive coverage of state-of-the-art ensemble learning techniques, including the random forest skeleton tracking algorithm in the Xbox Kinect sensor, which bypasses the need for game controllers. At once a solid theoretical study and a practical guide, the volume is a windfall for researchers and practitioners alike.

Amstat News

ISBN: UOM:39015055740198
Release Date: 2002
Genre: Statistics

Data Analysis and Data Mining

Author: Adelchi Azzalini
Publisher: Oxford University Press
ISBN: 9780199942718
Release Date: 2012-04-23
Genre: Business & Economics

An introduction to statistical data mining, Data Analysis and Data Mining is both textbook and professional resource. Assuming only a basic knowledge of statistical reasoning, it presents core concepts in data mining and exploratory statistical models to students and professional statisticians-both those working in communications and those working in a technological or scientific capacity-who have a limited knowledge of data mining. This book presents key statistical concepts by way of case studies, giving readers the benefit of learning from real problems and real data. Aided by a diverse range of statistical methods and techniques, readers will move from simple problems to complex problems. Through these case studies, authors Adelchi Azzalini and Bruno Scarpa explain exactly how statistical methods work; rather than relying on the "push the button" philosophy, they demonstrate how to use statistical tools to find the best solution to any given problem. Case studies feature current topics highly relevant to data mining, such web page traffic; the segmentation of customers; selection of customers for direct mail commercial campaigns; fraud detection; and measurements of customer satisfaction. Appropriate for both advanced undergraduate and graduate students, this much-needed book will fill a gap between higher level books, which emphasize technical explanations, and lower level books, which assume no prior knowledge and do not explain the methodology behind the statistical operations.

Data Analysis Machine Learning and Knowledge Discovery

Author: Myra Spiliopoulou
Publisher: Springer Science & Business Media
ISBN: 9783319015958
Release Date: 2013-11-26
Genre: Computers

Data analysis, machine learning and knowledge discovery are research areas at the intersection of computer science, artificial intelligence, mathematics and statistics. They cover general methods and techniques that can be applied to a vast set of applications such as web and text mining, marketing, medicine, bioinformatics and business intelligence. This volume contains the revised versions of selected papers in the field of data analysis, machine learning and knowledge discovery presented during the 36th annual conference of the German Classification Society (GfKl). The conference was held at the University of Hildesheim (Germany) in August 2012. ​

Introduction to empirical processes and semiparametric inference

Author: Michael R. Kosorok
ISBN: UCSD:31822034686212
Release Date: 2008-01
Genre: Mathematics

This book provides a self-contained, linear, and unified introduction to empirical processes and semiparametric inference. These powerful research techniques are surprisingly useful for developing methods of statistical inference for complex models and in understanding the properties of such methods. The targeted audience includes statisticians, biostatisticians, and other researchers with a background in mathematical statistics who have an interest in learning about and doing research in empirical processes and semiparametric inference but who would like to have a friendly and gradual introduction to the area. The book can be used either as a research reference or as a textbook. The level of the book is suitable for a second year graduate course in statistics or biostatistics, provided the students have had a year of graduate level mathematical statistics and a semester of probability. The book consists of three parts. The first part is a concise overview of all of the main concepts covered in the book with a minimum of technicalities. The second and third parts cover the two respective main topics of empirical processes and semiparametric inference in depth. The connections between these two topics is also demonstrated and emphasized throughout the text. Each part has a final chapter with several case studies that use concrete examples to illustrate the concepts developed so far. The last two parts also each include a chapter which covers the needed mathematical preliminaries. Each main idea is introduced with a non-technical motivation, and examples are given throughout to illustrate important concepts. Homework problems are also included at the end of each chapter to help thereader gain additional insights.

Bias and Causation

Author: Dr. Herbert I. Weisberg
Publisher: John Wiley & Sons
ISBN: 1118058208
Release Date: 2011-01-06
Genre: Mathematics

A one-of-a-kind resource on identifying and dealing with bias in statistical research on causal effects Do cell phones cause cancer? Can a new curriculum increase student achievement? Determining what the real causes of such problems are, and how powerful their effects may be, are central issues in research across various fields of study. Some researchers are highly skeptical of drawing causal conclusions except in tightly controlled randomized experiments, while others discount the threats posed by different sources of bias, even in less rigorous observational studies. Bias and Causation presents a complete treatment of the subject, organizing and clarifying the diverse types of biases into a conceptual framework. The book treats various sources of bias in comparative studies—both randomized and observational—and offers guidance on how they should be addressed by researchers. Utilizing a relatively simple mathematical approach, the author develops a theory of bias that outlines the essential nature of the problem and identifies the various sources of bias that are encountered in modern research. The book begins with an introduction to the study of causal inference and the related concepts and terminology. Next, an overview is provided of the methodological issues at the core of the difficulties posed by bias. Subsequent chapters explain the concepts of selection bias, confounding, intermediate causal factors, and information bias along with the distortion of a causal effect that can result when the exposure and/or the outcome is measured with error. The book concludes with a new classification of twenty general sources of bias and practical advice on how mathematical modeling and expert judgment can be combined to achieve the most credible causal conclusions. Throughout the book, examples from the fields of medicine, public policy, and education are incorporated into the presentation of various topics. In addition, six detailed case studies illustrate concrete examples of the significance of biases in everyday research. Requiring only a basic understanding of statistics and probability theory, Bias and Causation is an excellent supplement for courses on research methods and applied statistics at the upper-undergraduate and graduate level. It is also a valuable reference for practicing researchers and methodologists in various fields of study who work with statistical data. This book was selected as the 2011 Ziegel Prize Winner in Technometrics for the best book reviewed by the journal. It is also the winner of the 2010 PROSE Award for Mathematics from The American Publishers Awards for Professional and Scholarly Excellence

Smart Engineering System Design

Author: Cihan H. Dagli
Publisher: American Society of Mechanical Engineers
ISBN: CORNELL:31924093877698
Release Date: 2002
Genre: Computers

Proceedings of the Artificial Neural Networks in Engineering Conference, November 2002, St. Louis, Missouri. This annual conference publication presents refereed papers covering the following categories and their applications in the engineering domain: Neural Networks, Complex Systems, Evolutionary Programming, Data Mining, Fuzzy Logic, Adaptive Control, Pattern Recognition and Smart Engineering System Design. These papers are intended to provide a forum for researchers in the field to exchange ideas on smart engineering system design.

Subjective and objective Bayesian statistics

Author: S. James Press
ISBN: STANFORD:36105111981200
Release Date: 2003
Genre: Business & Economics

* Shorter, more concise chapters provide flexible coverage of the subject. * Expanded coverage includes: uncertainty and randomness, prior distributions, predictivism, estimation, analysis of variance, and classification and imaging. * Includes topics not covered in other books, such as the de Finetti Transform. * Author S. James Press is the modern guru of Bayesian statistics.

IEEE International Conference on Computer Vision

Author: IEEE Computer Society. Technical Committee on Pattern Analysis and Machine Intelligence
Publisher: I E E E
ISBN: 076952334X
Release Date: 2005-01
Genre: Computer vision

The ICCV 2005 proceedings explores subjects such as active and real-time vision, PDEs in vision, color, illumination and texture, vision-based graphics, early vision and image representation, image databases, indexing and retrieval, and learning in vision. It also covers model acquisition and validation, depth recovery and analysis, tracking and surveillance, object, event, and scene recognition, and segmentation and grouping.

Practical Text Mining with Perl

Author: Roger Bilisoly
Publisher: Wiley
ISBN: 0470176431
Release Date: 2008-08-18
Genre: Computers

Provides readers with the methods, algorithms, and means to perform text mining tasks This book is devoted to the fundamentals of text mining using Perl, an open-source programming tool that is freely available via the Internet ( It covers mining ideas from several perspectives--statistics, data mining, linguistics, and information retrieval--and provides readers with the means to successfully complete text mining tasks on their own. The book begins with an introduction to regular expressions, a text pattern methodology, and quantitative text summaries, all of which are fundamental tools of analyzing text. Then, it builds upon this foundation to explore: Probability and texts, including the bag-of-words model Information retrieval techniques such as the TF-IDF similarity measure Concordance lines and corpus linguistics Multivariate techniques such as correlation, principal components analysis, and clustering Perl modules, German, and permutation tests Each chapter is devoted to a single key topic, and the author carefully and thoughtfully introduces mathematical concepts as they arise, allowing readers to learn as they go without having to refer to additional books. The inclusion of numerous exercises and worked-out examples further complements the book's student-friendly format. Practical Text Mining with Perl is ideal as a textbook for undergraduate and graduate courses in text mining and as a reference for a variety of professionals who are interested in extracting information from text documents.

Encyclopedia of Statistical Sciences

Author: Samuel Kotz
ISBN: 0471743747
Release Date: 2006
Genre: Mathematical statistics

Entries cover statistical theory, methods, and applications. Includes the latest topics and advances made in statistical science over the past decade--in areas such as computer-intensive statistical methodology, genetics, medicine, the environment, and other applications.