Author: Trevor Hastie
Publisher: Springer Science & Business Media
Release Date: 2013-11-11
During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression & path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for “wide” data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.
Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work. The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation. By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling. Web Resource The book is accompanied by an R package (rethinking) that is available on the author’s website and GitHub. The two core functions (map and map2stan) of this package allow a variety of statistical models to be constructed from standard model formulas.
Author: Gerry P. Quinn
Publisher: Cambridge University Press
Release Date: 2002-03-21
An essential textbook for any student or researcher in biology needing to design experiments, sample programs or analyse the resulting data. The text begins with a revision of estimation and hypothesis testing methods, covering both classical and Bayesian philosophies, before advancing to the analysis of linear and generalized linear models. Topics covered include linear and logistic regression, simple and complex ANOVA models (for factorial, nested, block, split-plot and repeated measures and covariance designs), and log-linear models. Multivariate techniques, including classification and ordination, are then introduced. Special emphasis is placed on checking assumptions, exploratory data analysis and presentation of results. The main analyses are illustrated with many examples from published papers and there is an extensive reference list to both the statistical and biological literature. The book is supported by a website that provides all data sets, questions for each chapter and links to software.
Author: Jeffrey D. Long
Release Date: 2011-10-31
Genre: Social Science
This book is unique in its focus on showing students in the behavioral sciences how to analyze longitudinal data using R software. The book focuses on application, making it practical and accessible to students in psychology, education, and related fields, who have a basic foundation in statistics. It provides explicit instructions in R computer programming throughout the book, showing students exactly how a specific analysis is carried out and how output is interpreted.
A powerful tool for analyzing nested designs in a variety of fields, multilevel/hierarchical modeling allows researchers to account for data collected at multiple levels. Multilevel Modeling Using R provides you with a helpful guide to conducting multilevel data modeling using the R software environment. After reviewing standard linear models, the authors present the basics of multilevel models and explain how to fit these models using R. They then show how to employ multilevel modeling with longitudinal data and demonstrate the valuable graphical options in R. The book also describes models for categorical dependent variables in both single level and multilevel data. The book concludes with Bayesian fitting of multilevel models. For those new to R, the appendix provides an introduction to this system that covers basic R knowledge necessary to run the models in the book. Through the R code and detailed explanations provided, this book gives you the tools to launch your own investigations in multilevel modeling and gain insight into your research.
Author: Xing Liu
Publisher: SAGE Publications
Release Date: 2015-09-30
Genre: Social Science
The first book to provide a unified framework for both single-level and multilevel modeling of ordinal categorical data, Applied Ordinal Logistic Regression Using Stata by Xing Liu helps readers learn how to conduct analyses, interpret the results from Stata output, and present those results in scholarly writing. Using step-by-step instructions, this non-technical, applied book leads students, applied researchers, and practitioners to a deeper understanding of statistical concepts by closely connecting the underlying theories of models with the application of real-world data using statistical software.
Author: Jan Deleeuw
Publisher: Springer Science & Business Media
Release Date: 2007-12-26
This book presents the state of the art in multilevel analysis, with an emphasis on more advanced topics. These topics are discussed conceptually, analyzed mathematically, and illustrated by empirical examples. Multilevel analysis is the statistical analysis of hierarchically and non-hierarchically nested data. The simplest example is clustered data, such as a sample of students clustered within schools. Multilevel data are especially prevalent in the social and behavioral sciences and in the biomedical sciences. The chapter authors are all leading experts in the field. Given the omnipresence of multilevel data in the social, behavioral, and biomedical sciences, this book is essential for empirical researchers in these fields.
The contributions in this volume, made by distinguished statisticians in several frontier areas of research in multivariate analysis, cover a broad field and indicate future directions of research. The topics covered include discriminant analysis, multidimensional scaling, categorical data analysis, correspondence analysis and biplots, association analysis, latent variable models, bootstrap distributions, differential geometry applications and others. Most of the papers propose generalizations or new applications of multivariate analysis. This volume will be of great interest to statisticians, probabilists, data analysts and scientists working in the disciplines such as biology, biometry, ecology, medicine, econometry, psychometry and marketing. It will be a valuable guide to professors, researchers and graduate students seeking new and promising lines of statistical research.
Author: Steven G. Heeringa
Publisher: CRC Press
Release Date: 2010-04-05
Taking a practical approach that draws on the authors’ extensive teaching, consulting, and research experiences, Applied Survey Data Analysis provides an intermediate-level statistical overview of the analysis of complex sample survey data. It emphasizes methods and worked examples using available software procedures while reinforcing the principles and theory that underlie those methods. After introducing a step-by-step process for approaching a survey analysis problem, the book presents the fundamental features of complex sample designs and shows how to integrate design characteristics into the statistical methods and software for survey estimation and inference. The authors then focus on the methods and models used in analyzing continuous, categorical, and count-dependent variables; event history; and missing data problems. Some of the techniques discussed include univariate descriptive and simple bivariate analyses, the linear regression model, generalized linear regression modeling methods, the Cox proportional hazards model, discrete time models, and the multiple imputation analysis method. The final chapter covers new developments in survey applications of advanced statistical techniques, including model-based analysis approaches. Designed for readers working in a wide array of disciplines who use survey data in their work, this book also provides a useful framework for integrating more in-depth studies of the theory and methods of survey data analysis. A guide to the applied statistical analysis and interpretation of survey data, it contains many examples and practical exercises based on major real-world survey data sets. Although the authors use Stata for most examples in the text, they offer SAS, SPSS, SUDAAN, R, WesVar, IVEware, and Mplus software code for replicating the examples on the book’s website: http://www.isr.umich.edu/src/smp/asda/
Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data
Author: Chris Chapman
Release Date: 2015-03-09
Genre: Business & Economics
This book is a complete introduction to the power of R for marketing research practitioners. The text describes statistical models from a conceptual point of view with a minimal amount of mathematics, presuming only an introductory knowledge of statistics. Hands-on chapters accelerate the learning curve by asking readers to interact with R from the beginning. Core topics include the R language, basic statistics, linear modeling, and data visualization, which is presented throughout as an integral part of analysis. Later chapters cover more advanced topics yet are intended to be approachable for all analysts. These sections examine logistic regression, customer segmentation, hierarchical linear modeling, market basket analysis, structural equation modeling, and conjoint analysis in R. The text uniquely presents Bayesian models with a minimally complex approach, demonstrating and explaining Bayesian methods alongside traditional analyses for analysis of variance, linear models, and metric and choice-based conjoint analysis. With its emphasis on data visualization, model assessment, and development of statistical intuition, this book provides guidance for any analyst looking to develop or improve skills in R for marketing applications.
Author: Jason W. Osborne
Publisher: SAGE Publications
Release Date: 2016-03-24
In a conversational tone, Regression & Linear Modeling provides conceptual, user-friendly coverage of the generalized linear model (GLM). Readers will become familiar with applications of ordinary least squares (OLS) regression, binary and multinomial logistic regression, ordinal regression, Poisson regression, and loglinear models. Author Jason W. Osborne returns to certain themes throughout the text, such as testing assumptions, examining data quality, and, where appropriate, nonlinear and non-additive effects modeled within different types of linear models.