As with the bestselling first edition, Computational Statistics Handbook with MATLAB®, Second Edition covers some of the most commonly used contemporary techniques in computational statistics. With a strong, practical focus on implementing the methods, the authors include algorithmic descriptions of the procedures as well as examples that illustrate the use of the algorithms in data analysis. Updated for MATLAB® R2007a and the Statistics Toolbox, Version 6.0, this edition incorporates many additional computational statistics topics. New to the Second Edition • New functions for multivariate normal and multivariate t distributions • Updated information on the new MATLAB functionality for univariate and bivariate histograms, glyphs, and parallel coordinate plots • New content on independent component analysis, nonlinear dimensionality reduction, and multidimensional scaling • New topics on linear classifiers, quadratic classifiers, and voting methods, such as bagging, boosting, and random forests • More methods for unsupervised learning, including model-based clustering and techniques for assessing the results of clustering • A new chapter on parametric models that covers spline regression models, logistic regression, and generalized linear models • Expanded information on smoothers, such as bin smoothing, running mean and line smoothers, and smoothing splines With numerous problems and suggestions for further reading, this accessible text facilitates an understanding of computational statistics concepts and how they are employed in data analysis.
Praise for the Second Edition: "The authors present an intuitive and easy-to-read book. ... accompanied by many examples, proposed exercises, good references, and comprehensive appendices that initiate the reader unfamiliar with MATLAB." —Adolfo Alvarez Pinto, International Statistical Review "Practitioners of EDA who use MATLAB will want a copy of this book. ... The authors have done a great service by bringing together so many EDA routines, but their main accomplishment in this dynamic text is providing the understanding and tools to do EDA. —David A Huckaby, MAA Reviews Exploratory Data Analysis (EDA) is an important part of the data analysis process. The methods presented in this text are ones that should be in the toolkit of every data scientist. As computational sophistication has increased and data sets have grown in size and complexity, EDA has become an even more important process for visualizing and summarizing data before making assumptions to generate hypotheses and models. Exploratory Data Analysis with MATLAB, Third Edition presents EDA methods from a computational perspective and uses numerous examples and applications to show how the methods are used in practice. The authors use MATLAB code, pseudo-code, and algorithm descriptions to illustrate the concepts. The MATLAB code for examples, data sets, and the EDA Toolbox are available for download on the book’s website. New to the Third Edition Random projections and estimating local intrinsic dimensionality Deep learning autoencoders and stochastic neighbor embedding Minimum spanning tree and additional cluster validity indices Kernel density estimation Plots for visualizing data distributions, such as beanplots and violin plots A chapter on visualizing categorical data
Textual Statistics with R comprehensively covers the main multidimensional methods in textual statistics supported by a specially-written package in R. Methods discussed include correspondence analysis, clustering, and multiple factor analysis for contigency tables. Each method is illuminated by applications. The book is aimed at researchers and students in statistics, social sciences, hiistory, literature and linguistics. The book will be of interest to anyone from practitioners needing to extract information from texts to students in the field of massive data, where the ability to process textual data is becoming essential.
A new and refreshingly different approach to presenting the foundations of statistical algorithms, Foundations of Statistical Algorithms: With References to R Packages reviews the historical development of basic algorithms to illuminate the evolution of today’s more powerful statistical algorithms. It emphasizes recurring themes in all statistical algorithms, including computation, assessment and verification, iteration, intuition, randomness, repetition and parallelization, and scalability. Unique in scope, the book reviews the upcoming challenge of scaling many of the established techniques to very large data sets and delves into systematic verification by demonstrating how to derive general classes of worst case inputs and emphasizing the importance of testing over a large number of different inputs. Broadly accessible, the book offers examples, exercises, and selected solutions in each chapter as well as access to a supplementary website. After working through the material covered in the book, readers should not only understand current algorithms but also gain a deeper understanding of how algorithms are constructed, how to evaluate new algorithms, which recurring principles are used to tackle some of the tough problems statistical programmers face, and how to take an idea for a new method and turn it into something practically useful.
Lucidly Integrates Current Activities Focusing on both fundamentals and recent advances, Introduction to Machine Learning and Bioinformatics presents an informative and accessible account of the ways in which these two increasingly intertwined areas relate to each other. Examines Connections between Machine Learning & Bioinformatics The book begins with a brief historical overview of the technological developments in biology. It then describes the main problems in bioinformatics and the fundamental concepts and algorithms of machine learning. After forming this foundation, the authors explore how machine learning techniques apply to bioinformatics problems, such as electron density map interpretation, biclustering, DNA sequence analysis, and tumor classification. They also include exercises at the end of some chapters and offer supplementary materials on their website. Explores How Machine Learning Techniques Can Help Solve Bioinformatics Problems Shedding light on aspects of both machine learning and bioinformatics, this text shows how the innovative tools and techniques of machine learning help extract knowledge from the deluge of information produced by today’s biological experiments.
Bridging the gap between introductory theory and practical knowledge, this second edition reflects the fast-moving field of DNA microarrays by adding new and updated chapters that cover cutting-edge microarray topics. This edition now offers the option of learning elements of MATLAB® in parallel with data analysis. The author also includes Bioconductor tools that are linked to the theoretical concepts discussed in the text. This edition also features more opportunities for readers to practice everything that they have learned from the book. The accompanying CD-ROM provides MATLAB code and tips on how to use the MATLAB Bioinformatics toolbox.
Highlighting modern computational methods, Applied Stochastic Modelling, Second Edition provides students with the practical experience of scientific computing in applied statistics through a range of interesting real-world applications. It also successfully revises standard probability and statistical theory. Along with an updated bibliography and improved figures, this edition offers numerous updates throughout. New to the Second Edition An extended discussion on Bayesian methods A large number of new exercises A new appendix on computational methods The book covers both contemporary and classical aspects of statistics, including survival analysis, Kernel density estimation, Markov chain Monte Carlo, hypothesis testing, regression, bootstrap, and generalised linear models. Although the book can be used without reference to computational programs, the author provides the option of using powerful computational tools for stochastic modelling. All of the data sets and MATLAB® and R programs found in the text as well as lecture slides and other ancillary material are available for download at www.crcpress.com Continuing in the bestselling tradition of its predecessor, this textbook remains an excellent resource for teaching students how to fit stochastic models to data.
Author: Simon Rogers
Publisher: CRC Press
Release Date: 2011-10-25
Genre: Business & Economics
A First Course in Machine Learning covers the core mathematical and statistical techniques needed to understand some of the most popular machine learning algorithms. The algorithms presented span the main problem areas within machine learning: classification, clustering and projection. The text gives detailed descriptions and derivations for a small number of algorithms rather than cover many algorithms in less detail. Referenced throughout the text and available on a supporting website (http://bit.ly/firstcourseml), an extensive collection of MATLAB®/Octave scripts enables students to recreate plots that appear in the book and investigate changing model specifications and parameter values. By experimenting with the various algorithms and concepts, students see how an abstract set of equations can be used to solve real problems. Requiring minimal mathematical prerequisites, the classroom-tested material in this text offers a concise, accessible introduction to machine learning. It provides students with the knowledge and confidence to explore the machine learning literature and research specific methods in more detail.
Author: Liang Sun
Publisher: CRC Press
Release Date: 2016-04-19
Genre: Business & Economics
Similar to other data mining and machine learning tasks, multi-label learning suffers from dimensionality. An effective way to mitigate this problem is through dimensionality reduction, which extracts a small number of features by removing irrelevant, redundant, and noisy information. The data mining and machine learning literature currently lacks a unified treatment of multi-label dimensionality reduction that incorporates both algorithmic developments and applications. Addressing this shortfall, Multi-Label Dimensionality Reduction covers the methodological developments, theoretical properties, computational aspects, and applications of many multi-label dimensionality reduction algorithms. It explores numerous research questions, including: How to fully exploit label correlations for effective dimensionality reduction How to scale dimensionality reduction algorithms to large-scale problems How to effectively combine dimensionality reduction with classification How to derive sparse dimensionality reduction algorithms to enhance model interpretability How to perform multi-label dimensionality reduction effectively in practical applications The authors emphasize their extensive work on dimensionality reduction for multi-label learning. Using a case study of Drosophila gene expression pattern image annotation, they demonstrate how to apply multi-label dimensionality reduction algorithms to solve real-world problems. A supplementary website provides a MATLAB® package for implementing popular dimensionality reduction algorithms.