Exploratory multivariate analysis by example using R.pdf

(9512 KB) Pobierz
Exploratory Multivariate Analysis by Example Using R
Exploratory Multivariate Analysis
by Example Using R
719096743.050.png 719096743.051.png 719096743.052.png 719096743.053.png 719096743.001.png 719096743.002.png 719096743.003.png 719096743.004.png 719096743.005.png 719096743.006.png 719096743.007.png 719096743.008.png
Chapman & Hall/CRC
Computer Science and Data Analysis Series
The interface between the computer and statistical sciences is increasing, as each discipline
seeks to harness the power and resources of the other. This series aims to foster the integration
between the computer sciences and statistical, numerical, and probabilistic methods by
publishing a broad range of reference works, textbooks, and handbooks.
SERIES EDITORS
David Blei, Princeton University
David Madigan, Rutgers University
Marina Meila, University of Washington
Fionn Murtagh, Royal Holloway, University of London
Proposals for the series should be sent directly to one of the series editors above, or submitted to:
Chapman & Hall/CRC
4th Floor, Albert House
1-4 Singer Street
London EC2A 4BQ
UK
Published Titles
Bayesian Artiicial Intelligence, Second Edition
Kevin B. Korb and Ann E. Nicholson
Introduction to Machine Learning
and Bioinformatics
Sushmita Mitra, Sujay Datta,
Theodore Perkins, and George Michailidis
Clustering for Data Mining:
A Data Recovery Approach
Boris Mirkin
Computational Statistics Handbook with
MATLAB ® , Second Edition
Wendy L. Martinez and Angel R. Martinez
Correspondence Analysis and Data
Coding with Java and R
Fionn Murtagh
Microarray Image Analysis:
An Algorithmic Approach
Karl Fraser, Zidong Wang, and Xiaohui Liu
Pattern Recognition Algorithms for
Data Mining
Sankar K. Pal and Pabitra Mitra
R Graphics
Paul Murrell
R Programming for Bioinformatics
Robert Gentleman
Design and Modeling for Computer
Experiments
Kai-Tai Fang, Runze Li, and Agus Sudjianto
Exploratory Data Analysis with MATLAB ®
Wendy L. Martinez and Angel R. Martinez
Semisupervised Learning for
Computational Linguistics
Steven Abney
Exploratory Multivariate Analysis by
Example Using R
François Husson, Sébastien Lê, and
Jérôme Pagès
Introduction to Data Technologies
Paul Murrell
Statistical Computing with R
Maria L. Rizzo
719096743.009.png 719096743.010.png 719096743.011.png 719096743.012.png 719096743.013.png 719096743.014.png 719096743.015.png 719096743.016.png 719096743.017.png 719096743.018.png 719096743.019.png
Exploratory Multivariate Analysis
by Example Using R
François Husson
Sébastien Lê
Jérôme Pagès
719096743.020.png 719096743.021.png 719096743.022.png 719096743.023.png 719096743.024.png 719096743.025.png 719096743.026.png 719096743.027.png 719096743.028.png 719096743.029.png 719096743.030.png 719096743.031.png 719096743.032.png 719096743.033.png
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2011 by Taylor and Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Printed in the United States of America on acid-free paper
10 9 8 7 6 5 4 3 2 1
International Standard Book Number: 978-1-4398-3580-7 (Hardback)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made
to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all
materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all
material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not
been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in
any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, micro-
filming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com ( http://www.
copyright.com/ ) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-
8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that
have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identi-
fication and explanation without intent to infringe.
Library of Congress Cataloging‑in‑Publication Data
Husson, François.
Exploratory multivariate analysis by example using R / François Husson, Sébastien Lê, Jérôme
Pagès.
p. cm. -- (Chapman & Hall/CRC computer science & data analysis)
Summary: “An introduction to exploratory techniques for multivariate data analysis, this book
covers the key methodology, including principal components analysis, correspondence analysis,
mixed models, and multiple factor analysis. The authors take a practical approach, with examples
leading the discussion of the methods and many graphics to emphasize visualization. They present
the concepts in the most intuitive way possible, keeping mathematical content to a minimum
or relegating it to the appendices. The book includes examples that use real data from a range of
scientific disciplines and implemented using an R package developed by the authors.”-- Provided
by publisher.
Includes bibliographical references and index.
ISBN 978-1-4398-3580-7 (hardback)
1. Multivariate analysis. 2. R (Computer program language) I. Lê, Sébastien. II. Pagès, Jérôme. III.
Title. IV. Series.
QA278.H87 2010
519.5’3502855133--dc22
2010040339
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
719096743.034.png 719096743.035.png 719096743.036.png 719096743.037.png 719096743.038.png 719096743.039.png 719096743.040.png 719096743.041.png 719096743.042.png 719096743.043.png 719096743.044.png 719096743.045.png
Contents
Preface
x i
1 Principal Component Analysis (PCA)
1
1.1 Data | Notation | Examples . . . . . . . . . . . . . . . . .
1
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2.1 Studying Individuals . . . . . . . . . . . . . . . . . . .
2
1.2.2 Studying Variables . . . . . . . . . . . . . . . . . . . .
3
1.2.3 Relationships between the Two Studies . . . . . . . .
5
1.3 Studying Individuals . . . . . . . . . . . . . . . . . . . . . .
5
1.3.1 The Cloud of Individuals . . . . . . . . . . . . . . . .
5
1.3.2 Fitting the Cloud of Individuals . . . . . . . . . . . .
7
1.3.2.1 Best Plane Representation of N I . . . . . . .
7
1.3.2.2 Sequence of Axes for Representing N I . . . .
9
1.3.2.3 How Are the Components Obtained? . . . .
10
1.3.2.4 Example . . . . . . . . . . . . . . . . . . . .
10
1.3.3 Representation of the Variables as an Aid for
Interpreting the Cloud of Individuals . . . . . . . . . . 11
1.4 Studying Variables . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.1 The Cloud of Variables . . . . . . . . . . . . . . . . . 13
1.4.2 Fitting the Cloud of Variables . . . . . . . . . . . . . . 14
1.5 Relationships between the Two Representations N I and N K 16
1.6 Interpreting the Data . . . . . . . . . . . . . . . . . . . . . .
17
1.6.1 Numerical Indicators . . . . . . . . . . . . . . . . . . .
17
1.6.1.1 Percentage of Inertia Associated with a
Component . . . . . . . . . . . . . . . . . . .
17
1.6.1.2 Quality of Representation of an Individual or
Variable . . . . . . . . . . . . . . . . . . . . .
18
1.6.1.3 Detecting Outliers . . . . . . . . . . . . . . .
19
1.6.1.4 Contribution of an Individual or Variable to
the Construction of a Component . . . . . .
19
1.6.2 Supplementary Elements . . . . . . . . . . . . . . . . .
20
1.6.2.1 Representing Supplementary Quantitative
Variables . . . . . . . . . . . . . . . . . . . .
21
1.6.2.2 Representing Supplementary Categorical
Variables . . . . . . . . . . . . . . . . . . . .
22
1.6.2.3 Representing Supplementary Individuals . .
23
v
719096743.046.png 719096743.047.png 719096743.048.png 719096743.049.png
Zgłoś jeśli naruszono regulamin