Edited by Connie M. BorrorApplied Regression Including Computing and Graphics
Editors Reviews of New Editions, Collections of Papers, and Other Books
Probability and Statistics for Engineers and Scientists
by Ronald E. Walpole, Raymond H. Myers, and Sharon L. Myers
Statistical Methods for Quality Improvement
by Thomas P. Ryan
Applied Regression Including Computing and Graphics by R. Dennis Cook and Sanford Weisberg. John Wiley & Sons, Inc., New York, NY, 1999. xxvi + 593 pp. $89.95.
Reviewer: James Wisnowski, Department of Mathematical Sciences, USAF Academy, CO 80840.
Regression is one of the most commonly used statistical techniques and is addressed in virtually all statistics texts independent of the intended audience. To motivate simple linear regression (SLR), it is natural to show a scatterplot with a superimposed least squares fit line to describe the relationship between a single predictor variable and a response variable. Unfortunately, there are few graphical displays for more than one predictor variable that can characterize a regression as well as the scatterplot does for SLR. Even texts devoted to regression do not present many graphical methods beyond the SLR scatterplot and the usual residual plots to assess model adequacy. Cook and Weisberg complement the traditional methods of regression modeling in Applied Regression Including Computing and Graphics with a focus on a graphical approach to regression modeling. The authors contribute numerous unique graphical methods that allow the reader to gain a deeper appreciation and insight into the regression problem. Their objective is to provide as much information in a few plots about a multiple linear regression model as the scatterplot provides for SLR. The theory and motivation for many of these new graphical approaches are detailed in Cooks 1998 text Regression Graphics: Ideas for Studying Regressions Through Graphics. Cook and Weisberg implement many of those ideas in this text with less mathematical rigor.
Clearly, a graphically intensive approach to regression requires software. Most of the plots introduced in the text cannot be accomplished with standard statistical software. Fortunately, the authors have provided a web site to download their Xlisp-Stat program Arc. I had no difficulty installing this software and found it remarkably powerful, yet easy to use. Most graphical and computational analyses can be performed through the intuitively designed menu options. Command line options are available, but not well documented. The best feature of the software is that every graph is alive. That is, slidebars and pull down menus directly on the graph allow you to easily perform insightful sensitivity analysis. Although Arcs strength is its graphical capability, it does provide most of the usual regression quantities of interest. The 25-page appendix to the text provides some basic instructions on the Arc software. The appendix and on-line help menu are not sufficient users-manualsyoull need to read the text.
The 23 chapters in the text are divided into four parts: Introduction, Tools, Graphics, and Other Models. The 5 chapters in Part I immediately signal that this is not just another regression text. Chapter 1 introduces single sample descriptive statistics and graphical summaries (boxplots and histograms) using Arc. In Chapter 2, we are already slicing and brushing scatterplots as an introduction to regression. Chapter 3 extends the slicing idea to the non-parametric smoother lowess (locally weighted scatterplot smoothing). This chapter also provides insight on the use of boxplots to investigate the regression problem. Bivariate distributions and correlation are discussed and graphically explored in Chapter 4. The authors use Chapter 5 to motivate power transformations by changing default graphical settings on scatterplots. You can often see new relationships and the need to transform to normality by simply reorienting a plot.
The first two chapters of Part II, Tools, provide a traditional overview of simple and multiple linear regression, respectively, at an intentionally mild mathematical level. Arc is integrated into the discussion and the complements to each chapter give a more rigorous treatment of many standard regression topics. Chapter 8 is a tutorial on Arcs three-dimensional plot capabilities for regression. The authors address lack of fit and weighted least squares in Chapter 9 from a conventional approach. Chapter 10 details the contributions of the added variable plot and the confidence region graph to understanding the coefficient estimates. Arcs capabilities and some of the theory behind the predictor variable selection problem are presented in Chapter 11 and categorical factors and interactions are briefly addressed in Chapter 12. Chapter 13 offers the inverse fitted value plot as an alternative to a Box-Cox transformation on the response value. Chapters 14 and 15 discuss diagnostics to test for model adequacy. I was a little disappointed to see only the standard diagnostic plots of residuals and outlier tests in these chapters from two of the worlds leading experts in the field. Admittedly, the Arc diagnostic graphs are better than the standard software packages provide and the current level of diagnostic discussion is appropriate for the texts objective. Chapter 16 uses the Ceres (combining conditional expectations and residuals) plot, similar to the partial residuals plot, to determine appropriate transformations of the predictor variables to normality. Chapter 17 discusses the advantages of complementing residual plots with model checking plots that graph the responses, not the residuals, against a function of the predictors.
In Part III, Graphics, the authors present graphical approaches to search for structure in multiple regression without the assumption that a linear model holds. The key to regression graphics in high dimension is determining how many linear combinations of the predictor variables are required to characterize the regression without loss of information. This is the structural dimension of the data. Chapter 18 explains the structural dimension for two or fewer predictor variables. Chapter 19 introduces several new techniques to determine structural dimension for a large number of predictor variables. The material in the first 19 chapters builds to Chapter 20. Here, the goal is to use just a few summary 3D plots of linear combinations (principle Hessian directions) of the predictors to find coefficients estimates to adequately model the response variable.
Part IV explains extends some of the results to generalized linear models. Chapter 21 explains the logistic regression model using the binomial distribution and Chapter 22 recommends using some alternative plots available in Arc to help suggest appropriate transformations and diagnose logistic model adequacy. Chapter 23 briefly introduces Poisson and Gamma regression and how to fit these models in Arc.
The text is well written with numerous realistic examples and homework problems at varying levels of difficulty. I would agree with the authors that you need Arc loaded in front of you (all example data is already loaded in the software) to get the most from this text. This could also be viewed as a drawback for teaching a lecture-based course at a university. Instruction in a computer lab environment seems mandatory. Although written at a relatively low mathematical level, I believe students out of the discipline will still struggle with many of the concepts of regression as presented if they only have the authors prerequisite one semester of basic statistics. The authors do provide some remedies by offering a course on how to teach out of this text and also a teachers manual (not reviewed). For the applied statistics practitioner and quality engineer, I would highly recommend the text with associated software to supplement your standard regression modeling practicesyoure certain to discover some useful relationships that you would otherwise have missed.
Design and Analysis of Experiments by Angela Dean and Daniel Voss. Springer, New York, NY, 1999. 740 pp. $79.50.
Reviewer: Karen A. F. Copeland, Statistical Consultant, Boulder, CO 80304.
Extending from their belief that planning, running, and analyzing simple experiments is the best way to learn about design and analysis of experiments the authors have assembled an aggressive list of topics that are covered in this example rich text. The table of contents is similar to those of Montgomery (1997) and Kuehl (1994) with the flavor of the text more similar to Kuehl (1994), a general DOE text, than to Montgomery (1997), an industrial focused text. The material is written so that it can be used by upper level undergraduate students, graduate students in non-statistical fields (given that they have some statistical background), and beginning statistics graduate students. A basic statistics course (i.e., knowledge of topics such as hypothesis testing, ANOVA, and basic regression) would be a suitable minimum prerequisite to a course using this text. Note that There is an abundance of material in this text so that one would not even attempt to cover it all in one semester, nor do the authors suggest such.
The text has 19 chapters (listed below), each concluding with exercises for the reader (no answers given). The chapter titles are
In Chapter 2 the authors introduce a "checklist" for planing experiments. Their checklist has nine detailed steps to follow in setting up an experiment and they employ this checklist throughout the text when real experiments are discussed. One item that I feel is missing from such a detailed check list is a check of the quality of a measurement. Item (d) on the list is "Specify the measurements to be made, the experimental procedure, and the anticipated difficulties". The discussion of this item mentions significant digits as relating to detecting desired differences, however, I would have liked to have seen more discussion on the repeatability and reliability type of issues associated with measurements (having been there with designs where useless measurements were used). The text does not include a discussion of gage R&R studies. Chapter 2 concludes with brief overviews of many types of designs and descriptions of three simple experiments (results presented graphically). The discussion of the experiments sets the stage for the use of experiments in the remainder of the text. Each time an experiment is introduced the checklist is used. Experiments are continued throughout the text as relevant topics are introduced. The experiments are also the basis of the exercises at the end of each chapter. The experiments discussed include student run studies, such as on popcorn yields, to more complicated experiments taken from the literature. The focus on experiments for learning is good; however, it does consume a significant portion of the text.
Chapter 3 is where the main material of the text begins. Each of Chapters 3 through 19 have a "Using SAS Software" section that illustrates how to perform the analysis covered in the chapter using SAS. These sections contain both code and output making them a valuable resource for learning and using SAS.
The analysis for all designs is based on assumed models and preplanned analysis; no data exploration is considered. The text-book assumption that one knows which interactions are important is often cited. In Chapter 15 it is said that fractional factorial experiments have "the disadvantage that each main-effect and interaction contrast will be confounded with one or more other main-effect and interaction contrasts and cannot be estimated separately." I found that statement to be a bit misleading. One other discussion that I was uneasy with occurs in Chapter 19 prior to the introduction of split-plot designs. The authors are discussing a previous design and are counting the number of times a setting has to be changed in the course of running the experiment based on the randomization of the design. There is no mention that by not resetting each factor for each run that independence is lost. (see, e.g., Lucas and Hazel(1997) or Ganju and Lucas (1997) for detailed discussions on this topic.) Another downside to the text is that there is no mention of multiple response problems. This may seam picky, but in the data rich environment that we now live in multiple responses are the norm not the exception.
For the instructor without a great deal of actual DOE experience who wishes to teach a hands-on, experiment oriented DOE course then this text offers a great wealth of examples to teach from. Others may find the examples a bit much. The other main reason to consider this text is the SAS sections. These would make it possible for students to learn SAS without a losing a great deal of class time to SAS programming lessons.
Ganju, J. and Lucas, J. M. (1997). "Bias in Test Statistics when Restrictions on Randomization are Caused by Factors." Communications in StatisticsTheory and Methods 26, pp. 4763.
Kuehl, Robert O. Statistical Principles of Research Design and Analysis. Duxbury-Wadsworth, Belmont, CA.
Lucas, J. M. and Hazel, M. C. (1997). "Running Experiments With Multiple Error Terms: How an Experiment is Run is Important." ASQ Annual Quality Congress Transactions, pp. 283296.
Montgomery, Douglas C. (1997). Design and Analysis of Experiments, 4th ed. John Wiley & Sons.
Statistical Case Studies for Industrial Improvement by Veronica Czitrom and Patrick D. Spagon. Society for Industrial and Applied Mathematics and American Statistical Association, 1997. xxvii + 514 pp. $52.00 (Members: $41.60).
Reviewer: Lloyd S. Nelson, Statistical Consultant, Londonderry, NH 03053-3647.
Benjamin Franklin wrote, "Experience keeps a dear school, but fools will learn in no other." On the other hand smart people take full advantage of experiencetheir own as well as that of other people. The book under review provides a wonderful collection of superbly documented industrial statistical experiences. Although it must be pointed out that the book deals exclusively with problems in the integrated circuit industry, it is important to emphasize that their problems are remarkably like those in virtually everyone elses industry.
The book is divided into seven parts with the following titles. 1. Gauge Studies, 2. Passive Data Collection, 3. Design of Experiments, 4. Statistical Process Control, 5. Equipment Reliability, 6. Comprehensive Case Study, and 7. Appendices. Each part begins with "Introduction to (Part title)" and contains about five chapters giving the details of an experimental statistical enquiry. It is clear that much effort went into assuring that these contributions were both lucid and correct.
A beginner will appreciate the chapter entitled "Glossary of Selected Statistical Terms" in which sixteen types of graphs are discussed and exemplified as well as many of the common concepts and tests. Also included is a 9-page introduction to the analysis of variance. The last item is a table that lists forty-two statistical topics with reference to the chapter where each can be found. An excellent index is given. Included with the book are two diskettes (for IBM compatibles and Macintosh) that hold most of the data sets given in the book.
This book will inspire experienced researchers. Beginners, in addition, will learn a good deal about statistical techniques for industrial problem solving.
The following brief editors reviews are of new editions, collections of papers, or other books that may be of interest to some readers.
Connie M. Borror, Industrial and Management Systems Engineering, Arizona State University, Tempe, AZ 85287-5906.
Probability and Statistics for Engineers and Scientists, 6th ed. by Ronald E. Walpole, Raymond H. Myers, and Sharon L. Myers.. Prentice-Hall, Inc., Upper Saddle River, NJ 07458. 1998. xii + 793 pp. $96.00.
The first edition of Probability and Statistics for Engineers and Scientists was published in 1972. Since the first edition, the book has been improved upon and expanded at each subsequent edition. The sixth edition is no exception, now with a third co-author, Dr. Sharon L. Myers. The basic material and presentation of such is well written and should be considered for any calculus-based statistics course. The authors have made some changes and improvements to be outlined in this review.
The authors have listed in the Preface, many of the enhancements and changes. They are outlined here for the reader.
One note should be made at this point about the data files at the Prentice-Hall website. As is often the case in the "internet" age, the ftp site given in the Preface of the textbook is outdated. The data files are now available for purchase on the Pearson Prentice-Hall Web site.
Overall, this edition, as with the previous editions, is complete and well written. There are approximately 991 exercises (just about 79 more than the fifth edition) within the seventeen chapters. It is an outstanding textbook either to be used in a statistics course or as a reference. The authors have maintained the excellent presentation found in the previous editions with the many enhancements mentioned above.
Statistical Methods for Quality Improvement, 2nd ed. by Thomas P. Ryan.. John Wiley & Sons, Inc., New York, NY. 1999. xxvi + 593 pp. $89.95.
Statistical Methods for Quality Improvement, in its second edition, has many new added features along with enhancements of material from the first edition. In the first edition, there were sixteen chapters. In the second edition, a seventeenth chapter is included which discusses statistical process control (SPC) tools in connection with Six-Sigma programs.
The chapters and titles are listed next along with the number of exercises (given in parentheses) for each chapter.
Much material has been reorganized and expanded. For example, exponentially weighted moving average (EWMA) control charts and cumulative sum (CUSUM) control charts, which are covered in extensive detail, are in their own chapter (Chapter 8). Ryan has extensively expanded the material included in Chapter 14. He has also included discussion on short-run control charts, pre-control, nonparametric control charts, and autocorrelated data control charts (all in Chapter 10).
It is important to note that although many topics are included in this text, some topics receive only terse treatment. For example, Ryan has a section (10.4) entitled "Charts for Batch Processes" which consists of only five sentences. He presents a typical batch process setup and then provides some discussion on the use of standard control charts for batch processes. As a second example, Section 10.6 entitled "Nonparametric Control Charts" consists of a one-page discussion on this topic. Again, the author presents the topic and then discusses its pros and cons. The treatment received by some topics is not to be seen as a negative in this book, quite the contrary. Ryan presents topics that have been discussed in detail in the literature and provides references in every section and subsection for the reader to further explore a particular topic. The references that Ryan provides are invaluable to anyone interested in a topic presented in the text, regardless of the treatment provided.
On the other hand, many topics have been extensively enhanced and expanded upon from the first edition. The expansion of material has resulted in (and was a result of) the addition of numerous new references throughout the text. I believe that for the book to be used in various courses (undergraduate courses, short courses) it would have to be supplemented with more exercises and/or case studies, as is often the situation in courses such as those listed. Overall, the book is well-written and would be an excellence textbook in a course or as a reference text.
American Statistical Association, 1429 Duke St., Alexandria, VA 22314-3402, USA; (888) 231-3473; Fax: (703) 684-2037; www.amstat.org
John Wiley & Sons, Inc., 605 Third Ave., New York, NY 10128; (800) 225-5945; www.wiley.com
Prentice-Hall, Inc.,One Lake Street, Upper Saddle River, NJ 07458; (800) 643-5505; Fax (800) 835-5327; www.prenhall.com
Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688; Phone: (215) 382-9800; Fax: (215) 386-7999; www.siam.org
Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010-7858; (212) 460-1579; www.springer-ny.com