Edited by Connie M. Borror

The Mahalanobis-Taguchi Strategy.

by Genichi Taguchi and Rajesh Jugulum

Recurrent Events Analysis for Product Repairs, Disease Recurrences, and Other Applications

by Wayne B. Nelson

Statistical Process Control for Health Care

by M. K. Hart and R. F. Hart

Statistical Process Control, The Deming Paradigm and Beyond 2nd ed.

by James R. Thompson and Jacek Koronacki

*Douglas C. Montgomery*, Professor of Engineering and Statistics, Arizona State University, Tempe, AZ 85287-5906.

**The Mahalanobis-Taguchi Strategy** by *Genichi Taguchi and Rajesh Jugulum*. John Wiley & Sons, Inc. New York, NY 2002. xii + 234 pp. $125.00.

THE Mahalanobis-Taguchi Strategy (MTS) is a collection of multivariate methods that have been proposed as useful for diagnosis and forecasting. To use the approach, data must be available on a "normal" or "healthy" group of items and an "unhealthy" or "abnormal" group of items. The abnormal items may be further classi.ed into subgroups based on the severity of the type of abnormality. A Mahalanobis distance (MD) measure based on a set of descriptive variables for the items is used to separate normal and abnormal items. If it can be established that such an MD measure exists, the number of variables is reduced (if possible) using Taguchi-type orthogonal arrays (OA) and signal-to-noise (S/N) ratios. The MD scale using these variables is used as the basis for diagnosis and forecasting. The authors of the book cite a number of application areas for MTS, including medical diagnosis, pattern recognition, on-line testing of products, reliability/serviceability problems, and a broad range of quality and process improvement problems. Other references that describe the approach are the book by Taguchi et al. (2000) and the journal articles by Taguchi and Jugulum (1999, 2000a, 2000b) and Woodall et al. (2003).

The book contains 12 chapters:

- Chapter 1 (15 pages) is an introduction, providing an overview of the approach and a brief discussion of related methods.
- In Chapter 2 (20 pages), the authors introduce MTS and MTGS, a variation of MTS in which the MDs are calculated using the Gram-Schmidt orthogonalization process. In this chapter, a set of medical diagnosis data involving liver disease is introduced. The data set has 200 healthy individuals, 17 unhealthy individuals, and 17 covariates for each individual. This data set is used throughout the book to illustrate various aspects of MTS.
- Chapter 3 (16 pages) deals with the advantages and limitations of the MTS/MTGS procedures.
- In Chapter 4 (20 pages), the role of OAs and S/N ratios in multivariate diagnosis is introduced.
- Chapter 5 (11 pages) illustrates the applicability of MTS/MTGS for categorical data.
- In Chapter 6 (4 pages), the authors describe different ways to treat noise variables in MTS.
- In Chapter 7 (11 pages), the role of quality loss functions is described.
- Chapter 8 (4 pages) has a discussion of the standard error of the measurement scale developed using MTS/MTGS methods.
- Chapter 9 (24 pages) contains discussions of advanced topics such as multicollinearity.
- In Chapter 10 (22 pages), MTS/MTGS is compared with with classical statistical methods (principal components, discrimination and classification, multiple and stepwise regression, Rao's test, and multivariate control charts) and to artificial neural networks. The liver disease data are used again.
- Chapter 11 (21 pages) contains brief case studies.
- Chapter 12 (5 pages) is a conclusion, with some commentary about the scientific contributions of MTS and suggestions for future research.

Considering the relatively complex nature of the subject matter, the book is very tersely written. I think that the typical *JQT* reader would experience considerable difficulty obtaining anything more than an "overview level" of knowledge about MTS from the book, somewhat similar to the level of knowledge that I have about the engine in my car (I am an engineer by training, so I have a good conceptual understanding of an internal combustion engine, and at one time I even knew a lot about the theory of ICEs; however, I would be pretty uncomfortable if I had to disassemble the engine, rebuild the components, and make the assembly work properly again). Now, if a user is going to implement and use MTS via computer software, that overview level of understanding may be satisfactory. I obtained a much better understanding of MTS from the article by Woodall et al. (2003).

This leads me to a broader issue; should you be interested in learning about and using MTS? For background on this question, note that Taguchi's work in design of experiments is well known. It generated considerable controversy, not because of the problem he attacked (the robust parameter design or RPD problem), but because of the statistical methods he advocated to solve those problems. So far as the statistical community is concerned, Taguchi's statistical methods for design of experiments and his approach to solving the RPD problem have often been shown to be very inefficient and ineffective. Replacing the Taguchi approach to the RPD problem with an approach based on the response surface framework is generally recognized as a simpler and more effective approach. On the other hand, Taguchi deserves enormous credit for recognizing the RPD problem and for motivating the design of experiments community to develop good solutions to the problem.

That is not the case here. MTS resurrects all of the Taguchi-type design of experiments techniques (orthogonal arrays, S/N ratios) and uses them in a new setting. However, the contexts in which the authors apply MTS are not new (as with RPD), but are rather problems for which there is already a broad range of, if not perfect at least well-understood and well-characterized, statistical methods. The authors have ignored these approaches in the development and explanation of MTS. It is also dismaying to see that the book discusses at some length the advantages of using OAs and S/Ns without mentioning any of the serious criticisms of these techniques.

So one must ask, is there any significant advantage to using MTS as an alternative to the wellestablished procedures that are already available? Woodall et al. (2003) presented an in-depth review, analysis, and critique of the MTS procedure. The discussants of this paper also have many illuminating and interesting points to add. *I strongly recommend* this as *required reading* for anyone contemplating an excursion into the world of MTS. Woodall et al. (2003) also gave an in-depth analysis and discussion of the liver disease data that is used extensively in this book (and other publications on the MTS procedure). They note that the MTS analysis does not lead to satisfactory results as far as separation of the classes of individuals with abnormalities is concerned. As one of the authors of this paper is a physician with considerable disciplinary knowledge, one must accept this conclusion as authoritative.

Woodall et al. (2003) also concluded that MTS is not well-defined from either an operational or conceptual viewpoint. MTS is not based on any formal probability model. The notion of multivariate normality is not even mentioned, for example. This lack of reliance on a probabilistic framework results in serious shortcomings with MTS in several areas, such as recognizing and dealing appropriately with variability between individuals.

Sadly, Woodall et al. (2003) concluded that MTS will probably become more widely used in industry. I agree with this assessment. Many practitioners know about the importance of the problems the method purports to solve and recognize that the multivariate nature of their data is an important consideration. However, they don't have the background to understand the difficulties with MTS, but they will be sold the procedure anyway. This sounds like a terrible scenario, but I'm absolutely convinced it will occur. The same consultants that are still actively pushing the Taguchi approach to RPD, even though its shortcomings have been well-documented for a decade, are certainly positioning themselves to do exactly the same thing with MTS.

I cannot recommend this book. I most certainly cannot recommend the MTS procedure. Anyone interested in MTS would be much better served by reading the Woodall et al. (2003) paper. However, before an application of MTS is undertaken, please carefully consider other more reliable statistical methods.

Taguchi, G.; Chowdhury, S.; and Wu, Y. (2000). *The Mahalanobis-Taguchi System*. McGraw-Hill, New York, NY.

Taguchi, G. and Jugulum, R. (1999). "Role of S/N Ratios in Multivariate Diagnosis". *Journal of the Japanese Quality Engineering Society* 7, pp. 63–69.

Taguchi, G. and Jugulum, R. (2000a). "Quality Loss Function in Multivariate Diagnosis". *Journal of the Japanese Quality Engineering Society* 8, pp. 47–52.

Taguchi, G. and Jugulum, R. (2000b). "New Trends in Multivariate Diagnosis". *Sankhya* (Series B) 62, Part 2, pp. 233– 248.

Woodall, W. H.; Koudelik, R.; Tsui, K.-L.; Kim, S. B.;

Stoumbos, Z. G.; and Carvounis, C. P. (2003). "A Review and Analysis of the Mahalanobis Taguchi System". *Technometrics* 45, pp. 1–30.

*Reviewer: Necip Doganaksoy*, General Electric Global Research, Schenectady, NY 12345.

**Recurrent Events Analysis for Product Repairs, Disease Recurrences, and Other Applications** by *Wayne B. Nelson*. ASA-SIAM Series on Statistics and Applied Probability, SIAM, Philadelphia, PA, ASA, Alexandria, VA, xi + 144 pp. $59.50 ASA/SIAM members, $85.00 nonmembers.

MANY applications in reliability, medicine, sociology, and other fields involve statistical modeling and analysis of data on time to the occurrence of some event of interest. The following two types of such data must be distinguished:

- Life data on products, patients, etc., consist of a single time for each population unit, usually the end of life.
- Recurrence data consist of times for any number of repeated events on a population unit (for example, repairs of a product or recurrent disease episodes of a patient).

In comparison, analysis of recurrence data is grossly underdeveloped, particularly from a practitioner's perspective. Data on repeated events have been typically modeled with a parametric stochastic counting process (e.g., homogeneous and nonhomogeneous Poisson processes and renewal processes). These models are often limited to a single system, entail assumptions that are unrealistic or difficult to verify, and require knowledge of advanced topics that severely limit their accessibility to novice practitioners. Few statistical packages facilitate fitting of such models to data on recurring events from a sample of units. As indicated by Ascher and Feingold (1984), it is not at all uncommon for the methods appropriate for life data to be incorrectly used to analyze times between system failures. The need for practical models and methods for repairable systems data has gained increased recognition in recent years. The recent book by Rigdon and Basu (2000) provided an excellent exposition of parametric stochastic process models and their applications. Also, some newer statistical reliability books have chapters devoted to repair data, with an emphasis on data analysis (Tobias and Trindade (1995) and Meeker and Escobar (1998)).

This book provides easy to use nonparametric and graphical tools to analyze different types of recurrence data that are encountered in practice. It is written mainly for practitioners (statisticians and subject experts in reliability, biomedical, sociological, and other applications). The nonparametric mean cumulative function (MCF) for the number (or cost) of events contains most information sought from such data. MCF is the population mean cumulative cost or number of recurrences up to time *t*. The book provides a thorough presentation of the nonparametric estimator and confidence limits for MCF for common types of censored recurrence data. The author has been a major contributor to the development of the statistical theory for the nonparametric estimator. The MCF plots are as basic and as informative as probability plots are for life data. Furthermore, the MCF estimator (and plot) applies to cost of recurrences and other "values" of recurrences, an important innovation, whereas most books and literature deal only with counts of recurrences. The book also briefly introduces some regression (Poisson and Cox) and parametric models.

There are 8 chapters in the book. The material in Chapters 5 and 6 does not seem to have appeared before. An overview of the book's chapters follows:

- In Chapter 1 (Recurrent Events Data and Applications), the author uses real data to explain various types of censored recurrence data and describes the type of information sought from such data. These data include repairs of automobile transmissions, recurrences of bladder tumors, and births of children to statisticians, which are analyzed repeatedly in later chapters. This is one of the most important chapters of the book in that it provides much of the motivation for the remainder of the book. The author also describes important practical issues that must be resolved before application.
- In Chapter 2 (Population Model, MCF, and Basic Concepts), the simple and versatile nonparametric model for recurrent events data is presented.
- In Chapter 3 (MCF Estimates for Exact Age Data), the author presents and illustrates the nonparametric MCF estimator and its interpretation for a sample of units when the recurrence and censoring times are known exactly. This includes analysis of cost data as well as the number of recurrences.
- Chapter 4 (MCF Con.dence Limits for Exact Age Data) is an illustration of con.dence limits for the MCF estimator of Chapter 3.
- In Chapter 5 (MCF Estimate and Limits for Interval Age Data), the MCF estimator and confidence limits when the censoring and recurrence data are grouped into intervals are presented.
- In Chapter 6 (Analysis of a Mix of Events), methods for analyzing recurrence data with a mix of events (such as repair data containing different failure modes) are described.
- In Chapter 7 (Comparison of Samples), the author presents methods for comparing MCFs from different samples to assess if they differ significantly.
- Chapter 8 (Survey of Related Topics) provides a brief introduction to related topics such as the Poisson and renewal processes and the Cox model.

The book is clearly written, to the point, and easy to read. It is well suited for self-study by practitioners with an introductory-level background in statistics. The flow of the presentation is logical. In Chapter 1, the author uses several real datasets with different types of recurrence data to motivate the methods and analyses in subsequent chapters. The book contains many real data sets, most of them from the author's own applications dealing with industrial products. There are also quite a few other interesting examples from other fields, including customer purchase behavior at amazon.com, childbirths to statisticians, and bladder cancer tumor recurrences. All the applications are accompanied by insightful discussions of practical issues and interpretation of the findings. Most of the methods in the book can be easily implemented in a spreadsheet program. There is also a useful survey of suitable software packages. While the book has an applied orientation (most of the book consists of text, tables, and data plots), it is based on sound statistical theory. The assumptions are well explained. Estimators and confidence limits, based on approximations, are clearly spelled out.

The book could also be used as a supplementary text in a graduate or upper-level undergraduate course on reliability. There are problems and discussion questions at the end of each chapter. The bibliography is up-to-date and contains more than 90 references. The real data sets and practical discussions on the examples will benefit the students. I also expect that this book will motivate new research and development into areas where the existing methods are somewhat based on crude approximations (e.g., con.dence limits for MCF in the case of interval censored recurrence data). A minor criticism of the book concerns its typesetting. Some readers may find the small text font, and the even smaller font in some of the tables, annoying.

I highly recommend this book to anyone interested in analysis of recurrence data.

Ascher, H. and Feingold, H. (1984). *Repairable Systems Reliability: Modeling, Inference, Misconceptions and Their Causes*. Marcel Dekker. New York, NY.

Meeker, W. Q. and Escobar, L. A. (1998). *Statistical Methods for Reliability Data*. John Wiley & Sons, Inc., New York, NY.

Rigdon, S. and Basu A. P. (2000). *Statistical Methods for the Reliability of Repairable Systems*. John Wiley & Sons, Inc., New York, NY.

Tobias, P. A. and Trindade, D. C. (1995). *Applied Reliability*, 2nd ed. Van Nostrand Reinhold, New York, NY.

*Reviewer: Lloyd S. Nelson*, Statistical Consultant, Londonderry, NH 03053-3647.

**Statistical Process Control for Health Care** by *M. K. Hart and R. F. Hart*. Duxbury, Pacific Grove, CA. 2002, ix + 343 pp. $77.95.

THE stated objectives for the reader of this book are: (1) to understand the theory of statistical process control; (2) to analyze the data on the computer; and (3) to apply these techniques to real data. The inside back cover states that data sets are available on the Internet at www.duxbury.com/datasets. htm. There are numerous small data sets given in the text to illustrate procedures. However, readers are enjoined to carry out their computations on a computer. Examples are given on how to carry out analyses using two computer packages (Statit and Minitab).

This is a very nicely designed book that would be suitable as a text for a beginning class on control charts. It is also su.ciently self-contained to be well suited for self-teaching. I was pleased to notice that the authors resisted the temptation to explain why the denominator of the expression for the standard deviation is *n*-1 rather than *n*. When discussing the arithmetic mean, the old-fashioned descriptive term *central tendency* is used. It would have been helpful if the modern term (meaning the same thing) *location* had been introduced.

Excellent examples of real-life data are used. The authors do not just describe their examples; they take the reader through them. Although the examples deal with hospital problems, anyone with the desire to learn about quality control could appreciate the basics that are introduced.

I have one criticism that I think is very important. The authors, in their discussion of control charts, continually refer to the signi.cance (the Type I error probability) associated with points being out of control. Furthermore, they introduce a parallelism between a point being "out of control" on a Shewhart control chart and a "significant" result arising from the application of the analysis of means. The difficulty is that the analysis of means produces a significance test. The control chart is not a significance test in strict statistical parlance. This has been discussed in some detail by Nelson (1999) and Woodall (2000).

A second, much less important, criticism concerns the statement about halfway down page 3; namely, that common-cause variation is due only to random chance. That this is not so is discussed by Tukey and quoted in Nelson (1999). The point is that what is left after removing the special causes is not random but consists of causes of variation that are too small to be of economic interest. Despite these criticisms, however, I would be happy to recommend this book to any beginner with an urge to learn the wonders of control charting.

Nelson, L. S. (1999). "Notes on the Shewhart Control Chart". *Journal of Quality Technology* 31, pp. 124–126.

Woodall, W. H. (2000). "Controversies and Contradictions in Statistical Process Control". *Journal of Quality Technology* 32, pp. 344–378.

*Lloyd S. Nelson*, Statistical Consultant, Londonderry, NH 03053-3647.

**Statistical Process Control, The Deming Paradigm and Beyond** 2nd edition by *James R. Thompson and Jacek Koronacki*. Chapman & Hall/ CRC, Boca Raton, FL. 2002. xxii + 431 pp. $89.95.

USING Deming's ideas, the authors concentrate on illustrating and explaining his philosophy by means of numerical examples taken from real case histories. There are long sections that deal with various historical aspects of quality history that should be of great interest to those in management. Contrariwise, there are sections that are tutorials involving some quite advanced mathematical ideas (for example, "Appendix A: A Brief Introduction to Linear Algebra" and "Appendix B: A Brief Introduction to Stochastics"). I doubt if high-level managers would have any interest in these subjects.

For the engineer deeply involved in statistical process control, this book should be of great interest and considerable help. Many examples of control charts are used and discussed. The authors appear to have a favorite expression for designating an out-ofcontrol point; they repeatedly refer to it as a "Poisson glitch."

It was disappointing to see Evolutionary Operations (EVOP) devalued! The authors state that "Production sta.s are entitled to expect that most production time will be spent in an 'in control' situation. To expect production personnel to function almost continuously 'on the edge' is not particularly reasonable or desirable." The implication that EVOP causes a process to go out-of-control is misleading. This excellent procedure does not deserve such criticism. It is true that it is not used as much as it should be. But I believe that this is caused by a lack of understanding rather than a desire not to do something that would seem to reduce productivity. On the other hand, the Nelder-Mead Simplex algorithm is exemplified very nicely and with considerable attention to detail.

Various probability distributions, both continuous and discrete, are discussed in some detail. Other subjects treated are: bootstrapping, laws of large numbers, moment-generating functions, central limit theorem, conditional density functions, random vectors, quadratic forms of normal vectors, Poisson process, and Bayesian statistics. Tables of the common statistical distributions are provided. Following the Indian philosophy of never producing a perfect piece of work (to avoid angering the gods), the authors have misspelled Student's name; it is Gosset, not Gossett.

**CRC Press** 2000 Corporate Blvd. NW, Boca Raton, FL 33431-9868 (800) 272-7737; http://www.crcpress.com.

**Duxbury Press/ITP**, 511 Forest Lodge Road, Paci.c Grove, CA 93950-5098; (800) 423-0563; http://www.duxbury.com.

**John Wiley & Sons**, 605 Third Ave., New York, NY 10128; (800) 352-3566; http://www.wiley.com.

**Society for Industrial and Applied Mathematics**, 3600 University City Science Center, Philadelphia, PA 19104-2688; (215) 382-9800; Fax (215) 386-7999; http://www.siam.org.