 ## 2020

STATISTICS SPOTLIGHT

# But the Penguin Isn’t in a Vacuum

## Using simplified models to unravel complexities and make more-informed decisions

by Matthew Barsalou

I once took a summer semester physics class. The course covered the same material as a regular semester, but met for half a day five times a week to finish in less time than a regular class.

The class was intensive, and the instructor seemed to spend more time discussing hockey than he spent teaching physics. I needed help, so I would meet with two math students who were happy to tutor me. I am not sure how it started, but the examples they always used involved a penguin at the equator in a vacuum.

Now I know penguins don’t live at the equator and they don’t last long in a vacuum. But to simplify the examples, the students used those assumptions to simplify the problem for training purposes. While learning the basics wasn't the right time for additional complexity resulting from gravity that varies due to location or air resistance, more complications that are encountered in real-world problems could be added after I had mastered basic penguin ballistics.

A typical example from my physics textbook is: "A football is kicked at an angle θ° = 37.0° with a velocity of 20.0 m/s … Calculate (a) the maximum height, (b) the time of travel before the football hits the ground, (c) how far away it hits the ground, (d) the velocity vector at the maximum height and (e) the acceleration vector at maximum height. Assume the ball leaves the foot at ground level."

Kicking a penguin sounds excessively cruel, so my tutors would begin with, "Suppose we have a penguin at the equator in a vacuum and we impart a force of 20.0 m/s to the penguin. To determine the maximum height, we must …"

With the basic understanding I gained based on penguin-related examples, I could later easily convert a textbook problem to my more-comfortable world of penguins and perform the calculations. The same principles apply when going from penguins or textbooks to real-world situations. I just need to start factoring in things such as air resistance.

### Simple models for complex problems

Using a simplified model can be advantageous when communicating a new and potentially nonintuitive concept, learning new concepts or attempting to understand a complex real-world phenomenon.

Dianna Cowern asks, "What would happen if a rock is dropped from a boat?"2 Would the water level go up or down? Take a moment and think about that before moving to the next paragraph.

Naturally, the water level would go down because the rock is displacing its weight while in the boat, but it displaces an area equivalent to its volume when it is in the water. That simple answer may not be so intuitively grasped by everybody, but it can be easily explained using a simple thought model.

Suppose somebody is in a boat with a 1,000 kg marble that only has a diameter of 2.4 cm. The boat will sit lower in the water due to the 1,000 kg of weight. After the marble is in the water, the boat will rise because its total weight is now 1,000 kg less. The marble will enter the water and displace a volume of water equivalent to its actual size resulting in the water level going down compared to when the marble was in the boat.

The example with the unrealistic marble is an easier concept to grasp than the more realistic rock example. But anybody who can understand the simplified model should be able to understand the same principles when applied to an actual situation.

Models have a place in statistics, whether as a thought experiment (that is a model) for explaining a concept or as a mathematical model of the world.

### Models and explanations

A typical probability example from a statistics text book is the tossing of a coin such as "the probability of getting either a head or a tail on a toss is one, and there is a 50% chance of a head and a 50% chance of a tail."3

That book and many others like it are intended for teaching statistics to future engineers, yet they use examples related to coins, cards and dice. This does not imply that a knowledge of gambling or games is necessary for applying statistical concepts. These examples are simplified models intended for teaching the basics of a concept, much like the penguin-related examples used for teaching physics.

Often, statistics textbooks have some variation of "the supervisor tells you a sample of 50 widgets was measured and found to have a mean length of 22.4 mm, and the processes is known to have a standard deviation of 0.4. The process mean is known to be 22.5. Use a Z-test to determine if we have evidence the mean has changed using an alpha of 0.05."

The problem is solvable with the given information. In real life, however, I have yet to have a supervisor ever tell me the process standard deviation, the correct test to use or the desired alpha.

In the real world, we must generally determine the process standard deviation on our own or select a statistical test that does not require us to know the process standard deviation. Although the example is not realistic, it still serves as a method for teaching basic concepts. After those are understood, the model can be changed to become more complex and realistic.

Simple models are not only useful for textbooks and teaching statistics. A statistician planning a design of experiments (DoE) may need to use a simple model for explaining what a DoE is, for example, and how it works when working together with subject matter experts (SME) who have no knowledge of DoEs.

The flight time of a paper helicopter can be used to illustrate the response variable, and the various dimensions of the helicopter can be used for communicating the concepts of factors and levels. It would be easier to plan the DoE after the SMEs understand what a DoE is based on the simple model.

### Models and the world

Statistics are often used to create a mathematical model of the real world for the purpose of making a decision.

An insurance company, for example, may perform a statistical analysis to determine whether age, gender, type of vehicle and location have an influence on the rate of automobile accidents. An engineer may perform a DoE to identify the factors and their levels to improve the output of a manufacturing process.

The DoE results in a mathematical model that represents the actual process. The model may not match the process exactly due to variation under production conditions. If the DoE was properly carried out, however, the model can be used to identify the settings required for an increased output.

### Knowing the limitations

Simplified models of a complex world can be useful for explaining nonintuitive concepts in an easier-to-comprehend manner and gaining on understanding of actual phenomena to make more-informed decisions.

When using models, we would do well to remember George E.P. Box and Norman R. Draper’s warning: "Essentially, all models are wrong." Although all models may be wrong, hope is not lost because Box and Draper also reassure us: "Some are useful."4

The trick is to know when the model is useful as well as the limitations of the model.

### References

1. Douglas C. Giancoli, Physics: Principles With Applications, fifth edition, Prentice Hall, 1998.
2. Phil Plait, "A Sunday Morning Brain Teaser for You," Slate, June 26, 2016.
3. John Lawson and John Erjavec, Modern Statistics for Engineering and Quality Improvement, Wadsworth Group, 2001.
4. George E.P. Box and Norman R. Draper, Empirical Model-Building and Response Surfaces, John Wiley & Sons Inc., 1987.

Matthew Barsalou is a statistical problem resolution Master Black Belt (MBB) at BorgWarner Turbo Systems Engineering GmbH in Kirchheimbolanden, Germany. He has a master’s degree in business administration and engineering from Wilhelm Büchner Hochschule in Darmstadt, Germany, and a master’s degree in liberal studies from Fort Hays State University in Hays, KS. Barsalou is an ASQ senior member and holds several certifications.

### Average Rating Out of 0 Ratings