ONE GOOD IDEA
Humans Are Not Random
To ensure randomization in sampling, rely on Excel functions
by David Phillips
Humans are really poor at randomization. Ask people for a random number between one and 10 and you won’t get many ones or 10s.
So we’re intrinsically unprepared when a sampling plan tells us to "select 27 random parts," or a survey design requires that we use "random coding" to identify different choices. Fortunately, most of us have an excellent randomizing tool right at hand: Excel spreadsheets.
The basic random number function in Excel is =rand( ), which returns a random nine-place decimal value between zero and one, such as 0.134749358. This might not seem tremendously useful, but it is great for sorting a list into a random order.
If you wanted to audit five items from a list of 20, for example, you could list the items in column A, input =rand( ) in the corresponding rows of column B, and sort both columns by column B. After the list is sorted, the first five items in column A would be a randomly selected sample.
A close relative to =rand( ) is =randbetween(x,y). As you might have guessed, it returns a random result between x and y inclusive. Keep in mind that it will only return integers, even if you enter decimal values for x and y. This is useful for randomly coding samples. To generate random three-digit codes, for examples use =randbetween(100,999).
Taking random coding a step further, what if you prefer letters instead of numbers? All of the characters a computer recognizes are assigned a numerical value in accordance to the American Standard Code for Information Interchange (ASCII).
The letters A to Z have ASCII codes between 65 and 90, so you would start by assigning a random integer using
=randbetween(65,90). Conveniently, Excel has a built-in =char function to convert an ASCII code to the character it represents. To assign a single random letter, the formula would be =char(randbetween(65,90)).
A single-letter code is not useful, but you can easily
tack on extra characters using an ampersand (&). To get a two letter code,
for example, the function would
There are three main pitfalls when using Excel’s randomization functions:
- Duplicate codes. There is no automatic way to prevent duplicates. If you generate a list of two-letter codes, you could end up with two or more of the same code. One way to eliminate duplicates is to sort the list of codes in either ascending or descending order—duplicates will be grouped together and you can easily delete them.
- Changing results. The random functions are recalculated when the spreadsheet changes, so you will find that new, random results are constantly generated. This is especially annoying if you have already checked your results for duplicates. Once your random results are generated, copy and paste them into a new spreadsheet using the Paste Special > Values command.
- Inadvertent spellings. If you are using letter codes, be sure they are not spelling anything you don’t want. There are some obvious four-letter combinations to avoid. However, there are other, less obvious words as well. For example, avoid anything that implies one of the samples is a control, such as "CO" or "CTL." Also avoid any code that may identify the sample, such as by an organization’s name, a person’s initials or a project acronym.
With Excel’s randomization functions, you can ensure that your organization’s quality program follows best practices and avoids any inadvertently systematic human thinking.
David Phillips is the quality engineering manager at Dentsply in Milford, DE. Phillips has a bachelor’s degree in mechanical engineering from Grove City College in Pennsylvania. He is a senior member of ASQ, and an ASQ-certified quality engineer and Six Sigma Black Belt.