MEASURE FOR MEASURE
Understanding proficiency test results
by Christopher L. Grachanen
For many calibration and testing laboratories, routine proficiency testing is a requirement for obtaining and maintaining accreditation status. ISO/IEC Guide 43-1—Proficiency testing by interlaboratory comparisons defines proficiency testing as a "means used in the determination of laboratory testing and measurement performance."1
Stated a little differently, proficiency testing is analogous to a surveillance activity with the purpose of assessing the quality and uniformity of tests and measurements performed by a laboratory. ISO/IEC 17025, Section 5.9—Assuring the quality of test and calibration results references proficiency testing programs.2 The definition of "assure" includes the words promise, guarantee, pledge, declare, and give surety and comfort.3 Given these definitions, proficiency testing is basically an assessment activity used to demonstrate and determine a level of work competence as derived from measuring a test artifact and evaluating the results.
The demonstrated work competence level is assumed to be representative of the relevant work performed by a laboratory. As such, it is deemed a predictor with regard to the work a laboratory may be expected to perform. Consequently, the result of a laboratory satisfactorily completing a proficiency test assumes the laboratory will continue to make satisfactory measurements within the scope of the proficiency test. It must be noted that interlaboratory comparison programs and measurement assurance programs do not necessarily demonstrate proficiency and are usually not accepted as meeting the proficiency requirements for accreditation unless specifically designed to meet these requirements. You can reference ISO/IEC Guide 43-1:1997 for a better understanding of proficiency test requirements.
The method for performance determination is established via the evaluation of measurements made on test artifacts. Test artifacts are normally devices which have established performance attributes that are relatively stable over time and exhibit characteristics that are similar to units tested or calibrated by a laboratory. An example of a test artifact often used in pressure proficiency testing is a pressure transducer, which is a device that measures pressure and produces an electrical output corresponding to the pressure measured.
For many proficiency tests involving testing laboratories, test material commonly referred to as reference material, standard reference material or certified reference material is the norm. This test material is sufficiently stable and homogeneous regarding one or more of its properties to enable it to be used in assessing measurement processes.
The selection of a test artifact or test material for a particular measurement parameter considers a laboratory’s measurement capability in terms of measurement range, measurement accuracy, physical measurement restraints and safety considerations.
Characteristic values for proficiency test artifacts or test material are normally assigned by a reputable, independent laboratory—a laboratory normally not participating in the proficiency test—to avoid the appearance of any bias toward a particular laboratory participating in a proficiency test.
A typical proficiency test scheme has the independent laboratory first establish characteristic values for a test artifact or test material. The test artifact or test material is then sent to participating laboratories to measure. Note: Participating laboratories do not know the values assigned by the independent laboratory.
Accompanying the test artifact or test material are instructions giving guidance on general testing setups and the format of testing results. After completing of the proficiency test, the independent laboratory will often measure the test artifact or test material again to ensure nothing squirrely happened to the artifact or material during the proficiency test.
After proficiency test results have been submitted by all participating laboratories, an evaluation is performed to determine each laboratory’s performance. This evaluation is normally done by an independent administrator to again avoid the appearance of any bias toward a particular laboratory participating in a proficiency test.
Performance results are provided to each participating laboratory along with unidentified proficiency test results of other participating laboratories to gauge performance with other laboratories having similar measurement capabilities.
The three most widely used proficiency test performance statistics are percentage difference, z-scores and En numbers.
The percentage difference performance statistic is simply the difference between a participating laboratory’s test data and the test artifact’s assigned value, divided by the test artifact’s assigned value, multiplied by 100:
Percent difference: [(x - X) / X] * 100
in which x is the participating laboratory test data and X is test artifact assigned value.
The z-score performance statistic takes the difference between a participant’s test data and the test artifact’s assigned value divided by a variability performance statistic such as standard deviation:
z-score: (x - X) / s
in which s is the measure of variability. Note: The variability performance statistic used in the z-score computations should be based on enough observation to reduce the influence of extreme test results.
The En number performance statistic is derived by dividing the difference between a participating laboratory’s test data and the test artifact’s assigned value by the square root of the sum of the squares (RSS) of the participating laboratory test data uncertainty and the independent laboratory’s test artifact’s assigned value uncertainty:
En: (x - X) / √ (U2lab + U2ref)
in which U2lab is participant laboratory test data uncertainty, U2ref is an independent laboratory test artifact assigned value uncertainty.
So how do you know if proficiency test performance results are satisfactory or not? The following are industry-accepted values for evaluating z-scores and En numbers.
En ≤ 1 = satisfactory performance.
En > 1 = unsatisfactory performance.
|z| ≤ 2 = satisfactory performance.
2 < |z| ≥ 3 = questionable performance.
|z| ≥ 3 = unsatisfactory performance.
Additionally, the consensus of participating laboratory measured values may also be calculated, which is a central percentage at the 80, 90 or 95% level, to determine satisfactory performance.
Understanding proficiency test performance results is essential for determining whether a laboratory is competent to perform a particular measurement or test. When evaluating a measurement and testing a laboratory for a service, it is prudent to query if the laboratory participates in periodic proficiency testing and if they do, whether performance results were deemed satisfactory.
- International Organization for Standardization and International Electrotechnical Commission, ISO/IEC Guide 43-1:1997—Proficiency testing by interlaboratory comparisons.
- International Organization for Standardization and International Electrotechnical Commission, ISO/IEC 17025:2005, Section 5.9—Assuring the quality of test and calibration results.
- Dictionary.com, "Assure," http://dictionary.reference.com/browse/assure.
Christopher L. Grachanen is a master engineer and operations manager at Hewlett-Packard Co. in Houston. He earned an MBA from Regis University in Denver. Grachanen is a co-author of The Metrology Handbook (ASQ Quality Press), an ASQ fellow, an ASQ-certified calibration technician and the treasurer of the Measurement Quality Division.