One of my graduate statistics professors would tell a story about his wife calling him at work and asking him to pick up a new bathroom scale on his way home. Being a scientist, this was not simply a matter of dashing into a store and purchasing the first one he found. Instead he decided to go to Service Merchandise, a once thriving chain that no longer has any brick and mortar stores. Service Merchandise was unique in that only one of each product was on the shop floor and that one was displayed out of its box allowing you to fully examine it. If you wanted to purchase an item, you wrote its product number on a slip of paper and took that slip to a cash register to purchase it. After purchasing, an unopened box containing the product would be pulled from the warehouse upstairs and slid down rollers to where you waited.
For my statistics professor, the benefit of this type of display was that he was able to try out each bathroom scale that the store offered. He first tested each scale’s reliability. He did this by stepping on a scale, checking his weight, stepping off the scale, and then repeating this process several times for each scale. He might have gotten a few sideways looks from fellow shoppers as he stepped on and off half a dozen or so bathroom scales multiple times, but when he finished he had identified three of scales that gave him the same answer each time he weighed himself. These three scales were reliable—they repeatedly gave him the same answer. The three reliable bathroom scales didn’t all measure him as weighing the same; one scale showed him a couple pounds heavier, another a pound or so lighter. But, each scale did show him weighing the same as it had every other time he had stepped on that particular scale.
Now that he had identified which scales were reliable, he was ready to tackle validity. He needed to determine which of the three reliable scales was the most accurate. Fortunately for my professor, the store also had one of those big scales with sliding weights—the type you see at the doctor’s office each time the nurse says those unavoidable words, “step onto the scale, please”. Being a thorough scientist, he weighed himself on the doctor’s scale. He now knew his true weight. One of the three bathroom scales matched that true weight. He had found the bathroom scale that was valid—accurately measured his weight. With satisfaction, he picked up the reliable and valid scale and took it to check out where the cashier tried to write down the product number and order him an unopened (and untested) version of the scale he held. This wouldn’t do.
“No, I want to buy this scale,” he said holding out the scale and shaking it slightly. “I tested the reliability and validity of this particular scale, not of whichever one you will give me.” Needless to say, he went home happy with a bathroom scale that was a little shopworn but which told him his real weight every morning.
Morals of the story:
- Reliability is the repeatability of a measurement. A measure gives the same result each and every time it is used.
- Validity exists when a test measures what it claims to be measuring.
- Reliability is necessary perquisite for validity. Validity cannot exist where reliability does not exist.
- Always send a scientist when choosing a new bathroom scale!
Want to know how reliability and validity can serve as critical analytical factors in your organization’s success? Contact us now using your Contact Form.