Submit a Comment

Submit a Comment: State of Pain

Please use the form below to submit comments. Also provide an e-mail address and name. Your e-mail address and/or name will be used only to communicate with you about this or future comments you may submit. I am particularly keen to receive references to published material that contradicts the assertions and arguments I have made.

By submitting the above comment, I grant to Ross Alan Hangartner the right to incorporate the comment in full or in part, literally, paraphrased, or conceptually, as he sees fit, into State of Pain or other writings that he may create in the future. However, I don't grant permission to include my name or e-mail address, or to use them in any other way than to contact me for follow-up. I understand that by submitting the comment I acquire no right of any kind in State of Pain or other writings of Ross Alan Hangartner.

Sample Size and Confounders

Last updated: Sat, Aug 17, 2024

Technical problems can restrict the reliability of results. Suppose you want to test your idea about exaggeraters (Other Theories of Chronicity) by seeing how well they respond to your program of therapy. Suppose also that you have developed an operational definition of "exaggeraters" that you're satisfied with. A common method for evaluating treatment outcomes is to devise or adopt a questionnaire. That's objective, right? The hypothesis you develop to test your theory might be that "After our therapy program, the exaggeraters will have lower scores on the [fictitious] Overall Well-being Questionnaire (OWQ) than do other patients."

To test this, you need both exaggeraters and normals. (Or you need to know how much exaggerating each of your subjects does.) In the language of experiment, the normals in this hypothetical experiment are the "experimental controls." They are included in the experiment so that you have something to compare your exaggeraters to. Your operational definition of exaggeraters will allow you to select exaggeraters, or to measure how much your subjects exaggerate. You should expect some variability in the OWQ scores of individuals in both groups. Unless the difference between exaggeraters and normals is very large, you'll find that OWQ scores of the exaggeraters and the normals will overlap. If you don't have enough subjects, there is a chance that your experiment will give a misleading result. You'll need enough subjects to be sure that your outcome isn't the result of picking an unusual group of exaggeraters or an unusual group of normals.

This is a problem that the statistician on your team can help with, and in many cases it can be solved by increasing the number of subjects in the experiment. If there are enough subjects in the experiment to be reasonably sure that the result you got is representative of exaggeraters and normals, the result is said to be statistically significant. Experiments are commonly designed to be "statistically significant at the 95% level." This means that there is a 5% chance that, if you expanded your experiment to include all exaggeraters and all normals, you would reach the opposite conclusion. Put another way, 5% of experiments that are significant at 95% will come to the wrong conclusion.

You should also be interested in the practical significance of the difference between the two groups. In medical treatment research, the term used is clinical significance. Suppose that a 20-point difference in OWQ scores is considered clinically important, that is, a doctor or a patient would notice the difference and think it important. In this case you might design your experiment to test whether you can be confident that the difference between exaggeraters' OWQ scores and normals' OWQ scores is at least 20 points. It would require more subjects to establish a difference of 20 points than just to establish that the two groups had different scores. But you don't have to worry about the details because your statistician will.

People differ in other ways than in being exaggeraters or non-exaggeraters, and some of the ways in which they vary may affect how they respond to your therapy program. Since you want to compare the response of exaggeraters in general to the response of normals in general, you must take care that the subjects you include in your experiment don't differ in these other ways. It wouldn't do, for example, to pick exaggeraters who are all in their 20s and otherwise healthy, and to pick normals who are all in their 80s and are all afflicted with severe arthritis. If you did, you would be testing youthful exaggeraters against old and physically burdened normals. You would have no way to know whether any difference was because of being exaggeraters or because of being 20 years old.

These extra factors are called confounders. There are several strategies to protect against them. One way is to select enough subjects at random from a large representative population. Unfortunately, this is practically difficult to do. Another strategy is to match subjects in the two groups. If half of your exaggeraters are in their 80s with arthritis, make sure that half of your normals fit the same description. Unfortunately, this strategy requires the experimenters to make judgments about which characteristics are important enough that they need to be matched, which opens another door to bias. (See also Selection Bias.)

Problems can also come from the people who conduct the experiment. Consider what might be the result if the people who provide the therapy program that is part of your experiment have their own ideas about people. (Unlikely but not impossible.) Perhaps they don't like exaggeraters and don't pay too much attention to them during therapy, or perhaps they're concerned for them and give them extra help. Such effects could be another confounder when you get the results of the experiment. Although your hypothesis was proven (exaggeraters after all fared less well than normals), it wasn't because of something in the exaggeraters, but because of something in the therapists. This type of error is difficult to avoid or to diagnose.

For reasons like this, it is often desirable to “blind” the participants in experiments. Blinding is the technique of concealing the treatment that is being given to a subject from those involved in the experiment. The classic example of blinding is in the trial of a new drug, where neither patients, the medical personnel who administer the drugs, nor the physicians treating the patients may know whether the patient is receiving the new drug or something else. In some kinds of research, such as this hypothetical research on exaggeraters, blinding may be impractical to do effectively.

It is often theoretically possible to increase the sample size to reliably detect a difference in outcomes, however small the difference may be. However, it is not usually possible in experiments on human subjects to prove that there are no important differences between the treatment group and the controls.