The story is told that Voltaire was once taken on a tour of a spring in France known for its miraculous healing waters. When he was shown a wall full of crutches left by cripples who had been cured, he nodded and said, "yes, but where are the crutches of those who were not cured?" Although it may not seem immediately obvious, by asking this question Voltaire had anticipated several of the key issues involved in modern clinical trials research by almost two hundred years.
Voltaire was actually asking how representative of the general population (of cripples) the sample of discarded crutches was. Did every cripple who came to the spa leave his crutch? Did one in ten leave his crutch, or one in ten thousand? If one in ten thousand was cured, why were the other 9,999 not cured? Was there anything unusual about the cripples who were "cured" as compared to those who were not? Perhaps they represented only a particular type of cripple. Perhaps only a particular type of cripple came to the spa in the first place. Perhaps the abandoned crutches represented a normal remission rate for the kind of cripples who came to the spa. Basically, Voltaire was saying that in order to determine if anything unusual were going on at the spa, we would have to know a good deal more about what kind of people were apparently cured as well as what kind of people were not cured. And once we knew what kind of patients we had in our sample, we could compare them to similar patients (a matched sample) who didn't get the treatment to see if the waters really did anything at all or if we were looking at something which would happen anyway. And even then we could feel confident in projecting those findings not necessarily to all cripples but only to cripples who were like those in our sample.
Attempting to generalize from samples of patients with unknown qualities represents a major threat to the validity of informal clinical observations in general. The idea is that what happens with a particular patient or two may not be typical and should not necessarily be projected to all patients. In fact, the randomization process in randomized clinical trials is specifically designed to deal with this problem. Today we would speak of the necessity of obtaining an unbiased (projectable) sample of an identified pathological population, and of having a matched "control" group against which test group results can be compared.
The biased sample problem (sometimes called the availability heuristic) is particularly insidious because it usually operates on an unconscious level: we are not aware that our sample is biased or that we ourselves may be doing the biasing. It is not just a question of fine tuning what could be observed anyway: without specific controls against biased sampling, inferences drawn from informal clinical observations are very likely to be completely wrong, not even in the ball park. Useless remedies and even destructive clinical procedures (e.g. the bleeding of patients) have persisted, sometimes for hundreds of years, because biasing errors in clinical judgements were made consistently and formal evaluation methods were either not yet available or not applied.
Although they are the real reason scientific method has developed over the past two hundred years, naturally occurring errors in the intuitive (informal) judgement process are only now beginning to be studied in detail and understood. Of particular interest is how the universal tendency toward such inferential errors could have developed in the first place. One possibility is that human responses are often reinforced for speed or confidence rather than for accuracy. A confident and quickly-made response may, under many circumstances, be more adaptive than a technically more correct, but slower or more timid response. And, once such errors have been made and reinforced, human beings have invented many ways to keep doing them. People are led to unconsciously search out confirming evidence, to interpret mixed evidence in ways that confirm their expectations, and to see meaning in chance phenomena. Human reasoning has apparently evolved as much to protect the ego of the observer as to generate accurate knowledge of the world.
The first study that the Consortium for Chiropractic Research conducted was to determine the characteristics of the population of patients we would be sampling in future studies, i.e. to evaluate the types of patients who were treated at the Consortium's college clinics. The second study the Consortium did was to compare the college clinic patients to the type of patients who were treated at nearby regular D.C.'s offices. We answered the following questions: what kind of problems present in college clinics? Are they the same as in field doctors' offices? If we do studies using college clinic patients, can we extrapolate these findings to the "real" world? (The answers to these questions appear in three different articles in the Journal for Manipulative and Physiological Therapeutics.)
Studies like these are time consuming and may seem trivial when viewed relative to the work that lies ahead: preparing for foundation of any structure is generally not romantic or exciting work. But we cannot, as some would say chiropractic has tried to do in the past, begin building at the penthouse and work our way down.