Health & Wellness / Lifestyle

Response to Meta-Analysis Published by Assendelft, et al., in the Annals of Internal Medicine

Anthony Rosner, PhD, LLD [Hon.], LLC

The recent meta-analysis by Assendelft, et al., published in the May 13, 2003 Annals of Internal Medicine,¹ is a troubling example of how clinical data are interpreted and presented for public consumption and policy development. In addition to raising numerous issues as to the clinical applicability of meta-analyses, it may even belie its basic premises, as I will illustrate. In reviewing this document, one must remain vigilant as to how the rowdy qualities of human bias, subjectivity and disagreement extend well into the rarefied atmospheres of randomized clinical trials, meta-analyses and actual clinical guidelines.

The overall conclusion of the Assendelft, et al., report - that there is no evidence that spinal manipulation therapy is superior to standard treatments for patients with either acute or chronic low back pain - can be interpreted in the same breath to indicate that, in terms of the pain or disability outcomes scales evaluated, it is not inferior. Before analyzing the methodological issues of the report itself, it is entirely justified to ask whether the treatments are truly equivalent.

1. Comparative Side-Effects and Relative Safety

For spinal manipulation, the occurrence of major complications, regardless of the region of the spine manipulated, has generally been shown to be less than one per million.^2-5 Even transient, minor side-effects have been estimated to occur at one per 120,000 cervical manipulations.⁶ These figures pale when compared to an extensive body of literature describing as many as 220,000 deaths and other complications in the United States attributable each year to medications, in general,^7-14 or the 10,000-20,000 fatalities and multiple-organ systems adversely affected by NSAIDs.^15-23 Even what has been regarded as the more relatively benign COX-2 inhibitors^24-27 and acetaminophen medications²⁸ have been described to generate serious GI, cardiovascular and hepatic problems at rates that are orders of magnitude greater than side-effects attributed to spinal manipulation. The overall picture comparing spinal manipulation to the commonly used treatment alternatives of either direct analgesic ingestion, or visits to the general practitioner (80 percent resulting in analgesic use, by the authors' own citation ^1,29), should be one of relative clarity to the patient: In one instance, there is an option with a low rate of lasting side-effects; in the other, there is a treatment regimen with severe and sometimes fatal complications inexplicably deemed "acceptable."³⁰

2. Mix of Clinical Judgement With Data From the Literature

The authors strongly imply that this study is intended to be more rigorous than the systematic reviews and meta-analyses that preceded it. However, their admission to the effect that the comparison of spinal manipulative therapy with each different treatment alternative for each outcome for each back pain stratum "was not possible, because the data were sparse," raises one's suspicion that this particular review may not have been as "systematic" as first presumed. These fears are confirmed in the very next sentence, which informs the reader that the clinical judgment of effectiveness benchmarks from members of the Cochrane editorial board was used to fill in the gaps of experimental data, undermining the very process championed in this study.

Indeed, the noted epidemiologist David Sackett applauds the use of clinical expertise and experimental outcomes data to build a truly effective evidence base for optimum patient care,³¹ a sentiment echoed elsewhere.³² However, this undercuts the very process the authors suggest they are undertaking in pursuit of the most definitive experimental data available. In other words, how much adulteration of this "systematic review" has taken place?

3. Inadmissable Criterion of Quality

One of the criteria for methodologic quality of randomized clinical trials (RCTs) by the Cochrane Group - the blinding of the care provider (V3) - is impossible in the administration of manual therapy, particularly high-velocity spinal manipulation. Accordingly, its inclusion by the authors as a determinant of inclusion or rejection of RCTs is without justification. At least 11 studies have reported (erroneously) double-blinding in the chiropractic experimental literature; the nonfeasibility of blinding the practitioner in numerous modalities of alternative medicine has been extensively discussed elsewhere, and needs to be duly noted.³³

4. Guideline Rationale

As part of their rationale for embarking on this investigation, Assendelft, et al., bemoan the disparity of recommendations for spinal manipulative therapy from different countries, citing, in particular, the dissensions expressed in the guidelines from Australia, Israel and The Netherlands. What the authors do not disclose is the preponderance of support for spinal manipulation expressed in eight out of a total of 11 such guidelines, with perhaps an additional half guideline thrown in for The Netherlands (which found sufficient justification for treating acute, but not chronic back pain by spinal manipulation,³⁴ an oddity, since one of the authors of that study, van Tulder, who is Dutch, decisively supported the chronic over the acute evidence in a recent systematic review.³⁵)

Furthermore, in the comparison of guidelines cited by the authors, there was concordance among all 11 nations (United States; United Kingdom; The Netherlands; Israel; New Zealand; Finland; Australia; Switzerland; Germany; Denmark; and Sweden) in six aspects of health care:

diagnostic triage, history taking, and physical examination;
their conclusion that radiographs were not useful for managing nonspecific low back pain;
their recognition of the importance of psychosocial factors;
their discouragement of bed rest;
certain stipulations regarding the prescription of medications; and
their conclusion that the vast majority of low back pain cases should be managed in a primary care setting.

Other areas besides spinal manipulation in which differences arose were:

exercise therapy;
muscle relaxants; and
patient information.

The reason this census of nations regarding guideline and medical practices has been taken is to point out that the reported concordances and discordances do not appear to correlate with the amount, design and quality of randomized clinical trials or systematic literature reviews that have been published. Rather, there appear to be human and cultural values at work here that I maintain have not necessarily been eliminated in the study currently under discussion. This leads directly to our next point of critique.

5. Meta-Analyses Themselves Are Subject to Bias and Omissions

Regarding their clinical relevance, the very basis of meta-analyses, including the report of Assendelft, et al., has to be scrutinized closely. One report has gone so far as to compare meta-analyses to statistical alchemy, due to their intrinsic nature:

"... the removal and destruction of the scientific requirements that have been so carefully developed and established during the 19th and 20th centuries. In the mixtures formed for most statistical meta-analyses, we lose or eliminate the elemental scientific requirements for reproducibility and precision, for suitable extrapolation, and even sometimes for fair comparison."³⁶

Specifically, Feinstein raises the following deficiencies of meta-analyses, most having to do with the sloughing of important clinical information:

Disparate groups of patients of varying homogeneity across different studies are thrown into one analysis, often called a "mixed salad."
The weighting of studies of different quality may be inaccurate or absent altogether.
One needs to know about the real-world effects in the presentation and treatment of patients, in particular:

the severity of the illness;
co-morbidities;
pertinent co-therapies; and
clinically relevant and meaning-
ful outcomes.

Inconsistent statistical techniques pertaining to increments, effect size, correlation coefficients, and relative risk and odds ratios.
Omission of the reference denominator.
The fact that the odds ratio inflates the true value of the relative risk under certain conditions.

In any event, the numbers of patients needed to treat must be reported to observe a true difference in treatment groups, a practice often overlooked in meta-analyses.³⁶ To make matters worse, a recent report involving four medical areas (cardiovascular disease, infectious disease, pediatrics, and surgery) indicates individual quality measures were not reliably associated with the strength of the treatment effect in 276 RCTs analyzed in 26 meta-analyses.³⁷

The fact that arbitrariness and bias can not only creep into, but actually dominate meta-analyses, is demonstrated convincingly and dramatically in a recent study published in the Journal of the American Medical Association. In their efforts to compare two different preparations of heparin for their respective abilities to prevent postoperative thrombosis, Juni and his colleagues revealed that diametrically opposing results can be obtained in different meta-analyses, depending on which of 25 scales is used to distinguish between high- and low-quality RCTs. The root of the problem is evident from the variability of weights given to three prominent features of RCTs (randomization, blinding, and withdrawals) by the 25 studies that have compared the two therapeutic agents.

In one investigation, for example, a third of the total weighting of the quality of the trial is afforded to both randomization and blinding, whereas in another, none of the quality scoring is derived from these features. Widely skewed intermediate values for the three aspects of RCTs under discussion are apparent from the 23 other scales presented. The astute reader will suspect immediately that sharply conflicting conclusions might be drawn from these different studies, and these fears are amply borne out by the forest plot presented in the study.

Here, each of the meta-analyses listed resolves the studies they have reviewed into high- and low-quality strata, based on each of their scoring systems. It can be seen that 10 of the studies selected show a statistically superior effect of one heparin preparation over the other, but only for the low-quality studies. Seven other studies reveal precisely the opposite effect, in which the high- but not the low-quality studies display a statistically significant superiority of low-molecular weight heparin. Therefore, depending on which scale one uses, one can either demonstrate or refute the clinical superiority of one clinical treatment over the other. In this manner, all the rigor and labor-intensive elements of the RCT and its interpretation by the meta-analysis are simply reduced to the subjective and undoubtedly capricious human element of value judgment through the arbitrary assignment of numbers in the weighting of experimental quality.³⁸ Reduced to lay terms often used to describe the limits of computer capabilities, one might summarize this undertaking as an apt demonstration of the principle, "Garbage in, garbage out."

6. Contradictions in Design

There appear to be contradictions in the design in the authors' comparison of spinal manipulative therapy to seven other treatment therapies (sham; conventional general practitioner; analgesics; physical therapy; exercises; back school; or a collection of therapies judged to be ineffective or even harmful, such as traction; corset; bedrest; home care; topical gel; diathermy; minimal massage, or no treatment). Specifically:

Conventional general practitioner and analgesic use are considered synonymous, based on a single reference that suggests 80 percent of visits to the general practitioner result in a prescription for an analgesic. Why, then, should analgesic use be presented in the report as a discrete intervention?
Physical therapy is stated to include exercise in amounts up to 100 percent. Again, why should exercise then be presented elsewhere as a separate intervention?

7. Contradictions in Evaluating Statistical and Clinical Significance

One especially troubling situation arises with the authors' interpretation of the forest plots comparing spinal manipulative and sham therapies. In one instance (Figure 3 in the Annals article), spinal manipulative therapy is shown to have "clinically important" short-term improvements in pain and disability; however, these differences are deemed to have "failed to reach a conventional level of statistical significance." In comparing spinal manipulative therapy to the group of treatments deemed ineffective, however, we now find a statistically significant advantage for the former intervention. It is perplexing indeed to then find the authors stating, "The clinical significance of this finding is questionable (emphasis mine).¹ In the simplest of terms, one cannot have it both ways. It would almost seem as if there were a deliberate effort to minimize a treatment effect of potential interest pertaining to spinal manipulation.

8. Data Are Not Shown in Critical Areas of Interest

Given the aforementioned arbitrary characteristics of meta-analyses, and perhaps of the authors' presentation, one has every reason to wish for the opportunity to examine the data that support the authors' contention that "our sensitivity analyses supported the robustness of our results with respect to the type of manipulative therapy, profession of the manipulator, and the quality of the studies included."¹ However, none pertaining to these critical areas were presented in the body of the paper. The issue is particularly important with regard to the skill and training of the manipulator, who at times has been misrepresented in the scientific literature.^39,40 It is questionable how effectively the authors were able to draw comparisons of different chiropractic techniques, as they overlooked the most recent and arguably comprehensive attempts to do so from both the points of view of clinical effectiveness⁴¹ and a literature review.⁴²

9. Clinical vs. Fastidious Treatments

Some treatments (traction, diathermy, minimal massage) have been deemed by the authors to lack sufficient evidence for their effectiveness as stand-alone applications, and as such, have been rejected from consideration in this investigation. What is not clear, however, is whether they are effective in a synergistic manner as ancillary treatments, and whether they have been excluded as potentially helpful adjuncts to manual therapy. This was alluded to in Feinstein's discussion of meta-analyses presented above (critique #5).³⁶

10. Lack of Long-Term Follow-Up

In this study, follow-ups for back pain outcome assessments are limited to six months. However, numerous studies cite recurrences of low back pain for up to one year.^43-45 This not only makes the definition of an episode problematical,⁴⁶ but demands that follow-up times for at least a year be observed to assess a more durable and perhaps economical treatment effect. Indeed, the longevity of treatment effects of spinal manipulation in managing back pain for 12 months^47-49 to three years⁵⁰ has been amply demonstrated. In comparison to medications for the treatment of headaches, it has been shown to be markedly superior.^51,52 As in several aforementioned areas of this study, this particular aspect for comparing treatments might be expected to diminish the actual capacity of spinal manipulation to display its full benefits.

Concluding Remarks

From a variety of perspectives, these meta-analyses appear flawed and either obscured or overlooked the maximal clinical benefits that might be expected to have been conferred upon patients by spinal manipulation, particularly as performed by a chiropractor. The patient response to intervention is far more complex than the dimensions offered by the authors in their discussion. Tonelli points out, for example, that there will always be a region called an epistemological zone⁵³ in which discrete differences between individuals cannot be made explicit and quantified. This degree of sophistication is best summarized by Horwitz, who points out that to assume that the entire range of clinical treatment in any modality has been successfully captured by the precision of existing analytical methods in the scientific literature, "would be like saying that a medical librarian who has access to systematic reviews, meta-analyses, Medline, and practice guidelines provides the same quality of healthcare as an experienced physician."⁵⁴ Hopefully, these shortcomings in the current meta-analyses can be appreciated by the public and addressed more meaningfully in future research.

References

Assendelft WJJ, Morton SC, Yu EI, Suttorp MJ, Shekelle PG. Spinal manipulative therapy for low back pain: A meta-analysis of effectiveness relative to other therapies. Annals of Internal Medicine 2003;138:871-881.
Haldeman S, Rubinstein SM. Cauda equina syndrome in patients undergoing manipulation of the lumbar spine. Spine 1992;17(12):1469-1473.
Terrett AGL, Kleynhans AM. Complications from manipulations of the low back. Chiropractic Journal of Australia 1992;22(4):129-140.
Hurwitz EL, Aker PD, Adams AH, Meeker WC, Shekelle PG. Manipulation and mobilization of the cervical spine: A systematic review of the literature. Spine 1996;21(15):1746-1760.
Haldeman S, Carey P, Townsend M, Papadopoulos C. Arterial dissections following cervical manipulation: The chiropractic experience. Canadian Medical Association Journal 2001;165(7):905-906.
Klougart N, LeBouef-Yde C, Rasmussen LR. Safety in chiropractic practice, Part II: Treatment in the upper neck and the rate of cerebrovascular incidents. Journal of Manipulative and Physiological Therapeutics 1996;19(9):563-569.
Lazarou J, Pomeranz B, Corey P. Incidence of adverse drug reactions in hospitalized patients. Journal of the American Medical Association 1998;279(15):1200-1205.
Phillips D, Christenfeld N, Glynn L. Increase in US medication-error deaths between 1983 and 1993. Lancet 351:643-644.
Schuster M, McGlynn E, Brook R. How good is the quality of health care in the United States? Milbank Quarterly 1998;76:517-563.
Weingart SN, Wilson RM, Gibberd RW, Harrison B. Epidemiology and medical error. British Medical Journal 2000;320:774-777.
Starfield B. Is US health really the best in the world? Journal of the American Medical Association 2000;284(4):483-485.
Report from the Inspector General of the Department of Health and Human Services; Lauran Neergard, Associated Press, December 15, 1999.
Gandhi TK, Weingart SN, Borus J, Seger AC, Peterson J, Burdick E, Seger DL, Shu K, Federico F, Leape LL, Bates DW. Adverse drug events in ambulatory care. New England Journal of Medicine 2003;348(16):1556-1564.
Gurwitz JH, Field TS, Harrold LR, Rothschild J, Debellis K, Seger AC, Cadoret C, Fish LS, Garber L, Kelleher M, Bates DW. Incidence and preventability of adverse drug events among older persons in the ambulatory setting. Journal of the American Medical Association 2003;289 (9):1107-1116.
Wolfe MM, Lichtenstein DR, Singh G. Gastrointestinal toxicity of nonsteroidal antiinflammatory drugs. The New England Journal of Medicine 1999;340(24):1888-1899.
Ament PW, Childers RS. Prophylaxis and treatment of NSAID-induced gastropathy. American Family Physician 1997;55(4):1323-1326, 1331-1332.
Simon LS. Osteoarthritis: An overview. Clinical Cornerstone 1999;2 (2):26-34.
Fries JF. Assessing and understanding patient risk. Scandinavian Journal of Rheumatology 1992; 92(Suppl): 21-24.
Armstrong CP, Blower AL. Non-steroidal anti-inflammatory drugs and life-threatening complications of peptic ulceration. Gut 1987;28:527-532.
Gabriel SE, Jaakkimainen L, Bombardier C. Risk for serious gastrointestinal complications related to the use of nonsteroidal anti-inflammatory drugs: A meta-analysis. Annals of Internal Medicine 1991;115:787-796.
Carson JL, Willett LR. Toxicity of nonsteroidal anti-inflammatory drugs: An overview of the epidemiological evidence. Drugs 1993;46(Suppl 1):243-248.
Carson JL, Strom BL, Soper KA, West SL, Morse SL. The association of nonsteroidal anti-inflammatory drugs with upper gastrointestinal tract bleeding. Archives of Internal Medicine 1987;147: 85-88.
Page J, Henry D. Consumption of NSAIDS and the development of congestive heart failure in elderly patients. Archives of Internal Medicine 2000;160:777-784.
Bombardier C, Laine L, Reicin A, Shapiro D, Burgos-Vargas R, Davis B, Day R, Ferraz MB, Hawkey CJ, Hochberg MC, Kvien TK, Schnitzer TJ. Comparison of upper-gastrointestinal toxicity of rofecoxib and naproxen in patients with rheumatoid arthritis. VIGOR Study Group. New England Journal of Medicine 2000;343: 1520-1528.
Important drug safety information: Vioxx (Dear Healthcare Professional Letter). Point-Claire, Dorval [QC]: Merck Frosst Canada; 2002 April 15. Available: www.hc-sc.gc.ca/hpbdgps/therapeut/zfiles/english/advisory/industry/vioxx_e.html (accessed 2002 May 30).
Important drug safety information: Celebrex (Dear Healthcare Professional Letter). Mississauga (ON): Pharmacia Canada Inc.; 2002 May 13. Available: www.hcsc.gc.ca/hpbdgps/therapeut/zfiles/english/industry/celebrex_e.html (accessed 2002 May 30).
Konstam MA, Weir MR, Reicin A, Shapiro D, Sperling RS, Barr E, Gertz BJ. Cardiovascular thrombotic events in controlled clinical trials of rofecoxib. Circulation 2001;104:2280-2288.
Whitcomb DC, Block GD. Association of acetaminophen hepatoxicity with fasting and ethanol use. Journal of the American Medical Association 1994;272(23):1845-1850.
Cherkin DW, Dyer S, Browne W, Townsend J. Frank AO. Medication use for low back pain in primary care. Spine 1998;23:607-14.
Rome PL. Perspectives: An overview of comparative considerations of cerebrovascular accidents. Chiropractic Journal of Australia 1999;29(3):87-102.
Sackett DL. Editorial: Evidence-based medicine. Spine 1998;23(10):1085-1086.
Jonas W. The evidence house: How to build an inclusive base for complementary medicine. Western Journal of Medicine 2001;175:79-80.
Caspi O, Millen C, Sechrest L. Integrity and research: Introducing the concept of dual blindness: How blind are double-blind clinical trials in alternative medicine? Alternative Therapies in Health and Medicine 2000;6(6):493-498.
Koes BW, van Tulder MW, Ostelo R, Burton AK, Waddell G. Clinical guidelines for the management of low back pain on primary care. Spine 2001;26(22):2504-2514.
van Tulder MW, Koes BW, Bouter LM. Conservative treatment of acute and chronic nonspecific low back pain: A systematic review of randomized controlled trials of the most common interventions. Spine 1997;22(18):2128-2156.
Feinstein AR. Meta-analysis: Statistical alchemy for the 21st century. Journal of Clinical Epidemiology 1995;48 (1):71-79.
Balk EM, Bonis PAL, Moskowitz H, Schmid CH, Ioannidis JPA, Wang C, Lau J. Correlation of quality measures with estimates of treatment effect in meta-analyses of randomized controlled trials. Journal of the American Medical Association 2002;287(22):2973-2982.
Juni P, Witsch A, Bloch R, Egger M. The hazards of scoring the quality of clinical trials for meta-analysis. Journal of the American Medical Association 1999;282(11):1054-1060.
Terrett AGJ. Current concepts in vertebrobasilar complications following spinal manipulation. West Des Moines, IA: NCMIC Group Inc., 2001.
Terett AGJ. Misuse of the literature by medical authors in discussing spinal manipulative therapy injury. Journal of Manipulative and Physiological Therapeutics 1995;18(4):203-210.
Gatterman MI, Cooperstein R, Lantz C, Perle SM, Schneider MJ. Rating specific chiropractic technique procedures for common low back conditions. Journal of Manipulative and Physiological Therapeutics 2001;24(7):449-456.
Cooperstein RL, Perle SM, Gatterman MI, Lantz C, Schneider MJ. Chiropractic technique procedures for specific low back conditions: Characterizing the literature. Journal of Manipulative and Physiological Therapeutics 2001;24(6): 407-424.
Croft PR, Macfarlane GJ, Papageorgiou AC, Thomas E, Silman AJ. Outcome of low back pain in general practice: A prospective study. British Medical Journal 1998;316:1356-1359.
McGorry RW, Webster BS, Snook SH, Hsiang SM. The relation between pain intensity, disability, and the episodic nature of chronic and recurrent low back pain. Spine 2000;25(7):834-841.
Hestbaek L, Lebouef-Yde C, Engberg M, Lauritzen T, Bruun stat NH. LBP: What is the long-term course? European Spine Journal 2003;12(2):149-165.
Smith M, Stano M. Costs and recurrences of chiropractic and medical episodes of low-back care. Journal of Manipulative and Physiological Therapeutics 1997;20(1):5-12.
Koes BW, Bouter LM, van Mameren H, Essers AHM, Verstegen GMJR, Hofhuizen DM, Houben JP, Knipschild PG. Randomised clinical trial of manipulative therapy and physiotherapy for persistent back and neck complaints: Results of one-year follow-up. British Medical Journal 1992;304:601-605.
Skargren EI, Oberg BE, Carlsson PG, Gade M. Cost and effectiveness analysis of chiropractic and physiotherapy treatment for low back and neck pain. Spine 1997;22:2167-2177.
Meade TW, Dyer S, Browne W, Townsend J, Frank AO. Low back pain of mechanical origin: Randomised comparison of chiropractic and hospital outpatient treatment. British Medical Journal 1990;300:1431-1437.
Meade TW, Dyer S, Browne W, Frank AO. Randomised comparison of chiropractic and hospital outpatient management for low back pain: Results from extended follow-up. British Medical Journal 1995;311:349-351.
Boline P, Kassak K, Bronfort G, Nelson C, Anderson AV. Spinal manipulation vs. amiltriptyline for the treatment of chronic tension-type headaches: a randomized clinical trial. Journal of Manipulative and Physiological Therapeutics 1995;18(3): 148-154.
Nelson CF, Bronfort G, Evans R, Boline P, Goldsmith C, Anderson AV. The efficacy of spinal manipulation, amitriptyline and the combination of both therapies in the prophylaxis of migraine headache. Journal of Manipulative and Physiological Therapeutics 1998;21(8):511-519.
Tonelli MR. The philosophical limits of evidence-based medicine. Academic Medicine 1998;73(12): 1234-1240.
Horwitz RI. The dark side of evidence-based medicine. Cleveland Clinic Journal of Medicine 1996; 63:320-323.

September 2003