Dr. Carey and Dr. Melvin are affiliated with the Cecil G. Sheps Center for Health Services Research, University of North Carolina at Chapel Hill, CB 7590, Chapel Hill, NC 27599 (e-mail: email@example.com). Dr. Williams is with the Department of Medicine, Duke University, Durham, North Carolina. Dr. Oldham is with the Menninger Clinic, Houston, Texas. Dr. Goodman is with the Southern California Evidence-based Practice Center, RAND Corporation, Santa Monica, California. William M. Glazer, M.D., is editor of this column.
The past 20 years have seen substantial treatment advances for the large U.S. population with major mental illness. New classes of medication have been developed for the treatment of major depressive disorders, psychotic disorders, and disabling neuroses. Older medications (such as antiepileptics) have been found to be useful in conditions such as bipolar disorder.
How should physicians and patients choose among multiple medications within a drug class? Numerous influences will make these choices more complicated in the coming years. Insurers may list preferred agents in their tiered formularies or have fail-first measures with a mandated algorithm for step-therapy; hospital formularies dictate decisions among drugs within a class. In the Medicare drug benefit, dozens of drug plans are available to elders in a state, each with its own formulary. For patients with severe illness, such as major depression, bipolar disorder, or schizophrenia, the implications of changing medications within a class may be significant. Will the second agent have the same probability of efficacy as the first agent? Is there a standard to "crosswalk" dosing, ensuring that a patient will receive similar probability of benefit from the second agent? How many head-to-head comparisons of agents are sufficient to ensure patient and provider confidence in the equivalence of two drugs?
Reviews of the diagnosis and treatment of conditions are as old as the practice of medicine. Traditional narrative reviews are generally conducted by a sole author, address broad questions, are not easily reproducible, and may be somewhat idiosyncratic. This type of article is generally easy to read, and its findings are relatively easy to incorporate into everyday practice.
The systematic review has a more focused question, is based on a predefined literature search using explicit inclusion and exclusion criteria, uses methods that allow reproducibility of findings, and usually has several authors. This use of reproducible methods makes the systematic review less prone to bias but also brings some disadvantages. Users often comment that the questions may be so narrowly focused that the findings are not applicable to many of their patients. Also, report length and technical terminology may hinder easy use.
Many issues in mental health can be subjected to systematic reviews that address questions such as the pharmacotherapy of alcohol dependence, screening for suicide risk, and the diagnosis of perinatal depression (1,2,3). Drug class reviews are best considered a subset of the broad category of systematic reviews, addressing the relative efficacy, effectiveness, and harms associated with agents within a particular drug class.
Every hospital pharmacy and therapeutics or payer formulary committee addresses comparative effectiveness issues. The scope of these reviews can be extremely variable and markedly dependent on available resources, local expertise, and time. Issues often associated with these reviews and affecting their objectivity include selective citation of evidence and the potential for undisclosed conflicts of interest. Systematic reviews published without adequate description of the underlying search strategies used to select the component studies or that include articles that demonstrate only positive findings are fundamentally flawed, and their conclusions must be seriously questioned.
Several new initiatives provide additional support to clinicians. For the past three years, a consortium of state Medicaid agencies has funded the Drug Effectiveness Review Project. The reviews are extensive, transparent, extensively peer reviewed, and updated annually (www.ohsu.edu/drugeffectiveness/reports/index.cfm). Periodic updates are especially important given that over 10% of guidelines become outdated in three to four years (4).
In response to the Medicare Modernization Act, the federal government initiated a series of drug class reviews to better inform multiple intermediaries through the part D Medicare drug benefit. The process is similar across reviews:
• developing a specific, clinically relevant question and revising that question through expert review
• developing eligibility criteria for a study to be included in the review (for example, studies conducted outside the United States may not be relevant for certain conditions; editorials and letters to the editor also add little when addressing drug efficacy)
• searching the literature in a reproducible fashion using keywords, manual searches of the bibliographies of recent reviews, and consultation with experts. The proportion of the literature accessed from each source will vary depending on the clinical topic. Not all research is published, and negative studies may be less likely to be published than positive studies, but using accessible trial registries may help to reduce this publication bias (www.clinicaltrials.gov)
• reviewing titles and abstracts for initial eligibility
• abstracting by two reviewers of key, explicit data components into evidence tables with quality ratings of each article according to explicit criteria
• deciding whether pooling of data in a meta-analysis is appropriate (topics with few trials, heterogeneous study methods, or results not appropriate for data pooling)
• drafting the final report with explicit components regarding strengths, weaknesses, and recommendations for future research
• interpreting information for appropriate end users, including policy makers, formulary committees, practitioners, and the public
Explicit eligibility criteria are particularly important in drug class reviews. Sometimes the conclusions of a review may be dependent on two or three studies. If eligibility criteria are manipulated to exclude even one key study, then misleading results could be generated. Medline searches are generally not sufficient for most studies. Our group routinely searches the EMBASE database, which contains some European literature not contained in Medline, as well as the Cochrane Library and other specialized databases.
Drug class reviews are most helpful in making explicit what has been published but not yet synthesized by research and practice communities. Several recent studies have demonstrated the power of the direct, head-to-head-comparison. For example, the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) study on antipsychotics has attempted to clarify the utilization of these agents (5). The recent Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study finally gave providers information regarding use of second agents after first-line treatment with a second-generation antidepressant has failed (6). Even these large trials, however, cannot address every clinical issue regarding use of a class of medications, such as the appropriate choice of therapy for patients with extensive comorbidities, for the very elderly, or for children. These groups may be underrepresented in some studies.
Recent work has found that several characteristics distinguish generalizable effectiveness studies from more narrow efficacy studies. These include patient populations in primary care settings, less stringent eligibility criteria, assessment of health outcomes (as opposed to biologic intermediates), long study duration, assessment of adverse events, adequate sample size to assess a minimally important difference from a patient's perspective, and intention-to-treat analysis (7). However, it is much more common for head-to-head trials to be relatively small, with sample sizes ranging from 30 to 150 patients, rendering them underpowered to detect significant differences between drugs. In these situations, pooling multiple studies into a meta-analysis may be quite useful. More complex, however, is the circumstance in which there are few or no head-to-head trials between two agents within a drug class, making inferences regarding comparative effectiveness of medications more difficult and controversial.
Consider the situation in which we have a modest number of head-to-head studies (say four to six) between two drugs. This occurred with multiple agents within second-generation antidepressants (8). If the studies are conducted in similar populations and measure similar outcomes, pooling of the results in a meta-analysis may be appropriate. However, caution must be taken that the meta-analysis is appropriately conducted and that results are assessed for heterogeneity.
Analysis of the literature to assess subgroups for whom a treatment may be effective is commonly a section of drug class reviews. Within a drug class, it is uncommon to identify empirical evidence to justify a claim that a medication may work better for patients of a particular age, gender, or ethnic group or with a particular comorbidity. Drug class reviews cannot make up for problems in the component articles of the review. If the constituent research studies examining a class of drugs use poor research methods or inadequate adjunctive care to the main intervention or have excessive loss to follow-up, then statistical manipulation through meta-analysis will continue to lead to potentially misleading conclusions. However, a systematic review can be very useful in this situation through highlighting the methods challenges and assisting researchers and readers toward conducting higher-quality research.
When there are few (or no) head-to-head comparisons of medications, researchers may use indirect methods for comparison (9). An indirect comparison can be simply described as follows: imagine that two placebo trials have been performed—drug A versus placebo and drug B versus placebo. The studies are conducted in different cities, by different investigators, and use similar but not identical methods. Can one validly compare the outcome of drug A versus drug B? The key advantage of randomized trials, the randomization compared with another treatment (or placebo), is partially lost in these indirect analyses because patients are not randomly allocated between drug A and drug B. That is, drug A is compared with drug B as if it were a well-conducted observational study. Statistical methods can be used to partially adjust for between-study differences in severity of illness, comorbidity, or decade in which the study was conducted. The use of indirect methods is not a substitute for adequately powered, well-conducted randomized trials.
Clinicians routinely use information other than efficacy to make treatment decisions, including potential harms, cost, convenience, baseline risk of the disease, and related benefits. For harms, medications can of course have a variety of both common and rare adverse effects. A substantial challenge to clinicians is the often limited data on the prevalence and impact of adverse effects in the peer-reviewed literature (10). If the efficacy of several agents is similar, then cost may be a determining factor. Circumstances may lead to differential interpretations of the term "cost," depending on the perspective (of the patient or of the insurer). Convenience is also a consideration. Once- versus twice-daily dosing may make a difference in patient satisfaction or adherence. In a high-risk condition, clinicians may be willing to use treatments with only modest evidence of greater efficacy over comparison drugs or procedures. As for related benefits, some treatments may provide benefits for conditions other than the primary condition of interest, such as the utility of antidepressants in treating depression with comorbid anxiety disorder (11).
Drug class reviews are here to stay. The range of agents available for many conditions and the increasing pressure from payers to constrain prescription drug expenditures mandate that clinicians be able to use these reviews in a way that leads to improved care for our patients. Drug class reviews can form the basis for practice guidelines but are only one form of evidence that goes into such guidelines. Guidelines may also address issues of cost, local practice patterns, and the efficacy of competing, nondrug treatments. A challenge to researchers and policy makers is to conduct and disseminate these drug class reviews in as timely and accessible a manner as possible to the practice community.
This work was supported in part by a contract from the State of Vermont and the Neurontin Special Committee, a consortium of state attorneys general.
The authors report no competing interests.