The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×
ArticlesFull Access

Irritability and Its Clinical Utility in Major Depressive Disorder: Prediction of Individual-Level Acute-Phase Outcomes Using Early Changes in Irritability and Depression Severity

Abstract

Objective:

The authors evaluated improvement in irritability with antidepressant treatment and its prognostic utility in treatment-seeking adult outpatients with major depressive disorder.

Methods:

Mixed-model analyses were used to assess changes in irritability (as measured with the five-item irritability domain of the Concise Associated Symptom Tracking [CAST-IRR] scale) from baseline to week 4 after controlling for depression severity (as measured with the 16-item Quick Inventory of Depressive Symptomatology–Clinician Rated [QIDS-C]) in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial (N=664). An interactive calculator for remission (QIDS-C score ≤5) and no meaningful benefit (<30% reduction in QIDS-C score from baseline) at week 8 was developed with logistic regression analyses in the CO-MED trial using participants with complete data (N=431) and independently replicated in the Suicide Assessment and Methodology Study (SAMS) (N=163).

Results:

In the CO-MED trial, irritability was significantly reduced (effect size=1.06) from baseline to week 4, and this reduction remained significant after adjusting for QIDS-C change (adjusted effect size=0.36). A one-standard-deviation greater reduction in CAST-IRR score from baseline to week 4 predicted a 1.73 times higher likelihood of remission and a 0.72 times lower likelihood of no meaningful benefit at week 8, independent of baseline QIDS-C and CAST-IRR scores and reduction in QIDS-C score from baseline to week 4. The model estimates for remission (area under the curve [AUC]=0.79) and no meaningful benefit (AUC=0.76) in the CO-MED trial were used to predict remission (AUC=0.80) and no meaningful benefit (AUC=0.84) in SAMS and to develop an interactive calculator.

Conclusions:

Irritability is an important symptom domain of major depressive disorder that is not fully reflected in depressive symptom severity measures. Early reductions in irritability, when combined with changes in depressive symptom severity, provide a robust estimate of likelihood of remission or no meaningful benefit in outpatients with major depression.

Current diagnostic criteria and symptom severity measures rely on the nine core symptoms of major depressive disorder (13), yet impairments as a result of major depression and improvements with treatment are not fully captured by these symptom criteria (47). Irritability is unique among major depressive disorder-associated symptom domains, because it is considered to be a core diagnostic symptom among adolescents but not adults (1). Yet, it is widely prevalent in adult patients with major depression, with 40%−50% reporting the presence of irritability for more than half the time in their current depressive episode (810). The presence of irritability is associated with greater severity of depressive and anxiety symptoms, earlier age at onset, presence of atypical features, and poorer quality of life (8, 9, 11). Irritability is also associated with poorer clinical course. Patients with major depression who report irritability, compared with those without irritability, are more likely to have a chronic course characterized by a greater number of weeks spent in a depressive episode and greater time spent with residual symptoms in between depressive episodes (10). The presence of anger or hostility before treatment initiation and worsening of irritability after initiation of antidepressant medication are both associated with lower acute-phase remission rates (12, 13). Reductions ≥30% in anger or hostility ratings by week 2 of antidepressant treatment are associated with a doubled likelihood of remission among patients with major depression (14). In patients with treatment-resistant major depression, the presence of irritability may help guide the optimal next-step treatment selection, as evidenced by greater symptom improvement with low-dose brexpiprazole in patients with irritability (15).

Despite the wide prevalence, the prognostic utility, and the treatment selection potential, measurement-based care protocols (16, 17) and treatment guidelines (18, 19) for major depressive disorder do not systematically assess or incorporate irritability in clinical decision making. To be clinically useful, changes in irritability should predict longer-term clinical outcomes and reflect improvement beyond what is reflected in measures of overall depressive symptom severity. Previous studies have not evaluated whether changes in irritability with antidepressant treatment were completely accounted for by changes in depressive symptom severity. Furthermore, the prognostic significance of changes in irritability has not been shown to be independent of reduction in depressive symptom severity. Finally, even if changes in irritability predicted longer-term outcomes, currently there are no easy-to-use methods or recommendations to incorporate them in clinical decision making at the individual patient level.

Our aim in this study was to evaluate the clinical utility of adding irritability to the current paradigm of measuring depressive symptom severity during the course of antidepressant treatment. The specific questions we asked, using two samples of convenience, were as follows:

  1. Does irritability improve from baseline to week 4 of antidepressant treatment even after accounting for change in depressive symptom severity?

  2. Does baseline-to-week-4 change in irritability predict remission and no meaningful benefit (<30% reduction from baseline) at week 8 even after controlling for baseline irritability, baseline depressive symptom severity, and baseline-to-week-4 change in depressive symptom severity?

  3. Can baseline-to-week-4 changes in irritability and depressive symptom severity be used to predict remission and no meaningful benefit at an individual level?

  4. Do these predictions replicate in a separate, unrelated sample of outpatients with major depression?

Using the first sample of participants (N=664) from the Combining Medications to Enhance Depression Outcomes (CO-MED) trial, baseline-to-week-4 change in irritability was assessed after controlling for change in depressive symptom severity. Prediction of acute-phase (week 8) treatment outcomes (remission and no meaningful benefit) by baseline-to-week-4 change in irritability was tested in models incorporating baseline irritability and depressive symptom severity and baseline-to-week-4 change in depressive symptom severity. The estimates obtained from the CO-MED trial were used to predict individual-level probability of remission and no meaningful benefit in a second sample of participants (N=163) from the Suicide Assessment Methodology Study (SAMS). Week 4 was selected to measure change from baseline in irritability and depressive symptom severity because it is the first critical decision point for assessing response to treatment per measurement-based care protocol (16). Because SAMS was an 8-week study, acute-phase treatment outcomes were ascribed at week 8. Although remission (defined as a score ≤5 on the Quick Inventory of Depressive Symptomatology–Clinician-Rated [QIDS-C] [3]) is the preferred goal of acute-phase antidepressant treatment (19), “no meaningful benefit” (defined as a reduction <30% in QIDS-C score from baseline) was included to facilitate clinical decision making (treatment augmentation, switching, or discontinuation).

Methods

Study Overview and Participants

CO-MED trial.

Our analysis included all CO-MED trial (NCT00590863) participants for whom data from the Concise Associated Symptom Tracking (CAST) scale (20) were available at baseline (N=664). The details of the CO-MED trial, including recruiting sites, inclusion and exclusion criteria, and institutional review board approvals, have been described previously (21). From March 2008 through February 2009, participants from six primary care sites and nine psychiatric care sites were enrolled after written informed consent was obtained (21). Inclusion was restricted to treatment-seeking outpatients with major depression who were 18–75 years old and had nonpsychotic chronic (duration of current episode ≥2 years) or recurrent (current episode ≥2 months) depression and a baseline score ≥16 on the 17-item Hamilton Depression Rating Scale (HAM-D) (22). Exclusionary criteria included a lifetime history of a psychotic disorder, current psychotic symptoms, a history of an eating disorder in the past 2 years, a current primary diagnosis of obsessive-compulsive disorder (OCD), current substance dependence requiring inpatient-level care, an unstable general medical condition, a current psychiatric condition necessitating hospitalization, a history of a seizure disorder or narrow-angle glaucoma, inadequately treated hypothyroidism, and use of contraindicated medications for general medical or psychiatric conditions (antipsychotics, anticonvulsants, mood stabilizers, CNS stimulants, antidepressants, or other medications with potential augmentation properties). Participants were randomly assigned at baseline to one of three treatment arms in a 1:1:1 ratio after stratification by clinical site: escitalopram plus placebo (selective serotonin reuptake inhibitor [SSRI] monotherapy), sustained-release bupropion plus escitalopram (bupropion-SSRI combination), and extended-release venlafaxine plus mirtazapine (venlafaxine-mirtazapine combination). Postrandomization visits were conducted at weeks 1, 2, 4, 6, 8, 10, and 12 for the acute phase and weeks 16, 20, 24, and 28 for the continuation phase. As previously reported (21), acute-phase and continuation-phase outcomes (remission or response) did not differ among the three treatment arms.

SAMS.

All SAMS (NCT00532103) participants who completed the QIDS-C and CAST assessments at baseline and at week 4 and the QIDS-C at week 8 (N=163) were included. The details of SAMS, including recruiting sites, inclusion and exclusion criteria, and institutional review board approvals, have been described previously (23). From July 2007 through February 2008, a total of 266 participants from six primary care sites and nine psychiatric care sites were enrolled in SAMS after written informed consent was obtained (23). Inclusion was restricted to 18- to 75-year-old treatment-seeking outpatients with nonpsychotic major depressive disorder and a score ≥14 on the HAM-D. Exclusion criteria included failure of two or more courses of SSRIs in the current episode, current substance use disorder, bipolar disorder, schizophrenia, a primary diagnosis of OCD or an eating disorder, use of prohibited medications (antipsychotics and antiepileptics), and a unstable general medical or psychiatric condition necessitating hospitalization.

Measurement-Based Care

The measurement-based care (16) approach was used in both the CO-MED trial and SAMS to make medication dosage adjustments during the 8-week postbaseline period, using the QIDS-C (3) as a measure of depression severity and the Frequency, Intensity, and Burden of Side Effects Rating Scale (24) as a measure of side effects. Study physicians made dosage increases only if the depression severity was not adequately controlled and the side effect burden was tolerable.

Medications

CO-MED trial.

Participants in all three treatment arms received two types of pills in a single-blind fashion. The study personnel were aware of both pill types, but study participants were aware of only the first pill type. In the SSRI monotherapy arm, escitalopram was started at 10 mg/day, with an increase up to 20 mg/day permitted at week 4; pill placebo was added as the second pill type at week 2. In the bupropion-SSRI treatment arm, sustained-release bupropion was initiated at 150 mg/day and increased to 300 mg/day at week 1; escitalopram was started at 10 mg/day as the second pill type at week 2, and dosage increases of bupropion (up to 200 mg twice daily) and escitalopram (up to 20 mg/day) were permitted from weeks 4 to 8. In the venlafaxine-mirtazapine arm, extended-release venlafaxine was initiated at 37.5 mg/day and titrated to 150 mg/day by week 1; mirtazapine at 15 mg/day was added as the second pill type at week 2, and dosage increases of venlafaxine (up to 300 mg/day) and mirtazapine (up to 45 mg/day) were permitted from weeks 4 to 8.

SAMS.

Participants were treated with an SSRI (escitalopram, citalopram, sertraline, paroxetine, controlled-release paroxetine, or fluoxetine) in an open-label fashion, with choice of an antidepressant at the individual participant’s discretion (20). Escitalopram was initiated at 10 mg/day, and an increase up to 20 mg/day was permitted from weeks 4 to 6. Citalopram was initiated at 20 mg/day, and an increase up to 40 mg/day was permitted from weeks 4 to 6. Sertraline was initiated at 50 mg/day and increased to 100 mg/day at week 2; further increases up to 150 mg/day were permitted from weeks 4 to 6. Paroxetine was initiated at 20 mg/day, and an increase up to 40 mg/day was permitted from weeks 4 to 6. Controlled-release paroxetine was initiated at 25 mg/day, and an increase up to 37.5 mg/day was permitted from weeks 4 to 6. Fluoxetine was initiated at 20 mg/day, and an increase up to 40 mg/day was permitted from weeks 4 to 6.

Assessments

QIDS-C.

The 16 items of the QIDS-C are based on the nine symptom criteria domains of major depressive disorder. Each item is scored from 0 to 3, and total score ranges from 0 to 27 (3, 25). The QIDS-C correlates highly with the 17-item HAM-D (r=0.93) and has high inter-item correlations (Cronbach’s alpha=0.85) (25, 26). Remission was defined as a score ≤5 on the QIDS-C at week 8. No meaningful benefit was defined as a reduction <30% from baseline to week 8 on the QIDS-C.

CAST Self-Report.

The 16 items of the CAST assess symptoms across five domains in which each individual item is rated on a 5-point Likert scale (from 1, “strongly disagree,” to 5, “strongly agree”): anxiety (three items), irritability (five items categorized as the CAST irritability domain subscale [CAST-IRR]), mania (four items), insomnia (two items), and panic (two items) (20). Items included in the CAST-IRR are as follows: “I wish people would just leave me alone”; “I feel very uptight”; “I find myself saying or doing things without thinking”; “Lately everything seems to be annoying me”; and “I find people get on my nerves easily.” The factor structure of the CAST was reported initially by Trivedi et al. (20) and validated by Jha et al. (12) and Trombello et al. (27). The CAST-IRR has been shown to have significant correlations with the Impulsivity Rating Scale (r=0.39), the Beck Anxiety Inventory (r=0.42), and the irritability item of the Clinician-Administered Rating Scale for Mania (r=0.30) (20) as well as small to moderate correlations with comorbid psychiatric disorders (Spearman’s correlation coefficient range, 0.07–0.29) as measured by a self-report psychiatric diagnostic screening questionnaire (12). In the present study, the Pearson correlation coefficient between the QIDS-C and the CAST-IRR was 0.35 (p<0.001) at baseline.

Statistical Analyses

The analytic sample for changes in the CAST-IRR included all CO-MED trial participants with CAST-IRR data available at baseline (N=664). Of these participants, those who had QIDS-C and CAST-IRR data at week 4 and QIDS-C data at week 8 were included in the analytic sample for prediction of acute-phase treatment outcomes (N=431). The analytic sample for replication of these predictions in SAMS included participants with QIDS-C and CAST-IRR data at baseline and week 4 and QIDS-C data at week 8 (N=163). In the CO-MED trial, repeated-measures mixed-model analyses tested baseline-to-week-4 changes in CAST-IRR score, with the visit as the within-subject variable and all other variables as between-subject variables, before and after controlling for QIDS-C score at each visit. Also in the CO-MED trial, two separate logistic regression analyses predicted remission and no meaningful benefit at week 8, with baseline QIDS-C score, baseline CAST-IRR score, percent change in QIDS-C score (100×[baseline QIDS-C score − week-4 QIDS-C score]/baseline QIDS-C score), and percent change in CAST-IRR score (100×[baseline CAST-IRR score − week-4 CAST-IRR score]/baseline CAST-IRR score) as predictor variables using the following equation:

where p is the probability of the outcome variable (remission and no meaningful benefit at week 8), and bi is the regression parameter for the ith predictor. Model performance in predicting remission and no meaningful benefit after adding CAST-IRR variables were assessed using net reclassification improvement analyses (28). The logistic regression analyses described above were repeated with sex and treatment arms (SSRI monotherapy, bupropion-SSRI, and venlafaxine-mirtazapine) separately to test whether sex and treatment significantly affected the outcomes. A receiver operating characteristic curve was plotted to obtain the area under the curve (AUC), and calibration plots (29) were used to evaluate the agreement between predicted probabilities and observed outcomes. To generate calibration plots, the data were divided into 10 groups, the number of samples with true results equal to class was determined, and the event rate was determined for each bin. These event rates were then plotted against the midpoint of each bin. Graphically, this equates to predicted probabilities and actual outcomes falling on a 45-degree line. If a model is well calibrated, a study subject with a 0.50 estimated probability of remission will be expected to be in remission 50% of the time. The model estimates from the CO-MED trial were then used to compute individual-level probabilities in SAMS. A receiver operating characteristics plot was used to compare these predictions in SAMS with those in the CO-MED trial. An interactive calculator was developed using the Shiny package in R (https://shiny.rstudio.com).

The threshold for statistical significance was set at a p value of 0.05. We used SAS, version 9.3 (SAS Institute, Cary, N.C.) for all analyses except the calibration plots, which were plotted in R, version 3.4.3, with the caret package, and the interactive calculator, which was developed with Shiny.

Results

Participants in both the CO-MED trial and SAMS were predominantly female (67.1% and 69.9%), Caucasian (69.2% and 71.2%), and non-Hispanic (84.2% and 90.2%). Participants in these two studies had similar baseline clinical and sociodemographic characteristics, with the exception of higher rates of chronic depression in the CO-MED trial (N=235/431, 54.5%) compared with SAMS (N=37/163, 22.7%) (Table 1). Of the CO-MED trial participants with baseline and week-4 CAST-IRR and QIDS-C scores plus week-8 QIDS-C scores (N=431), 149 (34.6%) attained remission and 93 (21.6%) showed no meaningful benefit at week 8. Of the participants in SAMS with comparable data (N=163), 81 (49.7%) attained remission and 33 (20.3%) showed no meaningful benefit at week 8. Of the total sample in the CO-MED trial (N=665), individuals who were excluded from prediction of acute-phase outcomes (N=234) were younger, had greater depressive symptom severity at baseline, were more likely to be unemployed, were more likely to be African American, and reported less than a high school level of education (for further details, see Table S1 in the online supplement). Similarly, in SAMS, individuals who were excluded (N=103) were younger (see Table S1 in the online supplement).

TABLE 1. Baseline demographic and clinical characteristics of participants in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial and the Suicide Assessment and Methodology Study (SAMS) for whom complete data were avaliable for depressive symptoms and irritability at baseline and week 4 and outcome data at week 8

VariableCO-MED (N=431)SAMS (N=163)
N%N%
Sex
 Male14232.94930.1
 Female28967.111469.9
Race
 Caucasian29869.211671.2
 African American9121.13622.1
 Other429.7116.7
Educationa
 <12 years5012.02112.9
 12–15 years23155.59759.5
 ≥15 years13532.54527.6
Hispanic ethnicity6815.8169.8
Employed at baseline22752.79960.7
Onset of depression before age 18b19144.35232.1
Chronic depression23554.53722.7
Recurrent depressionc34279.410368.2
MeanSDMeanSD
Age (years)43.912.742.612.7
Depressive symptom severityd15.63.414.63.2
Irritability severitye17.23.816.14.1

aData were missing for 15 participants in the CO-MED trial.

bData were missing for one participant in SAMS.

cData were missing for 12 participants in SAMS.

dSymptoms were measured with the Quick Inventory of Depressive Symptomatology–Clinician-Rated.

eSymptoms were measured with the 5-item irritability domain of the Concise Associated Symptom Tracking scale.

TABLE 1. Baseline demographic and clinical characteristics of participants in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial and the Suicide Assessment and Methodology Study (SAMS) for whom complete data were avaliable for depressive symptoms and irritability at baseline and week 4 and outcome data at week 8

Enlarge table

The mean QIDS-C and CAST-IRR scores at week 4 were 9.2 (SD=4.3) and 12.6 (SD=4.6), respectively, in the CO-MED trial (N=431) and 8.6 (SD=4.4) and 12.2 (SD=4.5), respectively, in SAMS (N=163). The mean baseline-to-week-4 reduction in QIDS-C and CAST-IRR scores was 40.2% (SD=26.5) and 25.1% (SD=25.7), respectively, in the CO-MED trial and 40.8% (SD=29.6) and 21.4% (SD=29.2), respectively, in SAMS.

Change in Irritability From Baseline to Week 4

In the CO-MED trial, there was a significant baseline-to-week-4 reduction in CAST-IRR scores (F=271.80, df=3, 1663, p<0.0001; effect size=1.06) (for further details, see Figure S1 in the online supplement. This reduction in CAST-IRR scores remained significant even after controlling for QIDS-C score at each visit (F=26.17, df=3, 1661, p<0.0001; adjusted effect size=0.36). The estimated reduction in CAST-IRR scores from baseline, independent of QIDS-C change, was as follows: −1.21 (SD=0.16; p<0.0001), −1.33 (SD=0.18; p<0.0001), and −1.55 (SD=0.20; p<0.0001) at week 1, week 2, and week 4, respectively.

Change in Irritability at Week 4 as a Predictor of Remission and No Meaningful Benefit at Week 8

In the CO-MED trial, higher baseline-to-week-4 reduction in CAST-IRR scores was independently associated with higher likelihood of attaining remission (χ2=14.60, df=1, p=0.0001) and lower likelihood of no meaningful benefit at week 8 (χ2=4.39, df=1, p=0.036) (Table 2). A one-standard-deviation (25.7%) greater reduction in CAST-IRR score from baseline to week 4 independently predicted a 1.73 times higher likelihood of remission and a 0.72 times lower likelihood of no meaningful benefit. Adding irritability variables to the models significantly improved the reclassification of both remission and no meaningful benefit. The net reclassification improvement for remission and no meaningful benefit was 0.36 (95% CI=0.17, 0.56, p<0.0001) and 0.34 (95% CI=0.12, 0.57, p=0.004), respectively. With the inclusion of irritability variables in the remission model, 13% of remitters were correctly reclassified, and 23% of nonremitters were correctly reclassified. Similarly, with the inclusion of irritability variables in the no meaningful benefit model, 20% of participants with no meaningful benefit were correctly reclassified, whereas 14% with meaningful benefit were correctly reclassified. Sex did not significantly predict either remission (χ2=1.85, df=1, p=0.17) or no meaningful benefit (χ2=0.96, df=1, p=0.33). Similarly, treatment arm did not significantly predict either remission (χ2=1.27, df=2, p=0.53) or no meaningful benefit (χ2=2.10, df=2, p=0.35).

TABLE 2. Prediction of remission and no meaningful benefit in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial among participants for whom complete data were available at baseline and week 4 and outcome data at week 8 (N=431)a

RemissionNo Meaningful Benefit
VariableOdds Ratio95% CIOdds Ratio95% CI
Baseline
 QIDS-C0.860.80, 0.930.990.91, 1.08
 CAST-IRR0.950.89, 1.021.111.03, 1.20
One percent greater baseline-to-week-4 reduction
 QIDS-C1.031.02, 1.040.970.96, 0.98
 CAST-IRR1.021.01, 1.030.9870.975, 0.999

aRemission is defined as a score ≤5 on the Quick Inventory of Depressive Symptomatology–Clinician-Rated (QIDS-C) at week 8. No meaningful benefit is defined as a reduction <30% on the QIDS-C from baseline to week 8. CAST-IRR=5-item irritability domain of the Concise Associated Symptom Tracking scale.

TABLE 2. Prediction of remission and no meaningful benefit in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial among participants for whom complete data were available at baseline and week 4 and outcome data at week 8 (N=431)a

Enlarge table

Changes in Irritability and Depressive Symptom Severity as Predictors of Remission and No Meaningful Benefit at an Individual Level

In the CO-MED trial, the model containing baseline QIDS-C and CAST-IRR scores and baseline-to-week-4 changes in QIDS-C and CAST-IRR scores had AUC values of 0.79 for remission and 0.76 for no meaningful benefit at week 8 (Figure 1). The individual-level probabilities of remission (p/(1–p)=e[0.6554–0.1511*(baseline QIDS-C) –0.0520*(baseline CAST-IRR) + 0.0301*(percent change in QIDS-C) + 0.0213*(percent change in CAST-IRR)]) and no meaningful benefit (p/(1–p)=e[–1.6436 –0.00942*(baseline QIDS-C) + 0.1044*(baseline CAST-IRR) –0.0312*(percent change in QIDS-C) –0.0130*(percent change in CAST-IRR)]) were calculated. Calibration plots (see Figure S2 in the online supplement) showed that the predicted probabilities were well calibrated, aside from the tails.

FIGURE 1.

FIGURE 1. Receiver operating characteristic (ROC) curves for remission and no meaningful benefit in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial and the Suicide Assessment and Methodology Study (SAMS)a

a Remission is defined as a score ≤5 on the Quick Inventory of Depressive Symptomatology–Clinician-Rated (QIDS-C) at week 8. No meaningful benefit is defined as a reduction <30% on the QIDS-C from baseline to week 8. AUC=area under the curve.

Replication of These Predictions in an Unrelated Sample of Outpatients With Major Depression

In SAMS, individual-level probabilities of remission and no meaningful benefit were obtained by using intercept and beta estimates from the CO-MED trial. In SAMS, the AUC values of remission and no meaningful benefit at week 8 were 0.80 and 0.84, respectively (Figure 1). Using median split (the median baseline-to-week-4 CAST-IRR reduction in the CO-MED trial was 26.1%), participants were grouped by those with baseline-to-week-4 reductions ≥26.1% and <26.1% in CAST-IRR scores in order to visualize the differences in acute-phase treatment outcomes in both the CO-MED trial and SAMS (Figure 2).

FIGURE 2.

FIGURE 2. Treatment outcomes in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial (N=431) and the Suicide Assessment and Methodology Study (SAMS) (N=163) using baseline-to-week-4 reduction in irritabilitya

a Irritability was measured with the 5-item irritability domain of the Concise Associated Symptom Tracking scale (CAST-IRR). Remission is defined as a score ≤5 on the Quick Inventory of Depressive Symptomatology–Clinician-Rated (QIDS-C) at week 8. No meaningful benefit is defined as a reduction <30% on the QIDS-C from baseline to week 8. The threshold of 26.1% was based on the median percent change on the CAST-IRR from baseline to week 4 in the CO-MED trial.

To allow estimation of individual-level probabilities, the intercepts and beta estimates from the remission and no meaningful benefit models in the CO-MED trial were incorporated in an interactive web-based calculator that could be deployed on a server for universal use. Users were able to specify the QIDS-C and CAST-IRR values at baseline and week 4, view where these individual values lay according to the distributions in the CO-MED trial, and obtain estimated probabilities of remission and no meaningful benefit at week 8 (Figure 3).

FIGURE 3.

FIGURE 3. An individual-level calculator of remission and no meaningful benefit using estimates from the Combining Medications to Enhance Depression Outcomes (CO-MED) triala

a Remission is defined as a score ≤5 on the Quick Inventory of Depressive Symptomatology–Clinician-Rated (QIDS-C) at week 8. No meaningful benefit is defined as a reduction <30% on the QIDS-C from baseline to week 8. Irritability was measured with the 5-item irritability domain of the Concise Associated Symptom Tracking scale. The shaded areas in the individual panels represent observed values from the CO-MED trial (N=431); the vertical dotted lines represent the values that were entered using the interactive calculator.

Discussion

Irritability improved early with antidepressant treatment and predicted acute-phase treatment outcomes (remission and no meaningful benefit) independently in a large, ecologically valid sample of treatment-seeking outpatients with major depression. Furthermore, baseline-to-week-4 changes in irritability and depressive symptom severity were combined to estimate individual-level outcomes with high accuracy and were replicated in an unrelated sample. Improvement in irritability, seen as early as week 1, was not completely accounted for by reduction in depressive symptoms. Greater baseline-to-week-4 reduction in irritability was associated with higher likelihood of remission and lower likelihood of no meaningful benefit, even after controlling for baseline-to-week-4 change in depressive symptom severity and baseline levels of depressive symptom severity and irritability.

Improvement in irritability in this study is consistent with that in previous reports of reduced anger or hostility with antidepressant treatment (13, 14). These findings also add to previous findings that improvement with antidepressant treatment extends beyond changes in core depressive symptoms (4, 6, 7, 30). These studies, taken together, highlight the limitations of the current criteria for major depressive disorder and argue for expansion of assessments beyond the nine core diagnostic assessments. Higher likelihood of remission with greater reduction in irritability is consistent with a previous report of higher likelihood of remission with early (by week 2) improvement in anger or hostility (14). The findings that participants excluded from the prediction models were more likely to be younger, African American, and unemployed and to have greater symptom severity at treatment initiation and less than a high school level of education are consistent with previous reports of attrition from care in the Sequenced Treatment Alternatives to Relieve Depression study (31, 32).

A clinical implication of these findings is that irritability should be assessed during the course of antidepressant treatment. The five-item self-report measure of irritability can be implemented without significantly burdening patients and providers. Clinicians can easily combine early changes in irritability and depressive symptoms to estimate probabilities of remission and no meaningful benefit with the easy-to-use interactive calculator. The outcomes were chosen by design to be clinically actionable (for high likelihood of remission, treatment should be continued; for high likelihood of no meaningful benefit, treatment should be modified). In patients with persistent irritability at week 4 and a high likelihood of no meaningful benefit, clinicians may consider treatment strategies such as augmentation with brexpiprazole (15).

A major strength of this study is the testing and replication of predictive models in two unrelated samples. The large sample size and the recruitment of treatment-seeking outpatients from community practices, with broad inclusion and minimal exclusion criteria, increase the generalizability of the findings.

There are several limitations to the secondary analysis. The models for remission and no meaningful benefit were tested and replicated in participants for whom a complete data set (baseline, week 4, and week 8) was available, and thus these may not generalize to individuals who dropped out of care early. As a result of the nonrandom pattern of differential attrition, the use of methods to account for missing data (such as multiple imputation) has known pitfalls (33), and thus use of these methods was not considered appropriate. The number of predictor variables was limited to four, because the objective of this study was to demonstrate the clinical utility of adding irritability to current practices of measuring depressive symptom severity. Inclusion of other clinical and biological markers may further improve the predictive model. However, it is noteworthy that the strength of the predictive ability of the remission and no-meaningful-benefit models in SAMS (AUC values of 0.80 and 0.84, respectively) is comparable to the AUC values reported in other studies, such as the prediction of development of psychosis (AUC=0.79) in patients receiving secondary mental health care (34) and development of bipolar spectrum disorder (AUC=0.76) in at-risk youths (35). By design, all participants in the CO-MED trial and SAMS received a serotonergic antidepressant. Hence, these findings may not extend to individuals treated with nonserotonergic antidepressants, such as bupropion monotherapy. The individual-level calculator is restricted by the choice of the QIDS-C and the CAST-IRR as measures of depression severity and irritability. Further studies are needed to test the validity of this calculator with other measures of depression severity and to evaluate whether the measurement-based care paradigm that incorporates assessments of irritability along with depression severity results in improved treatment outcomes. The model is well calibrated with probabilities ≤0.6 in both models, and thus any estimates outside this range should come with the understanding that they may not produce outcomes at the anticipated rate. This low calibration may be a result of the fact that only about 20% of the CO-MED trial participants in the no-meaningful-benefit model were assigned probabilities above 0.6, and only 5% of the CO-MED trial participants in the remission model were assigned probabilities above 0.5. More information (including additional predictors and larger samples) is needed to better gauge calibration accuracy in these groups. Calibration plots have a major limitation—they represent all binned data as if there were equal amounts of data in each bin, which often is not the case.

In conclusion, irritability improves early with antidepressant treatment independently of depressive symptom severity. This early improvement is independently associated with higher likelihood of remission and lower likelihood of no meaningful benefit. The combinations of baseline and early changes (up to week 4) in irritability and depressive symptom severity can estimate individual-level probabilities of remission and no meaningful benefit. These findings support inclusion of assessments of irritability in measurement-based-care approaches for the treatment of patients with major depression.

From the Center for Depression Research and Clinical Care, UT Southwestern Medical Center, Dallas (Jha, Minhajuddin, South, Trivedi); the Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York (Jha); Duke-National University of Singapore (Rush); the Department of Psychiatry, Duke Medical School, Durham, N.C. (Rush); and Texas Tech University-Health Sciences Center, Permian Basin, Midland-Odessa (Rush).

Presented in part as an abstract at the 2018 World Congress of the International College of Neuropsychopharmacology, Vienna, June 19, 2018.

Send correspondence to Dr. Trivedi ().

Supported by NIMH grant N01 MH-90003 to the University of Texas Southwestern Medical Center at Dallas (principal investigators, Drs. Rush and Trivedi) and in part by the Center for Depression Research and Clinical Care (principal investigator, Dr. Trivedi) and the Hersh Foundation. Forest Pharmaceuticals, GlaxoSmithKline, Organon, and Wyeth Pharmaceuticals provided medications for the Combining Medications to Enhance Depression Outcomes trial at no cost.

ClinicalTrials.gov identifiers: NCT00590863 (CO-MED) and NCT00532103 (SAMS).

Dr. Jha has received contract research funding from Acadia Pharmaceutical and Janssen Research and Development. Dr. Rush has received consulting fees from Akili, Brain Resource, Compass, Curbstone Consultant LLC, Eli Lilly, Emmes Corp, LivaNova, MindLinc, Sunovion, Takeda USA, and Taj Medical; he has received speaking fees from LivaNova and SingHealth; he receives royalties from Guilford Press and the University of Texas Southwestern Medical Center, Dallas; and he is co-inventor on two patents (number 7,795,033: Methods to Predict the Outcome of Treatment with Antidepressant Medication; number 7,906,283: Methods to Identify Patients at Risk of Developing Adverse Events During Treatment with Antidepressant Medication). Dr. Trivedi has served as a consultant for or on the advisory board of Alkeremes, Akili Interactive, Allergan Pharmaceuticals, Acadia Pharmaceuticals, Avanir Pharmaceuticals, Brintellix Global, Bristol-Myers Squibb, Caudex, Cerecor, Forest Pharmaceuticals, Global Medical Education, Health Research Associates, Insys, Johnson & Johnson Pharmaceutical Research and Development, Lilly Research Laboratories, Lundbeck Research USA, Medscape, Merck, Mitsubishi Pharma, MSI Methylation Sciences, Navitor, One Carbon Therapeutics, Otsuka America Pharmaceutical, Pamlab, Pfizer, and Takeda Global Research; he has received royalties from Janssen Research and Development and has author agreements with Janssen Asia Pacific and Oxford University Press; and he has received grant support from the Agency for Healthcare Research and Quality, the Cancer Prevention and Research Institute of Texas, Johnson & Johnson, NIDA, the National Institute of Diabetes and Digestive and Kidney Diseases, the National Center for Advancing Translational Sciences, NIMH, and the Patient-Centered Outcomes Research Institute. The other authors report no financial relationships with commercial interests.

The authors thank the clinical staff at each clinical site for their assistance, all of the study participants, and Eric Nestler, M.D., Ph.D., and Carol A. Tamminga, M.D., for administrative support.

References

1 American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders, 5th ed. Washington, DC, American Psychiatric Association, 2013CrossrefGoogle Scholar

2 Kroenke K, Spitzer RL, Williams JB: The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16:606–613Crossref, MedlineGoogle Scholar

3 Rush AJ, Trivedi MH, Ibrahim HM, et al.: The 16-item Quick Inventory of Depressive Symptomatology (QIDS), Clinician Rating (QIDS-C), and Self-Report (QIDS-SR): a psychometric evaluation in patients with chronic major depression. Biol Psychiatry 2003; 54:573–583Crossref, MedlineGoogle Scholar

4 Cohen RM, Greenberg JM, IsHak WW: Incorporating multidimensional patient-reported outcomes of symptom severity, functioning, and quality of life in the Individual Burden of Illness Index for Depression to measure treatment impact and recovery in MDD. JAMA Psychiatry 2013; 70:343–350Crossref, MedlineGoogle Scholar

5 Jha MK, Greer TL, Grannemann BD, et al.: Early normalization of quality of life predicts later remission in depression: findings from the CO-MED trial. J Affect Disord 2016; 206:17–22Crossref, MedlineGoogle Scholar

6 Jha MK, Minhajuddin A, Greer TL, et al.: Early improvement in work productivity predicts future clinical course in depressed outpatients: findings from the CO-MED trial. Am J Psychiatry 2016; 173:1196–1204LinkGoogle Scholar

7 Jha MK, Minhajuddin A, Greer TL, et al.: Early improvement in psychosocial function predicts longer-term symptomatic remission in depressed patients. PLoS One 2016; 11:e0167901Crossref, MedlineGoogle Scholar

8 Perlis RH, Fava M, Trivedi MH, et al.: Irritability is associated with anxiety and greater severity, but not bipolar spectrum features, in major depressive disorder. Acta Psychiatr Scand 2009; 119:282–289Crossref, MedlineGoogle Scholar

9 Perlis RH, Fraguas R, Fava M, et al.: Prevalence and clinical correlates of irritability in major depressive disorder: a preliminary report from the Sequenced Treatment Alternatives to Relieve Depression study. J Clin Psychiatry 2005; 66:159–166Crossref, MedlineGoogle Scholar

10 Judd LL, Schettler PJ, Coryell W, et al.: Overt irritability/anger in unipolar major depressive episodes: past and current characteristics and implications for long-term course. JAMA Psychiatry 2013; 70:1171–1180Crossref, MedlineGoogle Scholar

11 Fava M, Hwang I, Rush AJ, et al.: The importance of irritability as a symptom of major depressive disorder: results from the National Comorbidity Survey Replication. Mol Psychiatry 2010; 15:856–867Crossref, MedlineGoogle Scholar

12 Jha MK, Minhajuddin A, South C, et al.: Worsening Anxiety, Irritability, Insomnia, or Panic Predicts Poorer Antidepressant Treatment Outcomes: Clinical Utility and Validation of the Concise Associated Symptom Tracking (CAST) scale. Int J Neuropsychopharmacol 2018; 21:325–332Google Scholar

13 Fisher LB, Fava M, Doros GD, et al.: The role of anger/hostility in treatment-resistant depression: a secondary analysis from the ADAPT-A Study. J Nerv Ment Dis 2015; 203:762–768Crossref, MedlineGoogle Scholar

14 Farabaugh A, Sonawalla S, Johnson DP, et al.: Early improvements in anxiety, depression, and anger/hostility symptoms and response to antidepressant treatment. Ann Clin Psychiatry 2010; 22:166–171MedlineGoogle Scholar

15 Fava M, Weiller E, Zhang P, et al.: Efficacy of brexpiprazole as adjunctive treatment in major depressive disorder with irritability: post hoc analysis of 2 pivotal clinical studies. J Clin Psychopharmacol 2017; 37:276–278Crossref, MedlineGoogle Scholar

16 Trivedi MH, Rush AJ, Wisniewski SR, et al.: Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry 2006; 163:28–40LinkGoogle Scholar

17 Guo T, Xiang YT, Xiao L, et al.: Measurement-based care versus standard care for major depression: a randomized controlled trial with blind raters. Am J Psychiatry 2015; 172:1004–1013LinkGoogle Scholar

18 Practice Guideline for the Treatment of Patients With Major Depressive Disorder, 3rd ed. Washington, DC, American Psychiatric Association, 2010Google Scholar

19 Rush AJ, Kraemer HC, Sackeim HA, et al.: Report by the ACNP Task Force on response and remission in major depressive disorder. Neuropsychopharmacology 2006; 31:1841–1853Crossref, MedlineGoogle Scholar

20 Trivedi MH, Wisniewski SR, Morris DW, et al.: Concise Associated Symptoms Tracking Scale: a brief self-report and clinician rating of symptoms associated with suicidality. J Clin Psychiatry 2011; 72:765–774Crossref, MedlineGoogle Scholar

21 Rush AJ, Trivedi MH, Stewart JW, et al.: Combining medications to enhance depression outcomes (CO-MED): acute and long-term outcomes of a single-blind randomized study. Am J Psychiatry 2011; 168:689–701LinkGoogle Scholar

22 Hamilton M: A rating scale for depression. J Neurol Neurosurg Psychiatry 1960; 23:56–62Crossref, MedlineGoogle Scholar

23 Trivedi MH, Wisniewski SR, Morris DW, et al.: Concise Health Risk Tracking scale: a brief self-report and clinician rating of suicidal risk. J Clin Psychiatry 2011; 72:757–764Crossref, MedlineGoogle Scholar

24 Wisniewski SR, Rush AJ, Balasubramani GK, et al.: Self-rated global measure of the frequency, intensity, and burden of side effects. J Psychiatr Pract 2006; 12:71–79Crossref, MedlineGoogle Scholar

25 Trivedi MH, Rush AJ, Ibrahim HM, et al.: The Inventory of Depressive Symptomatology, Clinician Rating (IDS-C) and Self-Report (IDS-SR), and the Quick Inventory of Depressive Symptomatology, Clinician Rating (QIDS-C) and Self-Report (QIDS-SR) in public sector patients with mood disorders: a psychometric evaluation. Psychol Med 2004; 34:73–82Crossref, MedlineGoogle Scholar

26 Rush AJ, Bernstein IH, Trivedi MH, et al.: An evaluation of the Quick Inventory of Depressive Symptomatology and the Hamilton Rating Scale for Depression: a Sequenced Treatment Alternatives to Relieve Depression trial report. Biol Psychiatry 2006; 59:493–501Crossref, MedlineGoogle Scholar

27 Trombello JM, Killian MO, Liao A, et al.: Psychometrics of the Self-Report Concise Associated Symptoms Tracking Scale (CAST-SR): results from the STRIDE (CTN-0037) study. J Clin Psychiatry 2018; 79:79CrossrefGoogle Scholar

28 Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, et al.: Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008; 27:157–172, discussion 207–212Crossref, MedlineGoogle Scholar

29 Kuhn M, Johnson K: Applied Predictive Modeling, New York, Springer, 2013CrossrefGoogle Scholar

30 Jha MK, Teer RB, Minhajuddin A, et al.: Daily activity level improvement with antidepressant medications predicts long-term clinical outcomes in outpatients with major depressive disorder. Neuropsychiatr Dis Treat 2017; 13:803–813Crossref, MedlineGoogle Scholar

31 Warden D, Rush AJ, Wisniewski SR, et al.: What predicts attrition in second step medication treatments for depression? a STAR*D report. Int J Neuropsychopharmacol 2009; 12:459–473Crossref, MedlineGoogle Scholar

32 Warden D, Trivedi MH, Wisniewski SR, et al.: Predictors of attrition during initial (citalopram) treatment for depression: a STAR*D report. Am J Psychiatry 2007; 164:1189–1197LinkGoogle Scholar

33 Sterne JAC, White IR, Carlin JB, et al.: Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 2009; 338:b2393Crossref, MedlineGoogle Scholar

34 Fusar-Poli P, Rutigliano G, Stahl D, et al.: Development and validation of a clinically based risk calculator for the transdiagnostic prediction of psychosis. JAMA Psychiatry 2017; 74:493–500Crossref, MedlineGoogle Scholar

35 Hafeman DM, Merranko J, Goldstein TR, et al.: Assessment of a person-level risk calculator to predict new-onset bipolar spectrum disorder in youth at familial risk. JAMA Psychiatry 2017; 74:841–847Crossref, MedlineGoogle Scholar