Irritability and Its Clinical Utility in Major Depressive Disorder: Prediction of Individual-Level Acute-Phase Outcomes Using Early Changes in Irritability and Depression Severity
Abstract
Objective:
The authors evaluated improvement in irritability with antidepressant treatment and its prognostic utility in treatment-seeking adult outpatients with major depressive disorder.
Methods:
Mixed-model analyses were used to assess changes in irritability (as measured with the five-item irritability domain of the Concise Associated Symptom Tracking [CAST-IRR] scale) from baseline to week 4 after controlling for depression severity (as measured with the 16-item Quick Inventory of Depressive Symptomatology–Clinician Rated [QIDS-C]) in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial (N=664). An interactive calculator for remission (QIDS-C score ≤5) and no meaningful benefit (<30% reduction in QIDS-C score from baseline) at week 8 was developed with logistic regression analyses in the CO-MED trial using participants with complete data (N=431) and independently replicated in the Suicide Assessment and Methodology Study (SAMS) (N=163).
Results:
In the CO-MED trial, irritability was significantly reduced (effect size=1.06) from baseline to week 4, and this reduction remained significant after adjusting for QIDS-C change (adjusted effect size=0.36). A one-standard-deviation greater reduction in CAST-IRR score from baseline to week 4 predicted a 1.73 times higher likelihood of remission and a 0.72 times lower likelihood of no meaningful benefit at week 8, independent of baseline QIDS-C and CAST-IRR scores and reduction in QIDS-C score from baseline to week 4. The model estimates for remission (area under the curve [AUC]=0.79) and no meaningful benefit (AUC=0.76) in the CO-MED trial were used to predict remission (AUC=0.80) and no meaningful benefit (AUC=0.84) in SAMS and to develop an interactive calculator.
Conclusions:
Irritability is an important symptom domain of major depressive disorder that is not fully reflected in depressive symptom severity measures. Early reductions in irritability, when combined with changes in depressive symptom severity, provide a robust estimate of likelihood of remission or no meaningful benefit in outpatients with major depression.
Current diagnostic criteria and symptom severity measures rely on the nine core symptoms of major depressive disorder (1–3), yet impairments as a result of major depression and improvements with treatment are not fully captured by these symptom criteria (4–7). Irritability is unique among major depressive disorder-associated symptom domains, because it is considered to be a core diagnostic symptom among adolescents but not adults (1). Yet, it is widely prevalent in adult patients with major depression, with 40%−50% reporting the presence of irritability for more than half the time in their current depressive episode (8–10). The presence of irritability is associated with greater severity of depressive and anxiety symptoms, earlier age at onset, presence of atypical features, and poorer quality of life (8, 9, 11). Irritability is also associated with poorer clinical course. Patients with major depression who report irritability, compared with those without irritability, are more likely to have a chronic course characterized by a greater number of weeks spent in a depressive episode and greater time spent with residual symptoms in between depressive episodes (10). The presence of anger or hostility before treatment initiation and worsening of irritability after initiation of antidepressant medication are both associated with lower acute-phase remission rates (12, 13). Reductions ≥30% in anger or hostility ratings by week 2 of antidepressant treatment are associated with a doubled likelihood of remission among patients with major depression (14). In patients with treatment-resistant major depression, the presence of irritability may help guide the optimal next-step treatment selection, as evidenced by greater symptom improvement with low-dose brexpiprazole in patients with irritability (15).
Despite the wide prevalence, the prognostic utility, and the treatment selection potential, measurement-based care protocols (16, 17) and treatment guidelines (18, 19) for major depressive disorder do not systematically assess or incorporate irritability in clinical decision making. To be clinically useful, changes in irritability should predict longer-term clinical outcomes and reflect improvement beyond what is reflected in measures of overall depressive symptom severity. Previous studies have not evaluated whether changes in irritability with antidepressant treatment were completely accounted for by changes in depressive symptom severity. Furthermore, the prognostic significance of changes in irritability has not been shown to be independent of reduction in depressive symptom severity. Finally, even if changes in irritability predicted longer-term outcomes, currently there are no easy-to-use methods or recommendations to incorporate them in clinical decision making at the individual patient level.
Our aim in this study was to evaluate the clinical utility of adding irritability to the current paradigm of measuring depressive symptom severity during the course of antidepressant treatment. The specific questions we asked, using two samples of convenience, were as follows:
Does irritability improve from baseline to week 4 of antidepressant treatment even after accounting for change in depressive symptom severity?
Does baseline-to-week-4 change in irritability predict remission and no meaningful benefit (<30% reduction from baseline) at week 8 even after controlling for baseline irritability, baseline depressive symptom severity, and baseline-to-week-4 change in depressive symptom severity?
Can baseline-to-week-4 changes in irritability and depressive symptom severity be used to predict remission and no meaningful benefit at an individual level?
Do these predictions replicate in a separate, unrelated sample of outpatients with major depression?
Using the first sample of participants (N=664) from the Combining Medications to Enhance Depression Outcomes (CO-MED) trial, baseline-to-week-4 change in irritability was assessed after controlling for change in depressive symptom severity. Prediction of acute-phase (week 8) treatment outcomes (remission and no meaningful benefit) by baseline-to-week-4 change in irritability was tested in models incorporating baseline irritability and depressive symptom severity and baseline-to-week-4 change in depressive symptom severity. The estimates obtained from the CO-MED trial were used to predict individual-level probability of remission and no meaningful benefit in a second sample of participants (N=163) from the Suicide Assessment Methodology Study (SAMS). Week 4 was selected to measure change from baseline in irritability and depressive symptom severity because it is the first critical decision point for assessing response to treatment per measurement-based care protocol (16). Because SAMS was an 8-week study, acute-phase treatment outcomes were ascribed at week 8. Although remission (defined as a score ≤5 on the Quick Inventory of Depressive Symptomatology–Clinician-Rated [QIDS-C] [3]) is the preferred goal of acute-phase antidepressant treatment (19), “no meaningful benefit” (defined as a reduction <30% in QIDS-C score from baseline) was included to facilitate clinical decision making (treatment augmentation, switching, or discontinuation).
Methods
Study Overview and Participants
CO-MED trial.
Our analysis included all CO-MED trial (NCT00590863) participants for whom data from the Concise Associated Symptom Tracking (CAST) scale (20) were available at baseline (N=664). The details of the CO-MED trial, including recruiting sites, inclusion and exclusion criteria, and institutional review board approvals, have been described previously (21). From March 2008 through February 2009, participants from six primary care sites and nine psychiatric care sites were enrolled after written informed consent was obtained (21). Inclusion was restricted to treatment-seeking outpatients with major depression who were 18–75 years old and had nonpsychotic chronic (duration of current episode ≥2 years) or recurrent (current episode ≥2 months) depression and a baseline score ≥16 on the 17-item Hamilton Depression Rating Scale (HAM-D) (22). Exclusionary criteria included a lifetime history of a psychotic disorder, current psychotic symptoms, a history of an eating disorder in the past 2 years, a current primary diagnosis of obsessive-compulsive disorder (OCD), current substance dependence requiring inpatient-level care, an unstable general medical condition, a current psychiatric condition necessitating hospitalization, a history of a seizure disorder or narrow-angle glaucoma, inadequately treated hypothyroidism, and use of contraindicated medications for general medical or psychiatric conditions (antipsychotics, anticonvulsants, mood stabilizers, CNS stimulants, antidepressants, or other medications with potential augmentation properties). Participants were randomly assigned at baseline to one of three treatment arms in a 1:1:1 ratio after stratification by clinical site: escitalopram plus placebo (selective serotonin reuptake inhibitor [SSRI] monotherapy), sustained-release bupropion plus escitalopram (bupropion-SSRI combination), and extended-release venlafaxine plus mirtazapine (venlafaxine-mirtazapine combination). Postrandomization visits were conducted at weeks 1, 2, 4, 6, 8, 10, and 12 for the acute phase and weeks 16, 20, 24, and 28 for the continuation phase. As previously reported (21), acute-phase and continuation-phase outcomes (remission or response) did not differ among the three treatment arms.
SAMS.
All SAMS (NCT00532103) participants who completed the QIDS-C and CAST assessments at baseline and at week 4 and the QIDS-C at week 8 (N=163) were included. The details of SAMS, including recruiting sites, inclusion and exclusion criteria, and institutional review board approvals, have been described previously (23). From July 2007 through February 2008, a total of 266 participants from six primary care sites and nine psychiatric care sites were enrolled in SAMS after written informed consent was obtained (23). Inclusion was restricted to 18- to 75-year-old treatment-seeking outpatients with nonpsychotic major depressive disorder and a score ≥14 on the HAM-D. Exclusion criteria included failure of two or more courses of SSRIs in the current episode, current substance use disorder, bipolar disorder, schizophrenia, a primary diagnosis of OCD or an eating disorder, use of prohibited medications (antipsychotics and antiepileptics), and a unstable general medical or psychiatric condition necessitating hospitalization.
Measurement-Based Care
The measurement-based care (16) approach was used in both the CO-MED trial and SAMS to make medication dosage adjustments during the 8-week postbaseline period, using the QIDS-C (3) as a measure of depression severity and the Frequency, Intensity, and Burden of Side Effects Rating Scale (24) as a measure of side effects. Study physicians made dosage increases only if the depression severity was not adequately controlled and the side effect burden was tolerable.
Medications
CO-MED trial.
Participants in all three treatment arms received two types of pills in a single-blind fashion. The study personnel were aware of both pill types, but study participants were aware of only the first pill type. In the SSRI monotherapy arm, escitalopram was started at 10 mg/day, with an increase up to 20 mg/day permitted at week 4; pill placebo was added as the second pill type at week 2. In the bupropion-SSRI treatment arm, sustained-release bupropion was initiated at 150 mg/day and increased to 300 mg/day at week 1; escitalopram was started at 10 mg/day as the second pill type at week 2, and dosage increases of bupropion (up to 200 mg twice daily) and escitalopram (up to 20 mg/day) were permitted from weeks 4 to 8. In the venlafaxine-mirtazapine arm, extended-release venlafaxine was initiated at 37.5 mg/day and titrated to 150 mg/day by week 1; mirtazapine at 15 mg/day was added as the second pill type at week 2, and dosage increases of venlafaxine (up to 300 mg/day) and mirtazapine (up to 45 mg/day) were permitted from weeks 4 to 8.
SAMS.
Participants were treated with an SSRI (escitalopram, citalopram, sertraline, paroxetine, controlled-release paroxetine, or fluoxetine) in an open-label fashion, with choice of an antidepressant at the individual participant’s discretion (20). Escitalopram was initiated at 10 mg/day, and an increase up to 20 mg/day was permitted from weeks 4 to 6. Citalopram was initiated at 20 mg/day, and an increase up to 40 mg/day was permitted from weeks 4 to 6. Sertraline was initiated at 50 mg/day and increased to 100 mg/day at week 2; further increases up to 150 mg/day were permitted from weeks 4 to 6. Paroxetine was initiated at 20 mg/day, and an increase up to 40 mg/day was permitted from weeks 4 to 6. Controlled-release paroxetine was initiated at 25 mg/day, and an increase up to 37.5 mg/day was permitted from weeks 4 to 6. Fluoxetine was initiated at 20 mg/day, and an increase up to 40 mg/day was permitted from weeks 4 to 6.
Assessments
QIDS-C.
The 16 items of the QIDS-C are based on the nine symptom criteria domains of major depressive disorder. Each item is scored from 0 to 3, and total score ranges from 0 to 27 (3, 25). The QIDS-C correlates highly with the 17-item HAM-D (r=0.93) and has high inter-item correlations (Cronbach’s alpha=0.85) (25, 26). Remission was defined as a score ≤5 on the QIDS-C at week 8. No meaningful benefit was defined as a reduction <30% from baseline to week 8 on the QIDS-C.
CAST Self-Report.
The 16 items of the CAST assess symptoms across five domains in which each individual item is rated on a 5-point Likert scale (from 1, “strongly disagree,” to 5, “strongly agree”): anxiety (three items), irritability (five items categorized as the CAST irritability domain subscale [CAST-IRR]), mania (four items), insomnia (two items), and panic (two items) (20). Items included in the CAST-IRR are as follows: “I wish people would just leave me alone”; “I feel very uptight”; “I find myself saying or doing things without thinking”; “Lately everything seems to be annoying me”; and “I find people get on my nerves easily.” The factor structure of the CAST was reported initially by Trivedi et al. (20) and validated by Jha et al. (12) and Trombello et al. (27). The CAST-IRR has been shown to have significant correlations with the Impulsivity Rating Scale (r=0.39), the Beck Anxiety Inventory (r=0.42), and the irritability item of the Clinician-Administered Rating Scale for Mania (r=0.30) (20) as well as small to moderate correlations with comorbid psychiatric disorders (Spearman’s correlation coefficient range, 0.07–0.29) as measured by a self-report psychiatric diagnostic screening questionnaire (12). In the present study, the Pearson correlation coefficient between the QIDS-C and the CAST-IRR was 0.35 (p<0.001) at baseline.
Statistical Analyses
The analytic sample for changes in the CAST-IRR included all CO-MED trial participants with CAST-IRR data available at baseline (N=664). Of these participants, those who had QIDS-C and CAST-IRR data at week 4 and QIDS-C data at week 8 were included in the analytic sample for prediction of acute-phase treatment outcomes (N=431). The analytic sample for replication of these predictions in SAMS included participants with QIDS-C and CAST-IRR data at baseline and week 4 and QIDS-C data at week 8 (N=163). In the CO-MED trial, repeated-measures mixed-model analyses tested baseline-to-week-4 changes in CAST-IRR score, with the visit as the within-subject variable and all other variables as between-subject variables, before and after controlling for QIDS-C score at each visit. Also in the CO-MED trial, two separate logistic regression analyses predicted remission and no meaningful benefit at week 8, with baseline QIDS-C score, baseline CAST-IRR score, percent change in QIDS-C score (100×[baseline QIDS-C score − week-4 QIDS-C score]/baseline QIDS-C score), and percent change in CAST-IRR score (100×[baseline CAST-IRR score − week-4 CAST-IRR score]/baseline CAST-IRR score) as predictor variables using the following equation:
where p is the probability of the outcome variable (remission and no meaningful benefit at week 8), and bi is the regression parameter for the ith predictor. Model performance in predicting remission and no meaningful benefit after adding CAST-IRR variables were assessed using net reclassification improvement analyses (28). The logistic regression analyses described above were repeated with sex and treatment arms (SSRI monotherapy, bupropion-SSRI, and venlafaxine-mirtazapine) separately to test whether sex and treatment significantly affected the outcomes. A receiver operating characteristic curve was plotted to obtain the area under the curve (AUC), and calibration plots (29) were used to evaluate the agreement between predicted probabilities and observed outcomes. To generate calibration plots, the data were divided into 10 groups, the number of samples with true results equal to class was determined, and the event rate was determined for each bin. These event rates were then plotted against the midpoint of each bin. Graphically, this equates to predicted probabilities and actual outcomes falling on a 45-degree line. If a model is well calibrated, a study subject with a 0.50 estimated probability of remission will be expected to be in remission 50% of the time. The model estimates from the CO-MED trial were then used to compute individual-level probabilities in SAMS. A receiver operating characteristics plot was used to compare these predictions in SAMS with those in the CO-MED trial. An interactive calculator was developed using the Shiny package in R (https://shiny.rstudio.com).
The threshold for statistical significance was set at a p value of 0.05. We used SAS, version 9.3 (SAS Institute, Cary, N.C.) for all analyses except the calibration plots, which were plotted in R, version 3.4.3, with the caret package, and the interactive calculator, which was developed with Shiny.
Results
Participants in both the CO-MED trial and SAMS were predominantly female (67.1% and 69.9%), Caucasian (69.2% and 71.2%), and non-Hispanic (84.2% and 90.2%). Participants in these two studies had similar baseline clinical and sociodemographic characteristics, with the exception of higher rates of chronic depression in the CO-MED trial (N=235/431, 54.5%) compared with SAMS (N=37/163, 22.7%) (Table 1). Of the CO-MED trial participants with baseline and week-4 CAST-IRR and QIDS-C scores plus week-8 QIDS-C scores (N=431), 149 (34.6%) attained remission and 93 (21.6%) showed no meaningful benefit at week 8. Of the participants in SAMS with comparable data (N=163), 81 (49.7%) attained remission and 33 (20.3%) showed no meaningful benefit at week 8. Of the total sample in the CO-MED trial (N=665), individuals who were excluded from prediction of acute-phase outcomes (N=234) were younger, had greater depressive symptom severity at baseline, were more likely to be unemployed, were more likely to be African American, and reported less than a high school level of education (for further details, see Table S1 in the online supplement). Similarly, in SAMS, individuals who were excluded (N=103) were younger (see Table S1 in the online supplement).
Variable | CO-MED (N=431) | SAMS (N=163) | ||
---|---|---|---|---|
N | % | N | % | |
Sex | ||||
Male | 142 | 32.9 | 49 | 30.1 |
Female | 289 | 67.1 | 114 | 69.9 |
Race | ||||
Caucasian | 298 | 69.2 | 116 | 71.2 |
African American | 91 | 21.1 | 36 | 22.1 |
Other | 42 | 9.7 | 11 | 6.7 |
Educationa | ||||
<12 years | 50 | 12.0 | 21 | 12.9 |
12–15 years | 231 | 55.5 | 97 | 59.5 |
≥15 years | 135 | 32.5 | 45 | 27.6 |
Hispanic ethnicity | 68 | 15.8 | 16 | 9.8 |
Employed at baseline | 227 | 52.7 | 99 | 60.7 |
Onset of depression before age 18b | 191 | 44.3 | 52 | 32.1 |
Chronic depression | 235 | 54.5 | 37 | 22.7 |
Recurrent depressionc | 342 | 79.4 | 103 | 68.2 |
Mean | SD | Mean | SD | |
Age (years) | 43.9 | 12.7 | 42.6 | 12.7 |
Depressive symptom severityd | 15.6 | 3.4 | 14.6 | 3.2 |
Irritability severitye | 17.2 | 3.8 | 16.1 | 4.1 |
Baseline demographic and clinical characteristics of participants in the Combining Medications to Enhance Depression Outcomes (CO-MED) trial and the Suicide Assessment and Methodology Study (SAMS) for whom complete data were avaliable for depressive symptoms and irritability at baseline and week 4 and outcome data at week 8
The mean QIDS-C and CAST-IRR scores at week 4 were 9.2 (SD=4.3) and 12.6 (SD=4.6), respectively, in the CO-MED trial (N=431) and 8.6 (SD=4.4) and 12.2 (SD=4.5), respectively, in SAMS (N=163). The mean baseline-to-week-4 reduction in QIDS-C and CAST-IRR scores was 40.2% (SD=26.5) and 25.1% (SD=25.7), respectively, in the CO-MED trial and 40.8% (SD=29.6) and 21.4% (SD=29.2), respectively, in SAMS.
Change in Irritability From Baseline to Week 4
In the CO-MED trial, there was a significant baseline-to-week-4 reduction in CAST-IRR scores (F=271.80, df=3, 1663, p<0.0001; effect size=1.06) (for further details, see Figure S1 in the online supplement. This reduction in CAST-IRR scores remained significant even after controlling for QIDS-C score at each visit (F=26.17, df=3, 1661, p<0.0001; adjusted effect size=0.36). The estimated reduction in CAST-IRR scores from baseline, independent of QIDS-C change, was as follows: −1.21 (SD=0.16; p<0.0001), −1.33 (SD=0.18; p<0.0001), and −1.55 (SD=0.20; p<0.0001) at week 1, week 2, and week 4, respectively.
Change in Irritability at Week 4 as a Predictor of Remission and No Meaningful Benefit at Week 8
In the CO-MED trial, higher baseline-to-week-4 reduction in CAST-IRR scores was independently associated with higher likelihood of attaining remission (χ2=14.60, df=1, p=0.0001) and lower likelihood of no meaningful benefit at week 8 (χ2=4.39, df=1, p=0.036) (Table 2). A one-standard-deviation (25.7%) greater reduction in CAST-IRR score from baseline to week 4 independently predicted a 1.73 times higher likelihood of remission and a 0.72 times lower likelihood of no meaningful benefit. Adding irritability variables to the models significantly improved the reclassification of both remission and no meaningful benefit. The net reclassification improvement for remission and no meaningful benefit was 0.36 (95% CI=0.17, 0.56, p<0.0001) and 0.34 (95% CI=0.12, 0.57, p=0.004), respectively. With the inclusion of irritability variables in the remission model, 13% of remitters were correctly reclassified, and 23% of nonremitters were correctly reclassified. Similarly, with the inclusion of irritability variables in the no meaningful benefit model, 20% of participants with no meaningful benefit were correctly reclassified, whereas 14% with meaningful benefit were correctly reclassified. Sex did not significantly predict either remission (χ2=1.85, df=1, p=0.17) or no meaningful benefit (χ2=0.96, df=1, p=0.33). Similarly, treatment arm did not significantly predict either remission (χ2=1.27, df=2, p=0.53) or no meaningful benefit (χ2=2.10, df=2, p=0.35).
Remission | No Meaningful Benefit | |||
---|---|---|---|---|
Variable | Odds Ratio | 95% CI | Odds Ratio | 95% CI |
Baseline | ||||
QIDS-C | 0.86 | 0.80, 0.93 | 0.99 | 0.91, 1.08 |
CAST-IRR | 0.95 | 0.89, 1.02 | 1.11 | 1.03, 1.20 |
One percent greater baseline-to-week-4 reduction | ||||
QIDS-C | 1.03 | 1.02, 1.04 | 0.97 | 0.96, 0.98 |
CAST-IRR | 1.02 | 1.01, 1.03 | 0.987 | 0.975, 0.999 |
Changes in Irritability and Depressive Symptom Severity as Predictors of Remission and No Meaningful Benefit at an Individual Level
In the CO-MED trial, the model containing baseline QIDS-C and CAST-IRR scores and baseline-to-week-4 changes in QIDS-C and CAST-IRR scores had AUC values of 0.79 for remission and 0.76 for no meaningful benefit at week 8 (Figure 1). The individual-level probabilities of remission (p/(1–p)=e[0.6554–0.1511*(baseline QIDS-C) –0.0520*(baseline CAST-IRR) + 0.0301*(percent change in QIDS-C) + 0.0213*(percent change in CAST-IRR)]) and no meaningful benefit (p/(1–p)=e[–1.6436 –0.00942*(baseline QIDS-C) + 0.1044*(baseline CAST-IRR) –0.0312*(percent change in QIDS-C) –0.0130*(percent change in CAST-IRR)]) were calculated. Calibration plots (see Figure S2 in the online supplement) showed that the predicted probabilities were well calibrated, aside from the tails.
Replication of These Predictions in an Unrelated Sample of Outpatients With Major Depression
In SAMS, individual-level probabilities of remission and no meaningful benefit were obtained by using intercept and beta estimates from the CO-MED trial. In SAMS, the AUC values of remission and no meaningful benefit at week 8 were 0.80 and 0.84, respectively (Figure 1). Using median split (the median baseline-to-week-4 CAST-IRR reduction in the CO-MED trial was 26.1%), participants were grouped by those with baseline-to-week-4 reductions ≥26.1% and <26.1% in CAST-IRR scores in order to visualize the differences in acute-phase treatment outcomes in both the CO-MED trial and SAMS (Figure 2).
To allow estimation of individual-level probabilities, the intercepts and beta estimates from the remission and no meaningful benefit models in the CO-MED trial were incorporated in an interactive web-based calculator that could be deployed on a server for universal use. Users were able to specify the QIDS-C and CAST-IRR values at baseline and week 4, view where these individual values lay according to the distributions in the CO-MED trial, and obtain estimated probabilities of remission and no meaningful benefit at week 8 (Figure 3).
Discussion
Irritability improved early with antidepressant treatment and predicted acute-phase treatment outcomes (remission and no meaningful benefit) independently in a large, ecologically valid sample of treatment-seeking outpatients with major depression. Furthermore, baseline-to-week-4 changes in irritability and depressive symptom severity were combined to estimate individual-level outcomes with high accuracy and were replicated in an unrelated sample. Improvement in irritability, seen as early as week 1, was not completely accounted for by reduction in depressive symptoms. Greater baseline-to-week-4 reduction in irritability was associated with higher likelihood of remission and lower likelihood of no meaningful benefit, even after controlling for baseline-to-week-4 change in depressive symptom severity and baseline levels of depressive symptom severity and irritability.
Improvement in irritability in this study is consistent with that in previous reports of reduced anger or hostility with antidepressant treatment (13, 14). These findings also add to previous findings that improvement with antidepressant treatment extends beyond changes in core depressive symptoms (4, 6, 7, 30). These studies, taken together, highlight the limitations of the current criteria for major depressive disorder and argue for expansion of assessments beyond the nine core diagnostic assessments. Higher likelihood of remission with greater reduction in irritability is consistent with a previous report of higher likelihood of remission with early (by week 2) improvement in anger or hostility (14). The findings that participants excluded from the prediction models were more likely to be younger, African American, and unemployed and to have greater symptom severity at treatment initiation and less than a high school level of education are consistent with previous reports of attrition from care in the Sequenced Treatment Alternatives to Relieve Depression study (31, 32).
A clinical implication of these findings is that irritability should be assessed during the course of antidepressant treatment. The five-item self-report measure of irritability can be implemented without significantly burdening patients and providers. Clinicians can easily combine early changes in irritability and depressive symptoms to estimate probabilities of remission and no meaningful benefit with the easy-to-use interactive calculator. The outcomes were chosen by design to be clinically actionable (for high likelihood of remission, treatment should be continued; for high likelihood of no meaningful benefit, treatment should be modified). In patients with persistent irritability at week 4 and a high likelihood of no meaningful benefit, clinicians may consider treatment strategies such as augmentation with brexpiprazole (15).
A major strength of this study is the testing and replication of predictive models in two unrelated samples. The large sample size and the recruitment of treatment-seeking outpatients from community practices, with broad inclusion and minimal exclusion criteria, increase the generalizability of the findings.
There are several limitations to the secondary analysis. The models for remission and no meaningful benefit were tested and replicated in participants for whom a complete data set (baseline, week 4, and week 8) was available, and thus these may not generalize to individuals who dropped out of care early. As a result of the nonrandom pattern of differential attrition, the use of methods to account for missing data (such as multiple imputation) has known pitfalls (33), and thus use of these methods was not considered appropriate. The number of predictor variables was limited to four, because the objective of this study was to demonstrate the clinical utility of adding irritability to current practices of measuring depressive symptom severity. Inclusion of other clinical and biological markers may further improve the predictive model. However, it is noteworthy that the strength of the predictive ability of the remission and no-meaningful-benefit models in SAMS (AUC values of 0.80 and 0.84, respectively) is comparable to the AUC values reported in other studies, such as the prediction of development of psychosis (AUC=0.79) in patients receiving secondary mental health care (34) and development of bipolar spectrum disorder (AUC=0.76) in at-risk youths (35). By design, all participants in the CO-MED trial and SAMS received a serotonergic antidepressant. Hence, these findings may not extend to individuals treated with nonserotonergic antidepressants, such as bupropion monotherapy. The individual-level calculator is restricted by the choice of the QIDS-C and the CAST-IRR as measures of depression severity and irritability. Further studies are needed to test the validity of this calculator with other measures of depression severity and to evaluate whether the measurement-based care paradigm that incorporates assessments of irritability along with depression severity results in improved treatment outcomes. The model is well calibrated with probabilities ≤0.6 in both models, and thus any estimates outside this range should come with the understanding that they may not produce outcomes at the anticipated rate. This low calibration may be a result of the fact that only about 20% of the CO-MED trial participants in the no-meaningful-benefit model were assigned probabilities above 0.6, and only 5% of the CO-MED trial participants in the remission model were assigned probabilities above 0.5. More information (including additional predictors and larger samples) is needed to better gauge calibration accuracy in these groups. Calibration plots have a major limitation—they represent all binned data as if there were equal amounts of data in each bin, which often is not the case.
In conclusion, irritability improves early with antidepressant treatment independently of depressive symptom severity. This early improvement is independently associated with higher likelihood of remission and lower likelihood of no meaningful benefit. The combinations of baseline and early changes (up to week 4) in irritability and depressive symptom severity can estimate individual-level probabilities of remission and no meaningful benefit. These findings support inclusion of assessments of irritability in measurement-based-care approaches for the treatment of patients with major depression.
1 : Diagnostic and Statistical Manual of Mental Disorders, 5th ed. Washington, DC, American Psychiatric Association, 2013Crossref, Google Scholar
2 : The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16:606–613Crossref, Medline, Google Scholar
3 : The 16-item Quick Inventory of Depressive Symptomatology (QIDS), Clinician Rating (QIDS-C), and Self-Report (QIDS-SR): a psychometric evaluation in patients with chronic major depression. Biol Psychiatry 2003; 54:573–583Crossref, Medline, Google Scholar
4 : Incorporating multidimensional patient-reported outcomes of symptom severity, functioning, and quality of life in the Individual Burden of Illness Index for Depression to measure treatment impact and recovery in MDD. JAMA Psychiatry 2013; 70:343–350Crossref, Medline, Google Scholar
5 : Early normalization of quality of life predicts later remission in depression: findings from the CO-MED trial. J Affect Disord 2016; 206:17–22Crossref, Medline, Google Scholar
6 : Early improvement in work productivity predicts future clinical course in depressed outpatients: findings from the CO-MED trial. Am J Psychiatry 2016; 173:1196–1204Link, Google Scholar
7 : Early improvement in psychosocial function predicts longer-term symptomatic remission in depressed patients. PLoS One 2016; 11:e0167901Crossref, Medline, Google Scholar
8 : Irritability is associated with anxiety and greater severity, but not bipolar spectrum features, in major depressive disorder. Acta Psychiatr Scand 2009; 119:282–289Crossref, Medline, Google Scholar
9 : Prevalence and clinical correlates of irritability in major depressive disorder: a preliminary report from the Sequenced Treatment Alternatives to Relieve Depression study. J Clin Psychiatry 2005; 66:159–166Crossref, Medline, Google Scholar
10 : Overt irritability/anger in unipolar major depressive episodes: past and current characteristics and implications for long-term course. JAMA Psychiatry 2013; 70:1171–1180Crossref, Medline, Google Scholar
11 : The importance of irritability as a symptom of major depressive disorder: results from the National Comorbidity Survey Replication. Mol Psychiatry 2010; 15:856–867Crossref, Medline, Google Scholar
12 : Worsening Anxiety, Irritability, Insomnia, or Panic Predicts Poorer Antidepressant Treatment Outcomes: Clinical Utility and Validation of the Concise Associated Symptom Tracking (CAST) scale. Int J Neuropsychopharmacol 2018; 21:325–332Google Scholar
13 : The role of anger/hostility in treatment-resistant depression: a secondary analysis from the ADAPT-A Study. J Nerv Ment Dis 2015; 203:762–768Crossref, Medline, Google Scholar
14 : Early improvements in anxiety, depression, and anger/hostility symptoms and response to antidepressant treatment. Ann Clin Psychiatry 2010; 22:166–171Medline, Google Scholar
15 : Efficacy of brexpiprazole as adjunctive treatment in major depressive disorder with irritability: post hoc analysis of 2 pivotal clinical studies. J Clin Psychopharmacol 2017; 37:276–278Crossref, Medline, Google Scholar
16 : Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry 2006; 163:28–40Link, Google Scholar
17 : Measurement-based care versus standard care for major depression: a randomized controlled trial with blind raters. Am J Psychiatry 2015; 172:1004–1013Link, Google Scholar
18 Practice Guideline for the Treatment of Patients With Major Depressive Disorder, 3rd ed. Washington, DC, American Psychiatric Association, 2010Google Scholar
19 : Report by the ACNP Task Force on response and remission in major depressive disorder. Neuropsychopharmacology 2006; 31:1841–1853Crossref, Medline, Google Scholar
20 : Concise Associated Symptoms Tracking Scale: a brief self-report and clinician rating of symptoms associated with suicidality. J Clin Psychiatry 2011; 72:765–774Crossref, Medline, Google Scholar
21 : Combining medications to enhance depression outcomes (CO-MED): acute and long-term outcomes of a single-blind randomized study. Am J Psychiatry 2011; 168:689–701Link, Google Scholar
22 : A rating scale for depression. J Neurol Neurosurg Psychiatry 1960; 23:56–62Crossref, Medline, Google Scholar
23 : Concise Health Risk Tracking scale: a brief self-report and clinician rating of suicidal risk. J Clin Psychiatry 2011; 72:757–764Crossref, Medline, Google Scholar
24 : Self-rated global measure of the frequency, intensity, and burden of side effects. J Psychiatr Pract 2006; 12:71–79Crossref, Medline, Google Scholar
25 : The Inventory of Depressive Symptomatology, Clinician Rating (IDS-C) and Self-Report (IDS-SR), and the Quick Inventory of Depressive Symptomatology, Clinician Rating (QIDS-C) and Self-Report (QIDS-SR) in public sector patients with mood disorders: a psychometric evaluation. Psychol Med 2004; 34:73–82Crossref, Medline, Google Scholar
26 : An evaluation of the Quick Inventory of Depressive Symptomatology and the Hamilton Rating Scale for Depression: a Sequenced Treatment Alternatives to Relieve Depression trial report. Biol Psychiatry 2006; 59:493–501Crossref, Medline, Google Scholar
27 : Psychometrics of the Self-Report Concise Associated Symptoms Tracking Scale (CAST-SR): results from the STRIDE (CTN-0037) study. J Clin Psychiatry 2018; 79:79Crossref, Google Scholar
28 : Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008; 27:157–172, discussion 207–212Crossref, Medline, Google Scholar
29 : Applied Predictive Modeling, New York, Springer, 2013Crossref, Google Scholar
30 : Daily activity level improvement with antidepressant medications predicts long-term clinical outcomes in outpatients with major depressive disorder. Neuropsychiatr Dis Treat 2017; 13:803–813Crossref, Medline, Google Scholar
31 : What predicts attrition in second step medication treatments for depression? a STAR*D report. Int J Neuropsychopharmacol 2009; 12:459–473Crossref, Medline, Google Scholar
32 : Predictors of attrition during initial (citalopram) treatment for depression: a STAR*D report. Am J Psychiatry 2007; 164:1189–1197Link, Google Scholar
33 : Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 2009; 338:b2393Crossref, Medline, Google Scholar
34 : Development and validation of a clinically based risk calculator for the transdiagnostic prediction of psychosis. JAMA Psychiatry 2017; 74:493–500Crossref, Medline, Google Scholar
35 : Assessment of a person-level risk calculator to predict new-onset bipolar spectrum disorder in youth at familial risk. JAMA Psychiatry 2017; 74:841–847Crossref, Medline, Google Scholar