Continuity of care has often been viewed as a crucial indicator of the quality of care for patients with a serious mental illness (1). However, several recent studies of persons who were enrolled in specialized intensive treatment programs found limited relationships between outpatient continuity-of-care measures and clients' mental health outcomes (2,3,4). In two of these studies, although baseline clinical measures were obtained on entry into a specialized intensive treatment program with follow-up several months after discharge, continuity-of-care measures pertained only to the period of outpatient treatment after discharge. Thus the intensive services received in these programs may have overwhelmed any detectable association between continuity of care and client outcomes in the postdischarge follow-up period.
In the study reported here we examined the relationship between measures of continuity of care and outcomes of persons in a larger population of clients with a variety of diagnoses who were receiving various types of outpatient mental health services from the Veterans Health Administration (VHA) of the Department of Veterans Affairs (VA).
The practical difficulty and cost of collecting outcomes data, whether through standardized surveys or through provider assessments, has hindered research into the correlation of pattern of service delivery and client outcomes in real-world practice. In recognition of both the importance and potentially high cost of outcomes monitoring, the VHA issued a policy directive in 1999 that required mental health clinicians to record a Global Assessment of Functioning (GAF) score at the conclusion of each episode of inpatient care and required that outpatients be rated with the GAF at least once every 90 days during active treatment (5).
The GAF is a single-item rating with which a treating clinician can evaluate the current global functioning of patients on a scale of 1 to 100 with brief anchors at 10-point intervals; higher scores indicate better functioning. The VHA selected the GAF because it is inexpensive, is practical to administer, and has demonstrated potential to be used reliably (6,7,8). The GAF is well known because it is an integral part of the standard multiaxial diagnostic system described in DSM-IV (9). In addition, Moos and colleagues (10) recently demonstrated that GAF scores collected by VHA clinicians are significantly associated with current symptoms and functioning as measured with standardized instruments, although these scores do not predict future health status or costs.
In an accompanying article in this issue (11), we present evidence of the discriminant validity of the GAF score in the VHA as well as of the usefulness of GAF-derived measures for monitoring changes over time in average facility-level outcomes.
However, substantial concerns about the GAF have been expressed, because the scale uses one item to measure many different functional areas, it excludes physical impairment (12), and it has greater association with psychiatric symptoms than with functional abilities (13,14). In addition, because the GAF is based on subjective professional judgment, it may be biased by a practitioner's knowledge of a patient's diagnoses and treatment setting (inpatient or outpatient).
The difficulty of evaluating the relationship of continuity of care and client outcomes is further complicated by the fact that continuity of care has been used to refer to almost all aspects of mental health service delivery (15). In this study we focused on a narrower definition of continuity of care, represented by three related concepts: regularity of care as indicated by an evenness in the use of the services over time and the absence of a hiatus in care (16,17,18,19); continuity of treatment across organizational boundaries—for example, through the transition from inpatient to outpatient services (20,21,22,23,24); and intensity of treatment—that is, the volume of services received in a specific period. In this study we used national VHA data from fiscal year 2002 to examine the relationship between three types of continuity-of-care measures and change in GAF scores by using multiple regression analysis to adjust for differences in client characteristics. The analyses addressed three separate populations of VHA patients: discharged inpatients, outpatients who were newly entering treatment, and outpatients who were in a continuing episode of care.
GAF ratings were obtained from a national file containing all GAF ratings made by VHA clinicians along with patient identifiers, an indicator of whether the rating was made at the end of an inpatient stay or during an episode of outpatient care, the date the rating was made, and a code documenting the specific facility at which the rating was made. The 0- to 100-point version of the GAF score was used, and ratings were made by primary clinicians who were not systematically trained. The study was reviewed by the institutional review board of the VA Connecticut Health System, and a waiver of informed consent was approved.
GAF ratings were completed as treatment occurred rather than at the beginning of a client's treatment, because many VHA patients have been in and out of treatment for various periods. For outpatients, such an approach has the benefit of preventing clinicians from attempting to "game" the indicator, given that the clinician does not know which particular score will be used as the baseline and which as the follow-up score. Gaming is less preventable for the inpatient measures, because clinicians can identify the baseline assessment, which occurs at discharge.
Data on veterans' sociodemographic and diagnostic characteristics were obtained from the VHA administrative workload files: the Patient Treatment File, the Outpatient Encounter File, and the Outpatient Care File. The Patient Treatment File is a discharge abstract file that contains basic data on all completed episodes of inpatient care. The Outpatient Encounter File and the Outpatient Care File document the date, clinic type, and diagnoses pertaining to each outpatient clinic contact; data from these two files document all VHA outpatient service delivery and were used to construct the continuity-of-care measures described below.
Our analytic sample consisted of three groups of patients: discharged inpatients (those with a GAF rating at the end of an inpatient stay and a subsequent outpatient GAF rating), new outpatients (veterans receiving outpatient services in each fiscal year who did not have any outpatient contacts in the last quarter of the previous fiscal year and thus are assumed to have begun a new episode of outpatient care), and continuing outpatients (those who had at least one outpatient visit in the last quarter of the previous year). For a patient to be included in the sample, the second outpatient GAF rating in each case had to have been made between 90 and 180 days after the initial rating (inpatient or outpatient).
GAF data that met these conditions were available for 173,435 veterans who received outpatient mental health services in 2002 (50,032 new outpatients and 123,403 continuing outpatients) and 8,350 inpatients. The veterans for whom two GAF ratings were available represent 31 percent of all veterans who had two outpatient mental health contacts between 90 and 180 days apart and 6 percent of the inpatients who had at least one outpatient contact between 90 and 180 days after their inpatient contact. The mean±SD baseline GAF score was 40.7±13.9 for inpatients and 53.6±11.3 for outpatients. The patients in our samples received services at more than 129 different VA medical centers (VAMCs).
GAF change measure. The primary outcome of interest was change in GAF scores, computed as the difference between the initial GAF rating and the last rating that occurred between 90 and 180 days later.
Continuity-of-care measures. Regularity of care was measured by the number of months in the six months after the initial assessment in which the veteran had at least one visit (range, zero to six visits). Continuity of care across organizational boundaries was examined by a measure indicating whether a veteran discharged from an inpatient psychiatry program received any mental health outpatient treatment during the first 30 days after discharge. Intensity of care was measured as the total number of visits between the initial GAF and the last GAF within 180 days.
Risk adjustment. A major challenge to fair comparison of different levels of continuity of care is that patients are likely to differ on various characteristics that may affect outcomes, such as age, gender, and diagnosis (18). As a result, outcomes must be risk-adjusted for differences in sociodemographic and clinical characteristics. To risk-adjust GAF change measures, we identified as many potentially confounding patient characteristics as data availability allowed. We hypothesized that patients who had more severe disability, as represented by greater degree of service-connected disability, or greater social disadvantages, as indicated by minority status, being unmarried, and having a lower income, would show less improvement. We also hypothesized that because diagnoses represent varying severity of illness, these measures should also be included in the model. Coding was based on clinicians' assessments with respect to diagnostic measures, and VHA administrative records were used for sociodemographic data. There was no formal operationalization of these measures.
Sociodemographic characteristics used as risk adjusters include age, gender, ethnicity, income level, and marital status. Data were also available on the receipt of VA compensation (10 to 49 percent disability, greater than or equal to 50 percent disability, or no VA disability rating). In addition, ICD-9 psychiatric diagnoses were grouped into nine non-mutually exclusive clusters on the basis of inpatient and outpatient diagnostic information from the current fiscal year. Veterans with dual diagnoses were also represented by a dichotomous variable.
We also included in each model the baseline GAF score and a measure of the number of days between the client's first and last GAF to risk-adjust for potential regression to the mean.
The analysis proceeded in three stages. First, a series of analyses of variance (ANOVAs) and chi square tests was conducted to identify significant differences between the three patient groups. Next, veterans were divided into three strata based on the number of clinical contacts they received, and dichotomous measures were created to represent whether the veterans received medium- or high-intensity treatment, with veterans who received low-intensity treatment serving as the reference group. A similar approach was used to create dichotomous measures for discharged inpatients and continuing outpatients, but, rather than dividing service use into three levels, we divided it into four, because the range of values was substantially wider.
Next, in the principal analysis, we examined the degree to which each continuity-of-care measure was positively associated with GAF change scores, with potentially confounding factors controlled for. Because of their intercorrelation, measures of regularity of care, intensity of treatment, and continuity of treatment were examined in separate models. In these analyses random effects were modeled for site by using an unstructured covariance structure, thereby adjusting standard errors for the correlated nature of the data in these models (for the potential autocorrelation of observations within sites). This technique is often referred to as hierarchical linear modeling (HLM) (25). The PROC MIXED procedure of the SAS software system was used for these analyses. To examine the proportion of all explained variance for each model, we calculated pseudo R2 statistics (26,27,28).
Consistent with the population served by the VHA, the sample contained white middle-aged to elderly men on average, although the inpatient sample had a large percentage of African Americans (t1). As would be anticipated, individuals in the inpatient group had more serious psychiatric diagnoses and lower baseline GAF ratings on average. The three groups differed significantly with respect to all measured characteristics, except for the proportion that were male.
With the exception of the intensity of care for discharged inpatients, the unadjusted GAF change scores generally increased with greater levels of continuity of care (t2). For example, the average GAF change score for discharged inpatients who had at least one outpatient visit within 30 days was 7.51, compared with 6.97 for those without such a visit.
Results were somewhat different after hierarchical linear models were used to adjust for site-level autocorrelation and clients' sociodemographic status, baseline GAF score, and diagnostic status. For example, among discharged inpatients, those with the highest level of intensity of care showed an average improvement in their GAF change scores that was 1.38 greater than those with the lowest intensity of such care (t3). In addition, for each month of treatment over six months, these patients experienced a .69 increase in their GAF change scores for a total of 4.1 points over the entire six months. Discharged inpatients who had an outpatient discharge within 30 days of discharge showed an increase in their GAF change score of approximately 1 point greater compared with those who did not.
New outpatients who had the highest intensity of care had only one quarter of a point gain in their GAF change score over those with the lowest intensity of care. Over a six-month period of treatment, new outpatients experienced an increase of only .3 in their GAF change score for each month they received services, for a total 1.8-point increase, a statistically significant improvement. When data for continuing outpatients were examined the results were either insignificant or in the opposite direction to what was hypothesized, with greater intensity associated with small but statistically significant declines in the GAF change scores.
The following covariates had a significant relationship with the GAF change measure for the three models that used data on discharged inpatients: age, married, divorced or separated, service connected at above 50 percent, diagnosis of schizophrenia, diagnosis of posttraumatic stress disorder, diagnosis of drug dependence or abuse, diagnosis of personality disorder, and baseline GAF score. For the two models that used data for new outpatients, the following covariates were significant: age, black, Hispanic, gender, married, divorced or separated, service connected at below 50 percent, service connected at above 50 percent, diagnosis of schizophrenia, diagnosis of posttraumatic stress disorder, diagnosis of bipolar disorder, diagnosis of anxiety disorder, diagnosis of personality disorder, dual diagnosis, and baseline GAF score. For the two models that used data on continuing outpatients, the following variables were significant: age, black, Hispanic, gender, married, annual income (log), service connected at below 50 percent, service connected at above 50 percent, diagnosis of schizophrenia, diagnosis of posttraumatic stress disorder, diagnosis of drug dependence or abuse, diagnosis of alcohol dependence or abuse, diagnosis of bipolar disorder, diagnosis of personality disorder, and baseline GAF score.
Although the pseudo R2 for these models was small, the R2 is indicative of the predictive power of the models at the individual patient level, and the focus of this study was on clinical improvement at the population level.
In this study we used national client-level administrative data to investigate the degree to which continuity of care was associated with improved mental health status, as measured by GAF change scores. We found that for discharged inpatients and new outpatients, several continuity-of-care measures were associated with greater improvement of GAF change scores. For continuing outpatients, in contrast, high intensity of care was associated with lower GAF change scores, and regularity of care was not significantly associated with the GAF change measure at all. One possible explanation for the findings is that in the process of making the transition from being without care to outpatient care, or from inpatient care to outpatient care, continuity of care is especially important to achieving positive outcomes. Patients may be especially vulnerable during treatment transitions, and thus continuity of care may be particularly important during such periods. In contrast, for continuing outpatients, greater intensity of care may reflect clinical deterioration, and thus the direction of causality is reversed. In the latter case, instead of continuity of care resulting in better outcomes, as we observed in the transitional situations, clinical deterioration may have resulted in more intensive service use among continuing patients.
Several methodologic limitations of this study must be noted. First, as with most administrative data sets, service use measures do not reflect care received outside the VHA. However, data from other studies (29,30) suggest it is likely that a relatively low percentage of the clients in the sample received outpatient care from a non-VHA source.
Second, because this study did not use random assignment, there could have been important differences between veterans with high and low levels of continuity of care that affected the observed outcomes. To address this possibility we used an assortment of measures to control for potentially confounding factors, including measures of diagnostic status, sociodemographic characteristics, and the GAF baseline score, but unmeasured characteristics may have biased our results in unclear ways.
A third limitation was that we had no data on the relationship between clients and their providers—for example, on the providers' skill and training or on the quality of outpatient treatment, factors that may also affect changes in functioning regardless of continuity of care. In general, narrow operational measures of continuity of care may leave out important aspects of the clinical relationship and the supportiveness of the context in which it is provided. Furthermore, the GAF is a single-item measure whose reliability and validity has not been well demonstrated in this real-world practice setting. However, it is notable that Moos and colleagues (10) found significant relationships between GAF ratings extracted from the same data file as the one used in this study and psychometrically sound measures. In addition, in a related study (11) we found that measures derived from the GAF score appeared to demonstrate discriminant validity.
A fourth limitation is that there may have been sampling bias. A large number of clients were not in our sample because they did not have a second GAF, either because it was not recorded or because an outpatient visit did not occur after their inpatient discharge or their first outpatient visit. These clients were generally less severely ill than clients who had two GAFs and who were thus included in our sample (31).
Another limitation is that, because the analyses were based on a large sample and thus had substantial power, some statistically significant findings may not have been clinically meaningful. There is no standard for determining how large a change in the GAF score is clinically meaningful. However, in a secondary analysis of data from a previous clinical trial we found that small differences in the GAF (2.2 points) that favored clozapine over haloperidol paralleled significant differences in other accepted measures, such as the Positive and Negative Syndrome Scale, that were found in a study that compared these two medications (32). In addition, we previously found that after adjustment for sociodemographic and other diagnostic measures, clients with a diagnosis of schizophrenia had a GAF score that was 4.2 points lower than that of other clients at the time of discharge from a mental health inpatient facility (11). Thus small changes in the GAF score may be clinically meaningful. This discussion and that which follows are not meant to imply that a GAF change score of either 2.2 or 4.2 should be considered a hard standard. Rather, they should be viewed as useful reference points in considering clinical significance.
Although the unadjusted GAF change scores for discharged inpatients were mostly above 6 and thus clinically meaningful, about half of the GAF change scores for new outpatients were moderately meaningful in a clinical sense (between 2 and 3), and none of the GAF change scores for continuing outpatients were clinically meaningful (t2).
Relationships between continuity of care and GAF change scores that were statistically significant varied in magnitude (t3). The adjusted GAF change score associated with the highest levels of regularity of care—that is, when an outpatient mental health visit occurred every month, if considered over the whole six-month period—was clinically meaningful for discharged inpatients (net difference of 4.1 over six months) and had possible clinical significance for new outpatients (net difference of 1.8 over six months). None of the other significant relationships met this standard of clinical meaningfulness.
Finally, our reliance on a VA sample may have limited the generalizability of the findings to other populations or health care systems. For example, veterans are overwhelmingly men and tend to be more elderly than patients served in other health care systems. In addition, because the VHA is an integrated health care system, there may be more support for coordinating care than in most health care systems.
These results differ from those of three previous studies that examined the relationship between continuity of care and client outcomes (2,3,4). Those studies found few positive and statistically significant relationships between continuity-of-care measures and a variety of desirable client outcomes. However, these three studies involved much smaller samples of clients (from 1,600 to 4,200). In addition, clients examined in two of these studies had received care in specialized intensive treatment programs for a specific condition (substance abuse or posttraumatic stress disorder), in contrast with the clients examined in this study, who had a much more diverse set of conditions and who had received a variety of types of care in a large mental health care system. Furthermore, although continuity-of-care measures pertained only to the treatment after discharge from intensive treatment in two of the earlier studies, the baseline outcome measures were obtained at admission. In this study, baseline outcome measures for inpatients were obtained at discharge (and for outpatients at the time of the first outpatient contact).
In contrast with several earlier studies, we found several positive and statistically significant relationships between measures of continuity of care and client outcomes, although these relationships were observed only in transitional treatment situations in which continuity of care may be especially important, and only a few of these could be confidently said to be clinically meaningful. Although continuity-of-care measures are widely used as performance indicators, research to date has not shown that continuity of care either by itself or in interaction with other features of service delivery ultimately improves clients' well-being.
The authors are affiliated with the Department of Veterans Affairs Northeast Program Evaluation Center in West Haven, Connecticut, and with the department of psychiatry of Yale University in New Haven. Address correspondence to Dr. Greenberg at Northeast Program Evaluation Center, VAMC, West Haven, Connecticut 06515 (e-mail, email@example.com). This article is part of a special section on the Global Assessment of Functioning scale.
Characteristics of veterans who participated in a study of continuity of care and client outcomes
Global Assessment of Functioning (GAF) change scores in a sample of veterans, by continuity of care
Relationship between continuity-of-care indicators and changes in Global Assessment of Functioning (GAF) scores in a sample of veterans