Continuity of care has been most frequently defined as "a process involving the orderly, uninterrupted movement of patients among the diverse elements of the service delivery system" (1) and has been touted as being critical to positive outcomes for persons with severe and persistent mental illness for more than 50 years. Continuity has recently been identified as an important service principle and performance measure for mental health services (2,3,4) and general health services (5,6).
Despite a theoretically and conceptually rich literature (7) and recognition of the importance of continuity of care, reports of poor and variable continuity persist (7,8,9,10,11). Efforts to improve the continuity of mental health services at the individual, program, and system levels include specific service changes, such as discharge planning, and system changes, such as services integration (4,9). Yet there is little empirical evidence to support the notion that continuity leads to improved outcomes.
Because continuity of care is a complex, multidimensional process that occurs at the interfaces of multiple services in the trajectory of a patient's care according to changing needs, operationalizing the concept is not trivial (12). In fact, examination of the literature reveals that this topic has been repeatedly abandoned by research teams. The paucity of evidence for improved outcomes is probably attributable to both a lack of validated multidimensional measures and a lack of studies designed to address the inherent service complexities (7,12).
Other authors have recommended longitudinal studies that link data on delivery of care and "information about need, not only at a specific time but also about transitions in need in time" (13) and the need for "careful baseline measurement of all potentially confounding variables and the use of multivariate methods of analysis" (12). Among the few longitudinal studies that have been done, participants are rarely followed for more than 12 months, which would allow assessment of the impact of continuity over longer periods.
The purpose of the study reported here was to examine the association between continuity of care—measured prospectively and comprehensively with use of a new multidimensional instrument, the Alberta Continuity of Services Scale for Mental Health (ACSS-MH)—and both health outcomes and costs in an 18-month longitudinal cohort of patients with severe mental illness. We report on the associations with health outcomes. A companion paper in this issue of Psychiatric Services outlines relationships with health service costs (14).
The study used an epidemiologic design. In classical prospective cohort studies, participants are grouped according to "exposure status," and the incidence of "disease" at the end of a given period is observed. The relationship between exposure and disease is then tested, with adjustment for other variables that display confounding characteristics in the data (15). In this study the analogue for exposure status is continuity of care over the period, and the analogues for disease status are health outcomes. Although causality can be confirmed only through a cumulative literature of such observational studies, these designs can be very useful at an early stage of knowledge. Such is the case for continuity of care, for which weaker designs—for example, cross-sectional studies using secondary data—are still published. It is also prudent to have a fuller understanding of a phenomenon through observational studies before mounting experimental designs, which are associated with substantial feasibility concerns, such as in randomizing large numbers or indeed whole systems of care. The study reported here was the second phase of a research program that also included development of measures and a randomized intervention trial.
All patients who presented for care to 70 directly funded inpatient, outpatient, emergency department, and community mental health service sites in three health regions in Alberta, Canada, between March and July 2001 were eligible for participation. Because Canada has a publicly funded universal health care system, a majority of services across the full continuum were provided by a single payer to all patients in each region. Inclusion criteria were a confirmed diagnosis of severe mental illness (psychotic disorders or bipolar or unipolar mood disorders of at least 24 months' duration) on the Mini-International Neuropsychiatric Interview (MINI) (16), age between 18 and 65 years, and not being under guardianship or receiving involuntary or forensic care. Because this was an effectiveness study, individuals were not excluded for reasons of comorbid psychiatric, physical, or substance use disorders.
The ACSS-MH was developed empirically from a pool of attributes of continuity extracted from 305 theoretical and empirical articles and 36 interviews with patients with severe mental illness; the list of attributes was reduced through pilot studies (17). The instrument includes both patient- and observer-rated scales, to ensure incorporation of the patient's perspective, and also has an independent "objective" and non-recall-dependent assessment of continuity. The patient-rated scale has 43 items scored on 5-point response scales in three subscales: system fragmentation (perceived discontinuity across services), relationship base (perceived importance of a consistent and dependable relationship with the primary caregiver and treatment team, and responsive treatment (characterized as the patient's experience of specific service actions in response to needs). Examples of items from the three subscales, respectively, are "I've had to repeat my history every time I need help," "My primary caregiver asks me about more than just my symptoms," and "I am reminded of appointments or called if I miss appointments."
A 37-item version of the patient-rated scale, which has a possible total score of 185, was used for this initial analysis, because six items had numerous "not applicable" responses. The observer-rated component contains 17 indicator items scored on 3- or 4-point scales (total possible score 59) on the basis of prospectively collected service and need information (18,19,20). Subscales have not been developed for the observer scale, but further analysis of subscale associations with outcomes and relationships between subscales and items across both scales is in progress. An example of an item from the observer-rated scale is the number of 30-day periods without services when they were needed.
In our own studies and in an independent study among similar patients, psychometric characteristics of the ACSS-MH have been encouraging (17,20). Good internal consistency (.92, .86, and .78) and split-half reliability (.88, .83, and .77) have been demonstrated for the respective subscales of the patient-rated scale. Good internal consistency (a range of .60 to .95 across items) and reasonable interrater reliability have been found for the observer-rated scale (the average percentage agreement within one response was 85 percent).
Other study measures and follow-up process
Variables measured at baseline included demographic characteristics (age, sex, education, employment, income, disability status, and ethnicity) and clinical characteristics (recruitment location, age at diagnosis, primary diagnosis, previous hospitalizations, suicidality, and comorbid psychiatric, substance use, and medical disorders). Severity of symptoms was also rated by research staff at baseline by using the Brief Psychiatric Rating Scale (BPRS) (21). Baseline ratings of problem severity and need were collected by mail from participants' primary mental health care providers by using the Colorado Client Assessment Record (CCAR) (22). Social support was measured by research staff in the first follow-up contact by using the Social Provisions Scale (23).
Participants were prospectively followed and contacted by telephone (or in person as needed) at two- to three-month intervals for measurement of all health service use events, including inpatient, outpatient, emergency department, crisis, community clinic, residential, vocational, addictions, self-help, and housing services as well as medications received (name, frequency, and dosage). A standardized assisted-recall interview was used. Specially designed cards for prospective recording of service events were provided and were used by some participants to aid recall. Research staff had access to records at all service sites to verify information or locate patients who were lost to contact.
At the end of follow-up, the ACSS-MH patient-rated scale and the outcome measures were administered in a combination of face-to-face and telephone interviews. Outcome measures included the Multnomah Community Ability Scale (MCAS) for community functioning (24), the BPRS for endpoint severity of symptoms (21), the Service Satisfaction Scale-10 (SSS-10) (adapted with the author's permission ), the Wisconsin Quality of Life Inventory (WQLI) for disease-specific quality of life (26), and the EQ-5D (both the five-item index score and the 100-point visual analogue scale score) for generic quality of life (27). ACSS-MH items were rated on the basis of participants' service experiences for the full follow-up period. When necessary, the interviewer assisted respondents in recalling service events that were relevant to making ratings for that particular period.
Research staff had systematic training according to authors' training materials (the MCAS and the CCAR) or practice, discussion, and reconciliation (the EQ-5D, the WQL, SSS-10, and the ACSS-MH). BPRS training consisted of videoconference-based demonstration interviews conducted by psychiatrist co-investigators with group ratings and reconciliation, followed by co-ratings of the first interviews in each region by psychiatrist-research staff pairs until reasonable concordance was achieved (about seven to 12 interviews). The same research personnel administered baseline and endpoint measures in each region. It was not feasible to collect additional information about the field performance of instruments, with the exception of the ACSS-MH, which underwent a test-retest reliability check. BPRS ratings and service use were the only variables with repeated measures.
The process leading to ACSS-MH observer scale ratings had several steps. First, a standard form for chart reviews was designed that summarized all reported service events (at all locations) and medications (for verification) and key information needed for rating ACSS indicators. Most important, the match between identified need and service responses as they changed over the follow-up period were assessed. Two chart reviewers were trained in the use of co-reviews with the first author for ten to 12 charts in each region, followed by discussion and reconciliation. The addition of chart information resulted in a standard, comprehensive longitudinal care record for each participant. Observer ratings were then applied by a single independent rater (that is, not involved in care provision) (the first author) to each record. Interrater reliability, as reported above, was checked by the first, fourth, and fifth authors on a 10 percent subset. All participants gave informed consent for participation and data access. The study was reviewed and approved by the conjoint health research ethics board at the University of Calgary.
Because associations between continuity of care, health outcomes, and other variables are at such an early stage of explication, we used an exploratory, staged approach to the analysis, beginning with examination of bivariate relationships. First, differences between mean scores on each continuity scale according to categories of demographic and clinical variables were tested, and associations between mean scale scores for each health outcome measure according to quartiles of the continuity scales were examined. Analysis of variance was used for all bivariate testing.
A series of classical stratified analyses (15) were then used to examine the relationship between continuity and health outcome (expressed as an odds ratio for the association of the variables, dichotomized) across categories of all potential confounding third variables and expressed as Mantel-Haenszel adjusted odds ratios (28) (data not shown). In most cases, variables did not meet the classical definition of confounding (29), so their inclusion in multivariate modeling could have introduced bias (30).
Finally, multiple linear regression models were developed for the associations between patient and observer-rated continuity (independent variables) and EQ-5D VAS and EQ-5D index scores (dependent variables), respectively, with adjustment for confounding variables. Separate models were produced for each continuity scale because they were only moderately correlated with each other (r=.36, p<.001), so it was considered inappropriate to combine scales. Significance levels for variables in the regression models were set at .05.
Endpoint continuity ratings were completed for 411 of 486 participants (85 percent). No significant differences were found between participants who were followed up and those who were lost to follow-up in terms of age, number of years of education, employment status, previous hospitalizations, duration of illness, or social support, although men (p<.01) and participants with greater problem severity at baseline (p<.05) were more likely to be lost to follow-up. No significant differences in continuity were found by region. All analyses are reported for the complete sample. Demographic and clinical characteristics are shown in t1.
The sample was somewhat biased toward women (60 percent) and toward patients with mood disorders (65 percent). Nevertheless, it did represent a relatively severely affected group in terms of comorbid psychiatric illness (67 percent), chronic illness (mean duration of illness 22 years), and functional impairment (only 18 percent of the patients were currently working). No significant change in BPRS scores was noted from baseline to endpoint, which suggests that the patients in this sample, although chronically ill, were relatively stable. The average follow-up time was 16.8±.9 months; two-thirds of the sample was followed up within a narrow range of four weeks.
The mean score on the patient-rated scale was 131±20 out of a possible 185, and the mean score on the observer-rated scale was 39±10 out of a possible 59. Score distributions were reasonably normal for most items and for the total scores on both scales. Relationships between continuity and demographic and clinical variables are presented as mean continuity scores according to categories of demographic and clinical variables in t2. Among the demographic variables, lower observer-rated continuity scores were associated with older age (up to 65 years) and lower annual household income but not with sex, educational level, current employment status, personal annual income, or ethnicity. This finding suggests that there is reasonable equity and response to need (at least in terms of care continuity) on demographic characteristics among patients who have already gained access to these health care systems.
Curiously, no demographic variable was significantly associated with patient-rated continuity. With respect to clinical variables, higher observer-rated continuity scores were associated with a diagnosis of psychosis, no suicidality or comorbid substance abuse, and lower problem severity scores. A diagnosis of psychosis, no comorbid substance abuse, and lower problem severity scores were also significantly associated with higher patient-rated continuity.
t3 shows bivariate associations between mean health outcome scale scores and quartiles of both continuity scales. Consistent patterns of association were seen across all variables except BPRS scores. Associations between observer-rated continuity and quality of life (both disease-specific and generic) and service satisfaction and between patient-rated continuity and quality of life, functioning, and satisfaction were statistically significant at the .01 level.
Associations between continuity and health outcomes held up in multiple linear regression models (t4). The relationship between patient continuity ratings and the EQ-5D 100-point visual analogue scale remained highly significant after adjustment for income and baseline problem severity. In addition, the relationship between observer-rated continuity and EQ-5D index scores remained significant after adjustment for diagnosis, age, suicidality, and income.
We observed consistent associations between both patient- and observer-rated continuity of care measures and health outcomes in a prospective cohort of persons with severe mental illness across all service events over a 17-month period, after adjustment for empirically defined confounder variables. This pattern of positive associations with several outcome measures is very encouraging in light of the minimal and very mixed evidence to date.
Current evidence is based on studies that used simple, provider-defined measures of continuity. Other methodologic challenges that may have contributed to mixed findings include variations in outcome measures, insufficient sample sizes, low follow-up rates, weak designs, overreliance on secondary data, and overadjustment of variables (4,13,18). Significant associations have been found in other studies between continuity and symptom scores (31,32), inpatient service use (33,34), and lower Medicaid costs (34), but no evidence for relationships between continuity of care and quality of life (31,32), functioning (31,32,35), Health of the Nation Outcome Scales (HoNOS) scores (35), general life or health satisfaction (34), homelessness, and inpatient or outpatient days (32) has been found.
This is the first study of which we are aware to use a comprehensive, psychometrically validated measure of continuity that includes both the patient's perspective and independent observer ratings. In addition, we followed, with minimal attrition, a large sample of patients drawn from all service sites in integrated regional health care systems and collected information about need and health service use across the full spectrum of services. We included a comprehensive set of self- and observer-rated outcome measures as well as demographic and clinical variables that may explain or confound the relationship of interest.
Nevertheless, a number of characteristics of this study may have affected the findings. First, given the study's observational design, the causal direction of the association between continuity and outcomes cannot be determined. Although the continuity measures were made in relation to changing need and the full series of service events were collected prospectively and longitudinally, outcomes were measured only at the endpoint. Ideally, baseline scores would have been taken for one or more of the outcome measures. Our decision about choice and timing of measures at baseline balanced the known performance of these measures in this population, concerns about respondent burden, study resources, and a separate plan for a subsequent experimental study. Although this study design can test for the existence of an association and describe relationships with potentially confounding variables, it cannot support an unequivocal conclusion that better continuity leads to better outcomes. It is possible that persons who have better functioning and quality of life were more capable of continuity-maintaining behaviors such as appointment compliance.
Second, enormous diversity in care patterns was observed, ranging from no follow-up at all to minimal care—for example, intermittent appointments with a single provider—to intensive daily support by several community or institutional providers. Although this heterogeneity contributes to healthy variability in measurement and enhances generalizability, it also creates challenges in comparability of information used to judge continuity across patients. Third, because of time and resource limits, our sample was drawn from patients who were already in treatment. Although such patients constitute the population to whom continuity is most relevant, changes in health status with treatment would not be as marked as for individuals receiving a first episode of care. Thus continuity and outcome associations may have been attenuated.
Fourth, neither patient- nor observer-rated continuity scales were perfect measures. The patient scale required recall over a fairly lengthy follow-up period as well as judgments about a relatively abstract concept across multiple service experiences. The observer scale also required subjective judgments about need and appropriateness of care. This approach was necessary to advance the science of the measurement of continuity from simple to more dynamic measures, but it brings with it the inherent issues of reliability and validity of judgments for some of the less concrete observer-rated items. Not surprisingly, we noted enormous variation in the quality of existing records that was not fully mitigated, even with multiple source verification. Although we were able to record all significant care events in a total system, our ratings were nevertheless based on relatively high-level indicators of continuity. Perfect measurement would involve accompanying each patient and observing service operation virtually daily—a level of omniscience not possible in larger samples. Despite this limitation, the intent of having continuity scales from both perspectives was for each scale to mitigate the weaknesses of the other. Finding a similar pattern of results across both scales is encouraging in this respect.
Fifth, the magnitudes of difference in continuity scores for different outcome levels were not large. A possible explanation is that continuity plays only a partial role in outcomes, or simply that the noise in measurement or other biases made it difficult to find larger effect sizes.
Sixth, we were unable to assess the extent to which our sample represented all eligible patients, because compliance among care providers with our system for tracking study referrals was poor, and our ethics board did not permit us to review nonparticipants' charts. Our sample was almost certainly biased toward patients who were less seriously ill and more compliant, who were referred by more interested and conscientious providers. Thus the levels of continuity described here are probably optimistic.
Despite these uncertainties, internal validity would have been more seriously compromised by higher levels of attrition. Although Fischer and associates (36) have demonstrated that individuals who are lost to follow-up in psychiatric studies are more likely to be of lower socioeconomic status, male, from an ethnic minority group, and single and to have histories of aggression and substance abuse, we found no significant differences in age, education, income, number of previous hospitalizations, duration of illness, or social support between those who were followed up and those who were not and only slight differences in sex and baseline problem severity.
These findings were confirmed anecdotally by field staff, who indicated that as many participants dropped out because of recovery and the wish to put the illness period behind them as did those who dropped out because of apparently poor functioning. This relatively nondifferential loss might have been attributable to our follow-up methods, which included community-based assertive outreach, not simply the telephone and mail-based contact attempts used by Fischer and colleagues. Even though small, the direction of bias introduced by loss to follow-up would likely be toward a finding of no association. Finally, although our field research associates were trained not to intervene in any way except in life-threatening circumstances, the regular communication itself could have contributed to better continuity and outcomes—possibly inflating the measured associations.
The utility of our continuity instrument is currently limited to the research context. Neither scale is as yet refined enough for use in routine performance measurement. The magnitude of score differences would require large samples for the patient-rated scale, and extensive knowledge of the patient's needs and service events across settings would be required for the observer scale. Future research may be able to identify the best item subsets for performance measurement in the continuity domain. More research is also needed to provide an understanding of why patient- and observer-rated continuity are only moderately correlated and to fully explain the causal relationships among continuity, outcomes, and the many and complex intervening variables. Intervention studies are needed that go beyond single program interventions to multiple program changes at the system level.
The implication of this research for practice is that improving continuity in mental health services may contribute to improved outcomes. Specific initiatives are suggested by the scale items, such as ensuring timely record sharing and adjusting frequency of appointments according to need. Although the evidence base for many such strategies is still incomplete in relation to health outcomes (4), there is room for service changes motivated by quality- and client-centered care. The findings also clearly point to patients who are at risk of poor continuity and who thus might benefit from targeted interventions. In particular, baseline CCAR scores were very predictive of poor continuity. This instrument might be useful as an eligibility screen for interventions designed to improve continuity.
In this study we found consistent, positive relationships between continuity of care and quality of life, community functioning, and service satisfaction among persons with severe mental illness. Associations between continuity and quality of life held up in multivariate models. Although further research is clearly needed before the causal web of associated relationships is fully understood, these findings suggest that efforts at improving continuity in and among mental health services may be fruitful.
This study was funded by grant RC2-2709 from the Canadian Health Services Research Foundation, cosponsored by the Alberta Heritage Foundation for Medical Research, the Alberta Mental Health Board, the Institute of Health Economics, and an unrestricted grant from Eli Lilly (Canada) Inc. The authors thank Doreen Ma, M.Sc., Lynne Kostiuk, M.Ed., and Gisele Marcoux, B.A. The interpretation and conclusions contained herein are those of the researchers and do not necessarily represent the views of the Government of Alberta or Alberta Health and Wellness.
Dr. Adair is affiliated with the department of psychiatry and the department of community health sciences of the University of Calgary, 3330 Hospital Drive, N.W., Calgary, Alberta, Canada T2N 4N1 (e-mail, firstname.lastname@example.org). Dr. McDougall is with the department of psychiatry of the University of Calgary and the Alberta Mental Health Board. Dr. Mitton is with the department of health care and epidemiology of the University of British Columbia in Vancouver. Dr. Joyce, Dr. Gordon, and Dr. Costigan are with the department of psychiatry of the University of Alberta in Edmonton. Dr. Wild is with the Centre for Health Promotion Studies of the University of Alberta. Ms. Kowalsky, Ms. Pasmeny, and Ms. Beckie are with the Alberta Mental Health Board. This paper was presented in part to the Canadian Academy of Psychiatric Epidemiology in Halifax, Nova Scotia, on October 30, 2003.
Demographic and clinical characteristics of participants in a study of continuity of care among patients with severe mental illness in Canada
Relationships between Alberta Continuity of Services Scale for Mental Health scores and demographic and clinical variables
Bivariate analysis of mean scores on health outcome measures in a study of continuity of care among persons with severe mental illness, according to Alberta Continuity of Services Scale for Mental Health (ACSS-MH) quartiles
Associations between Alberta Continuity of Services Scale for Mental Health scores and EQ-5D scores in multiple linear regression models