0
Get Alert
Please Wait... Processing your request... Please Wait.
You must sign in to sign-up for alerts.

Please confirm that your email address is correct, so you can successfully receive this alert.

1
Letters   |    
Large Data Sets Are Powerful
John A. Pandiani, Ph.D.; Steven M. Banks, Ph.D.
Psychiatric Services 2003; doi: 10.1176/appi.ps.54.5.745

To the Editor: In February's Taking Issue, Drake and McHugo voice their concern that "large data sets can be dangerous" (1). We believe that the large administrative and operational databases that our society generates on an ongoing basis can contribute to an unprecedented growth of knowledge in medical and behavioral sciences. Instead of focusing on the negative, we propose that researchers focus on the strengths of these data sets and develop new methodologies that will allow us to learn from our massive stores of data regarding public health, vital records, social and criminal justice programs, public and private insurance, and positive participation in society—for example, in school and in gainful employment. We live in an information-rich society. We have a responsibility to use this resource to promote the advancement of knowledge.

Administrative and operational databases have many advantages over narrowly focused, special-purpose data collection. One of the greatest strengths of these databases is comprehensiveness. Minority populations are included in numbers adequate to provide confidence in findings, identical outcome measures for relevant comparison groups exist within the databases, the problem of subjects lost to contact is minimized, and studies can be replicated at minimal cost because the data and analytical tools are already in place. Unlike experimental research, use of administrative and operational databases allows examination of treatments as they are routinely administered in community settings where best practices may not be universal. Administrative and operational databases also avoid many reactive effects of testing (2).

Criticisms of the quality of administrative and operational data frequently overlook the fact that these systems typically include strong data quality controls, such as audits and utilization review. Submission of false-positive records, such as insurance claims and death certificates, is often punishable by fine or imprisonment. Failure to report is limited by economic forces, as in the case of insurance claims, or by legal mandates, as in reports of births and deaths. Critics judge administrative and operational databases by their weakest data elements rather than by the data elements used in an analysis. In mortality databases, for instance, the objective fact of death is rarely refuted. We believe all research, including research using existing databases, should acknowledge the degree of objectivity and subjectivity involved in the creation of the data being analyzed.

The cost of research using large data sets is minimal compared with the cost of research that involves special-purpose data collection. We believe that the research community should discuss the relevance of an ethic of efficiency to medical and behavioral research. Is it ethical to conduct a very expensive study when a very inexpensive study has equal—or superior—promise of generating useful knowledge?

Our administrative and operational databases have the potential to move science forward at a rapid rate. In the past, science tended to operate deductively, seeking repeated verification of hypothesized relationships. As anomalies converged, new theories emerged to challenge the old (3). Our wealth of administrative and operational data could accelerate this process by supporting a world of research in which inductive and deductive models coexist and interact (4). Medical and behavioral researchers should explore new models for creating and sharing knowledge, models that embrace the wealth of information and analytical power that we now have at our fingertips. Instead of looking backward, medical and behavioral research should focus on developing methodologies that maximize the knowledge we extract from our administrative and operational databases.

The authors are affiliated with the Bristol Observatory in Bristol, Vermont.

Drake RE, McHugo GJ: Large data sets can be dangerous! Psychiatric Services 54:133,  2003
 
Webb EJ, Campbell DT, Schwartz RD, et al: Unobtrusive Measures: Non-reactive Research in the Social Sciences. Chicago, Rand McNally, 1966
 
Kuhn TS: The Structure of Scientific Revolutions. Chicago, University of Chicago Press, 1962
 
Glaser BG, Strauss A: The Discovery of Grounded Theory. Chicago, Aldine, 1967
 
+

References

Drake RE, McHugo GJ: Large data sets can be dangerous! Psychiatric Services 54:133,  2003
 
Webb EJ, Campbell DT, Schwartz RD, et al: Unobtrusive Measures: Non-reactive Research in the Social Sciences. Chicago, Rand McNally, 1966
 
Kuhn TS: The Structure of Scientific Revolutions. Chicago, University of Chicago Press, 1962
 
Glaser BG, Strauss A: The Discovery of Grounded Theory. Chicago, Aldine, 1967
 
+
+

CME Activity

There is currently no quiz available for this resource. Please click here to go to the CME page to find another.
Submit a Comments
Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
Comments are moderated and will appear on the site at the discertion of APA editorial staff.

* = Required Field
(if multiple authors, separate names by comma)
Example: John Doe



Web of Science® Times Cited: 7

Related Content
Books
The American Psychiatric Publishing Textbook of Substance Abuse Treatment, 4th Edition > Chapter 33.  >
Textbook of Traumatic Brain Injury, 2nd Edition > Chapter 33.  >
The American Psychiatric Publishing Textbook of Substance Abuse Treatment, 4th Edition > Chapter 49.  >
Textbook of Traumatic Brain Injury, 2nd Edition > Chapter 32.  >
Textbook of Traumatic Brain Injury, 2nd Edition > Chapter 32.  >
Topic Collections
Psychiatric News