Get Alert
Please Wait... Processing your request... Please Wait.
You must sign in to sign-up for alerts.

Please confirm that your email address is correct, so you can successfully receive this alert.

Letters   |    
Large Data Sets Are Powerful
Robert E. Drake, M.D., Ph.D.; Gregory J. McHugo, Ph.D.
Psychiatric Services 2003; doi: 10.1176/appi.ps.54.5.746

In Reply: We are in complete agreement with the comments of Drs. Pandiani and Banks and with many of those of Dr. Segal. They have described the other side of the same coin. Theirs is the more commonly viewed side, the one that inspires numerous and increasing efforts to make use of existing large data sets, and the one that is presumably familiar to most readers of the journal. We were asked by the editor of Psychiatric Services to illuminate the dark side precisely because it is less well known.

Our contention is that large data sets "can be" dangerous, not that they are inherently dangerous. Our goal was not to stigmatize research using large data sets but rather to remind the scientific community of the frequently overlooked limitations of large data sets and of the seductive ways that they can lead investigators astray. A parallel editorial in the March 2003 issue of Scientific American suggests that the same concerns are pertinent in other areas of science (1). As the editors of Scientific American point out, the dangers of information overload, poor data quality, and capitalization on chance abound.

Often where there is opportunity there is liability. The use of large data sets presents many opportunities for the advancement of knowledge that is relevant for practice and policy, but it also requires careful attention to data quality and the disciplined application of statistical and inferential methods. The warnings in our editorial address the latter issues, which appear to be less salient to some users of large data sets on the basis of the journal's experience with manuscripts submitted for publication and on our experience as peer reviewers. Besides sharing the optimism of Drs. Pandiani and Banks about the potential of large data sets, we also wish that articles sent for review showed their admirable attention to quality.

Total information overload. Scientific American 288(3):12,  2003


Total information overload. Scientific American 288(3):12,  2003

CME Activity

There is currently no quiz available for this resource. Please click here to go to the CME page to find another.
Submit a Comments
Please read the other comments before you post yours. Contributors must reveal any conflict of interest.
Comments are moderated and will appear on the site at the discertion of APA editorial staff.

* = Required Field
(if multiple authors, separate names by comma)
Example: John Doe

Related Content
DSM-5™ Clinical Cases > Chapter 4.  >
DSM-5™ Clinical Cases > Chapter 18.  >
Gabbard's Treatments of Psychiatric Disorders, 4th Edition > Chapter 53.  >
Gabbard's Treatments of Psychiatric Disorders, 4th Edition > Chapter 18.  >
The American Psychiatric Publishing Textbook of Substance Abuse Treatment, 4th Edition > Chapter 32.  >
Psychiatric News
PubMed Articles