Because the data already exist, and their acquisition is a sunk cost, the efficiency argument that initially justified the design becomes a moot point. 5 However, its application in retrospective database studies, for example those using electronic health records and insurance claims where longitudinal person-level data have already been captured and the analysis is performed solely within the resident data, has become commonplace. The case-control design was originally developed to support prospective studies in situations where data on subjects were costly to acquire, and study budgets did not allow for recruiting and following large cohorts. 2 Since this landmark finding, researchers increasingly have applied the design, occasionally generating spectacular findings leading to headlines in major news outlets, such as a recent study linking anticholinergic drugs to an increased risk of dementia, 3 or another recent study linking immunosuppressants to a lower risk of Parkinson's disease. Perhaps the greatest success of the case-control design stands its contribution to the evidence that smoking causes lung cancer. The comparison focuses on differential exposure to the agents of interest in the two groups greater exposure amongst the cases than amongst the controls suggests a possible positive association. Often, one matches controls to cases based on characteristics such as age and sex to make them more comparable. At the very least, negative control exposures should be used to prove that the concerns raised here do not apply.Ĭase-control 1 studies consider the question “are persons with a specific disease outcome exposed more frequently to a specific agent than those without the disease?” Thus, the central idea is to compare “cases,” ie, individuals that experience the outcome of interest with “controls,” ie, individuals that did not experience the outcome of interest. We argue that this design should no longer be used in these types of data. Moreover, by focusing on cases and controls it opens the door to inappropriate comparisons between exposure groups, leading to confounding for which the design has few options to adjust for. Although the case-control design in general is not at fault, its application in retrospective database studies, where all exposure and covariate data for the entire cohort are available, is unnecessary, as other alternatives such as cohort and self-controlled designs are available. In contrast, applying a self-controlled design to answer the same questions using the same data reveals far less bias. Both replication studies produce effect size estimates consistent with the original studies, but also generate estimates for the negative control exposures showing substantial residual bias. We include large sets of negative control exposures (where the true odds ratio is believed to be 1) in both studies. The second focuses on dipeptidyl peptidase-4 inhibitors and acute pancreatitis, using a nested case-control design. The first investigates isotretinoin and ulcerative colitis using a simple case-control design. To demonstrate the shortcomings of applications of this design, we replicate two published case-control studies. However, results of these studies often cannot be replicated, and the advantage of this design over others is questionable. If you make a study to find out the percentage of smokers among lung cancer patients versus the percentage of smokers among those who do not have lung cancer then this is case-control.Ĭonversely, if you follow smokers and non-smokers in time and see who develop lung cancer or not, then this is a cohort study.The case-control design is widely used in retrospective database studies, often leading to spectacular findings. For example you can say the incidence of pulmonary tuberculosis was xx% in Asian immigrants who immigrated to the US in the period (2005-2010).Ĭlassic example is Smoking and Lung Cancer: Note that cohort study can be retrospective i.e you follow the cohort in past time (history) by checking their hospital records for example. You can measure incidence in prospective and retrospective cohort studies. Therefore you use Relative Risk and Attributable Risk.
Retrospective Cohort, you are studying the risk factor and see if you can associate a disease to it. Note that Case-Control cannot be prospective because you already have the outcome (disease versus no disease) you are just studying what has lead to this outcome. Case-Control Studies are typically done for rare diseases, so you have a rare disease and you go and grab that small number of patients and compare them with a control group and study their history and see if there was a risk factor that might have caused their disease.
Case-Control Study, you are studying the disease and see if you can associate risk factors to it.