Skip to content

Epidemiology produces too many false positive results! Do you agree? If so, what should we do about it?

March 13, 2012

Signs that say past and future

In planning for the future, we can learn from the past.  Epidemiology has been criticized for producing too many false positive research findings as well as too many findings that merely replicate previous ones.  Have we progressed from repeating studies unnecessarily described 10 years ago?  How do we know when to move from observation to acceptance of a research finding?  Where is the “sweet spot” for decisions on future epidemiologic research — that ideal balance between repeating research to confirm findings or refute previous findings and avoiding redundant research that wastes resources?

Posted by Epidemiology Branch, NHLBI

10 Comments leave one →
  1. Monika Safford permalink
    March 14, 2012 6:07 pm

    If this is a priority, NIH could make this a specific review criterion during peer review. Targeting studies on specific high risk and under studied subgroups (e.g., complex patients, minorities) can avoid purely duplicative work.

  2. Paul Sorlie permalink
    March 14, 2012 8:04 pm

    Great start with new thinking. Do you think that this is too subjective for reviewers or could they handle it? Are some components of a research proposal duplicative and others innovative? Could a review group suggest deletion of some components? I would like to know what others think.

  3. Tim Lash permalink
    March 15, 2012 1:46 pm

    Concern about false-positive epidemiologic associations stems from the insistence on counting study results instead of topic areas. The pressure to publish statistically significant results has created a machine for generating false-positive associations based on under-powered study designs. These overestimated associations must then be overcome by replication, using bigger studies that are better designed. With the accumulation of evidence, true positive associations stand the test of time. If we count topic areas (and especially topic areas for which intervention policies have been promulgated), the false-positive proportion is drastically reduced. We should worry less about false-positive associations in single studies and appreciate that few false-positive policy recommendations have been made, which is a result of patience while results accumulate and the research topic matures.

    “Duplication” is almost always a bad idea, especially for newly designed epidemiologic studies on the same topic. “Replication” is a much better term. Studies should endeavor to BOTH replicate earlier results AND advance the topic area by improving the research quality. Improvements can include larger sample size, better measurement of exposure, outcome, or important covariates, better control for confounding, or reduced susceptibility to selection bias. If an initial study is susceptible to multiple biases and based on small study size, then the replicating studies should each strive to improve the evidence base by addressing limitations of earlier studies. Replication and advance is the most productive approach; duplication alone is a bad idea unless based on existing data sets with low marginal cost to generate the result.

  4. david herrington permalink
    March 18, 2012 8:48 am

    In addition to the excellent suggestions outlined above by Tim, the population science community should make better use of tools designed to establish the generaliztion error of findings. This will not overcome the problems with sample bias and poor power due to small sample size, but will reduce some fasle positives that are due to overfitting.

  5. Denise Simons-Morton permalink
    March 24, 2012 2:06 am

    Question posed: “How do we know when to move from observation to acceptance of a research finding?”

    Answer: When you’ve done a randomized controlled trial that confirms the causal hypothesis.

    Observational epidemiology is only hypothesis generating. When substantial evidence of causality is accumulated (e.g. dozens of studies with consistent findings showing associations with the correct temporal sequence and dose-response relationships, with sufficient control for confounding), it should lead to trials to test the hypotheses that are generated.

    Examples of hypothesis confirmation by trials: CVD outcomes from blood pressure and blood cholesterol lowering. Examples of non-confirmation: CVD and mortality outcomes from arrhythmia suppression post MI, HRT, and intensive blood glucose lowering in T2DM. Substantial epidemiological evidence supported the latter hypotheses, but it was insufficient to move to “acceptance.” The problem is that you don’t know which findings will be supported by trials unless you do the trials.

    • epi-dunce permalink
      March 24, 2012 9:30 am

      But, when do you know that you have enough information to move to a trial? What do you do when a trial is impossible, unethical, or unaffordable?

  6. fred permalink
    March 24, 2012 9:48 pm

    As argued by our colleagues in Social Sciences, “false positives” is the wrong term for the problems we’re trying to describe. The risk factors we identify are not *exactly* independent of disease, just like experimental drugs in randomized trials do not have *exactly* the same effect as placebo. Thus, the null hypothesis is not true, and thus we cannot “falsely” claim to reject it. So, strictly speaking, we never generate false positives.

    This does not, of course, rule out epidemiologists (and social scientists, and others) i) grossly over-stating the statistical and/or practical significance of their results ii) getting the direction of effect wrong iii) attributing causal claims to associations which do not provide such justifications. Problems i) and ii) are known as “Type M” (for magnitude) and “Type S” (for sign) errors. Problem iii) is a case of the generalization problem which David Herrington mentions.

    These problems are all different. If you want to insist (reasonably enough, but not universally accepted) that science is moved forward by having replicable results, then we need to re-focus analyses on what that replication would look like. That is, we need to supplement analyses that come with honest, well-calibrated measures of what it would take to replicate the main results. We do not currently do this; for example, all the little choices one makes in unplanned analyses contribute to Type M errors, but are typically just ignored. Not having a replication population sufficiently similar to the original leads to generalization errors, as well as errors of Types M and S. Data dredging grossly inflates the Type M problem. Not accounting for measurement errors leads to errors of Types M and S; one could go on. Yet these (and other) problems are routinely addressed by weak, informal, “hand-waving” arguments, tagged onto the end of many papers just to mollify reviewers.

    What to do about this? Epidemiologists can help by looking more critically at their own results; trying to formulate, honestly, what a replication study would be expected to require would be a useful first step. Journals can help by insisting that the concerns above are addressed by referees – the good ones already do this, but not all; the “p-value culture” is a blight, and sensitivity analyses are rarely seen as important parts of papers – “more research is needed” is a mantra, but few papers say what that research would look like, or what it might achieve. NHLBI can help by insisting that reviewers appraise projects by using direct measures of their scientific contribution – i.e. that, however the results come out, they will result in replicable and useful knowledge. Considering what to do about (non-existent) “false positives” is, in large part, the wrong question to ask.

  7. gadfly permalink
    April 4, 2012 11:42 am

    Suppose investigators had to register their specific aims for epidemiology studies when they receive funding on something similar to clinical along with their analysis plan for those aims. Then journal submissions could cite the registration as justification for publication whether positive or negative findings. This would reduce publication bias and lead to more balanced scientific productivity from research funding (research that was the basis for funding by peer review would be publicly reported). Papers developed on topics not in the specific aims could be submitted to journals as results from data mining exercises or hypothesis-generating exercises and the journal review could treat them as they think appropriate.

  8. Christie Ballantyne permalink
    April 4, 2012 12:28 pm

    As someone who sees patients, is active in clinical trials and also is active in epi research, I have a different viewpoint. The epidemilogical studies, as pointed out by others, find associations and identify potential targets which may or may not be in the causal pathway for disease and thus may be good targets for therapy, either drugs or lifestyle or devices. The question of false positive is occasionally due to power, ie the observation is not confirmed, but more commonly is due to studies which show an association which are followed by trials of therapies ( vitamin E, treating homocysteine, hormone replacement therapy, etc) which fail.

    As also pointed out the success of cholesterol lowering therapies and blood pressure in reducing morbidity and mortality from CVD is impressive. It is extraordinary how the treatment of these two conditions has evolved since I finished my internship in 1983, when a cholesterol value was considered high when over 300 mg/dl and when SBP was high if greater than the age of the patient plus 100, ie if you were 75, then anything under 175 systolic was OK. Clearly the epi studies have made enormous contributions to this area, and more importantly, the combination of epi studies with new approaches in genetics when we can examine both common and rare variants will have even more profound impact on development of rational targets for drug therapies. The best example is LDL as a target of therapy. There was a nice presentation at ACC by Ference as a late breaking trial which showed the genetic variants in multiple genes that were associated with lower LDL-c levels were all associated with reduced CHD events. He went a step further and showed that if you adjusted for the LDL-c difference for each variant, there was no difference for the reduction in CHD for the 9 snps in 5 different genes.This suggests that all of the genes examined may be good targets for drugs and one of the them, HMGCR, clearly is a good target as this is the gene product which statins target.
    The most promising target for new drugs in many years is PSCK9 and every presentation show the data from the ARIC study in which individuls with a loss of function genetic variant have reduced CHD events. This does not prove that the benefits of a therapy will outway the risks of a therapy, which can only be addressed in large RCTs, but it does provide strong scientific rational for a potential theraputic drug target. The emerging studies on HDL are quite different, some of the genes associated with HDL levels are NOT associated with CHD and thus therapies which raise HDL-c levels may not have any benefit if they are target the wrong pathway.

    In summary, large well powered well phenotyped population studies with extensive genetic data represent an extraordinary opportunity to not only generate hypothesis, but to examine if “nature’s experiments” of genetic variation alter the disease process. Clearly this requires validation and it gets very complex when one considers interactions between genes and environment, but nonetheless, I feel that the approches taken by the NIH and NHLBI have been extraordinarily productive and have made a very large impact on public health, in particular the marked decline in CVD. The new explosion of genetic data will be extraorinarily helpful to guide basic scientist as to genes, gene products and pathways that are important in human disease and in providing data on which new targets make the most sense to purse for therapuetic approaches.


  1. What you have told us. «

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s