Civil Justice Memo
No. 33 September 1997
Science in the Courts
by Peter W. Huber and Kenneth R. Foster
On March 17, 1997, the U.S. Supreme Court agreed to review the case of General Electric v. Joiner. At issue is how critically an appellate court may review a trial judge’s decision to admit or exclude expert scientific testimony. The case was filed by an electrician and former smoker, who blamed his lung cancer on exposure to trace PCBs in transformer oil. The trial judge granted summary judgment for General Electric, concluding that Joiner’s “expert” witnesses had in fact submitted no credible scientific evidence supporting the claimed link between PCB exposure and lung cancer. A divided panel of the Eleventh Circuit Court of Appeals reversed. The two judges in the majority concluded that “particularly stringent” review of such decisions is in order, but only when a trial judge rejects expert testimony, not when the judge accepts it.
The Supreme Court isn’t likely to approve this biased and asymmetric standard of appellate review. But Joiner is important for another reason: it presents the High Court with its second opportunity to discuss the core evidentiary question the Court first addressed in Daubert vs. Merrell Dow Pharmaceuticals four years ago. What is “scientific knowledge” and when is it reliable?
These deceptively simple questions have been the source of endless controversy. Whether “creation science” should be taught in schools along with the theory of evolution turns on whether creation science—or evolution, for that matter—can fairly be called “science” rather than “belief,” “faith,” or something else. In the courtroom, the outcome of litigation—criminal, paternity, first amendment, and civil liability cases, among others—often turns on scientific evidence, the reliability of which may be hotly contested. Sometimes, as with DNA testing in capital murder trials, the reliability of a claim presented as “scientific” is a matter of very grave consequence indeed.
When presented with a witness who proposes to give expert testimony about some aspect of science, a judge must make an upordown call on admissibility. The evidence either will be presented to the jury or it will not. Judges are gatekeepers. The jury has the separate, more flexible job of “weighing” whatever evidence has been admitted. Decisions on admissibility may, and often do, determine the outcome of a trial, but that’s beside the point. The decision the judge makes is not, in itself, directed at the parties to litigation. It is directed at discrete packages of testimony and evidence. The issue before a judge is not whether electromagnetic fields caused this plaintiff’s cancer (for example), or whether such fields ever cause cancer; the issue for the judge is whether an expert is offering sufficiently reliable, solid, trustworthy evidence that they do and did.
In any discussion of what evidence rules should be, it is essential to maintain a clear perspective of how broadly these rules reach. Rules of evidence apply in criminal cases as well as civil ones. And they apply equally to prosecutors and defendants. Rules that are “liberal” for civil plaintiffs in “toxic tort” cases will be “liberal” for prosecutors in capital murder cases too. Rules that may seem to tilt things in a socially desirable direction in today’s case may tilt the opposite way in tomorrow’s.
It is equally important to recognize that labeling evidence “inadmissible” is not the same as labeling it “false.” Judges applying the rules of evidence are not passing judgment on the ultimate truth of specific scientific propositions. Comparisons with Galileo miss the mark completely: The judge who declines to admit an expert’s testimony does not then send the expert to prison. In enforcing rules of evidence, judges are simply ruling on whether a particular proposition, presented by a particular witness, is sufficiently reliable and well grounded to be admissible in this trial, at this point in time. For similar reasons, ordinary (non-expert) witnesses are likewise forbidden to present “hearsay” testimony. The hearsay rules of evidence are not based on the (plainly incorrect) presumption that all gossip is always false. Gossip is quite often true. But judges concluded long ago that gossip, though sometimes true, just isn’t reliable enough to be presented to a jury in court. Preliminary, unpublished, and tentatively framed scientific findings may turn out to be true too, but there is good reason to exclude evidence of this character, nevertheless.
In its 1993 Daubert ruling, the Supreme Court discussed what factors federal judges should weigh in performing their proper role as “gatekeeper” when scientific experts arrive at the courthouse doors. “Faced with a proffer of expert scientific testimony,” Justice Blackmun wrote for a seven-Justice majority, “the trial judge must determine . . . whether the expert is proposing to testify to (1) scientific knowledge that (2) will assist the trier of fact to understand or determine a fact in issue . . . . Many factors will bear on the inquiry, and we do not presume to set out a definitive checklist or test. But some general observations are appropriate.”1 The Court’s “general observations” followed.
The Daubert majority relied on unusual authorities. It cited two of the most influential philosophers of science in this century, Carl Hempel (1905-) and Sir Karl Popper (1902-1994), as well as John Ziman, a prominent physicist turned commentator on science. The High Court cited the editors of influential medical journals. It cited three amicus briefs filed in Daubert on behalf of groups of scientists, and quoted from two of them.2 One of those groups comprised eighteen scientists, including six Nobel Laureates,3 with expertise in chemistry, physics, meteorology, epidemiology, environmental medicine, and teratology. The second scientists’ amicus brief had been filed on behalf of the American Association for the Advancement of Science and the National Academy of Sciences. The third group of scientists included Stephen Jay Gould, the renowned author and paleontologist.
As Chief Justice Rehnquist pointed out in his Daubert dissent, Supreme Court opinions do not ordinarily rest on this kind of intellectual foundation. Indeed, few of the scientific sources cited by the majority would be readily at hand for most judges to consult, nor would the broader literature that those sources summarize and represent. Yet seven Justices of the Supreme Court agreed that the meaning of a key phrase in the Federal Rules of Evidence— “scientific knowledge”—cannot be given intelligent meaning without venturing beyond the standard law library into the domains of scientists and philosophers.
The Court wrote at some length about the factors that judges should consider in their Rule 702 analyses. It directed trial judges to determine whether a theory or technique can be (or has been) tested, and whether it is “falsifi[able].”4 “Peer review” is to be weighed as an important, though not dispositive, factor.5 Trial judges are to consider the “known or potential rate of error” of a scientific technique and the “existence and maintenance of standards controlling the technique’s operation.”6 “General acceptance” within a scientific community is an additional, important factor bearing on admissibility.7 The Court noted that “the inquiry is a flexible one, and its focus must be solely on principles and methodology, not on the conclusions that they generate.”
The criteria enumerated by the Daubert Court center on the “reliability” and “validity” of proffered testimony. But the meaning of “validity” when used in connection with a general scientific proposition is neither unambiguous nor uncontested among scientists themselves; the “validity” (or otherwise) of any one scientific paper or study is more ambiguous still. How are judges to decide “scientific validity” when scientists and philosophers themselves cannot agree what those word mean, or when they apply?
The Daubert Court began with the issues of “relevance,” or “fit.” Can a test, study, calculation, or observation be related, by a credible theory, to the issue at hand? A fact may be reliable and accurate, yet the inferences that it suggests may still be confusing, misleading, or plain wrong. It may be true that Jupiter was aligned with Mars; that fact is almost never relevant to whether a driver born under Scorpio behaved negligently when entering an intersection. Out of context, or viewed from the wrong perspective, true facts often imply false conclusions.
The question of “fit” often ends up turning on how much other evidence there is and how consistently the totality of the evidence at hand supports a theory explaining what is going on. The better we understand the precise cause(s) of a disease, the less we tend to argue about “fit.” Discordant bits and pieces of evidence that do not “fit” in an otherwise coherent and overwhelming mass of evidence are, in practice, simply disregarded by risk assessors. But when there is no coherent theory, and no overwhelming mass of data, there is no “discordant” evidence, either.
The inquiry about “fit” in effect requires judges to survey the sum total of scientific evidence that bears on the proposition in dispute. “Fit” acquires meaning only in a broader context. A judge ruling on admissibility therefore has no choice but to consider how each expert’s testimony fits into the larger context of the scientific case. “Fit” boils down to placing bits and pieces of evidence in the larger context of what is known. Discrete assertions of fact or theory that clash too sharply with the totality of the scientific evidence at hand must be excluded, particularly if the expert does not carefully reconcile the discordant evidence with other established knowledge.
A “scientific proposition,” the Daubert Court continued, must be phrased in a way that can be falsified by other scientists if it is wrong. The majority cited Hempel and Popper. In dissent, Chief Justice Renhquist insisted that he was “at a loss to know what is meant when it is said that the scientific status of a theory depends on its ‘falsifiability.’”8
While most scientific papers are not written in strictly Popperian terms, the “testability” and “falsifiability” criteria are both useful and used by scientists. These terms tell us something about how a proposition must be framed to be admissible as “scientific” in federal court. Propositions that are so loosely framed as to be incapable of being tested are very slippery indeed. Many misuses of expert testimony in court involve “nonfalsifiable” diagnoses of disease—disease with nonspecific symptoms, for which no definite criteria exist to determine when patients do not have it. The diagnostic criteria for multiple chemical sensitivities, for example, are so loose as to make it impossible to prove that any person does not have the syndrome. In the worst cases, “cause” and “effect” are verbalized out of existence, replaced by terms like “activated,” “precipitated into disabling manifestation,” “aroused into disabling reality,” and other lawyerly prevarications. As the great but irascible physicist Wolfgang Pauli once remarked about a report he had read: “That paper isn’t even good enough to be wrong!”
The third factor emphasized in Daubert: the problem of scientific error. Errors arise in many different ways, and scientists have formulated nuts-and-bolts prescriptions to try to avoid them. There is an extensive scientific literature on the subject. It confirms that the individual scientist alone is not good at estimating or correcting error. There are, however, normative prescriptions for estimating and reporting errors that come from the larger scientific community and are articulated by widely-accepted standards. Judges can quite readily locate and enforce such standards, if they care to. Some are discussed in Judging Science. And since Daubert was decided, the scientific and legal communities have set about generating a serious literature on this subject for judges to consult.
The first of two positive criteria articulated in Daubert was that expert testimony must be “reliable.” As the Daubert Court put it: “evidentiary reliability will be based on scientific validity.” The majority distinguished “reliable” science from mere “conjecture,” “subjective belief,” or “unsupported speculation.”9
Even the best of science commonly begins as conjecture. The problem isn’t conjecture itself, but the untested and hence unreliable nature of conjecture. As Popper argued, a scientific theory is tested by comparing conclusions that can be deduced from the theory among themselves, by investigating the logical form of the theory, by comparing the theory with other theories, and by testing the theory empirically. As a general, though not invariable rule, what moves claims from the “conjecture” side of that continuum to the “scientific knowledge” side is the active involvement of a broader scientific community.
Bayes’ Theorem provides formal, though somewhat difficult, proof that the inherent “reliability” of an observation depends on two factors. One is the reliability of the observer. Biologists evaluating a report about an abominable snowman in the Andes may instinctively, and with some justification, inquire as to the sobriety and observational skill of the person who makes the report. A judge applying the Daubert criteria to the testimony of a specific witness likewise has no choice but to inquire, in some measure, about the expert’s personal reliability as an observer and interpreter of science. However objectively “valid” the substance of proffered testimony, it will not be accepted if the expert arrives to testify visibly intoxicated.
The second component of reliability centers on the “objective” probability of the thing being observed. The answer to that question requires knowledge of the background incidence of the thing being tested for—HIV infection, say, or child abuse. Daubert plainly does not contemplate that experts will self-certify their own “reliability.” This suggests a conclusion of profound, practical importance in court. The only way to assess the “reliability” of any single scientist’s observation or opinion is to combine information about the observational tools used by the individual expert with external information about the actual likelihood of the thing the scientist claims to have observed. Some judges may reasonably conclude that the most convenient, and perhaps only “external” information about “background” likelihoods is the view most generally accepted by other scientists in the field. Other approaches are possible, but all look to some external referent beyond the individual expert. Bayes impels an important legal conclusion. It is not possible to assess the “reliability” of an observation without some extrinsic, “objective” yardstick of the background probability of the thing observed.
Thus, when an individual scientist testifies that a subject’s birth defects were caused by the mother’s use of a drug during pregnancy, there is no way to assess the reliability of the scientific proposition underlying that opinion without reliable information on background rates of similar birth defects by mothers who used the drug and mothers who didn't. Absent such information, the reliability of the individual diagnosis cannot be gauged at all. That does not mean that the opinion or theory is inherently unreliable (or reliable). It simply means that we are missing a key input. One cannot measure the area of a rectangle by measuring only one side, no matter how carefully one measures it.
The second positive criterion set out in Daubert, and undoubtedly the most important, is one that attempts to clarify what the Court means by reliability. “In a case involving scientific evidence, evidentiary reliability will be based upon scientific validity,” the Court wrote, italicizing the final two words.10 The Court returned to the term scientific “validity” at numerous other points in its opinion. According to the Court, the “overarching subject” of the debate about admitting expert testimony is “the scientific validity—and thus the evidentiary relevance and reliability—of the principles that underlie a proposed submission.”11
“Validity” has several components. The first is logical validity. This is a matter of internal consistency. A mathematical proof is either “valid” or it isn’t. But less formal arguments can be tested for logical validity too. Arguments built on non sequiturs, ad hominem attacks, appeals to ignorance, begging the question, special pleading, and so on lack “validity” at a basic level of logic. A scientific argument built on logic that does not parse is invalid on its face, and must fail Daubert’s “validity” requirement.
Logical issues aside, however, “scientific validity” is not a binary, yes-or-no quantity, and it is not easy to gauge. Popper used the metaphor of science rising on piles above a swamp. There is never an utterly solid bedrock beneath, no “‘natural’ or ‘given’ base.” Scientists simply stop driving deeper when they are satisfied that the piles “are firm enough to carry the structure, at least for the time being.” Stopping points are determined, instead, by strictly practical, social considerations—how the science is to be used.
But Popper also did list various positive ways in which the strength and soundness of a theory can be weighed. It can be tested for logical consistency. It can be compared with other theories, and it can be tested empirically. These criteria are cumulative; the more advanced the testing, the more confidence we can have in a theory’s strength. In practice, any scientific theory can be tested through the same process of skeptical questioning that any intelligent layperson would apply to any factual claim about the world. Theories that are logically inconsistent, or that disagree with already established scientific principles, are not “valid” in the lay sense of the word.
“Validity” is also a term of art in science, particularly in sciences like epidemiology that rely heavily on statistics. In these contexts, at least, science itself offers more precise definitions of the term. In particular, statistics offers clear guidelines for valid uses of tests for statistical inference, from the assumptions that were made in developing the tests. It is safe to assume that the Daubert Court had no precise definition in mind when it used the term “scientific validity.” But it is equally reasonable to assume that any statistical analysis that involves serious misuse of statistics fails the test of “validity” and would be inadmissible under Daubert. The statistical tests have been formulated with an eye to issues entirely analogous to “evidentiary reliability” that concerned the Daubert Court.
Acceptance in The Scientific Community
The last factor noted by the Daubert Court as bearing on the quality of “scientific knowledge” was peer review and “general acceptance.” Science has mechanisms for articulating what well-informed scientists “generally accept” to be true about issues ranging from the way that research should be conducted to socially important issues such as health and disease. Reports of well-constituted groups under the auspices of the National Academy of Sciences, or National Council on Radiation Protection and Measurements, for example, set forth the best scientific thinking that is available about important issues. Such reports often include minority statements or other indications of the range of informed opinion. Any individual witness who holds views sharply at variance with statements like these should bear a heavy burden of explaining why he is right and the community, speaking through its committees, academies, or institutes, is wrong. Other expressions of “general acceptance” include the reports adopted by major medical societies that set forth diagnostic criteria for diseases, or reports by the Association of Official Analytical Chemists describing methods of analysis for chemicals of regulatory or health significance.
“Peer review” is one gatekeeping mechanism in science that detects and reduces error, and publication in a peer-reviewed journal is an important step in gaining “general acceptance” of a theory within the scientific community. Peer review does not ensure the validity of a scientific finding (although the process usually does weed out flagrantly unsupportable claims) but is an initial gatekeeper in the knowledge filter of science.
Prejudicing, Confusing, Or Misleading The Jury
Finally, the Supreme Court alluded to the distinct issues of jury prejudice and confusion. Even valid science may not be presented in ways likely to encourage valid inference by lay jurors. The Federal Rules of Evidence independently require exclusion of any evidence that presents too great a risk of that kind. Photographs of an autopsy are routinely excluded not because they somehow lack any scientific or medical validity, but because they are more likely to provoke revulsion than rational thought among ordinary jurors.
Sophistic presentation of scientific data is a particular problem in court because science often relies on statistical reasoning which is often very counter-intuitive or at least difficult to fathom for jurors unaccustomed to quantitative reasoning. Moreover, the jury may incorrectly believe that the individual scientist speaks with the authority of science—believe, in other words, that the individual in some sense represents a broader body of established scientific learning. A similar error is encouraged by pasting a doctor’s picture on a bottle of medicine. The medicine may or may not work. But the picture of the doctor suggests not merely the individual endorsement of one physician, but the approval of an entire medical community. A more subtle problem is the claim (either implicit or explicit) that “science” has some privileged epistemic position by virtue of a scientific method.
The issue here is legal, not scientific, except perhaps insofar as it implicates the science of psychology. The “scientist” comes to court with an impressive title—in a white coat, either literally or figuratively. The white coat, and all it implies, has an effect: Lay people often tend to put more faith in his claims and arguments than they merit. They do this because science as a whole has high credibility and influence in society. The problem of prejudice, in other words, derives from the fact that the expert witness comes into court implicitly claiming (1) special status for “scientists” and “scientific knowledge,” and (2) his own, upstanding membership in the exclusive club that confers that status.
In the minds of ordinary jurors, that special status undoubtedly derives from the perception that scientists as a group have special education, special means for discovering scientific knowledge, and special procedures for checking out each other’s work. An ordinary juror may thus readily assume that an individual scientist comes to court as an ad hoc but faithful representative of a larger professional community. If the individual scientist in fact presents views that have not been derived, shared, or checked by other scientists, there is a subtle but serious problem of misrepresentation.
The scientist in court presents a serious risk of being too credible, precisely because the implicit message is that the witness speaks with the authority of a larger community behind him. The simplest solution—and perhaps the only one—is for judges to make sure that the expert witness relies on theories or methods that have survived extensive testing in the scientific community.
That is pretty much what common-law rules of evidence called for decades before Daubert was written. Daubert provided new detail, and, more importantly, new determination to take a close look at evidence packaged as “scientific.” The Supreme Court will have more to say on the general subject when it decides General Electric v. Joiner this Fall. Justice Blackmun, the author of the Daubert majority opinion, has retired. His replacement, Justice Breyer, has a keen and sophisticated interest of his own in how science is handled in court, and even published a short, influential book on risk analysis (Closing the Vicious Circle) before he was nominated to serve on the High Court. It will be particularly interesting to read what, if anything, he chooses to write in this case.
In all likelihood, however, the Daubert standard will emerge unchanged from Joiner. Federal judges, at both trial and appellate levels, will continue to give expert testimony the careful scrutiny that Daubert demands and that most judges have begun to apply.
To find out more about Judging Science: Scientific Knowledge and the Federal Courts (MIT Press),
contact Andrew Hazlett at .Manhattan Institute, 52 vanderbilt Ave.,NYC 10017, 212-599-7000
We call readers’ attention to an important article by Samuel Jan Brakel in Vol. 31, No.1 of the Georgia Law Review (Fall 1996), pp. 77-200. The article is entitled “Using What We Know About Our Civil Justice System: A Critique of ‘Base Rate’ Analysis and Other Apologist Diversions.” For some years a small band of commentators have suggested in law reviews and elsewhere that America does not in fact have unusually high rates of litigation, that litigants seldom press abusive or speculative claims in our civil courts, and that irrational verdicts and outsize damage awards are rare exceptions of little practical importance to the general run of cases. To back up these seemingly counterintuitive assertions they have offered a variety of statistics regarding case filings, median damage awards and other indicators. Brakel extensively surveys this “apologist” literature and finds it wanting in numerous respects, especially its highly selective and often tendentious use of statistics. A limited number of copies of Brakel’s article are available to friends of the Institute; contact Andrew Hazlett.
1.Daubert v. Merrell Dow Pharmaceuticals, Inc., 113 S. Ct. at 2796.
2.Daubert, 113 S. Ct. at 2795 (citing and quoting from Brief for Nicolaas Bloembergen et al., as Amici Curiae 9); ibid. (citing and quoting from Brief for American Association for the Advancement of Science and the National Academy of Sciences as Amici Curiae 7-8); id. at 2798 (citing Brief for Ronald Bayer et al. as Amici Curiae).
3.Nicolaas Bloembergen in Physics, 1981; Dudley R. Herschbach, in Chemistry, 1986; Jerome Karle, in Chemistry, 1985; Wassily Leontief, in Economics, 1973; William N. Lipscomb, in Chemistry, 1976; Arno A. Penzias, in Physics, 1978.
4.Daubert, 113 S. Ct. at 2796.
5.Daubert, 113 S. Ct. at 2797.
6.Daubert, 113 S. Ct. at 2797.
7.Daubert, 113 S. Ct. at 2797 (citing United States v. Downing, 753 F.2d 1224, 1238 (3d Cir. 1985)).
8.Daubert, 113 S. Ct. at 2800.
9.Daubert, 113 S. Ct. at 2795 n.9.
10.Daubert, 113 S. Ct. at 2795 n.9.
11.Daubert, 113 S. Ct. at 2797.M