He Said / She Said: How to Keep Liars Out of Court

By Jonathan Marin (Copyright 2000)


In recent years, frequent instances of wrongful criminal convictions and mistaken verdicts in prominent civil cases have attracted media attention, and become a matter of public concern. They have reinforced the sense that attorneys, including prosecuting attorneys, often willingly and even eagerly make use of highly suspect testimony. The Ramparts police perjury scandals in Los Angeles, and similar scandals in Illinois, West Virginia, and other states, have heightened public awareness of the untrustworthiness of testimony by police and informants, especially jailhouse informants and accomplices testifying under plea agreement. Corrolary to the public awareness of official misconduct has been a high volume of accusations against law enforcement officers, and public skepticism when, as usually happens, the officers are cleared.  While the accusations are pending, however, they are a source of anxiety and distraction for innocent officers. The volume of accusations, together with the “he said /she said” nature of most cases and the presumed credibility of an officer vis a vis an accused criminal, create a screen behind which the occasional rogue cop can successfully hide, sometimes for years. This article looks at how a novel, limited use of polygraph results might be used to address the problems.

Whenever a question has been debated for a long time, where champions of both sides are intelligent, in good faith, and fully informed as to the facts, there is a high probablility that it is the wrong question. The polygraph debate has centered on the question of whether polygraph results should be admitted as evidence in court. This article sidesteps the traditional positions of both proponents and opponents of this use of the polygraph and instead asks the question: “In light of what is known about its validity and reliability, how can the polygraph best be used?”.  A strong case emerges for an innovative, but sound, “third way.”  Some constituencies of both groups may perceive the proposal as a threat to their interests. Nevertheless, I think this “third way” is a reform worth fighting for, in order to restore public confidence in testimony by police officers and cooperating witnesses and to prevent miscarriages of justice.

The article addresses the issues and problems–scientific, legal, and social–that surround the use of polygraph evidence in court and suggests an approach to its use that I think safely navigates the minefield they present. It presents the idea that polygraph results should not be admitted as evidence in their own right but rather should be used as a tool to screen out untrustworthy evidence.

Courts should exclude testimony from a witness who has tested “deceptive” where–and only where–that result is corroborated by a “nondeceptive” result on the opposing side (“paired results”). Using only paired results resolves the unquantifiable uncertainties and reduces by a factor of at least 5 the quantifiable uncertainties that underlie most legitimate resistance to the polygraph. If individual examination results are incorrect 15% of the time, then after correction for chance the probability that two together will be in error is only 4.5%. Juries make mistakes. Evidence that has at most a 4.5% likelihood of being true is too untrustworthy to warrant submitting it for their consideration. Excluding it will reduce the incidence of perjury and the number of mistaken verdicts. It will discourage frivolous claims and frivolous defenses, thereby reducing court caseloads and backlogs. Wherever possible, the decision to exclude should be made before trial.

The article addresses broad problems of admissibility and exclusion of evidence affecting both civil and criminal cases. In respect to criminal cases, it considers the constitutional implications of the approach, particularly those having to do with the right against self-incrimination. It also analyzes alternative approaches and their drawbacks.

He Said / She Said: Polygraph Evidence in Court


In many court cases, civil as well as criminal, the two sides present witnesses whose factual claims clash. Both cannot be telling the truth. A woman swears she was raped; the defendant swears it was consensual. An arrested suspect charges that police used excessive force; the officers deny it. A jailhouse “informant” swears that the defendant confessed a murder to him; the defendant swears it isn’t so. In these situations, polygraph results may prove invaluable, if used correctly.

Polygraph results are no longer barred from the courtroom. The Supreme Court has left it to the courts of each jurisdiction to determine how and when to allow it, or to exclude it altogether [United States v. Scheffer, 523 U.S. 303 (1998)]. I believe that the courts’ safest, simplest, and most productive use of the polygraph is to exclude testimony about a fact from any witness who has tested “deceptive” about that fact whenever a witness from the opposing side has tested “nondeceptive” about that fact. No jury would ever hear, or hear of, the polygraph results themselves. A witness’s refusal to submit to a polygraph examination on any factual claim would be treated as a “deceptive” result in regard to that claim in civil cases; it would be treated similarly in criminal cases except where the refusing witness is a defendant.

A defendant would have the right to demand that a jailhouse informant be polygraphed concerning an alleged confession and to be polygraphed himself. If the informant refused to take the test, his testimony would be inadmissible. If the informant tested positive for deception, and the defendant negative, then the informant’s testimony would be inadmissible. If the defendant did not demand that the informant be tested, or the test produced any other combination of results, the informant’s testimony would be admissible, and, except that neither side would be allowed to make any reference to the polygraph, he would be subject to cross-examination as any other witness.

Polygraph results are unreliable to some degree. How a polygraph chart is interpreted can depend on the thresholds of physiological variance above which a response will be called “deceptive” and below which it will be called “nondeceptive.” (Responses falling between the thresholds are called “inconclusive.”) It will be the court’s responsibility to determine that the thresholds and the conditions under which tests have been given are in accordance with the standards enforced by federal agencies–or at least the recommendations of the American Polygraph Association.  Results should be accompanied by an unedited beginning-to-end videotape of the examination. Determination of examiners’ competence should not be based on their academic credentials, their years of experience, or the number of tests they have conducted, but on their proven accuracy in simulated-crime studies where the protocol and number of subjects are comparable to those in the published peer-reviewed literature.

Reliability, Validity and the Power of Pairing

Eight recent studies – four field and four laboratory – that sought to quantify polygraph tests’ accuracy found both false positives and false negatives to be less than 10% [Honst, US v Scheffer (Amicus)]. “Accuracy” can be misleading, however, especially where the proportion of subjects who are in fact telling the truth is high. Suppose that from a sample of 100 subjects, of whom only 1 is “deceptive,” a test found 2 of the subjects to be “deceptive.” It could claim 99% accuracy (if one of them was the deceptive subject). Impressive. But the likelihood that a failing subject had actually been deceptive would be only 50%. The field studies cited in Honst (average false positive rate = 9.5%) imply that the 50% figure probably does approximate the ratio of false positives to true positives in the real world of criminal investigations. That figure is far too high to allow  the results of individual tests, standing alone, to be admitted into evidence.

Drawing correct inferences from stand-alone results requires knowledge about the relevant samples that is rarely possible outside the laboratory and an understanding of statistical inference that is beyond the experience of jurors. Admitting single results can easily clear the guilty and imperil the innocent. When two people dispute a fact within the personal knowledge of both, however, usually one is telling the truth and the other is lying. Pairing results therefore assures the balanced samples necessary to support sensible inferences. Even allowing for a modest percentage of witnesses who are honestly mistaken, and of cases where both are lying, the known accuracy of the test can be safely applied. When results are paired and the second result confirms the first, then according to probability theory the probability of an erroneous conclusion is the product of the two individual probabilities. Since no testimony will be excluded when both subjects pass or when both fail, the procedure operates under an effective base rate of 50%. Conservatively supposing the tests’ individual probability of error to be as high as 15%,  the probability that confirmed “deceptive” testimony would be true would be only 4.5% (0.15 x 0.15 = 0.225 * 2 (to correct for chance of 0.5) = 4.5%), and the probability that it would be false would therefore be 95.5% (100 – 4.5 = 95.5%).

Numerical Examples:

Suppose paired testing were implemented to exclude testimony from any subject found to be “deceptive” regarding facts where a contradicting witness has tested “non-deceptive” about those facts. Here are examples of the numerical consequences that would ensue assuming 80%, 85%, and 90% accuracy respectively, per 100 pairs where a conclusive result is obtained for both subjects.

+ means exclusion increased the likelihood of a just result

– means exclusion increased the likelihood of an unjust result

= means no exclusion (same as under present system)

Supposing 80% accuracy:

80 Non-deceptive subjects correctly identified

    64 (of deceptives paired to the 80) correctly identified, properly excluded (+)

    16 (of deceptives paired to the 80) incorrectly identified, neither subject excluded (=)

20 Non-deceptive subjects, incorrectly identified

     4 (of deceptives paired to the 20) incorrectly identified, wrong subject excluded (-)

    16 (of deceptives paired to the 20) correctly identified, neither subject excluded (=)

TOTALS (64+, 4-, 32=) or (+) in more than 94% of cases where exclusion applies.

    64 cases where deceptive subject is excluded, non-deceptive subject allowed

    4 cases where deceptive subject allowed

    32 cases where both parties are allowed to testify as they would be now

Supposing 85 % accuracy:

85 Non-deceptive subjects correctly identified

    72 (of deceptives paired to the 85) correctly identified, properly excluded (+)

    13 (of deceptives paired to the 85) incorrectly identified, neither subject excluded (=)

15 Non-deceptives, incorrectly identified

    3 (of deceptives paired to the 20) incorrectly identified, wrong subject excluded (-)

    12 (of deceptives paired to the 20) correctly identified, neither subject excluded (=)

TOTALS : (72+, 3-, 25=) or (+) in 96% of cases where exclusion applies.

    72 cases where deceptive subject excluded, non-deceptive subject allowed

    3 cases where deceptive subject allowed

    25 cases where both parties are allowed to testify as they would be now

Supposing 90 % accuracy:

90 Non-deceptive subjects correctly identified

    81 (of deceptives paired to the 90) correctly identified, properly excluded (+)

     9 (of deceptives paired to the 90) incorrectly identified, neither subject excluded (=)

10 Non-deceptives, incorrectly identified

    1 (of deceptives paired to the 10) incorrectly identified, wrong subject excluded (-)

    9 (of deceptives paired to the 10) correctly identified, neither subject excluded (=)

TOTALS (81+,1-,9=) or (+) in more than 98% of cases where exclusion applies.

    81 cases where deceptive subject is excluded,

     1 case where deceptive subject allowed, non-deceptive subject excluded

     9 cases where both parties are allowed to testify as they would be now


Courts continue to agonize over whether to accept polygraph results as “scientific” and admit them into evidence. But information doesn’t have to go before a jury (or other finder of fact) in order to be useful. Using the polygraph to exclude testimony that has a ~95% likelihood of being false accords both with common sense and with the Supreme Court’s view that “Exclusion . . ., is usually premised on the view that admission would lead to the frequent presentation of perjured testimony to the jury” and that “untrustworthy evidence should not be presented to the triers of fact” [Chambers v Mississippi, 410 U.S. 284 (1973)]. Utilising paired test results to exclude untrustworthty testimony would not require modifying or overturning longstanding precedent against admissibility. Excluding such testimony would not usurp the role of the jury as ultimate fact-finder any more than such time-honored exclusions as the rule against hearsay. Doing so could well prove as valuable as the hearsay rule in steering juries away from mistaken results.

There are at least two quite distinct purposes that polygraph evidence can serve in court. One is to present negative (“nondeceptive”) test results in order to bolster the credibility of witnesses. The other is to present positive (“deceptive”) results in order to preclude witnesses from testifying or impeach their credibility. Both arise from the same technology, but the scientific and statistical bases for trusting them, and the practical and legal considerations surrounding them, differ greatly.

Through the years, the primary focus of the polygraph debate has been the admissibility of individual “nondeceptive” polygraph results to bolster testimony, especially that of criminal defendants and prisoners whose test results point to their innocence. The points raised by both sides focus on the trustworthiness of polygraph results treated on a stand-alone basis.

The proponents of wider use have argued for admitting test results as trial evidence. Admissibility is a difficult argument to win, and its proponents have rarely been able to win it. Results could be admitted only after an elaborate, tedious, and time-consuming courtroom minuet. They would have to be supported by the examiner, and perhaps other experts, as well as be subjected to challenge by cross-examination and the presentation of contrary evidence, and to a web of instruction, some of it highly technical, by the court. Exclusion based on paired tests circumvents those difficulties. Because of the benefits it offers to police, prosecutors, courts, defense and civil bar, and honest parties, it is the approach that provides proponents of widening the courts’ use of the polygraph their best prospect of success.

Acceptance of paired results can help free many wrongly convicted prisoners, whereas stand-alone results face an insurmountable public acceptance problem. Suppose that 90% of prisoners are guilty of the crimes for which they are incarcerated and that false negatives average about 10%. Then for every 100 prisoners tested, there would be 80 true positives, 10 true negatives, 9 false negatives, and 1 false positive. About half the people that stand-alone tests would release would in fact be guilty. Acting where prisoners test negative and their accusers test positive would reduce that to 10%. Freeing 10 innocent persons, at the price of freeing 1 guilty one, is an objective capable of winning public acceptance.

Stand-Alone Negative Results

The case–scientific, legal, and social–against allowing negative (“nondeceptive”) results on a stand-alone basis is strong. The statistical underpinning of negative results is problematical because of the difficulty of quantifying the false negatives in the absence of “ground truth”–an external yardstick by which to measure whether subjects are deceptive. In field work, ground truth is notoriously difficult to determine.

There is no straightforward way to ascertain false negative rates–the percentage of subjects testing ‘nondeceptive” who were in fact deceptive–in real-world samples. To be a known false negative, a subject must first beat the test and later be found out. That rarely happens. When people beat the test, it usually remains their secret. It is not known how many cases go unsolved because a false negative was excluded from further investigation and how many because the culprit was not among those tested. In laboratory tests, false negative rates are usually about 10%, but extrapolating them to the real world is difficult. The physiological changes the equipment measures are affected by the subjects’ fear. The higher the stakes, the greater the fear of being caught in a lie, and the greater the measured response. However expert the examiner and well conducted the test, the high stakes of real-world tests cannot be duplicated in the laboratory.

Allowing stand-alone “nondeceptive” polygraph evidence is fraught with other difficulties. Once it were allowed, litigants would seek to introduce polygraph evidence to buttress many, even most, witnesses. Juries would come to expect them to do so. Since polygraph results cannot be introduced into evidence without the testimony of the examiner or other expert to interpret them, this would mean a de facto return to the archaic voucher system of the Middle Ages, when litigants were expected to produce “voucher witnesses” to vouch for the credibility of their witnesses.

The parade of voucher witnesses would tie up dockets and, by lengthening trials, would add to the cost of litigation for all parties. Moreover, polygraph examinations are expensive, and examiners are well paid for their time in court. In civil cases, the “voucher effect” would tend to raise the price of justice, aggravating the already serious disadvantage faced by parties with limited budgets. It would be especially pernicious in criminal trials, where strategy considerations often preclude defendants’ taking the stand. Allowing “nondeceptive” results into evidence in support of prosecution witnesses would practically compel criminal defendants to be polygraphed and testify, giving prosecutors an unacceptable subterfuge around the right against self-incrimination.

Many possible countermeasures that would enable deceptive subjects to fool polygraph examiners have been suggested. Their utility remains unproven, but to the extent they may be or become effective, their use would affect only negative results. Police officers testify frequently, and they are trained to do it effectively. Many are professional witnesses. If effective countermeasures could be mastered, unscrupulous police and other “professional witnesses” would be among the first to learn them. The danger exceeds their numbers of such witnesses because of the number of times each would testify through the course of a career.

If negative results were to be admitted as evidence on behalf of criminal defendants, they would practically guarantee acquittal. To the extent that individuals’ false negative results are repeatable, the selectivity bias could create a threat to public safety. Those individuals, knowing themselves to be practically impervious to prosecution, would be enabled to break the law with impunity. The special protections afforded criminal defendants introduce another selectivity bias into the process. If defendants had the option of introducing polygraph evidence, their counsel could be counted on to bury unfavorable results. Defendants would take tests privately–a no-risk option. Most defendants opting for private tests would presumably fail. The court would never know. Only defendants with favorable test results would introduce them. If defendants in, say, 10% of trials presented “nondeceptive” test results, would that mean that 10% of defendants are innocent? That the polygraph is subject to 10% false negatives? This selectivity bias applies primarily to criminal defendants, not to most other witnesses. But no court that allowed stand-alone negative results to bolster the testimony of some witnesses could constitutionally bar criminal defendants as a class from using them. To the extent false negatives occur, the selectivity bias would lead to wrongful acquittals.

James K. Murphy, the former polygraph unit chief at the FBI laboratory in Washington, D.C., has testified (http://truth.boisestate.edu/polygraph/MURPHY1.HTML) that the FBI annually administers polygraph examinations to about 5,000 applicants for sensitive jobs. Each applicant takes two tests. Applicants almost always pass the first test, which focuses on counterintelligence issues: Applicants are asked whether they’ve ever been in contact with anybody from a foreign intelligence service and whether they were directed to seek FBI employment. The failure rate is about 0.5%.

The applicants’ charts from the first test are used for comparison with the charts from their second test, which deals with use of illegal drugs, abuse of legal drugs, and falsification of the application for employment. In accord with ordinary knowledge and common sense, the failure rate on the second test is much higher: More applicants have had undesirable experience with drugs than have an involvement with espionage. More than 70% of applicants failing the second test have validated the examination results through confession or through admission at the time of the test.

The FBI believes that these results support validation, through the correspondence of the results with the known statistical base rates for those two subject areas, and achieve reliability as the test relates to them. They rely heavily on these results, notwithstanding that the test results provide only a weak inference regarding false negatives.

Despite the scientific and statistical difficulties with “nondeceptive” results, federal, state, and local police and prosecutors place great confidence in them and make important decisions based upon them. The methodological argument against their use on a stand-alone basis is not that they are valueless, but that their value is so uncertain. The rub is that the testimony of interested parties, informantes, and plea-bargained accomplices is also uncertain.

Stand-Alone Positive Results

The methodological, scientific, and statistical grounds for confidence in estimates of the rate of false positives are stronger. The FBI, OSI, and CIA have administered polygraph examinations to tens of thousands of past, present, and prospective government employees and armed forces personnel. The 0.5% failure rate cited by Mr. Murphy of the FBI indicates that when tests are given under proper conditions by competent examiners, and interpreted using a high threshold of physiological variance, false positives, taken as a percentage of tests administered, can be extremely low. This low occurrence of positive results occurs in a real-world setting where the stakes for the examinees are not only their jobs, but also the unpleasantness of becoming the subject an espionage investigation. Since there obviously cannot be more false positives than there are positives, the percentage of positive results establishes a rigorous limit for those thresholds. But for purposes of evaluating the significance of an individual positive result, it is the ratio of false positives to all positives that matters. If there is one spy among a population of subjects, and two (including the culprit) fail the test, then that ratio is 50%, irrespective of the number of subjects in the sample.

In testing conducted by police for the purpose of eliminating possible suspects, where the subjects are people who have an appreciable likelihood of being involved in a crime, positives occur more often. Many subjects who get positive results confess and provide independent evidence that supports their confession or are convicted by juries with no knowledge of the polygraph results, thereby reducing the number of positives that might be false positives and helping scientists further refine their estimates of the trustworthiness of positive test results.

Nevertheless, I believe that no testimony from a witness who has tested “deceptive” should be excluded unless a contradicting witness has tested “nondeceptive.” The second result increases confidence, by a factor of 5 or more, that the excluded testimony is really untrustworthy. Both witnesses may be untruthful, and no advantage should accrue to the one who has refused to be tested. In criminal cases, defendants would be freed from the Hobson’s choice of having to testify before the jury in order to contradict testimony they know to be false.

Police and Prosecution Issues

Police and prosecutors have consistently opposed allowing polygraph results into court. The polygraph is an extremely useful investigative tool that enables them to screen possible suspects and focus their resources effectively. Police cannot compel suspects to take polygraph tests, due to the rule against self-incrimination. If failing results could be introduced into court, even many innocent people would be reluctant to risk consent, and the police would lose an invaluable time-saver. If, in order to keep the tool, they were to promise not to use results in court, their relation to the technology would remain exactly as it is now. Only defendants would stand to benefit from admissibility. Even if acknowledging a refusal to take the police polygraph became a condition of defendants’ introducing “nondeceptive” results, such refusals would be credibly explainable, in light of the favorable test result, as due to an innocent person’s distrust of the police.

By limiting prosecutors’ use of a positive result to exclusion, the subjects’ risk is reduced and the legitimate police concern addressed. Their stated concerns no doubt mask the unwillingness of some law enforcement people to forgo the advantage they gain from police perjury and other dubious testimony. To the extent that the polygraph removes that advantage, it removes a blight. The Ramparts scandals in Los Angeles, and similar scandals elsewhere, come into an atmosphere of increasing public awareness of wrongful convictions. Together, they threaten to foster a deep and long-lasting suspicion against testimony by police and informants, especially jailhouse informants and accomplices testifying under plea agreement.

It is no mere public relations problem. It is a serious cloud, and it will require concrete measures to dispel it. Apart from problems arising from a generalized negative attitude toward police, there is the specific danger of lost convictions due to excessive juror skepticism. I think the use of polygraph results suggested here is a reform that will help to restore public confidence in testimony by police officers and cooperating witnesses. Departments that opt for the idea will find allies among the media and among groups that are opposed to prosecutorial misconduct and wrongful convictions, whatever their attitude toward police image problems.


Police departments hold classes and workshops to teach officers how to testify effectively. Consequently, the persuasive impact of officers’ testimony, whether honest or not,  is greater than that of most other witnesses. And it is often false. Indeed, perjured testimony by police is so common that there is a word for it, “Testilying”, that is used and understood in every station house and D.A. office. In civil litigation, false statements, oral and written, enable parties asserting an invalid claim (or defense) to exact a penalty from opponents with worthy cases. Especially where the party in the wrong is financially stronger, this frequently compels the honest party to agree to an unconscionable settlement, or even abandon the case altogether.

In both civil and criminal litigation, fact-finders evaluate witness credibility using a host of spurious factors such as age, ethnicity, physical attractiveness, speaking voice, clothes, and occupational status. When litigation attorneys assess whether a person will make a “good” witness, it is those factors rather than the truthfulness of the person’s testimony they are usually talking about.

Imagine that a couple of King Arthur’s knights arrived here in a time machine, and happened upon a refrigerator. One of them suggests using the unfamiliar object as a boat, while the other says no, let’s fill it with dirt and grow vegetables. To me, the debate over the use polygraph results in court feels much the same. One side wants to bar them altogether, while the other wants the results of single stand-alone tests to be admitted before juries. They are equally incorrect. It is a third course–using paired results to screen testimony– that I believe is the right one.

Paired polygraph results can streamline court proceedings, lighten prosecution and legal assistance caseloads, reduce litigation costs, and move the docket along. Countless laborious cross-examinations will never take place. Many witnesses will not appear at all. Many fraudulent and frivolous cases will never be brought, and many others will never reach trial.

Ensuring that the utilization is proper is nontrivial but straightforward. Judges need to know about proper polygraph examination procedure and interpretation of results. They need to be able to reject unqualified and “bought” examiners, and conclusions based on unreasonable thresholds. Judges are capable of learning this and applying the knowledge on an ongoing basis. Unlike juries, whose secret deliberations give few clues as to the weight given to admitted evidence, judges’ decisions are open, would apply to specific testimony, and would be based on examination charts and videotapes that will become part of the record, making their decisions subject to review.

The approach offers the prospect of reducing the number of wrongful criminal convictions, and of wrongful outcomes of civil cases, in a way that skirts a potential minefield of technical uncertainties as well as legal and social complications. The courts of every jurisdiction would do well to consider it.

(c) 2000 By Jonathan Marin

