The Jolly Contrarian
The Jolly Contrarian on Crime and Punishment
Lucy Letby: confirmation bias
1
0:00
-43:14

Lucy Letby: confirmation bias

How to get from “there is no evidence of murder whatsoever” to “the only logical explanation is that the nurse is a serial murderer, and she killed all of them”.
1
File:A Moorlands Zebra^ - geograph.org.uk - 469971.jpg
Well, you never know.

Take the following hypothetical scene, which describes Lucy Letby’s experience at the Countess of Chester Hospital:

Internal investigation

A hospital experiences a cluster of deaths and collapses materially in excess of its usual rates for such events.

Staff notice a particular nurse was present during an abnormal proportion of the collapses.

Concerned at the possibility of foul play, hospital management investigates the collapses, focusing on those where the nurse was present. Those where she was absent are removed from the cluster.

The investigation is widened to include other unexplained collapses at which the nurse was present. Some are added to suspicious cluster.

External investigation

Though hospital management has not thoroughly considered alternative possibilities — that the cluster was simply “statistical noise” or that it had a different, non-malicious explanation — it presents its suspicions to the police.

Police investigate the narrow question “whether there are sufficient grounds to prosecute the nurse” and not, specifically, “what else could have caused the original cluster?” Police do not interrogate the hospital’s internal investigation methodology.

Instead, Police seek firstly evidence consistent with foul play — typically technical medical analysis — and secondly, given there has been foul play, evidence consistent with the nurse being behind it — this will be weak circumstantial evidence of “unusual behaviour” consistent with guilt. These two steps are markedly distinct. Notably, Police disregard behaviours “consistent with” the nurse’s innocence.

Media involvement

The media pick up the story. Through repetition and hyperbole, the nurse’s identified “guilty“ behaviours are sensationalised to be ever more incriminating.

“Experts” speculate on the nurse’s motivation: “God complex”, “craving attention”, “Munchausen’s by proxy”, secretive behaviour” for example. Though they frequently contradict and none were formally diagnosed before (or after) the investigation, these descriptions are taken in the round as further corroboration of the nurse’s guilt.

This may sound like a carefully tailored account of Ms. Letby’s apprehension and conviction, but it is not: it describes the pattern followed in the prosecutions of Beverley Allitt, Benjamin Geen, Colin Norris, Susan Nelles, Jane Bolding, Victorino Chua, Lucia de Berk, Daniela Poggiali and dozens of other health professionals prosecuted in the western world for murder by poisoning.

Thanks for reading! This post is public so feel free to share it.

Share

First find a suspect, then find a crime

Now, this is not to imply that none of these individuals committed murder nor, necessarily, that all were wrongly prosecuted — though some of them were1 — but rather to note that the “healthcare serial murder” scenario presents distinctive challenges uncommon in normal crime scene investigations.

Normally, criminal investigations start with little doubt there’s been a crime, but a lot about who committed it. A dead body with an ice-pick in its ear and a bullet-holed playing card in its breast pocket is all but certain to have been murdered. The investigator’s main challenge is working out by whom: once identified, linking the villain to the crime tends to be straightforward.

With healthcare serial murder cases, the “route to suspicion” is curiously backward. It starts with a cluster of events that could be but, all else being equal, are most likely not, criminal in nature. Assuming, nevertheless, they are criminal, investigators then hypothesise about the most likely suspect. Inevitably, their suspicion falls upon the person present at the most identified events.

There may then follow an exercise in “refining” the cluster to better fit the “foul play” theory. Collapses at which the nurse was not present may be dropped — recharacterised as non-suspicious2 — while previously unsuspicious collapses during the nurse’s shifts may be upgraded to “suspicious” in light of her presence.

The refined cluster will, by her constant presence alone, much more strongly implicate the nurse then the original one did. This is, of course, the Texas sharpshooter at work before our very eyes.

With that “stronger” framework for suspicion, investigators will set about finding evidence to support their theory of foul play. This process may start before police are involved, but will carry on with great gusto once they are on the case. It will often gloss over two important steps:

Firstly, if investigators put any effort at all into ruling out innocent explanations, it will be early in the process, and will be peremptory.

Secondly, investigators will make little effort to link any wrongdoing directly to the suspect. The main “identification” evidence will be the nurse’s “unique opportunity”. Since, the reasoning goes, she is the only person who could have done it, little needs to be done to show that she actually did do it.

Ms. Letby’s is perhaps the most pristine example. The consultants started with the plainest concession to police they had no grounds for suspecting foul play in any of the collapses, and no grounds for concern about Ms. Letby:

Nurse

As part of the review staffing was looked at, there was a notable high statistical relationship between a member of the nursing staff and babies deteriorating in the unit. There is no evidence, other than coincidence.

[...]

The nurse has been working at COCH for approximately 8 years full time, she is a Cheshire resident, and a single parent. The staff member has since placed a grievance against COCH. There has been no formal investigation of misconduct and no motive identified. There are no mental health issues known and nothing has been highlighted by occupational health. There are no management issues.3

Yet they still managed to move a jury from “no evidence of any crime” to “legal certainty about most of them”.

Just how that happened is the topic of this post.

Epistemology and the law

Aquick reminder about the epistemology of the law. Where there is a dispute about what happened, any court case is beset by “epistemic” uncertainty. History is fixed. Deeds are done. What those deeds were and whether they were dirty may be as immutable as the stars but, still, certain knowledge of them is beyond our mortal grasp.

A court does not and cannot know what really happened. It, and the whole of the rest of the outside world, is in a state of permanent, incurable un-knowledge about it. The best it can do is assess likelihoods.

A criminal trial is a highly artificial contrivance. One way of looking at it is that it is designed to solve this exact problem — incurable epistemic uncertainty — by generating reliable probability estimates. Criminal trials cannot and do not purport to generate certainties. The process is fundamentally probabilistic.

Those outraged that anyone should question the delivered verdict of a British jury should remember this: even a flawlessly-conducted trial between expert counsel examining perfectly-motivated witnesses before an enlightened jury can deliver injustice. “Beyond reasonable doubt” is an epistemic step short of certainty. The system’s reliability can itself be measured in probabilities.

According to ONS data, the annual homicide rate in the UK has been a relatively stable 550 since 1970. According to University of Exeter data in that time there have been 150 false convictions. This gives a false conviction rate of just under 1%. If we take the beyond reasonable doubt probability of “95%” assessment at face value, this means the system is meeting, or exceeding, its SLA: A 95% certainty rates implies a 5% false conviction rate.

A criminal trial is designed to isolate an artificially limited set of legally relevant facts and assign them probabilities as a way of assessing of what is most likely to have happened and whether that satisfied the artificial legal construct called a “crime”.

That court assessments are expressed in words — “beyond reasonable doubt”, “more likely than not” — should not obscure the fact that they are probabilistic assessments. Explicitly, criminal courts convict on probabilities lower than 1.4 A small, but statistically significant, error rate is factored in. There is no magic to it: the criminal process shares all our human cognitive frailties.

Though barristers tend to, we should not put it on a pedestal.

“I wouldn’t start from here”

There is an old joke, these days not politically correct, even though it articulates unassailable wisdom:

A lost tourist asks a local policeman for directions to Dublin’s cathedral. The policeman scratches his chin and says, “Well, if I were you, I wouldn’t start from here.”

The soundness of your conclusion depends on the plausibility of your premises. All else being equal, start with the most likely explanation, and stick with it until something rules it out. If you are curious about a cluster of collapses, don’t start with murder.

Patients do unexpectedly collapse at every hospital every year. Nurses don’t, very often, murder patients. Starting with a cluster of unexpected collapses and proceeding directly to a murder investigation is a bit like hearing hoofbeats and looking for a zebra.

If that is your frame — and in healthcare serial murder cases, invariably it is — you are at grave risk of bad confirmation bias.

Good and bad confirmation bias

Confirmation bias
kɒnfəˈmeɪʃᵊn ˈbaɪəs (n.)

The tendency to search for, interpret, and recall information that confirms a pre-existing belief, preferring information that supports that belief, ignoring information that contradicts it, interpreting ambiguous information to be consistent with the belief, and more readily recalling information that confirms the belief than that which challenges it.

Not all confirmation bias is bad. In fact, it is one of homo sapiens’ principle reasoning strategies. Being instinctive pattern-matchers we owe our sustained success in the evolutionary lottery to our ability quickly and instinctively to observe, orient, decide and act. We survey and assess and react to a situation in real-time based on our experience, usually before our deliberative function has made it out of bed. In common scenarios it works well. When we tread on the pavement, we trust it will bear our weight. When we flip the switch, we are confident the kettle will boil.

When we hear hoofbeats, we assume horses.

We thereby practice a kind of unwitting Bayesian heuristic. I have walked along the footpath outside my house thousands of times. It has always taken my not inconsiderable weight. While there is a non-zero chance that, this morning, a sinkhole could open up and swallow me, my experience tells me it will not. This is a form of confirmation bias. On regular days, it is highly unlikely to let me down.

Our cognitive machine is well-oiled, fast and, mostly, reliable. Occasionally, it lets us down: usually in unusual scenarios, edge cases, and where our emotions are engaged. We are curiously vulnerable to deception by malign interlopers at the periphery. But still, when it comes to healthcare professionals, there are few “malign interlopers”. The credentialisation process in the healthcare industry is, in part, designed to weed them out.

In any case, quick, low-intensity decision making — heuristics in place of careful analysis in familiar cases — is a core survival tactic. It is skewed, however, to err in favour of self-preservation: the man who mistakes a rock for a bear survives. The one who mistakes a bear for a rock does not. When in doubt, to hell with the other guy.

Our doubts sit on a continuum between normalcy and oddness. Most of them are at the oddness end. There are many “samey” days, a few unusual ones, and tiny number that are totally weird. If you aren’t sure what kind of day it is, assume it’s a samey one. We are most unlikely to misread “samey” days. It is less likely to matter if we do. It is weird days that catch us out.

The surprising risk of false positives

Just how likely are false positives? Well, here is a fun thought experiment: let’s compare the base rate probability of a false conviction for murder with the probability a given nurse actually is a murderer.

When you combine these probabilities it produces a rather surprising result.

For the false conviction base rate let’s take our 1% estimate of the miscarriage of justice rate and give the courts the benefit of an order of magnitude of doubt. We will divide that error rate by ten. Say only one in one thousand murder convictions are wrongful, giving a wrongful conviction rate of 0.1%.

Now let’s estimate, roughly, the prevalence of murderers in the nursing community. We can do this by reference to healthcare serial murder convictions over the last 50 years across America and Western Europe. In that time, there has been a surprisingly steady rate of about 1 conviction per year.5 If we say there were, at a conservative estimate, roughly 5 million medical professionals in North America and Europe at any time over that period,6 the going “rate” of murderers among medical professionals runs at about 1 in five million. That is pleasingly rare.

So:

1 in 1,000 miscarriages of justice
1 in 5,000,000 healthcare serial murderers

Base rate neglect: why it is overwhelmingly likely a conviction is false

Now: if we happened to pull that one-in-five-million “true positive” killer nurse out of a hat and put her on trial, with a 0.1% false conviction rate, we could be extremely confident she would be convicted. Happy days.

But she is but a single needle in a haystack of 4,999,999 innocent nurses. What if, by accident, we pulled an innocent nurse out of our hat and put her on trial instead? What would her chance of acquittal be?

Also, 99.9%. So, also, happy days!

But hold on. There is something — well, someone — missing: we have our one “true positive” murderer — that’s great — and we also have the 99.9% “true negative” innocents — also, great — but 99.9% of 5 million is only 4,995,000.

Where are the other 4,999 innocent nurses?

Well, this is awkward: they are in prison.

The corollary of that healthy 99.9% accuracy rate is that rate for every one actual murderer you convict, you would expect to convict 4,999 innocent nurses. That is a low strike rate. It means if we try a nurse at random, she is overwhelmingly likely to be innocent, and (a bit less) overwhelmingly likely to be acquitted.

But if we try a nurse at random and she is convicted, she is overwhelmingly likely to be innocent.

But we don’t just try nurses at random

“But hold on, JC: this all presumes we put nurses on trial for murder at random. We don’t: we only try nurses we are pretty sure, on good grounds, will be found guilty. Aren’t these bamboozling statistics you are laying on us just a cheap parlour trick?”

If we put nurses on trial only when we first had solid direct evidence of murder — eyewitnesses, CCTV footage, freely-given confessions and so on — this would be true. But we don’t. The nature of “healthcare serial murder” is that, usually, at first there is no direct evidence. The suspicion of foul play, at first, arises from a given nurse’s opportunity.

If there is a statistical parlour trick going on here, that is it: we are relying on “implausible coincidence” rather than “compelling evidence of criminal behaviour” to convict our nurse. We don’t have any better grounds for believing this nurse to be a murderer than any other nurse. Almost always, the nurse just happened to be there. That’s the incriminating evidence. There is nothing to differentiate her, qualitatively, from the general population of nurses who do not commit murder.

When it comes to healthcare serial murder investigations, it seems, we do just try nurses at random.

How confirmation bias works

A prosecutor must prove, beyond reasonable doubt, that the cluster not only was caused by malice, but by the malice of a single nurse, since the prospect of multiple independent murderers in one hospital is unthinkable even to those in the grip of prosecutor’s tunnel vision.

This forces an odd dilemma onto the prosecution. If we step through it we can get a sense of how a highly improbable possibility morphs into an apparent certainty, though a series of unconscious, invalid, cognitive manoeuvres.

  1. On day one, there is no direct, or even circumstantial, evidence of murder. The prosecution’s suspicion that there might be murder arises from:

    1. an unusual cluster of deaths and

    2. the presence, at many of those deaths — not yet all of them: we’ll get to that — of one nurse.

  2. The suspicion of murder emerges from the entirety of the cluster, and not out of any single case.

  3. There is an important preliminary point in the reasoning here:

    1. If it is a “medical” murder in a secure hospital environment — Spartan if — the list of possible suspects is limited to those with the skills, apparatus and access to carry out the murder.

    2. That means, if it is murderSpartan if — the murderer is almost certain to be an on-duty medical professional.

    3. It will be tempting to hear the second part of that sentence — “the murderer is almost certain to be an on-duty medical professional” — while ignoring the first part, upon which it is entirely conditional: “if it is murder”. This would be a grave mistake: as we have seen, murder by a medical professional in an intensive care unit is very, very unlikely.

  4. There’s second subtle point here, too. Given the lack of strong social ties between nurse and a patient, the usual psychological motivations for “one-off” murders — jealousy, grudge, age-old familial resentment — are unlikely. A nurse who murders is likely to be a sociopath. Sociopathic motivations are enduring: they are unlikely to stop at one victim. Conversely, someone who does “stop at one victim” is not likely to be a sociopath.

  5. Murderers who are not sociopaths will generally be in social networks with their victim: this is another way of saying “normally adjusted people don’t randomly murder strangers”. Nurses tend not to be in social networks with their patients — especially when those patients are infants. So if the murderer is not a sociopath, it is most unlikely to be a nurse. So we have this weird world where a nurse is either a serial murderer, or not a murderer at all. Since we suspect the nurse, we are obliged to seek out evidence of multiple offences.

  6. Back to our cluster. Besides opportunity, we don’t, yet, have any better grounds to suspect our nurse than any other nurse on the ward. We had better find some. Investigators systematically search the record for information to corroborate the suspicion, discarding information that tends to contradict it. The risk of confirmation bias arises.

  7. It would be better to seek evidence that there was no murder, as that is more likely. Without a pre-existing theory we are less prone to confirmation bias here: it could be, literally, anything. But looking for non-apparent reasons not to prosecute is not generally what detectives do.

  8. Instead, the police will engage new experts to sift through the records looking for evidence of murder that was previously missed. They will inspect x-rays, analyse immunoassay test results, re-examine post mortems, study internet search histories, interrogate ex-boyfriends and dig up gardens. Anything consistent with murder is fitted into the frame. Anything else is set aside.

  9. Inevitably, plenty of “circumstantial evidence” like this will turn up. Much of it will be ambiguous, probative of little, and highly prejudicial. The “medical evidence” arising from re-evaluated medical records will not personally incriminate the suspect but will instead be presented as evidence of criminality in the abstract: even though none of the duty doctors or pathologists detected foul play at the time, prosecution experts will be categorical, months after the fact, that there has been foul play and the records prove it. The original doctors and pathologists, who made the records, will not be called as witnesses.

  10. The tacit inference will be drawn that if any of the collapses are murder, some or all of the others are likely to be murder too, since such a sociopath would not be able to stop herself. There is a circularity here, though, because by the same logic a lack of other attacks would tend to imply that there was not a sociopathic nurse on the loose.

  11. In light of this re-evaluation the original cluster is revised, filtered, augmented and reassembled as a charge sheet. It will now much more clearly implicate the suspect: though she wasn’t present at all the events in the original cluster, she was present at every one of the deaths as now charged.

  12. Though there is still little evidence directly incriminating our suspect for any offence, by logical loop there are grounds to presume she committed all of them. This is a subtle, unjustified, logical shift.

  13. Remember our conditional probability above: “in the highly unlikely event of a murder, the murderer is almost certainly an on-duty medical professional.”

  14. The focus now switches to supporting the inference there has been foul play by someone it need not incriminate the suspect.

  15. In at least one case — it only needs to be one case — there will be plausible — but not definitive — grounds for that belief. The defendant may herself acquiesce in the view that “someone administered poison”, as Ms. Letby did in cross-examination.7

  16. With the presumption there has been a murder, the prosecution seems to have hurdled that tricky conditional probability. The weak sentence: “in the highly unlikely event of a murder, the murderer is almost certainly an on-duty medical professional” has turned into a very strong one: “There is a murderer, so it is almost certainly an on-duty medical professional”.

  17. The only question now is who.

  18. Of the thirty odd medical professionals who could have been on duty, only one, the suspect, had the opportunity for each case. It does not matter that she does not match the behavioural profile of a serial killer: we have our suspect. It literally cannot have been anyone else.

  19. Two further subtle shifts in our reasoning have happened:

    1. We are now sure there is a serial murderer on the ward, so if there are any cases of foul play they are highly likely to be perpetrated by her. After all, the chance of their being two murderers on the ward operating independently is vanishingly remote.

    2. Without a known murderer on the ward, the chance a given collapse was murder is very low. But where there is a confirmed serial murderer operating in a ward, the chance a given collapse is a murder is much, much higher. The explanation that the suspect — whom we have deduced is a sociopath — murdered all of them — is suddenly very plausible.

And there, in a nutshell, is the reasoning, with confirmation bias, that gets you from “there is no evidence of murder whatsoever” to “the only logical explanation is that the nurse is a serial murderer, and she killed all of them”.

Conclusion

If you start with a relatively unlikely scenario and do not systematically gather evidence supporting more likely explanations, instead collecting anything, however tendentious, that fits your theory, you are at high risk of “bad” confirmation bias.

The conditions for a well-intended, but all the same wrongful healthcare serial murder prosecution are the same ones that propel catastrophes of other kinds with which we are wearily familiar: financial crashes, air crashes, reactor meltdowns.

They are what Charles Perrow would call “normal accidents”: they do not depend on malice aforethought but are baked into our institutions. The system by its normal operation is fated to generate these outcomes every now and then. It will produce conditions that will lead to conviction of nurses for murder whether they are guilty or not.

When there are no witnesses, confessions, or clear motives the most statistically likely explanation is that confirmation bias has manufactured a case against an innocent person.

This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.

See also

References

1

Of these, two were dropped before trial, one was acquitted, two were acquitted on appeal, three are subject of live miscarriage applications, and two rest as unchallenged convictions.

2

This is somewhat facilitated by the possibility of labelling the same collapse as “suspicious” — ostensibly caused by foul play — if the suspect was present but “unexplained” — being of unclear origin, ambiguous, but presumptively innocent if she was not.

3

Cheshire Police briefing note of meeting with COCH, 5 May 2017. The reference to Ms. Letby being a single parent appears to be an error.

4

The criminal standard of proof, “beyond reasonable doubt” is widely regarded to be something like 95%. The civil burden of proof of the “balance of probabilities” — a more explicitly probabilistic label — is 51%.

5

Notwithstanding inaccuracies in our sample caused by known known and unknown knowns, as discussed here. I have data and can prove this if anyone is interested!

6

Again, this is a conservative estimate: in 2025 there are 15 million.

7

This is the cross-examination in question:

Letby is asked if Child E was poisoned with insulin.
Ms. Letby: Yes I agree that he had insulin.
Mr. Johnson: Do you believe that somebody gave it to him unlawfully?
Ms. Letby: Yes.
Mr. Johnson: Do you believe that someone targeted him?
Ms. Letby: No.
Mr. Johnson: It was a random act?
Ms. Letby: Yes ... I don’t know where the insulin came from.
Mr. Johnson: Do you agree [Child L] was poisoned with insulin?
Ms. Letby: From the blood results, yes.
Mr. Johnson: Do you agree that someone targeted him specifically?
Ms. Letby: No...I don’t know how the insulin got there. ... I don’t believe that any member of staff on the unit would make a mistake in giving insulin.”

Discussion about this episode

User's avatar