On Forensic DNA Evidence and Forged DNA

For context's sake, let me say that at time of writing I am between Cork Airport and somewhere in London, attending the EU DIYbio Continental Congress, with the collective goal of developing a community code of bioethics and biosafety that we can hopefully ask the community at large to adhere to. I'll be summarising and expounding on the experience afterwards, but for now I'm motivated to discuss something mostly unrelated.

DNA Evidence; Great for a Portfolio, Bad in a Database

There are rumblings here in Ireland of a Garda DNA database, as have been established elsewhere for law-enforcement purposes. As a geneticist, this worries me, and I'd like to share with you why that is. DNA evidence (generally in the form of DNA fingerprinting) is an incredible tool, a powerful tool that has doubtless solved some of the most intractible crimes of our generation. I don't want anyone to think that I discourage the appropriate use of DNA evidence in solving crimes.

However, there is a general misconception, which is unfortunately endemic in the forensics community and the justice system (particularly juries who've seen CSI) that DNA evidence is infallible. This is a dangerous and unjust fallacy.

DNA evidence normally takes the form of length polymorphisms at certain sites in the human genome; i.e. you and I will share the areas of DNA but will probably differ in the size of said areas. By assembling a set of such sites and characterising the lengths of each, a so-called “fingerprint” can be assembled, which is almost always accurate enough to match a DNA sample at a site to a suspect.

Note “almost always”, however. The likelihood of a match between a suspect and a sample being down to chance, assuming perfect laboratory technique and ignoring many confounding factors, lies somewhere between one in one hundred billion or five hundred quintillion, depending on the battery of tests. Therefore if you have a suspect who you have reason to believe committed a crime and you can use DNA evidence to match them to the scene, it's practically airtight evidence that they were there. Though, not necessarily that they committed the crime, which is a whole other problem.

However, that's making some huge assumptions. Appealingly to our cultural sense for movie and soap drama, we must consider monozygotic (identical) twins, who actually axe a huge chunk (up to seven orders of magnitude!) out of that statistical trophy all by themselves. Then there's chimeras (those who are composed of two or more fused embryos, and therefore have multiple genomes), who present an as-yet unknown confounding effect; certain studies have already documented how the chimera confounding effect has “proven” that mothers are not the parent of their children using DNA evidence, for example. A chimera may have blood derived from one genome, and cheek cells from another, for instance. Bone marrow transplant recipients could appear to have the blood of their donor, too, making them an artificial chimera.

More worrying still is the effect of operator or machine error. PCR is infamously temperamental to those who perform it daily, and PCR machines are prone to significant wear and tear which can affect their ability to accurately measure and alter temperatures. A poorly maintained or overused machine could produce off results, and a tired, inept or uncaring operator could easily misinterpret or taint results. On top of that, it's pretty easy for DNA evidence to get contaminated with other DNA samples (for precisely the same reasons that it's so easy to get DNA evidence: It's really, really sensitive); on scene contamination or contamination with skin, breath or hair in the lab.

Indeed, even if the sample is perfectly analysed by perfect machines and operators, it might itself be too degraded for a full analysis; certain sites may be absent, with each missing site removing zeroes from the denominator of the spurious-results frequency.

Now imagine an entirely possible scenario: Your ideal scenario odds of 500 quintillion (already based on an assumtion of entirely random genetic assortment which I'm tempted to question), is axed by identical twins alone down to 50 trillion or so. Assuming that Chimeras are very rare, they might take another factor of ten out, leaving us with one-in-five-trillion. Still pretty good. Now we apply the evidence that forensic error rates are moderate-to-high, the moderate likelihood of contamination, and the possibility of an incomplete analysis (which, at least, would have to be announced). Remembering that the 0.2% of the population who are identical twins took us from 500 quintillion to 50 trillion, and plug in your own numbers for each of the above.

It's easy to arrive at an estimate below a one-in-a-million chance. I've seen professional estimates of less than one in six hundred thousand. Considering the scale of existing databases, that's already very worrying and implies that at least some people already convicted using DNA evidence are probably innocent. It gets even worse as databases swell, or start sharing data. Worse still when databases stop limiting themselves to prior offenders, which is likely to happen soon if nothing changes.

Considering this, the attitude evinced by law enforcement personnel worldwide toward DNA evidence needs to change, immediately. Even a well instructed jury is open to fallacies based on poorly presented DNA evidence, and many law enforcement or court personnel are poorly informed or trained in descriminating tight evidence from bad evidence. Even then, the “dark matter” of lab error probably won't be factored into the actual odds, artificially inflating them by leaps and bounds. By covering up their mistakes and claiming perfect technique, forensic lab technicians could be allowing others to be convicted falsely.

This is not to say that a database cannot be of use to an investigation, but finding a match on a database can only be used as a way to find potential candidates for investigation which have not already been identified by existing evidence. When database results alone led one to a suspect, it's time to abandon that line of enquiry. Unfortunately, a “hit” will always bias a human against the person identified; even when coupled with genuine evidence, how can we be sure that justice was carried impartially, or that justice remained blind in the face of a cold computer match? Perhaps worse, the fact that someone was even considered as a suspect can be enough to cause social and professional harm to a person, so how are they to be protected from spurious matches?

Forging DNA Evidence in the Kitchen

That's not all. DNA evidence may be pretty reliable today, but the near future is certain to see some incredible upheavals; but not before many more innocent people are found guilty of crimes they had no knowledge of.

The reason I say this is pretty simple; given the PCR primers that are used to perform DNA fingerprinting analysis and some junk DNA, perhaps extracted from my own cells and sonicated, I could probably forge some false fingerprints in short order.

By sonicating DNA to dice it up into a collection of randomly sized fragments and selecting the fragments of the size I want, I could ligate the primers to either end of a fragment to generate a false fingerprint for a given set of primers. Indeed, if I design my primers to be degenerate and self-amplifying, I could even do away with stock DNA and use just the primers to assemble a set of length polymorphisms, potentially matching the mimicked sites at the code level, too.

Given a little time, I could quickly generate a mix of synthetic DNA that would easily foul and fool any forensic test. Worse, if I had a sample of someone else's DNA to work with, I could very easily forge their “fingerprint” and spray it everywhere. Doing so, in fact, would be easier than the above methods by far; just by performing the fingerprinting analysis I am creating a set of DNAs that can be used to forge that fingerprint.

One can easily imagine a “kit” consisting of every DNA polymorphism for every fingerprinting analysis locus, from which false “alleles” can be chosen to forge any desired fingerprint, and which would easily fit in a briefcase (if consisting of tube samples, although one could equally spot the DNA onto sheets of paper and have a “book” that might fit in a pocket).

If you'd been thinking until now “ok, why not outlaw PCR to prevent this?", think again; firstly, that'd be practically impossible because the requirements for a basic PCR are quite low and can be found without too much trouble. Secondly, once someone has done it even once, kits for forging evidence would be trivially easy to sell on the black market. I could conceal a dried set of forged fingerprints on the pages of an old Nancey Drew book and send it by post; all the recipient would have to do is snip some paper out and dissolve it in water, then spray it wherever desired.

All this means two things, in plain summary:

  1. DNA fingerprinting is, at present, used badly and can easily incriminate innocent people. Indeed, it probably has. This is almost entirely due to the use of databases, as supplementary use of DNA is still fantastic.
  2. DNA fingerprinting is already easier to forge than a fingerprint given a little savvy, and with some relatively trivial work for an amateur biotechnician could become entirely irrelevant using readily available techniques..and then shared with others in a virtually untraceable manner.

In other words, it's time to return to genuine forensics, guys. DNA evidence should be taken only as part of a greater portfolio of evidence against a likely suspect. Building databases will only bake bias into the system and distract from more trustworthy methods.

 Share, if you like. Fediverse sharing is preferred, though.