Mutated Manuscripts: The Evolution of Genes and Texts


Recently, Christian G. Sprecht, a neurobiologist, wrote an intriguing piece in The Scientist about how scientific citations -- references by one paper to previous ones -- mutate over time. What Sprecht meant by this is that over time, a popular paper, instead of being read and cited directly, gets cited by looking at other citations. This somewhat lazy approach is unfortunately all too common, and if a scientist types it wrong, then suddenly there is a mutated version out there, that other scientists reference, leading to a proliferation of errors. By studying these mutations you can learn about the history of the article that is being cited.

But, of course, such errors are not confined to modern scientific articles. People having been miscopying text for thousands of years. And understanding the errors in these manuscripts is actually quite similar to understanding genetics. This sounds a bit odd. What do handwritten manuscripts, from the medieval period or earlier, have to do with genetics? On the surface, nothing: one is a hard experimental science and the other is a distinguished part of the humanities. However, while those who study each of these fields have very little to do with each other, it turns out that there is a great deal of symmetry. And it mainly comes down to mutation. Scholars who study paleography - the field of research that examines ancient writing - are all-too-well-aware of the mistakes that scribes make when copying a text. These types of errors, which can be used to understand the provenance of a history of a document, are actually nearly identical to the types of errors caused by polymerase enzymes, the proteins responsible for copying DNA strands.

It's clear what a mutation is in genetics: a strand of DNA gets hit by a cosmic ray, or copied incorrectly, and some error gets introduced into the sequence. For example, an 'A' gets turned into a 'G', although they can be much larger in effect. These errors can range from causing no problem whatsoever (don't worry - the majority are like this), to causing large-scale issues due to the change in a single letter of DNA, such as in the case of sickle-cell anemia.

Well, there are also systematic errors in copying a text. Whether it's skipping a word or duplicating it, there is order to the ways in which a scribe's mind wanders during his transcription. Many of the errors can be grouped into categories of error, just like the different types of genetic mutations. And not only are there regularities to how both DNA and ancient manuscripts are copied, but it gets even better: despite the differences in terms, these types of errors are often identical.

For example, there is a scribal error known by the Greek term homeoteleuton. This refers to a type of deletion, where there are two similarly ending passages and the scribe skips to the second ending without transcribing the first intervening portion. For example, if a section read, "And you should do the following things because I am the Lord. Here's what you should do, because I am the Lord. Amen." and it was instead copied as "And you should do the following things because I am the Lord. Amen." that would be a homeoteleuton.

Well, in genetics this error is simply known as a mutation called slipped-strand mispairing. AATTCGATATACGA gets copied as AATTCGA. Smaller slipped-strand mispairs also exist, and are known in paleography as haplography, where such a miscopy and deletion occurs within a single word (like going from metoposcopy to metoscopy, both quite rare words).

Insertions can occur during copying in both genetics and paleography as well. This is called dittography for manuscripts, and, well, insertions, in genetics. There are also reversals: metathesis in paleography and chromosomal transpositions in genetics. And point mutations, substituting the wrong genetic base when copying DNA, also occur in handwritten manuscripts. In both cases, the wrong letter is written, based on probabilities of being similar. In DNA, A and T are quite similar chemically and can be confused easily. In ancient Greek, lambda and delta look similar and are more likely to be exchanged as well. And the list goes on.

While fun to chronicle such similarities, these similarities can also be exploited in the same way. Mutational differences between DNA sequences can be used to understand the evolutionary history of a population, or even a group of species. And so too with variants of the same manuscript. A famous example of this is from a 1998 research article in the journal Nature that quantitatively studied the differences between the 80 surviving versions of Geoffrey Chaucer's The Canterbury Tales. By subjecting the variants to a battery of genetic analyses, the researchers were able to better understand the contents of the ancestral version, Chaucer's own copy!

And in fact, the relationship between the fields has become more than figurative or expedient. Recent research has brought the areas of paleography and genetics together, through the examination of the DNA of the animal skins on which the texts themselves were written on. In the past few years, scientists have sequenced the genetic material of these ancient manuscripts, in order to determine what animals were used to furnish the raw materials for writing.

I look forward to the day where studying biology is a prerequisite for a PhD in Classics, and biblical criticism can help round out your doctorate in evolutionary biology.