We are now firmly into 2020 in my blog blacklog, and that was, as you presumably remember, so very different a year that I amassed rather fewer stubs than usual and might even move through it mercifully quickly. For now, however, we’re in mid-February of that year, when an old friend who likes to scour the Internet for medievalist news, or as in this case even older, picked up on a recent study of digital methods for dating ancient texts and posed me the reflection which forms the title above: was this digital palæography finally coming of age?1
Now, I am less concerned than some have reason to be about the possibility of my expertise and training being replaceable by automation, although with every attempt to automate marking or package teaching content in such a way that anyone can deliver it whether expert or not, we get a step closer.2 Still, the actual doing of historical analysis, whether I am paid for that or not, will probably remain a thing beyond computerised automation until we somehow go full-on Hari Seldon, and the database categories you’d need for such an analysis will probably take a few more civilisations to work out, so I think I’m safe. But at the fringes of the historical endeavour, if I was picking a discipline for highest vulnerability to digitsation and automation, it might well be palæography. That’s not just because almost no institution wants to pay for there to be palæographers, despite the near endless potential they have for research contributions; it’s also because at its absolute basic simplest, the discipline of palæography is based on the ability to recognise consistent graphical patterns, that is, letter-forms, and graphical pattern recognition (rather than social pattern recognition à la Seldon) is a thing computers are good at.
Accordingly, it’s not surprising that almost since computing and the humanities first tentatively shook hands, people have been trying to get computers to recognise and date ancient and medieval scripts. The earliest reference I have on this goes back to 1994 and relates to Egyptian papyri, and that was little more than an expression of hope, but by 2006, when I myself was briefly professionally interested in image recognition, people were getting closer.3 Back then the academic work was ahead of Google Image Search, but that didn’t last long, and before long technology like theirs was getting into humanities computing labs and I was seeing papers about it.4 Now those papers are coming out and people are clearly making great progress, especially it seems with South Asian scripts, so the fact that the one my friend had pointed me to existed was not surprising to me.5 But whether because she hadn’t been looking for this sort of stuff already or because I am just more cynical, I wasn’t expecting as much from this article as my friend suggested was in it.
There are, I guess, at least three ways a scientific study on something from my periods of interest can disappoint. The most annoying is when even I can see that it’s scientifically faulty, because of minuscule sample size, unconsidered error margins, lack of reproducibility or whatever.6 Nearly as annoying is when the science appears to be good but the historical context is more or less derived from the 1950s textbooks which apparently sourced either the lead researchers’ own undergraduate study or the Wikipedia page on which they based their questions; that’s annoying because they could just have asked (and then ideally credited) a historian, and I myself would love to be asked, so you know, come on.7 But much the most common and least reproachable, but still annoying for the non-scientific reader, is the study which is actually out to test or validate a method, not to find out something historical, and which therefore stops at ‘we have therefore shown that this could work’ without actual results.8 And this is one of those, a study of how we might digitally date the many undated fragments among the Dead Sea Scrolls which, nonetheless, does not actually date any of them, because what it is trying to do is make their systems match the dates humans have already assigned to such fragments.
You might then ask why, if they in fact had a viable method demonstrated, they didn’t at least go so far as to show it in action. It might have been because they were attempting to avoid the risk of showing their historical ignorance, like those behind a new pottery dating method back in the day; but actually, it’s worse; they didn’t yet have a viable method.9 Instead, their conclusions section is full of fixes which might be applied to make the method work better: a new date calculation method which didn’t ideally require even intervals (which they didn’t have, because the palaeographical datings they were trying to match worked in historic periods, not mathematical ones), or a specialised Hebrew character recognition tool, for example.10 Their error margins were reckoned to be about 23 years either side of the central year in any given dating period; that would be better than the few radio-carbon dates that have come off the Scrolls, if it were accurate, but when one of the periods into which they are trying to date is only thirty years long – less, we might note, than the lifespan of most of the people writing in the appropriate style – you can see how that wasn’t enough.11 It doesn’t quite end with ‘so, back to the drawing board’, but it’s very much, ‘don’t come in, we’re not ready yet’.
For me, however, this study does not fail because of the weakness of the computing techniques used. I’m quite prepared to believe that for the values they’ve set up, those techniques could be refined, and at least they eliminate several as being unhelpful for the endeavour. But the problem they don’t see is the human element, in two places: in the creation of their source matter and in the provision of their classifications. The latter of these, the fact that the datings they were trying to train their method to match were all subjective by-eye evaluations by human beings, be they never so learned, the authors at least wave at in the introduction, saying that one advantage of a digital palæographical method might be to reduce subjectivity before proposing one based entirely on subjectively derived datings.12 But the fact that humans, individual ones many of whose working lives probably overlapped their period boundaries, actually made the things they’re trying to date, almost eludes them. They do admit that scribes demonstrably change their writing styles over time, before saying that they are after a method which captures period-level shift in script instead; but they don’t seem to see that the former factor is a component of the latter.13 This is partly just the problem of database categorisation: something must fall one side of a line or the other, it can’t be ‘sort of both’.14 But it’s also humans in action, muddling along, trying something different, going back to the old ways disappointed, maybe trying again later. Every one of those decisions and choices could throw a close palæographical dating way out. A good palæographer knows all this and tries, subjectively, to account for it with context and background knowledge. Remove that subjectivity, and every palæographical judgement would need to come with huge error bars which would be labelled, if there were space, ‘unless this is a weird one’. Long ago, a then-lawyer friend of mine angrily told me in a pub, “the trouble with you historians, Jon, is you forget that people are weird!” Probably a fair complaint; but I’m not the only one guilty… So in the end perhaps the human palæographer has not yet got to fear robotic replacement: the computers will certainly end up better able to match patterns than we can, but the task of working out what the patterns mean is going to remain gloriously and resistantly fuzzy.15
1. Maruf A. Dhali, Camilo Nathan Jansen, Jan Willem de Wit & Lambert Schomaker, “Feature-extraction methods for historical manuscript dating based on writing style development”, edd. Francesca Fontanella, Francesco Colace, Mario Molinara, Alessandra Scotto di Freca & Filippo Stanco in Pattern Recognition Letters Vol. 131 (Amsterdam 2020), pp. 413–420, DOI: 10.1016/j.patrec.2020.01.027.
2. Cf. Innovating Pedagogy: Exploring new forms of teaching, learning and assessment, to guide educators and policy makers by Agnes Kukulska-Holme, Carina Bossu, Tim Coughlan, Rebecca Ferguson, Elizabeth FitzGerald, Mark Gaved, Christothea Herotodou, Bart Rientes, Julia Sargent, Eileen Scanlon, Jinlian Tang, Qi Wang, Denise Whitelock & Shuai Zhang, Open University Innovation Report 9 (London 2021), online here, or Wayne Holmes & Ilkka Tuomi, “State of the art and practice in AI in education” in European Journal of Education Vol. 57 (Oxford 2022), pp. 542–570, DOI: 10.1111/ejed.12533, which both think otherwise.
3. The 1994 paper is Janet Johnson, “Computers, Graphics and Papyrology” in Adam Bülow-Jacobsen (ed.), Proceedings of the 20th International Congress of Papyrologists, Copenhagen, 23-29 August, 1992 (Copenhagen 1994), pp. 618–620. By 2007 one could also count Ikram Moalla, Frank LeBourgeois, Hubert Emptoz and Adel M. Alimi, “Contribution to the Discrimination of the Medieval Manuscript Texts: Application in the Palaeography” in Horst Bunke and A. Lawrence Spitz (edd.), Document Analysis Systems VII: Proceedings, Lecture Notes in Computer Science 3872 (Berlin 2006), pp. 25–37, or M. Bulacu and L. Schomaker, “Automatic Handwriting Identification on Medieval Documents” in 14th International Conference on Image Analysis and Processing (ICIAP 2007) (New York City NY 2007), pp. 279–284, online here, one of the authors of which shows up again in the paper under discussion. I’m sure there was lots more. The team I was part of myself was concerned with coins (inevitably) and showed up with Martin Kampel, “Computer Aided Analysis of Ancient Coins” in Robert Sablatnig, James Hemsley, Paul Kammerer, Ernestine Zolda and Johann Stockinger (edd.), Digital Cultural Heritage – Essential for Tourism (Wien 2008), pp. 137–144, and eventually Jonathan Jarrett, Sebastian Zambanini, Reinhold Hüber-Mork and Achille Felicetti, “Coinage, Digitization and the World-Wide Web: numismatics and the COINS Project” in Brent Nelson and Melissa Terras (edd.), Digitizing Medieval and Early Modern Material Culture (Tempe AZ 2012), pp. 459–489.
4. For example, Arianna Ciula, “The Palaeographical Method under the Light of a Digital Approach”, presented at the International Medieval Congress, University of Leeds, 8 July 2008, and Peter Stokes, “Computing for Anglo-Saxon Paleography, Manuscript Studies and Diplomatic”, presented at the International Medieval Congress, University of Leeds, 13 July 2011, both mentioned here in their seasons.
5. Ciula’s did, at least, as Arianna Ciula, “The Palaeographical Method Under the Light of a Digital Approach” in Malte Rehbein, Patrick Sahle & Torsten Schaßan (edd.), Kodikologie und Paläographie im digitalen Zeitalter. Codicology and Palaeography in the Digital Age (Norderstedt 2009), pp. 219–235; Stokes’s I haven’t seen, but he did mastermind DigiPal, so it’s not like he left the game. One could also see Florian Kleber, Robert Sablatnig, Melanie Gau and Heinz Miklas, “Ruling Estimation for Degraded Ancient Documents based on Text Line Extraction” and Maria C. Vill, Melanie Gau, Heinz Miklas and Robert Sablatnig, “Static Stroke Decomposition of Glagolitic Characters”, both in Sablatnig, Hemsley, Kammerer, Zolda & Stockinger, Digital Cultural Heritage, pp. pp 79–86 & 95–102, or Jinna Smit, “The Death of the Palaeographer? Experiences with the Groningen Intelligent Writer Identification System (GIWIS)” in Archiv für Diplomatik Vol. 57 (München 2011), pp. 413–425, as steps along the way, and Mike Kestemont, Vincent Christlein and Dominique Stutzmann, “Artificial Paleography: Computational Approaches to Identifying Script Types in Medieval Manuscripts” in Speculum Vol. 92 (Cambridge MA 2017), pp. S86–S109, for where we are now or were recently. Again, I could cite lots more. On South Asian scripts, see Shaveta Dargan and Munish Kumar, “Gender Classification and Writer Identification System based on Handwriting in Gurumukhi Script” in International Conference on Computing, Communication, and Intelligent Systems (ICCCIS 2021) (New York City NY 2021), Vol. I, pp. 388–393, online here, and S. Brindha and S. Bhuvaneswari, “Repossession and recognition system: transliteration of antique Tamil Brahmi typescript” in Current Science Vol. 120 (Bengaluru 2021), pp. 654–665.
6. Discussed here but harmless: Michael McCormick, Paul Edward Dutton and Paul A. Mayewski, “Volcanoes and the Climate Forcing of Carolingian Europe, A.D. 750-950” in Speculum Vol. 84 (Cambridge MA 2007), pp. 869–895. Nastier: Mario Slaus, Zeljko Tomicić, Ante Uglesić and Radomir Jurić, “Craniometric relationships among medieval Central European populations: implications for Croat migration and expansion” in Croatian Medical Journal Vol. 45 (Zagreb 2004), pp. 434–444, PMID: 15311416.
7. S. R. H. Jones, “Devaluation and the Balance of Payments in Eleventh-Century England: an exercise in Dark Age economics” in Economic History Review 2nd Series Vol. 44 (1994), pp. 594–607; for an example where they did ask a historian but then didn’t credit her, see Susan M. Adams, Elena Bosch, Patricia L. Balaresque, Stéphane J. Ballereau, Andrew C. Lee, Eduardo Arroyo, Ana M. López-Parra, Mercedes Aler, Marina S. Gisbert Grifo, Maria Brion, Angel Carracedo, João Lavinha, Begoña Martínez-Jarreta, Lluis Quintana-Murci, Antònia Picornell, Misericordia Ramon, Karl Skorecki, Doron M. Behar, Francesc Calafell and Mark A. Jobling, “The Genetic Legacy of Religious Diversity and Intolerance: Paternal Lineages of Christians, Jews, and Muslims in the Iberian Peninsula” in American Journal of Human Genetics Vol. 83 (Bethesda 2008), pp. 725-736, DOI: 10.1016/j.ajhg.2008.11.007, where Dolors Bramon is acknowledged p. 734.
8. For example Alice M. W. Hunt and Robert J. Speakman, “Portable XRF analysis of archaeological sediments and ceramics” in Journal of Archaeological Science Vol. 53 (Amsterdam 2015), pp. 626–638, which more or less says, ‘this is a silly thing to do but if you must, here’s how’; cf. Warren W. Esty, “Estimation of the Size of a Coinage: a Survey and Comparison of Methods” in Numismatic Chronicle Vol. 146 (London 1986), pp. 185–215, for another example from a different discipline.
9. My whipping boy this time is Moira A. Wilson, Margaret A. Carter, Christopher Hall, William D. Hoff, Ceren Ince, Shaun D. Savage, Bernard McKay & Ian M. Betts, “Dating fired-clay ceramics using long-term power law rehydroxylation kinetics” in Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences Vol. 465 (London 2009), pp. 2407–2415, DOI: 10.1098/rspa.2009.0117, on whose problems see my old Cliopatria post here.
10. Dhali & al., “Feature-extraction methods”, p. 419.
11. Ibid., p. 418 (error margins) & pp. 414-415 (periodization), with the problems it causes expressed p. 419.
12. Ibid., p. 413 and 413-414.
13. Ibid. p. 414. For more on the problem see Jesús Alturo and Tània Alaix, “Categories of Promoters and Categories of Writings: The Free Will of the Scribes, Cause of Formal Graphic Differences” in Barbara Shailor and Consuelo W. Dutschke (edd.), Scribes and the Presentation of Texts (from Antiquity to c. 1550), Bibliologia 65 (Turnhout 2021), pp. 123–149.
14. Cf. Jonathan A. Jarrett, “Poor tools to think with: the human space in digital diplomatics” in Antonella Ambrosio, Sébastien Barret and Georg Vogeler (edd.), Digital diplomatics: The computer as a tool for the diplomatist?, Beihefte der Archiv für Diplomatik 14, (Köln 2014), pp. 291–302.
15. It wasn’t deliberate, but it’s probably no coincidence that the position I thus finish with is similar to that in Smit, “Death of the Palaeographer?” and Arianna Ciula, “Digital palaeography: What is digital about it?” in Digital Scholarship in the Humanities Vol. 32 Supplement 2 (Oxford 2017), pp. ii89–ii105, DOI: 10.1093/llc/fqx042.
I wouldn’t start with the scrolls. I’d start with stuff where the dates are well established rather than expert guesses, and try my techniques on them. Then you can bin the techniques that fail but still entertain the ones that succeed.
How about choosing parchments from medieval monasteries? Are there large numbers that are closely datable?
There are, of course, you’re right; all my charters, for a start, and while there’s some argument about whether charter hands and book hands are the same and whether a scribe in common can be identified if he or she used both, that would, I suppose, be among the things one set out to test. Lots of manuscripts from later on have dated colophons, too, though sometimes they themselves are copied from older manuscripts. Still, these are all things palaeographers are used to getting round. The problem I foresee is that script development isn’t like tree-ring chronology, where if we just get enough bits they must all join up on the same continuum. Being able to reliably machine-date 13th-century European Gothic script per region, for example, wouldn’t necessarily mean that the same engine would be OK with 6th-century Greek uncial, or any script culture where there was not so great a training set now available. I don’t think one could create a universal digital palaeographer. Maybe this is why as far as I know no-one has tried what you suggest. And all the same, I feel that you’re right that using a closely defined and well-dated sample would still be a better way to validate one’s methods than so enigmatic a corpus as the Dead Sea Scrolls.
Come to think of it, if you have access to ChatGPT 4.0 would you like to throw it some questions on this topic?
Or, more thrilling, maybe ask it to decrypt Linear A?
I don’t actually know if ‘we’ do have such access, I admit; we’re currently much more concerned that our students do, and are not institutionally able to adopt the most obvious solution to that problem, viz. change our assessments very quickly. But we are into something new, there; I did not use to walk the corridors of a university and walk past casual and fairly shameless conversations about cheating on assessments as I now regularly do. Anyway… As I understand the GPT thing, the P is for predictive, and it’s through prediction of patterns that it works, not analysis of any kind; thus, if we fed it a large enough test set of medieval graphemes, it might reasonably be able to guess what a given text was, but it wouldn’t read it or parse it. That might, just, work, for Latin palaeography some of the time; it might be able to do what current lexomic software can already do and identify the text, though it would then presumably smooth out any exciting variants in a given copy. But it wouldn’t read what was on the page, and it couldn’t work at all for Linear A. At best it could try and guess what the text might say, but we’d then not be able to rely on that or deduce anything from it. And to be honest, there are codebreaker programs that would be more obviously applicable, though I don’t know if any have ever been applied…