I’m designing the graduate seminar I’ll teach in the Department of English this fall (2015) on the subject of ‘Algorithmic Criticism,’ a title I took from the subtitle of Stephen Ramsay’s 2011 book, Reading Machines. It’s an introduction to computational text-analysis for students of literature, from word frequency to topic modelling.
By the end of the course, students will be comfortable moving between close reading and distant reading, or what Matthew Jockers calls micro-, meso-, and macro-analysis. (Along with Ramsay’s book, Jockers’ 2013 study Macroanalysis and his 2014 guide to Text Analysis with R for Students of Literature will be required readings.)
Students will learn and implement some programming basics using Python and R, so they can see what happens when natural-language processing and other tools parse and rearrange the words in both individual texts and larger corpora. I haven’t developed more detailed course outcomes than that. We’ll use Codecademy’s Python tutorials alongside Jockers’ book on R.
So which literary texts do you assign for old-fashioned linear close readings in a course like this? They should be long enough to have a lot of words to work with, and complex enough that they contain a lot of topics. They should provide good contrasts with each other – that is, contain a lot of different words and topics – yet be close enough in time that the comparison makes sense. And they should be in the public domain, so we have texts to manipulate in whatever repository we’re drawing them from.