Authorial London

Authorial London, a literary geography web application launched by Stanford University Libraries’ Center for Interdisciplinary Digital Research (CIDR) in April 2016, aims to facilitate exploration of place in literature for researchers, students, and the general public. Users can explore the project corpus through biographical, literary, and geographical perspectives: displaying, mapping, and filtering (to date) 1600 place-inflected passages from 193 works by 47 authors who lived in and wrote of London between the 14th and 20th centuries. A generalized version of the project software, Authorial {X}, will be made publicly available this summer.

As detailed below, this effort extends earlier work by Professor Martin Evans of the English Department, and has been developed collaboratively by Kenneth Ligda, a literary scholar and instructional designer in the Digital Learning Design Team of the Vice Provost for Teaching and Learning (VPTL), and Karl Grossner, a geographer and research developer for CIDR.

Background

Three stages of Authorial London development

In the Fall of 2014, Ligda, then Academic Technology Specialist in Stanford’s English Department, was awarded a CIDR development team grant to revamp Authorial London, a project initiated in 2004 by Professor Evans as a resource for students, with particular regard for those who didn’t have the chance to visit London in person. Working from a deep knowledge of London literature, Evans had first created a Literary London web site in 2004. Renamed Authorial London, A Student’s Guide in 2009, the site featured short, London-specific biographies of 47 writers spanning from Geoffrey Chaucer to Sylvia Plath, together with contemporary pictures Evans had taken while visiting London.

In 2010 Evans enlisted Stanford Libraries’ Digital Humanities Specialist Elijah Meeks to build a new site integrating interactive maps. Titled A Guide to Authorial London, it was launched publicly in 2011. Sadly, Evans passed away in 2013, and not only was the site now untended, its underlying technology (Adobe Flash/Flex) would soon become obsolete.

The evanescence of online projects is notorious, and allowing a site as rich and intriguing as Authorial London to slip into oblivion would be an extravagant waste.  This was an opportunity for the University, via CIDR, to take the long view of intellectual endeavor—a quietly revolutionary move to place institutional support into significantly furthering a digital project and putting it on a more sustainable technological footing.

What to do?

Surveying the site with an eye to improvement, we began with the obvious and moved to exciting. First some basics: take it out of Flash and into a more powerful and adaptive open-source JavaScript environment.  Make the maps more prominent.  Make navigating the authors easier and more flexible—add filtering for attributes like period, genre, literary form, and social standing. But here was opportunity to do much more.  Professor Evans had envisioned opening up the site for contributions, especially to the many great authors not yet included. So the site should be extensible, supporting future contributions. And then there was something obvious and exciting: the 47 subjects of the site were united by being authors—how wonderful it would be to actually allow exploration of their writings. As a geographer, Karl wanted to enable exploration of literary representations of particular places and how they have altered over time. This dovetailed perfectly with Kenneth’s interest in mapping narrative passages, comparing locations in authors’ and literary communitis' lives and works, and studying place as a literary device. We would have to build a corpus of spatially inflected passages from works by all 47 authors.

Finally, there was a moonshot idea about all the literature that lies far beyond London.  Most of the world’s literature is uncharted terrain.  We decided to build Authorial London as the first instance of an Authorial {X} platform, allowing people interested in other cities or regions to use a generalized copy of the software to illuminate the literary geography that mattered to them.

Taken together, we hope these extensions will make the new site valuable for research and teaching, while appealing to general audiences as well.

Nuts and bolts

Authorial London was built over an 18-month period, necessarily interleaved with other projects within the two sponsoring organizations, CIDR and VPTL.

A geo-tagged corpus
The corpus contents were selected by Kenneth, and built with an engineering assist. It began with his manual selection, transcription, and geo-tagging of several hundred passages referring to place within in a few dozen books in copyright. The result was a spreadsheet with the tagged passage text and a list of 200 or so place references (placerefs). Not all placerefs are toponyms, strictly speaking, but each was matched with one of a shorter list of distinct geocoded toponyms (places). For example the placerefs, “Father Thames,” “Thames waters,” and “the silver Flood” all refer to a single place record and geometry for “the Thames.” The place reference and place lists were then supplemented from generic lists of London neighborhoods and districts.

Next we downloaded and indexed about 690 texts from gutenberg.com, limiting ourselves for the time being to the original 47 authors. Karl wrote a Python script to identify, tag, and extract passages containing one or more of our (now ~400) place references, then built a simple web page that allowed Kenneth to efficiently review around 10,000 results for inclusion in our corpus. In the course of that work, many new place references and places were discovered. We repeated these machine extraction and manual culling steps twice, ultimately adding around 1300 passages from out-of-copyright works to the original set. The resulting dataset at project launch included 2802 text references to 882 mapped places in 1591 passages from 193 works by 47 authors, written between the 14th and 20th centuries.

A suitable software environment
Our requirements for a) ongoing additions of curated data, and b) a generic software core that could be shared for creation of any Authorial {X}, necessitated a far more structured, complex, and sustainable software development stack than has been used in any CIDR projects to date. Authorial London has a Backbone/Marionette JavaScript front-end built against the back-end stack favored by SUL’s Digital Library Systems and Services division (DLSS): Ruby-on-Rails and a PostgreSQL database.  A full three months was taken up with learning Backbone and Rails, a grueling but ultimately rewarding experience, and an essential step for ensuring the application will be long-lived.

Future work

Authorial London is a dynamic site; authors, works, passages and places will be added over time by its Editor, Kenneth. He has expressed particular interest in leveling its current gender imbalance. Two faculty members have expressed interest in incorporating the site into instruction, another possibility for faculty-curated content making its way into the corpus.

As mentioned above, Authorial London is the first instance of a generic software platform, Authorial {X}, which will be published to GitHub in Summer of 2016. The platform will not be "plug-and-play;" its implementation will require web development skills and access to a Ruby-on-Rails web server.

Analytics
The geo-tagged corpus of Authorial London is amenable to various kinds of linguistic and spatial-temporal analyses, but so far the only analysis performed has been the computation of "distinctive terms" associated with London neighborhoods. We plan to investigate further possibilities along these lines, including spatio-temporally indexed topic modeling and concept analysis using sophisticated term collocation measures.