Hathi’s Automatic Genre Classifier

The HathiTrust Digital Library is a massive collection of digital books: As of 2017, it contains 5 billion pages from 15 million volumes (7 million titles). About 40% of these are public-domain works, meaning anyone can search and read them. Some of these have been marked for their textual genre. Here I do a little […]

OED Gender Genre

In “Sex in the OED” I  ran through some figures on female vs male representation in OED quotation evidence, comparing the original OED1 with the later Supplements that resulted in OED2. Here I look a little closer at what kinds of works by women the two editions tended to cite. Below are two charts breaking […]

Burchfield’s Reach-Backs

The vast majority of the quotation evidence in Robert Burchfield’s OED Supplements comes from after the first (1928) edition was completed. The median date for these is 1944, whereas for the first edition it’s 1742. However, in some circumstances the Supplements did reach back into periods already covered by OED1 — if it could antedate […]

Sex in the OED

Two subprojects concerning OED quotation metadata are now near enough to complete to present some preliminary results. They concern the sex of the authors quoted in the OED, in both the first edition (1928) and the later Supplements (1933, 1972-86). The most focused work on this question so far has been Baigent, Brewer, and Larminie, […]

Guest Post: Moving from 2.0 to 3.0

Danielle Griffin recently completed her co-op term as a full-time research assistant at The Life of Words. Here she offers some thoughts about her work on identifying the textual genre of quotations in the Oxford English Dictionary: When I started my job as an RA, Dr. Williams had me tagging quotations five days a week […]

Vector Space and Poetic Logic

I’ve been spending the weekend experimenting with vector space modelling and poetic language. Vector space word embedding models use learning algorithms on very large corpora in order map a unique location in n-dimensional space to each token (=word) in the corpus. “N-dimensional space” is just a mathy-sounding way of saying that multiple (or n) features […]

Bowie on OED

It’s a day for sharing David Bowie quotations on the social medias. One in particular just crossed my path: I presume the person who wrote out, photographed, and posted this little tidbit (making sure to draw attention to their book store’s own social media outposts) found it among the collected Bowie quotations on some “famous […]

The Colour of Greyhounds

Do you know the old joke, “What colour was Napoleon’s white horse?” Well I have another one for you: “What colour was Napoleon’s greyhound?”. Not sure? You may consult the Official Greyhound Colour Chart: Here’s how OED sums up the situation: Apparently < a first element cognate with Old Icelandic grey bitch (further etymology uncertain: […]

Two Notes on T. S. Eliot and the OED

I have two upcoming notes in the journal Notes & Queries concerning T. S. Eliot and the Oxford English Dictionary. Though they won’t be published until later next year, the self-archiving policy at Oxford Journals allows me to make an unrevised pre-print version available here. The two articles are: “The ‘Oxford Dictionary’ in T. S. […]

Interview with Paul Muldoon

Here are some excerpts from an interview I did with Paul Muldoon a couple of years ago, which focused on dictionaries and etymology. A full .pdf version of the interview can be downloaded here: [Interview with Paul Muldoon]. PM: I’ve never really been into the OED Online. Maybe I should. I think I might even […]