Author Archives: D-AW

David-Antoine Williams. I’m an assistant professor of English at St Jerome’s University, in the University of Waterloo. See “About Me” page on the menu above for details.

One last round with metadata from Hathi and Underwood

In “Hathi’s Automatic Genre Classifier” and “Hathi Genre Again – Zero Recall“, I ran a couple of experiments comparing genre categories assigned by human taggers working on the Life of Words OED mark-up project to two sources of genre metadata associated with the HathiTrust Digital Library. The first post looked at data from the automatic […]

Poetry Competition Time

As part of our OMRI funding, LOW runs an annual poetry competition, open to all high school students in Ontario. Last year’s pilot run had a few dozen submissions, from which we picked one winner, two runners up, and twelve honorable mentions, all collected in our 2016 Anthology. Last year’s theme was “write a poem […]

Shakespeare’s Earliest Citations in the OED

No author’s representation in the OED has received more comment than Shakespeare’s: if you ever come across a mention of OED citation evidence, more than likely it’s being used to substantiate (sometimes challenge or qualify) a claim that Shakespeare invented the most English words, or made up the most new meanings for existing words, or […]

OED Subject Matter

In my last post I described using HathiTrust’s Solr Proxy API to fetch Hathi genre metadata for OED quotations. But genre is not the only metadata that Hathi sends back down the intertubes when I ask it a question. For most works, I also get a Library of Congress Classification code for the volume. This […]

Hathi Genre Again – Zero Recall

In “Hathi’s Automatic Genre Classifier” [17.01.06] I compared the consolidated automatic genre metadata for a subset of HathiTrust Digital Library texts (available here) to the genre classifications arrived at for human-inspected works as part of the OED quotation tagging project under-way at The Life of Words. My process there was pretty closely supervised, but the […]

Hathi’s Automatic Genre Classifier

The HathiTrust Digital Library is a massive collection of digital books: As of 2017, it contains 5 billion pages from 15 million volumes (7 million titles). About 40% of these are public-domain works, meaning anyone can search and read them. Some of these have been marked for their textual genre. Here I do a little […]

OED Gender Genre

In “Sex in the OED” I  ran through some figures on female vs male representation in OED quotation evidence, comparing the original OED1 with the later Supplements that resulted in OED2. Here I look a little closer at what kinds of works by women the two editions tended to cite. Below are two charts breaking […]

Burchfield’s Reach-Backs

The vast majority of the quotation evidence in Robert Burchfield’s OED Supplements comes from after the first (1928) edition was completed. The median date for these is 1944, whereas for the first edition it’s 1742. However, in some circumstances the Supplements did reach back into periods already covered by OED1 — if it could antedate […]

Sex in the OED

Two subprojects concerning OED quotation metadata are now near enough to complete to present some preliminary results. They concern the sex of the authors quoted in the OED, in both the first edition (1928) and the later Supplements (1933, 1972-86). The most focused work on this question so far has been Baigent, Brewer, and Larminie, […]

Entitled Professor

I happen to have an interest and a certain amount of expertise in words that mean their own opposites. You might say I’m qualified to post here on that topic. You might even say I’m entitled to my opinion on a wider range of things in which I’m not necessarily expert. But if you call […]