Monthly Archives: February 2017

OED Subject Matter

In my last post I described using HathiTrust’s Solr Proxy API to fetch Hathi genre metadata for OED quotations. But genre is not the only metadata that Hathi sends back down the intertubes when I ask it a question. For most works, I also get a Library of Congress Classification code for the volume. This […]

Hathi Genre Again – Zero Recall

In “Hathi’s Automatic Genre Classifier” [17.01.06] I compared the consolidated automatic genre metadata for a subset of HathiTrust Digital Library texts (available here) to the genre classifications arrived at for human-inspected works as part of the OED quotation tagging project under-way at The Life of Words. My process there was pretty closely supervised, but the […]

Guest Post: Magazines and the Dentist Test

Cosmin Dzsurdzsa is a research assistant working on identifying the textual genre of quotations in the OED. Here he writes the first in a series of posts on borderline and difficult genre determinations. Filtering quotation blocks is essential to optimizing our results with the quantity of data we deal with here at LOW. For a […]