LOWBot Goes a(n)-Antedating

In an earlier post, OED Antedating OED, I documented how OED3’s rate of antedating had improved dramatically since the revision kicked off in 2000, from around 35-40% of word entries antedated in the first five or six years of updates, to above 60% since 2012, noting that one reason for the improvement must be the coming online of big historical text repositories such as EEBO and ECCO.

Recently OED launched one of their storied ‘Appeals‘ to the public which highlighted this fact:

… for the entries we worked on in the early years of the project, there’s a good chance of being able to improve upon the dates of our earliest quotations by searching in a number of now readily accessible databases that simply weren’t available then.

And this is where you come in. As editors are concentrating on updating the unrevised text of the OED, it is unlikely that they will be able to go back systematically over the revised ranges for some time. Carrying on the long tradition of crowdsourcing employed by the OED, we’d like to invite you to try your hand at antedating any sense that has been revised or added in the range M-R…

[As a small aside, who was the first to refer to OED as a crowdsourced or crowdsourcing project? I did in 2013 (both terms, and noting that they were not yet in OED–they are now), but there must have been others before me. I’ll get the bots on it!]

The Appeal motivated me to work on an experiment I’ve been wanting to run for some time, so a couple of days ago I set longtime team member LOWbot to look for potential antedatings in EEBO and ECCO. It has already started tweeting antedatings, and will continue at a rate of one per hour (well, one minute earlier each hour, in the spirit of things) for the next seven days. You can check up on its progress in the feed embedded below, or follow @lifeofwordsbot, #oedantedatings, or #hourly_antedating on Twitter  [update – the hashtags don’t seem to be picking up LOWbot’s tweets or my retweets – follow or search the bot itself if interested].

[Note that not all of these will be true antedatings–LOWbot is a good thing but it is not designed or destined to be a lexicographer. Even so, I do think a healthy number of these hits will probably will turn out to be good. Let’s see.]

 

3 Comments

  • kts wrote:

    Some early uses of “crowdsourcing” or “crowdsourced” to describe the OED:

    Heidi Harley on Language Log, February 02, 2007:

    The original editor of the OED, James A.H. Murray, invented crowdsourcing long before the advent of Wikipedia, the personal computer, or the internal combustion engine.

    Vocabulary.com interview with Jesse Sheidlower, July 30, 2008:

    VT: Nowadays that would be called “crowdsourcing.”
    JS: Yes, and this process still goes on with the North American Reading Program and the OED’s other reading programmes.

    The first use by an official representative of the OED describing itself may have been the appeals announcement in October 2012, cited in the paper that you link. And you just barely missed the entry of “crowdsourcing” when you submitted that paper — it was added in June 2013.

  • kts wrote:

    Looking over LOWBot’s tweets: You replied to its tweet finding machaira in Elyot, Bibliotheca Eliotae (1542) with

    Elyot recorded many Latin terms — is this a definition or a translation ? OED often blurs that line.

    But as far as I can tell, OED is very careful *not* to blur that line. Bibliotheca Eliotae is a Latin-English dictionary, and the OED cites it many times but never counts the Latin headwords as evidence of use in English, even if they were borrowed into English later. It uses only the English definitions as quotation support for English words, e.g. this quotation for dog:

    1542 T. Elyot Bibliotheca Canina facundia, dogge eloquence.

    It cites the Latin headwords only in etymologies (alcyonium, angiology, vaccinium, zeta n.2) or in square brackets before the English quotations (acipenser, dogmatist, reliquiae), as evidence that the Latin word was known to English speakers. The only exception is alligator, n.1 (one who binds or ties), which is labeled as a ghost word (“only attested in dictionaries or glossaries”) since they have no citations except dictionaries, and that quote really should be in square brackets as well.

    So, sorry LOWBot, machaira doesn’t count. Neither do manubrium, mataeotechnia, materfamilias, and meconium. You’re only a bot, we don’t expect you to know the difference between Latin and English.

  • LOWbot might have cause to feel let down, here. The tweets themselves were lightly curated, in that I did a very quick pass to cull obvious false positives based on a couple of indicators. Probably I should have nixed Elyot for the reasons you give.

Leave a Reply

Your email is never shared.Required fields are marked *