Most people know that the OED is in the midst of a wholesale revision of legacy OED material dating back in some cases to the 1890s, in addition to the regular updates and additions we hear about in quarterly bulletins. This work started to be published almost 20 years ago, now, and may go on for a similar time span before the project has finished revising and transitions towards mainly adding and updating entries.
I’m using data that is almost a year old by now (the December 2018 update), but since I’ve been working on cross-edition comparisons, it occurred to me to have a look at what exactly we look at when we consult OED.com, to gauge where OED is, really, in the course of its revision.
We tend to think about revision in terms of numbers of entries revised, which makes a certain amount of sense. In that reckoning, the revision project is just about half way through — 50% of entries are either new in OED3 (7%) or have been revised by OED3 staff (43%).
But there is another way to look at the revision which might give a better account of the real amount of work that has gone into it, and how much remains to be done. What I’ve done below is map out the actual amount of sequential text in the OED, in a grid of 50 x 3,800 squares, where each square represents 2,000 characters. The squares are then coloured according to the original edition of the entry in which they appear, and their revisions status, as so:
So, to give an example, bright yellow dots represent original OED1 text, whereas streaky yellow/blue dots represent OED3 revised text in an entry that first appeared in OED1 (note: here I don’t separate out intermediate revisions, e.g. OED1 entries revised in the Second Supplement — these show up as simply OED1-originating entries). By this way of reckoning, the current OED3 is more than two-thirds (67%) revised material.
Here’s what you might call a bird’s eye view of the entire dictionary, in five columns, broken up alphabetically:
Quite evident in this graph is the fully revised M-R range (the revision proceeded alphabetically, starting at M, for a number of years), yellow/brown with flecks of dark blue for new OED3 entries, and a few entries from the supplements and additional series. You can even make out dispersed dark blue dots for new OED3 entries across the alphabetical range.
Not much more can be gleaned from this, so I give below the highest resolution version I can manage in this format (click here to download a single jpg):