In my previous post, I used OED3’s etymologies to chart the languages that gave English its words, noting that most English words come from other English words. I then dug deep into all the non-English sources of English. Today I’ll take a closer look at the etymological sources of English words developed within English.
Lexical extension is one good sign that a word has been thoroughly integrated into the language, or, as I like to say, “Englished.” Take my first sentence, above: post, n. and chart, v. are two English words formed on an identical English word, which at some point was borrowed out of French or Latin. In the case of post there is probably a long line of development through English between the French or Latin original and my use. These words have been thoroughly Englished, unlike, say, the word Zeitgeist, which retains somewhat its exotic quality. It’s hard to imagine it extended very much in meaning, let alone hear it naturally verbed or adjectivized.
It’s Englishing that I’m looking at today, and the historical sources of Englished words. In the charts below, I’ve counted only words with at least one count of “English” in the OED3 etymology. I then collected all the other antecedent language sources for that word and for all of the word’s listed etymons. The first graph is a simple count of all of these sources. I group “English” with “Germanic” to represent the basic core of the language, to which other languages are contributing.
It’s important to bear in mind that the chart is counting sources, not words, since a word may have (indeed, must have) many sources. Totals are for the trailing 100 years, just to smooth out the curve. The double-hump shape is the familiar shape of English word production as recorded in OED (discussed briefly last time).
I want to normalize for this, so I’m going to divide each total by the sum for that year, giving me a % contribution chart, here:
Figure 2 shows a gradual and steady decline of English/Germanic among the various sources for English words over time. To paraphrase the chart, taking the high point, if you count up all the languages that are sources of all the words first recorded in the 100 years leading up to the year 1200, 90% would be English or a direct Germanic ancestor of English. By 2000, only 40% would be: this is the cumulative effect of steadily incorporating foreign source languages over time.
Let’s focus in on the non-English/Germanic sources, to see what English is Englishing, and when:
Figure 3.Percentages before around 1400 can be a bit patchy, due to the small totals involved. The early spike in “Other European” languages is not, as far as I can tell, due to any one main factor: there are mainly Celtic and Romance sources, a little Scandinavian, but the overall figures are small and are probably due to only a handful of source texts. One detail to keep an eye on is the rise in Greek and German after 1750 — more on this below.
Naturally Latin and French would dominate Figure 3, and would be correlated, since essentially all French words are originally Latin at some point (but not français, which is Germanic!). That doesn’t mean OED will always give the Latin etymon, but in most cases it does. We can reduce this noise by mapping each word onto only one language, in one of two ways: in Figures 4 and 5, I show the earliest and the latest source language, respectively:
The earliest listed source language gives a sense of how languages have perpetuated themselves down through the line of English development. It makes sense that in the century leading up to 1700, up to a quarter of new English words based on older English could be traced back to Latin – there is the combined effect of the Norman French infusion having seeded Latin throughout English, with neoclassical attitudes which would tend to encourage the germination of these seeds. It is also, as Fig. 1 shows, a very productive time in terms of vocabulary formation and extension (subject to the usual caveats about OED evidence).
We could turn things around, however, and look at the latest source language, the donor or contact language:
Figure 5.This chart I think is the most telling of all. We see clearly:
- The effect of contact with Norman invaders, beginning about 1300 and peaking about 1550. Many direct borrowings that begin about the same time, but peak around 1400 (see previous post), will have been thoroughly Englished in this period.
- The rise in words Englished from Latin loanwords, starting in the 1600s and remaining high till today.
- The rise in words Englished from Greek etymons, starting about 1800. This coincides with a rise in words Englished from German. They are, in fact, closely related phenomena: most of these words are scientific terms formed with particles such as meta- and micro-, -ite and –ogen, which OED3 recognizes are fully Englished by the 1800s. Much of this scientific terminology was first coined in German and borrowed into English from there. The rest was formed in English based on Greek particles. Sometimes both things happened, as in my example of last time: mesotrophic is listed y OED3 as being formed, on a German model, with the Englished Greek particles meso- and –tropic.
And here are the same values stacked up, to give a sense of how it all adds up: