New Map of Indigenous American Words in English

For some time I’ve been meaning to update my map of pathways into English of  Indigenous American Words, which was based on the Second (1989) Edition of the Oxford English Dictionary. With a couple of hours to spare while watching the kids this week, I managed to get around to it, using data from the current OED Online (January 2020 update):

[a larger PNG is linked from the image, and a PDF can be downloaded here]

This map has almost double the number of words as the previous (homonyms are ignored here). OED3’s language hierarchy also allows me to group them according to language family.

As with the previous map, many of these words are ethnonyms and/or names of languages (Wallawalla, Potawatomi, Garifuna).

Plenty are familiar common words as caribou, chocolate, igloo, inukshuk, kayak, mesquite, moose, potlatch, skunk, tomato.

An interesting category is calques–words formed within English on the model of an indigenous expression. These include white mouse (sense 3, the name of a type of lemming, from the Chipewyan dlunegai dlune mouse + gai white), firewater (after Ojibwa ishkodewaaboo ishkode fire + waaboo liquid), and Rocky Mountains (Cree asinîwaciya, < asiniy stone, rock + waciy mountain).

Yours to explore, if you like that sort of thing.


  • John Cowan wrote:

    Bloomfield pointed out that the Algonquian word for ‘liquor’ could be reconstructed to Proto-Algonquian, although I have not been able to find out what reconstruction he proposed. I am also not convinced that the metaphor first appeared in so westerly a language as Ojibwa; I bet it started pretty close to the East Coast in Abenaki or Massachuset or whatever, and then spread west by etymological nativization.

  • Yes, that’s interesting. The OED entry for /firewater/ was revised in 2015. It cites a form of the Ojibwa word from 1703 (I guess in Lahontan’s /Nouveaux Voyages/ though it doesn’t say and I can’t locate it). Perhaps it was simply the first to be recorded.

  • John Cowan wrote:

    For the benefit of anyone else reading this, I should add that of course Proto-Algonquian ceased to be spoken long before alcohol distilling was introduced, so the reconstruction is a false one made to prove a point about what is and what is not evidence of relatedness. Similarly, by choosing suitable words you can reconstruct English to Proto-Romance, but English morphosyntax rules out that idea and tells us that the Italic words in English are the result of four layers of borrowing (Latin, Normand, Central French, Latin again).

    I think you are right about the Ojibwa form being the first to be recorded (as far as we know) and of course that’s what the OED has to cite.

  • Still no pogonip.

  • kts wrote:

    Methodology: OED3 entries with Indigenous American language names occurring in the “etymon language” or “source language” fields were isolated

    How did you get the source language field? Is that visible to any subscriber? If so, I’m not smart enough to find it; I’ve been frustrated by the Language of Origin field in the Advanced Search page, since (as far as I can figure out) it only searches the *immediate* source language. That is, if you select Algonquian in that field, the results won’t include caribou or pecan, which came in via French.

    (Somehow, your map does include caribou, but not pecan — though you do have pekan, a marten, < French < Micmac.)

    The only way I can think of to get such a list would be to search for all indigenous language names in the Etymology/Language field. But then where do you get a list of all indigenous American language names (in the exact spellings they use)?

    But I don’t think you did that, since a search on a specific language, e.g. Chipewyan, turns up a few that you missed: Beaver (ethnonym), translating Chipewyan Tsa-ttiné; Dene (as opposed to Na-Dene) from “an Athabaskan language (compare e.g. Chipewyan dëne …)”; little chief hare (= American pika) “[after Chipewyan bek’ódheri gah yaze]”.

    (“White mouse” in the Chipewyan-calqued sense (= lemming) is misdated on the map: it dates from the 19th century, not the 16th (which would’ve been well before any European-Chipewyan contact). The 1592 date is for “white mouse” in the English compositional sense.)

    Also, by using the OED, you’re going to miss place names where they didn’t bother to give an etymology, e.g. Michigan, Mississippi via French from Algonquian languages. Considering how many of the words here are proper names of languages and peoples, I see no reason to exclude such familiar place names.

  • The language is sourced from the etymonLanguage and sourceLanguage attributes in the [Entry] field. That’s always given in the language family tree, eg. for caribou, sourceLanguage=”Native American languages/North American languages/Algic/Algonquian|European languages/Italic/Romance/Italo-Western/Gallo-Romance/French”.

    This misses that sense of BEAVER because it’s (unusually!) a blended etymology folded within the European-derived headword. I’m guessing they stuck it in there in the draft additions for some reason — perhaps it will get its own headword when revised.

    As for PECAN it looks like the entry itself is defective (from a certain point of view), as only the French is given in the field. The “< Illinois pakani" etym is separated out in a [lang] field within [etym], but isn’t listed as a source or etymon language. (You can’t use [lang] since any language mentioned in the etym will attract this tag.

    This all goes to your final point about what OED misses…

  • Writing in haste this morning I skipped over the first part of your q: no I don’t think the end user can isolate these fields exactly, though I havent tried hard to do so. I’m currenlty using an xml dump of the Dec 2019 OED3.

  • kts wrote:

    Thanks very much for the explanation. I’ll try to find my way around the prototype API. I don’t know if it allows searches by the source_language field, but at least it shows what they have in that field.

    You may want to cross-check your previous map against this one: there are quite a few missing from the new one for the same reason as pecan, i.e., faulty entries in the source_language field. For example: woodchuck, alpaca, canoe, rokeag, piache, cassava, hurricane, iguana, tobacco, vicuña, viscacha, Navajo, buccan, hominy, tanager, … That seems like a lot, and that’s just out of the first 130 or so that I checked. Do they need better QC? Does anyone else actually *use* the source_language field?

    Anyway, it’s a great visual design, very clear.

  • Thanks – the impetus was actually to get some practice with “dataviz” so I’m glad its easy to read.

    PS I tidied up my previous, where I had put the field names in angle brackets, which of course got eaten up in the html render.

  • PPS It’s worth mentioning that these xml tag attributes were added algorithmically from the existing text, which had [lang] tags going back to OED2. Overall that process worked better with etymology languages than it did with, e.g., regional category tags. I had a couple paras on that in “Alien vs. Editor“:

Leave a Reply

Your email is never shared.Required fields are marked *