Q: What’s the difference between having a SQUIRREL FACE and having a FACE SQUIRREL?



Generally speaking, if you want a word for a MORP that has FUZ, you call it a FUZ-MORP, right? And if there’s a FUZ that gets rid of your MORPS, you call it a MORP-FUZ. That’s just how our language works when it wants to compress things that affect other things into one noun. What term goes first or second is governed by what’s called headedness, and it’s neat, once you notice it (which you don’t, because language).

A recent xkcd comic proposes that we ditch lexical precision in favour of compound headedness:

The <x> that is held by <y> is also a <y><x>, so if you go to a food truck, the stuff you buy is truck food. A phone that's in your car is a carphone, and a car equipped with a phone is a phonecar. When you play a mobile racing game, you're in your phonecar using your carphone to drive a different phonecar. I'm still not sure about bananaphones.

[I enter the quibble that “lifeboat” is not in fact a boat that holds boats; rather it is a boat held by boats — this shipship is a boatboat in the sense the chart implies — but that is by the by.]

Now, by nature or by accident there must be plenty of words like “boathouse”, which have  a “houseboat” counterpart firmly established in the lexicon. (Whether by nature or accident might be something for a linguist to probe).

Let’s propose “boathouse words”, or “houseboat words”, or “chi compounds”, or maybe “bidirectional compound terms”, as a label for this class of compound word pairs, where the component items form compounds in either order — e.g. “houseboat” and “boathouse”, or, say:

  • “blow-back” (what blows back) and “backblow” (a blow on the back);
  • “bath oil” (oil for your bath) and “oil bath” (a bath of oil);
  • “deer-mouse” (a mouse)  and “mouse deer” (a deer);
  • “aloe tree” (the tree) and “tree aloe”
  • “revenue tax” (a tax on your revenue) and “tax revenue” (the proceeds from those taxes and more)

When the xkcd was picked up by Language Log, Mark Liberman surmised that there might be dozens or hundreds of such compounds, and challenged readers to think of as many as they could in a minute. Some commenters managed well; others thought it a tough assignment. Others still just didn’t get it, or maybe were trolling.

I didn’t have a minute to probe my intuitions, but I did have about a half-hour to mine the OED3 dataset. This fingered 2,568 possible boathouse words, according to the formal criteria: NOUN1+NOUN2 where NOUN1 and NOUN2 are both nouns in OED3 and may or may not be separated by SPACE or HYPHEN.

Here’s a small sample from the results:

The whole dataset is linked here as a CSV.

Some brief notes and highlights;

  • Although for now it’s best to define “boathouse terms” as widely as possible, not all of the collected terms are true boathouses. Some exceptions are simply algorithmic false positives, e.g. “ageman”~”manage”, “guitar”~”targui”.
  • Another whole class of words satisfies the algorithm: the “boatboat” type (to borrow again from XKCD), where both elements are identical. So “agar-agar” shows up, as does “akeake”, “bon-bon” and so on. There are 135 such words.
  • Almost all of the identical (“boatboat” type) boathouses are loanwords, translations, transliterations. onomatopoeias, or reduplications. There are true boatboats among them, though–“daughter daughter”, “spin-spin”,  and “log log” among them–but whether these count as boathouses depends on whether you think there needs to be, in addition to modification, two reciprocal senses (i.e. a boat that holds boats, and a boat that is held by a boat). These don’t have that.
  • Some houseboats are synonyms: “toy-boy”|”boy toy”, “Emperor-King”|”King-Emperor”, “manservant”|”servant-man”.

