Lately I’ve been working with several different gender-inference tools, tweaking them here and there to serve my purposes. Since I’m working with a historical dataset with about eight million records, from 1800 to today, once of the packages I’m using is the gender library for R by Lincoln Mullen, which uses historical US census and Social Security information to predict the gender of a given name. This is important because some names are unisex (or androgynous), and many change their gender tendency over time.
As a diversion, I thought I’d try to find the most-changey names in the datasets used by this particular tool. I’ve graphed a selection of these below. Note that the graphs concatenate the pre-1930 IPUMS data, drawn from Census samples, and the post-1930s SSA data, so take the borderline (1920-1940) as somewhat fuzzy, and caveat emptor if you want to draw conclusions across it.
Each graph plots a +/-10 year average for each year from 1800-2000. I.e. the values for 1950 represent all people born 1940-1960, and so on. Names that were selected were at least 90% female at one point in time, and at least 90% male at another.
A couple trends to spot:
- Most shifty names move in the direction from Male->Female. I’ve included a couple that buck that trend, Auguste and Augustine, both of which have become more male.
- From WWII to the end of the Boomer generation (1960s), many previously male names became increasingly, and often exclusively, female. This group includes Stacey, Robin, Lynn, Leslie, Leigh, Lauren, and Hillary.
- A few names, like Shirley, Meredith, Fay, and Kim, become increasingly female starting in the 19th C.
- Lacey is maybe the changeyest name in this group, going back and forth from over 70% male to over 70% female at least four times throughout the period. Kerry is also up and down.
There seem to be a lot of very sharp changes from male->female around 1930-1940. Any idea what prompted those? Some reinforcement of gender differences that occurred during the Great Depression or something?
The very very sharp shifts right at 1930 (or smoothed a little bit into the 1930s) are an artefact of the underlying data change-over from the IPUMS dataset (based on a sample of US census data) to the SSA dataset (based on registrations at birth). There could be a lot of reasons for this effect, but I’m not sure what the real cause is. However, it’s also true that a lot of names continue sharply in the direction of femaleness through the 30s and 40s. I’m just speculating, but it tracks somewhat with an increase in the social capital of women in America, so maybe appropriating traditionally male names is part of that. There may be phonetic motivators too (Sh-,-ee,-en all seem to gravitate female) but that doesn’t explain the timing, per se. Another somewhat obscured effect is that because names are faddish they can be influenced by a single prominent individual, so what might start out as a gradual increase in a gender representation might create a spike or stronger trend 20 or 30 years later.
[…] Gender Shifts in American Baby Names – An interesting study of the gender flips some names undergo over the generations. Would be interesting to see if patterns can be found in the shifts, maybe tendencies for certain underlying phonetic elements to shift in gender? […]
Avery and Vivian should be on here!