Aug 16, 2013

Author Names

It occurs me to me that the list of books I've read over the past 31 years can be treated as a source of names. Because the vast majority are from the English-speaking world, the names, too, are a spectrum of English Christian names. I thought I might plot the lot to see what patterns might emerge.

So here's what I did: I read into an R session the names of all my authors, removed duplicates (1844 names remaining), extracted the first names, and removed all those that have only an initial (e.g. 'J.'). I ended up with 1610 first names, not all distinct, of course. Here's a random selection of them: Robin, Marta, Michael, Hanif, Brian.

Observe the Hanif (Qureishi). Not a Christian name, what?

Of these 1610 names, 637 only occur once. These are names like: Wladyslaw, Vernor, Slavenka and Pamela.

Here's a barplot of the names that occur more than once.

The labels are not very clear unless you click through and zoom. Bit of a pain, that. 

So let me do this: I'll split the plot into 4 chunks.

Names occurring fewer than 4 times
Names occurring fewer than 10 times

Names occurring fewer than 20 times

The top names
Unsurprisingly, the top names are all male - I think I documented elsewhere that I've read far more books by men than by women. What's interesting is that the most popular female name is Mary (9 times) and there are twenty-two male names that occur more often than that, until the most popular male name of John (46 occurrences).

This is still a tad irregular, because obviously there are names like Tom and Thomas, or Will and William, and Marie and Maria and Mary which are variants: should these be treated as different? These days there's a vogue for erstwhile diminutives to be treated as true names (there are a few Charlies I know - and they assure me they are not Charles's; likewise, Jack is an immensely popular name now). The combining of the variants into one class is not something I can do automatically with a bit of coding magic, so I have to create a new list manually.

Here I'm a bit baffled again. Are Andrea, Andrew and Andrei the same? Is every Alex an Alexandra? Some could be Alexander, of course, while others might be Alexis or Alexei. Unless I look into the antecedents of each of the names, I'm not going to be able to make much progress here.

Let me make some hand-waving assertions, though.

If I club Ann, Anna, Anne and Annie into one class, then there are 14 of these.

If I club Dan and Daniel, then there are 10 of these.

Daves and Davids together are 32.

Erics and Erichs together are 10.

Freds, Fredericks, Friedrichs together are 10.

George and Georges together are 20.

Geoffrey and Jeffrey together are 7.

Helen, Helene and Helena together are 7.

Iain and Ian total 17.

Jack and John and Johann together are 51.

Jose, Joe and Joseph together are 14.

Mary, Marion, Marie, Maria and Mari are 13.

Mike, Mikhail, Michael, Michele, Mikael and Miguel total 35.

Niccolo, Nicholas and Nick total 12.

Bob, Robert, Roberto are 30.

Stephen, Steven, Steve and Stefan are 26.

Thomas and Tom are 17.

Will, William, Willem and Bob are 25.

So what do we have?

The most popular women's names are now Ann and Mary and Helen and Jane, in that order.

The most popular men's names are John, Michael, David, Robert, Stephen, William.

Now, this website gives the top names in the US from 1900-2000. Observe that John, Michael and David were very popular in the early years, but gradually drifted down until only Michael and William remained in the top 10 in 2012. Between 1980 and 1990, John waned (heh).

Among women, Mary vanished from the top 10 in 1970; Anna and Helen were in the top 10 before the 1930s; Jane doesn't even figure in the list.

For some reason, the Scots have kept records of popular names over the last century (John tops the list till 1975, after which it plummets; Mary, Helen and Annie fade from the 1950s). Given that England and Wales have had censuses for yonks, I'm a bit puzzled that I can't readily find UK-wide statistics. Perhaps I'm not looking too closely.

But here's the thing! The Johns in my list - Bowers (1928- ), Brunner (1934-1995), Buchan (1875-1940), Creasey (1908-1973), Derbyshire (1945- ), Dickson Carr (1906-1977), Gribbin (1946- ), Fuller (1913-1990), Grant (1949- ), Grisham (1955- ), Harvey (1938- ), Norwich (1929- ), Kay (1948- ), Keay (1941- ), Lanchester (1962- ), Lawton (1949- ), le Carré (1931- ), Ford (1951-2006), Man (1941- ), Masters (1914- ), Mortimer (1923- ), McWhorter (1965- ), Reader (1937- ), Steinbeck (1902-1968), Wheeler (1911-2008), Wyndham (1903-1969) - are nearly all born in the heyday of the name. If I had the puff to look at the authors born after 1970, I'd probably find very few Johns.

On the other hand, the Anns in my list - Digby (1935- ), Weale (1929-2007), Frank (1929-1945), Dukthas (*), Lamott (1954- ), Tyler (1941- ), Fadiman (1953- ), Stothard (1983- ), Reid (1965- ), Cleeves (1954- ), Dee Ellis (??), Perry (1938- ), Proulx (1935- ), Maria Ortese (1914-1998) - are exceptions - most of them were born after the name's popularity dropped in the 1930s.

Make of it all what you will!

(* - Ann Dukthas is a pseudonym of Paul C. Doherty.)


Parmanu said...

You are insane, Fëanor.

Insanely good.

Fëanor said...

Too kind, too kind.

