In counting the number of tiles in a Scrabble game (preparation for the Hudson Area Library Scrabble Tournament) I found this distribution (picture left). Could this really be the frequency of the use of letters in English? Off to Google land. From the Oxford English dictionary website came this interesting chart:
“We did an analysis of the letters occurring in the words listed in the main entries of the Concise Oxford Dictionary (11th edition revised, 2004) and came up with the following table:
The third column represents proportions, taking the least common letter (q) as equal to 1. The letter E is over 56 times more common than Q in forming individual English words.
The frequency of letters at the beginnings of words is different again. There are more English words beginning with the letter ‘s’ than with any other letter. (This is mainly because clusters such as ‘sc’, ‘sh’, ‘sp’, and ‘st’ act almost like independent letters.) The letter ‘e’ only comes about halfway down the order, and the letter ‘x’ unsurprisingly comes last.”1
Comparing the piles in my picture to this chart immediately suggested that something was fishy about the distributions in Scrabble. “S”, for example is the eight most frequent letter according to Oxford’s analysis. But, in my picture the “S” pile is noticeably shorter than might be expected. Turns out that the inventor of Scrabble, Alfred Butts, created his frequency analysis from a scan of the NY Times but only included four “S” tiles to make it more difficult to use pluralization as a scoring strategy.2 So, the distribution in Scrabble is not without influences of the strategy of the game itself.