So what is
Soundex? What does that click box do to my search? Why do people keep telling me to use it when I do a search?
If
you started your genealogy research B.C. (before computers) you
probably know these answers. You also may have just nodded and sort of
smiled about the "good old days." But if you are newer to genealogy
research -- A.C. (after computers) especially after the early years --
you may not know that there is a lot behind that simple click on a
search form. And that's okay, today we're going to change that.
Soundex
is one of many phonetic algorithms that allow us to index words (mainly
names) by the sound of the word. So regardless of minor spelling
differences the words are grouped (indexed) together.
For
genealogists, that means we can find all the Smith, Smyth, Smythe, etc
names in one spot. This makes it easier for us because as we go back in
time spelling was not standardized and more people were illiterate and
may not have known how to spell their name anyways. Soundex gives us a
fighting chance to find them in many cases.
According to Wikipedia,
Soundex
was developed and patented in 1918 and 1922. A variation called
American Soundex was used in the 1930s to index the US Census from
1890-1920. The National Archives and Records Administration (NARA)
maintains the rules for implementation for the US Government.
Rules? Yes, rules. [Please read on and learn all about it. Or read on and see how much you remember from the "good old days."]
Soundex
converts words to a letter and three numbers -- no matter how many
letters make up the word. If you have a Michigan driver's license, the
letter and first three numbers of your license number are the Soundex
code for the surname on your license. Note: not all states use this as
part of the driver's license number.
Before computers (for some of you this equals before you were born),
we figured the code with paper and pencil. [It's okay if you use a blank
notepad or Word document file but it isn't quite the same as the "old
days."]
Take a surname, any surname, and write it
down. Then put four underscores/dashes ( _ _ _ _ ) to the right of the
surname or above the surname. As you figure the Soundex code this is
where you are going to put your "answers" as you determine the code for
the surname you wrote down.
1. The letter portion of the code is always the first letter of the surname/word you are converting.
It does not matter if that first letter is a consonant or a vowel. So write that letter on the first underscore/dash.
Now we figure out the number part of the code (three numbers) from the remaining letters of the surname.
2. Eliminate/cross-out the vowels and a few other letters in the surname.
A, E, I, O, U, H, W, Y
3. Below the remaining letters of the surname, convert each letter to the appropriate number from the list below.
1 = B, F, P, V
2 = C, G, J, K, Q, S, X, Z
3 = D, T
4 = L
5 = M, N
6 = R
4. Now read and apply any of these additional rules to the surname your wrote down.
Double Letters
If the surname has any double letters, ignore (cross out) the second occurrence of the same letter
Letters Side-by-Side that Convert to the Same Soundex Number
If
the surname has two different letters side-by-side that become the same
Soundex number, ignore (cross out) the second occurrence of the number.
This includes situations where the first letter of the surname (which
remains a letter) and the second letter would code to the same number.
Names with Prefixes
In
this situation, you need to convert the surname to two different
Soundex codes. One using the Prefix and one not using the Prefix. Note:
Mac and Mc are not considered prefixes while Van, Le, De, etc. are
prefixes. This covers you for different indexing (non-coded) methods
used.
Names with Consonant Separators
If a vowel (
a, e, i, o, u)
separates two consonants with the same number code, the consonant to
the right of the vowel is coded -- you use the second occurrence of the
code number. But if the letters
h or
w separate the two consonants with the same number code, you do not use the second occurrence of the code number.
Out of Letters
If you run out of letters, use a 0 (zero) to fill in any of the three Soundex numbers still vacant.
5.
Following all the rules, now you should have the number portion of the
Soundex code for the surname you wrote down. Transfer your three numbers
to the remaining underscores/dashes.
Examples:
Lincoln = L524 (L, 5 for N, 2 for C, 4 for L)
Wellington = W452 (W, 4 for L, ignore the second L, 5 for N, 2 for G, remaining coded consonants are ignored)
Pfropper = P616 (P, ignore f as codes the same as a p, 6 for R, 1 for P, ignore second P, 6 for R)
See = S000 (S, e is a vowel which is ignored and there are no remaining letters so use 000)
Sy = S000 (S, y is not coded and there are no remaining letters so use 000)
The
National Archives and Records Administration's explanation of the rules has a good example of the consonant separator rule.
Tymczak = T522 (T, 5 for M, 2 for C, ignore Z since it codes to 2 also, vowel separates so 2 for K)
VanGogh
with prefix = V522 (V, 5 for N, 2 for G, vowel o lies between the next consonant so 2 for the next G is used)
without prefix = G200 (G, 2 for G, no letters remain so use 00)
Want
to check if you figured your code correctly? There's a converter for
that. In the early days of genealogy on the internet, an automatic
Soundex Converter was "a big thing." Today, you can use it to check your
work, or use it just for the fun of it. One
Soundex Converter
is hosted by RootsWeb. There are likely more still out there on the
internet. Almost all genealogy programs have a feature to tell you the
Soundex code for a surname.
Now this indexing system
takes into account many spelling variations. But not all of them.
[Doesn't there almost always seems to be a caveat?]
Soundex will not help you if the first letter of the surname was switched. Like
when a census enumerator (not of the same ethnicity as the resident)
heard a V when a resident with a German accent said a name spelled with a
W. In German, a W is pronounced more like V. Thus Wandschneider can
become Vonsnider on a census. And as you can see the Soundex code for
Wandschneider (W532) is not the same for Vonsnider (V525) and you won't
find these different spellings in the same place (group). So remember to
think how someone said something and how it may have been heard. You
may need to play with letters a bit.
So, besides the
caveat, that is the mystery behind the curtain of today's simple click
to use Soundex in your search form. Started your genealogy B.C.? Hope
your memory wasn't too rusty.
©2014, goneresearching. All text and photos in this post are
copyrighted & owned by me (goneresearching) unless indicated
otherwise. No republication (commercial or non-commercial) without prior
permission. You may share (tell others) of this blog as long as you
give credit and link to this site (not by downloading or copying any
post). Thank you.