Alan E. Mann, AG

fhfair@alanmann.com                                                                                Accredited Genealogist

www.alanmann.com/articles                                                                        prepared January 2005

                                 

 

Search Engines for Genealogists

 

I often tell people the Internet is the richest source of genealogical information available today. The amount, scope, and availability of data are staggering, even incomprehensible. It is virtually certain that there is valid information about your ancestors on the Internet that you don’t have. Information that you would probably want if you only knew it was there.  So how can you find it? With a lot of searching. This session looks at tools to help your searching.

 

Generally, a genealogist’s Internet searching has two phases.

  1. First, you must find the website that may have the desired information.
  2. Then, you search that website for the desired information.

This presentation focuses on the first step—finding the website.

 

While “search engine” can be correctly used in several different ways, the most common usage is a web tool used to find web pages on a specific topic. An Internet search engine is like a catalog to the Internet. Companies and individuals with web sites notify the search engines about their sites because they want people to find out about their site. Search engines also use spiders or robots—tools that go looking around the Internet, capturing pages and then indexing them. Beware—not all search engines are equal. Some index the first few sentences of a web page only. Others index every word. No search engine indexes all of the web. Some have billions of pages and others only have a few hundred million…

 

Search engines are not designed specifically for genealogy, but rather search for whatever words you input to search. There are thousands of search engines. One source claims to list over 809,000 search engines. Basically, a search engine visits web sites and indexes their content. While a search engine can index the name Richard Poor, it wouldn’t be able to distinguish between a person by that name, Poor Richard’s almanac, and a play that had the line “Alas, poor Richard…” Once when searching for wills left by my Brooks family ancestors, a search engine confidently directed me to a page where I found the sentence “Garth Brooks will be appearing…” There are some tricks to using search engines. Use unusual names whenever possible (see tip #6 at the first, above). When searching for a common family name, add the word genealogy, the phrase “family history”, or “was born” after the name to narrow down your search.

 

Search Methodology

 

Doesn’t forget the need to know what you are searching—what’s the scope, the source, and how do you use it? Here are a few pointers that apply to genealogical searches on the Internet.

 

  1. Don’t stop just because you succeed! One common mistake is to stop searching when we find something. It’s great to find something about our ancestor and consider our search a success, but we should continue our searches with the other search aspects even after one method was successful. Another method may lead to additional information about that same ancestor.
  2. In general, we should keep our start our searches broad and then narrow them down when necessary. This is because too much detail may cause us to miss something we might have found useful. For example, when visiting FamilySearch.org, we can enter first name, last name, birth year, birth place, spouse’s name, father’s full name, and mother’s full name. I suggest that you enter just the first and last name unless the name is fairly common (this wouldn’t work for Thomas Walker, William Jones, or Mary Taylor). If the surname is very unusual, you may want to enter surname only for your search. Consider leaving the place blank—you never know when family members might be found in an unexpected part of the world. If you get too many hits, then redo the search with some added detail.
  3. Be aware of what is being searched. You may want to search at either a higher or lower level (e.g., IGI only vs. “all resources”)
  4. Check out search help—this is one time that help often helps. This may be hidden in a link to “advanced search.” There may be appealing options that you really don’t want to use—like exact spelling in familysearch.
  5. Find out what options you have in searching. You may be able to use partial or truncated searches.
  6. The most important tip of all is the simplest. READ THE SCREEN. If you take time to do that, you can avoid many rookie mistakes.
  7. Consider searching for uncommon names first. If John Smith married Hortense Frinzwilter, don’t search for John—search for Hortense Frinzwalter. If David Brown had a brother Eliphalet Brown, search for Eliphalet. Then David may appear on the same page.
  8. If you enter a year, ALWAYS select “range of years” or + or – x years. Years are often estimated, approximated, or incorrectly reported.
  9. One way to use a default search engine is to just type what you want to find into the address bar of your web browser. Try it and see what happens. My wife loves doing this.

 

 

MetaSearching

 

What is metasearch?  The term does not yet appear in most dictionaries, but is a common term on the web. It is used most often to describe an Internet metasearch engine. The general idea is that you submit keywords in its search box, and it then transmits your search to several individual search engines simultaneously. Within a few seconds, you get back results from several search engines. Metasearch engines do not have their own index or database of Web pages; they send your search terms to those kept by search engine companies, and then combine the results from their indexes.

 

What you need to know about metasearching is that the quality of their results depends on what they search and how they organize the results. A metasearch cannot be better than the sum of the individual databases they query.

 

There are some good general web metasearch engines which are not designed as genealogical search tools, but which can be used to search for genealogy or genealogically-related topics. 

 

This class extends the idea to genealogy web pages. Generally speaking, a genealogy metasearch tool would be something that searches several databases or several web sites. Using this definition, metasearches can either be those that search several databases on a single site or tools that search several web sites and combine the results. This class will look at examples of both types of metasearches.

 

 

Single Site Metasearches

 

Here, we can talk about Ancestry, Heritage Quest Online, FamilySearch, RootsWeb, or a variety of other web sites. These are sites that have a lot of databases, but have a (meta)search that looks through and presents results from all of the different databases.

 

USGenWeb. The simplest would be the USGenWeb Archive. Here, there are hundreds of thousands of files representing extracts, transcriptions, abstracts, and indexes to many millions of names. The site search engine allows you to search all of their files at once or all of the files for any one state. However, the search options are extremely limited. Basically, you can search for any word in any of the files selected, but you cannot specify whether the word is a name, place, relationship, or something else. This is called a freeform, general, or unfielded search. While it does offer the advantage of searching many things at once, it doesn’t give much flexibility to limit or narrow the search results.

 

Ancestry. This is a site with many different databases. They have made a default search that searches across those databases—census, wills, family history books, obituaries, and more. You usually search by name, but can add country, state or province, year range, keyword, or record type. You can also specify whether to use soundex or exact spelling. The search template does not change when you specify a record type. But if you select a record type from the list at the right, the search template changes. You will then probably only be able to specify name and keyword. You also will get a list of databases so that you can further restrict your search. The general policy for Ancestry is to search the database for the items specified, but to ignore any input fields that don’t apply to that database. For example, if you specify a range of years, but the database being searched doesn’t specify years, Ancestry’s metasearch will just ignore the year range and display any results from that database.

 

FamilySearch. This site lists the databases it searches along the left. The default is “all resources,” a metasearch. You can limit your search to a specific database to get additional search options. FamilySearch’s general policy is to restrict your search fields to just those fields that are common to the databases being searched. Thus, an all resources search has search fields. When you select census records, you get different search fields. When you select one census year to search, you get yet more options unique to that census. Exceptions include the web site search, which disregards everything you enter except surname (this search is nearly useless, except for unusual surnames).

 

Heritage Quest Online. This site has some very useful ways of grouping results and actually has the most flexible census searches. It is less of a metasearch than the other sites listed here because it has three categories, and has no single search that searches all three categories. Nonetheless, there are thousands of databases searched within a category.

 

 

Genealogical Metasearches

 

Perhaps the best example of a multiple-site metasearch is Internet Family Finder (www.genealogy.com/ifftop.html). This searches over 300,000 separate family history databases. Unfortunately, the search has no true fields other than first and last name, but those have been well identified—making it much more useful than USGenWeb’s text only search.

 

Another example is more of a tool than a search—Culman’s MultiGen. Found at http://ourworld.compuserve.com/homepages/CACulman/MultiGen.htm, this site has you enter a name once, then submits a search request to ten genealogy sites at once.  If you select the “open new window” option and then click on “Search them all,” you will get ten windows with the separate results from each of the ten sites. While not a true metasearch because it doesn’t combine the results, it can save time and conduct several searches at once…

 

 

Clustering Metasearches

 

Clustering metasearch engines find results and group results by common terms found on the resulting pages. This can be very helpful by suggesting other terms that you may recognize and use to narrow down your search results.  Three unique examples of clustering metasearches are Clusty (www.clusty.com)  Kartoo (www.kartoo.com), and Vivisimo (www.vivisimo.com).

 

 

Yet More Search Engine Information

 

What makes a good Internet metasearch is an engine that searches good databases, accepts complex searches, integrates results well, eliminates duplicates, and offers additional features such as clustering by subjects within your search results.

 

Yet another metasearch with some extra helpful features is ZapMeta (www.zapmeta.com).  Try turning snapshots on. Try the other features out!

 

For more general Internet search tools and information on search engines, see http://searchenginewatch.com/links/article.php/2156241, http://www.searchenginewatch.com/facts/index.php, and http://www.netstrider.com/search/.

 

 

 

©Copyright 1997-2005 by Alan E. Mann. All rights reserved. Written permission to reproduce all or part of this syllabus material in any format, including photocopying, data retrieval, or the Internet, must be secured in advance from the copyright holder.