Alan E. Mann, AG

---

alan.familyhistory@gmail.com                                                            Accredited Genealogist

BYU 2008 Annual Family History and Genealogy conference www.alanmann.com/articles           

Thursday, 31 July 2008                                                                 1:30-2:30 pm and 5:30-6:30 pm

 

Search Engines: Getting More from Google and Using More than Google

The Internet is the richest source of genealogical information available today. The amount, scope, and availability of data are staggering, even incomprehensible. It is likely there is valid information about your ancestors on the Internet that you don’t have. Information you would probably want if only you could find it. So how can you find it? That’s the topic of this session.

Internet Search Tips

1.        Start your searches broad and narrow it down when necessary. Too much detail may cause you to miss something useful. For example, when visiting FamilySearch.org, we can enter first name, last name, birth year, birth place, spouse’s name, father’s full name, and mother’s full name. Enter just the first and last name unless the name is fairly common (you probably shouldn’t search for just Thomas Walker, William Jones, or Mary Taylor, but add some place or time period when searching for common names). If the surname is unusual, enter surname only for your search. To start, leave the place blank—you never know when family members might be found in an unexpected part of the world. If you get too many hits, then add some detail to a new search—but only as much detail as necessary to reduce the search results to a manageable level.

2.        Be aware of what is being searched. You may want to search at a higher or lower level (e.g., IGI only vs. “all resources”). Sometimes, restricting your search to a single database gives you additional search options or effectively narrows your search down to something manageable.

3.        Check out search help—it often actually helps. This may be hidden in a link to “advanced search.” There may be options that look appealing but that you shouldn’t use—like exact spelling in FamilySearch.

4.        Find out what options you have in searching. You may be able to use partial, truncated, or wildcard searches.

5.        The most important tip of all is the simplest. READ THE SCREEN. If you take time to do that, you can avoid many rookie mistakes.

6.        Consider searching for uncommon names first. If John Smith married Hortense Frinzwilter, don’t search for John Smith—search for Hortense Frinzwilter. If David Brown had a brother Eliphalet Brown, search for Eliphalet. David may appear on the same page or in the same source.

7.        If you enter a year, ALWAYS select “range of years” or + or – x years. Years are often estimated, approximated, or incorrectly reported.

8.        Don’t stop just because you succeed! Success can be a barrier to greater success. One common mistake is to stop searching when we find something. It’s great to find something about our ancestor, but we should continue our search. Finish the list of search results or “hits”, then continue with the other search aspects (see tip 9) even after finding what you were seeking.

9.        Alter your search approach to find information not just by name, but also by places your ancestors lived, topic (e.g., ethnicity, religion, society), time period, event (e.g., war, famine), characteristic, record type needed, or keyword. Another approach may lead to additional information.

10.     The Internet is dynamic. If you don’t find it today, try again later.  If you find something today, you may be able to find yet more later. This means that our searches may need to be repeated from time to time to locate new information or newly indexed or categorized information.

11.     Did you know you can use site: or inurl: to focus on the best results? For example, I want to see if any of my Iron county, Utah STUBBS family appears on the USGenweb page for Iron county, but the site doesn’t have a search engine, I just type site:rootsweb.com inurl:utiron stubbs into my google search, and I’ve cut hours of searching to seconds.

What is a Search Engine?

While search engine can be correctly used in several different ways, the most common usage is a web tool used to find web pages on a specific topic. It is important to understand what a search engine is, what it includes, and how to best use it. The answers to these questions may vary between each search engine. Generally, a search engine indexes web sites it has been able to identify and index. Some search engines index every word, some index only the first page, and a few just the first few sentences.

Companies and individuals with web sites notify the search engines about their sites because they want people to find out about their site. Search engines also use crawlers, spiders or robots—tools that go looking around the Internet, capturing pages and then indexing them. Beware—not all search engines are equal. Some index the first few sentences of a web page only. Others index every word. No search engine indexes all of the web. Some have billions of pages and others only have a few hundred million…

Search engines are not designed specifically for genealogy, but rather search for whatever words you input to search. Originally, while a search engine could index the name Richard Poor, it wouldn’t be able to distinguish between a person by that name, Poor Richard’s almanac, and a play that had the line “Alas, poor Richard…” Once when searching for wills left by my Brooks family ancestors, a search engine confidently directed me to a page where I found the sentence “Garth Brooks will be appearing…” Search engines are becoming more sophisticated and some ability to distinguish is being designed into their search findings. There are some tricks to using search engines. Use unusual names whenever possible (see sidebar below).

The best known search engine today is Google. Google indexes every word on the sites that it has copied. It is estimated that Google’s 25 billion + indexed pages represents somewhere around 20% of the web. That means that over 80% of the web remains unrepresented in Google! Consider using more than just Google because no search engine indexes more than 20% of the web and also because none of them index the SAME 20% or less. There are sites on each that may not be listed on others. Another reason is that there are different methods of searching—different ways to apply your search terms. You need to read  a search engine’s help page, advanced search tips and experiment with each to find the best way to use that search engine.

In addition, there are nearly a million other search engines. Some are specialized, some are general. The major four are Google, Yahoo, Ask, and Live (formerly MSN). General information with links to search tips can be found at www.geocities.com/familyhistory.geo/search2006.htm.  A comparison chart that may help you understand options in the major search engines is available at http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/SearchEngines.html.

The lines between different types of search engines are blurring. The various types have been adopting the features and some of the functions of the other types, making distinctions almost nonexistent. Nonetheless, it would be worthwhile to look at four special types or characteristics of search engines, namely metasearch, clustering, federated, and custom searching.

Metasearching

Originally, a metasearch engine is one in which you submit keywords in its search box, and it then transmits your search to several individual search engines simultaneously. Within a few seconds, you get back results that came from several search engines. Metasearch engines do not have their own index or database of Web pages; they send your search terms to those kept by search engine companies, then combine the results from their indexes. Merriam Webster defines meta as more comprehensive : transcending <metapsychological> -- usually used with the name of a discipline to designate a new but related discipline designed to deal critically with the original one. Therefore, a metasearch would a comprehensive or transcending search, or, in other words, a search which includes more than one search.

Single Site MetaSearch. Here, we can talk about Ancestry, FamilySearch, RootsWeb, Heritage Quest Online,USGenWeb Archive, or a variety of other web sites. These sites have many databases, with a tool that will search through and present results from all of their databases.

Multiple Site MetaSearch. What you need to know about metasearching is that the quality of their results depends on what they search and how they organize the results. A metasearch cannot be better than the sum of the individual databases they query. What makes a good Internet metasearch is an engine that searches good databases, accepts complex searches, integrates results well, eliminates duplicates, and offers additional features such as clustering by subjects within your search results.

While there are many metasearch engines. Three companies have tried to apply the broad metasearch concept to genealogy. The two still in business are Internet Family Finder and MyHeritage. Perhaps the best example of a multiple-site metasearch is Internet Family Finder (www.genealogy.com/ifftop.html). This searches over 300,000 separate family history databases. Unfortunately, the search has no true fields other than first and last name, but those have been well identified—making it much more useful than text only searches.

Clustering

Clustering metasearch engines find results and group results by common terms found on the resulting pages. At first, the difference between a regular metasearch and a clustering metasearch is difficult to see. They both allow searching for specific terms within a set of results. The difference is clustering tools suggest terms, not just search what you input. This can be very helpful by suggesting other terms that you may recognize and use to narrow down your search results. The leaders in this field are www.clusty.com and www.zapmeta.com. Two unique examples of clustering metasearches are Kartoo (www.kartoo.com), and Vivisimo (www.vivisimo.com). The strategy in using these tools is to search for a name, records type, or concept and then use the words in common on the left to focus in on what you are looking for.

Federated Searching

 A federated search is a type of metasearch. Rather than freeform searching typical of most search engines, a federated search uses organized or fielded data and combines two or more sites with similar data structure into a single results list. For example, both Ancestry.com and FamilySearch.org allow you to search by first name, last name, birth year, and place. A federated search might allow you to input each of those four pieces of data, then search both FamilySearch and Ancestry, then present the results in a single list. The leader in federated searching is WebFeat. MyHeritage (www.myheritage.com/research) combines many sites with search tools.  BYU Idaho has a tool that federates searching biography databases, but only work on campus. See the list of databases that are federated at http://abish.byui.edu/library/r_biographies.cfm.

Custom Searching

A new type of searching is Cloud searching – see cloudsearch.net.

Google has made it possible for those wanting to try a little programming to design customized Google searches. For a few examples of such searches, see:

Yet More Search Engine Information

Your web browser has a default search engine, which is where it goes when you type what you want to find into the address bar of your web browser. Try it and see what happens. My wife loves this! If you think your research might be in Europe, www.euroseek.com finds European sites and lets you search in foreign languages.

 

For more information about search engines, see www.searchengineshowdown.com searchenginewatch.com/links/article.php/2156241 (metasearching), searchenginewatch.com/resources/index.php (facts, tutorials, explanations), and www.searchenginewatch.com/facts/index.php.  A couple more concepts to cover: personalization and intelligent searching.

Several search engines now have personalization. An example is iGoogle, which keeps track of the searches you do and the results you click on. You can access Google from any computer, log in, and pick up where you left off on any other computer. It can also store web pages and source documentation using Google Notebook.

 

Intelligent Searching

 

In an intelligent search, the computer gathers the results from the many web sites and evaluates whether any of them could be the ancestor you want. A semantic web would mark up records, transcripts, abstracts, or indexes with the context of the records. Thus, not only name, but place, time period, and record type would be identified within the meta-tags embedded in the page itself. This is the future of genealogical research. The only question is how long it will take.

 

©Copyright 2006-8 by Alan E. Mann and Intellectual Reserve, Inc. All rights reserved. Written permission to reproduce all or part of this syllabus material in any format, including photocopying, data retrieval, or the Internet, must be secured in advance from the copyright holders.