Alan E.
Mann, A.G.
fhfair@alanmann.com Accredited
Genealogist
Granger West Stake Family History
Fair April
2004
Genealogy Metasearch Tools
I
often tell people the Internet is the richest source of genealogical
information available today. The amount, scope, and availability of data are
staggering, even incomprehensible. It is virtually certain that there is valid
information about your ancestors on the Internet that you don’t have.
Information that you would probably want if you only knew it was there. So how can you find it? With
a lot of searching. This session looks at tools that implement a concept
called metasearching.
What
is metasearch? The term does not yet
appear in most dictionaries, but is a common term on the web. It is used most
commonly to describe an Internet metasearch engine. The general idea is that
you submit keywords in its search box, and it then transmits your search to
several individual search engines simultaneously. Within a few seconds, you get
back results that came from several search engines. Metasearch engines do not
have their own index or database of Web pages; they send your search terms to
those kept by search engine companies, then combine
the results from their indexes.
While
the term usually applies to Internet search engines which are used to try
to find web pages on a specific topic, this class extends the idea to genealogy
web pages. Generally speaking, a genealogy metasearch tool would be something
that searches several databases or several web sites. Using this definition,
metasearches can either be those that search several databases on a single site
or tools that search several web sites and combine the results. This class will
look at examples of both types of metasearches.
Single Site
Metasearches
Here,
we can talk about Ancestry, Heritage Quest Online, FamilySearch, RootsWeb, or a variety of other web sites. These are sites
that have a lot of databases, but have a search that will search through and
present results from all of the different databases.
USGenWeb. The simplest would be the USGenWeb Archive. Here, there are hundreds of thousands of
files representing extracts, transcriptions, abstracts, and indexes to many
millions of names. The site search engine allows you to search all of their
files at once or all of the files for any single state. However, the search
options are extremely limited. Basically, you can search for any word in any of
the files selected, but you cannot specify whether the word is a name, place,
relationship, or something else. This is called a freeform, general, or unfielded search. While it does offer the advantage of
searching many things at once, it doesn’t give much flexibility to limit or
narrow the search results.
Ancestry. This is a site with
many different databases. They have made a default search that searches across
those databases—census, wills, family history books, obituaries, and more. When
you go to their site, the search has several different fields. You search by name,
but can add country, state or province, year range, keyword, and record type.
You can also specify whether to use soundex or exact
spelling. The search template does not change when you specify a record type. But if you select a record type from the list at the right, the
search template changes. You will then probably only be able to specify
name and keyword. You also will get a list of databases so that you can further
restrict your search. The general policy for Ancestry is to search the database
for the items specified, but to ignore any input fields that don’t apply to
that database. For example, if you specify a range of years, but the database
being searched doesn’t specify years, Ancestry’s metasearch will just ignore
the year range and display any results from that database.
FamilySearch. This site lists the databases that it searches along the
left. The default is “all resources,” which is the metasearch. You can also
select a specific database to search. When you choose a specific database, you
get additional search options. FamilySearch’s general
policy is to restrict your search fields to just those fields that are common
to the databases being searched. Thus, when you do an all resources search, you
have a certain search template. When you select census records, you get some
different options. When you select a specific census to search, you get yet
some additional options that are unique to that census. There are some
exceptions, however. For example, the web site search disregards everything you
enter except surname (which makes this search nearly useless, except for
unusual surnames).
Heritage Quest Online. This site has some
very useful ways of grouping results and actually has the most flexible
searches. It is less of a metasearch than the other sites listed here because
it has three categories, and has no single search that searches all three
categories. Nonetheless, there are thousands of databases searched within a
category.
Multiple Site
Metasearch Tools
Perhaps
the best example of a multiple-site metasearch is Internet Family Finder (www.genealogy.com/ifftop.html).
This searches over 300,000 separate family history databases. Unfortunately,
the search has no true fields other than first and last name, but those have
been well identified—making it much more useful than USGenWeb’s
text only search.
Another
example is more of a tool than a search—Charles Culman’s
MultiGen. Located at http://ourworld.compuserve.com/homepages/CACulman/MultiGen.htm,
this site allows you to enter a name once, and then submit a search request to
ten different genealogy sites at once. If you select the “open new window” option and
then click on “Search them all,” you will get ten windows with the separate
results from each of the ten sites. While not a true metasearch because it
doesn’t combine the results, it does save you some time and conduct several
searches at once…
Other Information
about Metasearching
One
important thing to know about metasearching is that the quality of their
results depends on what they search and how they organize the results. A
metasearch cannot be better than the sum of the individual databases they
query.
There
are some good general web metasearch engines which are not designed as
genealogical search tools, but which can be used to search for genealogy or
genealogically-related topics. Let’s
take a quick look at metasearch engines.
Three of the most unique are Kartoo
(www.kartoo.com),
Zapmeta (www.zapmeta.com) and Vivisimo
(www.vivisimo.com). These find results
and try to group results by common terms found on the result pages. This can be
very helpful, with some extra helpful features on ZapMeta
(turn snapshots on). Try them out! For more general Internet search tools, see http://searchenginewatch.com/links/article.php/2156241.
What
makes a good Internet metasearch is an engine that searches good databases,
accepts complex searches, integrates results well, eliminates duplicates, and
offers additional features such as clustering by subjects within your search
results.
Doesn’t
forget the need to know what you are searching—what’s the scope, the source,
and how do you use it?
|
|
©Copyright 2004 by Alan E. Mann. All rights reserved. Written permission to reproduce all or part
of this syllabus material
in any format, including photocopying, data retrieval or the
Internet, must be secured in advance from the copyright holder.