WWW Search Engines

Objectives:

  1. Learn about WWW Search Engines
    1. AltaVista
    2. Excite
    3. HotBot
    4. Infoseek
    5. Lycos
    6. WebCrawler
    7. Yahoo
    8. Google
  2. Learn about Meta-searchers
    Meta-searchers search many indexes at once.

    1. MetaCrawler
    2. search.com
Name and URL Features Description
AltaVistahttp://www.altavista.com/Top of Page Provided by: Digital Equipment Corporation
Browsable index: No
Search capabilities: Yes
Features: Size of the index, speed of retrieval
Help and FAQ: No, for the product. Yes, for simple, and advanced search.
AltaVista indexes millions of Web pages as well as the text on those pages. It also provides access to thousands of Usenet newsgroups. Boolean AND/OR/NOT searching is supported, as well as phrase, proximity, truncation, and field searching. AltaVista provides feedback in order of relevance, but does not provide relevance scores. It has two detailed FAQs for searching that are essential to exploit the power of the system. It also offers LiveTopics, which, from a retrieved search page, provides the user grouped tables of related terms for adding to or deleting from the search.
 

Excite

http://www.excite.com/

Top of Page

Provided by: Excite, Inc.
Browsable index: Yes
Search capabilities: Yes
Notable feature: Size of index, concept based searching
Help and FAQ:Yes, for the product and the searching system
Excite indexes millions of Web pages and Usenet news articles. Its Intelligent Concept Extraction (ICE) search engine is based on concept searching, but also supports Boolean AND/OR/NOT, and phrase searching. Users may search by “concept” or key words. Concept based searching is easier and can yield larger retrieval, but can be confusing in that it sometimes returns sites that are not related to the query. Excite offers relevance feedback, returning different colored icons based on the level of relevance, and provides percentage confidence ratings. It allows the user to click on the relevance icon next to an item, to return similar items (query by example). It also offers the user the option of retrieval by site, with sites listed hierarchically. Users should read the searching FAQs carefully to use the system to its fullest advantage. Excite also contains a browsable subject index in 14 major categories.
 

HotBot

http://hotbot.lycos.com/

Top of Page

Provided by: Inktomi and the HotWired Network
Browsable index: No
Search capabilities: Yes
Notable features: Size of the index, speed of retrieval
Help and FAQ:Yes, for the product and the searching system
HotBot indexes millions of documents on the web and Usenet news. Through a forms based interface, it allows Boolean AND/OR/NOT and phrase searching. In “open all” mode, it also supports field searching. Items are returned in order of relevance, which is provided on a percentage basis.
 

Infoseek Ultraseek

http://www.go.com/

Top of Page

Provided by: Infoseek Corporation
Browsable index: Yes
Search capabilities: Yes
Notable feature: Size of index, speed of retrieval
Help and FAQ:Yes for the product. Yes, for the searching system.
Ultraseek indexes the index full text of millions of pages. It supports Boolean AND/OR/NOT and phrase searching, as well as field searching in 4 categories (link, site, url, and title). Items are returned in order of relevance, which is provided on a percentage basis. Infoseek also contains a browsable subject index in 13 major categories.
 

Lycos

http://www.lycos.com/

Top of Page

Provided by: Lycos, Inc.
Browsable index: Yes.
Search capabilities: Yes
Notable feature: Size of index
Help and FAQ:Yes, for the product and the searching system
Originally provided by Carnegie-Mellon University, this site is now maintained by Lycos, Inc. It supports Boolean AND/OR/NOT and truncation searching, relevance feedback, and allows the user to control the level and amount of feedback, as well as the level of relevance. It brings back annotation about the site from the page itself. Lycos also maintains two browsable subject indexes, the a2z Guide, and the well known Point Top 5% of all web sites directory.
 

Webcrawler

http://www.webcrawler.com/

Top of Page

Provided by: Excite, Inc.
Browsable index: Yes
Search capabilities: Yes
Notable feature: Quick and easy page locations
FAQ:Yes, for the product and the searching system
Originally provided by the University of Washington, then by America Online, this site is now maintained by Excite, Inc. It allows phrase, Boolean AND/OR/NOT, and proximity searching. Relevance feedback is available (if you select “Show Summaries”), as is a short summary taken from the page itself. It is a good basic searcher for a “quick and dirty” search. Webcrawler also contains a browsable subject index in 18 major categories.
 

Yahoo

http://www.yahoo.com/

 

Yahoo! Search Features

Top of Page

Provided by: Yahoo, Inc.
Browsable index: Yes
Search capabilities: Yes
Help and FAQ:Yes, for the product.Yes, for the searching system.
Originally written by 2 graduate students at Stanford, Yahoo is now provided by Yahoo, Inc. The service consists of a subject catalog divided into 14 top level categories and hundreds of sub-categories. Yahoo alerts users to how many links are under each category and if any new ones have been added recently. Short annotations are provided. Yahoo is very large and well known, and is the first place many people go for a subject catalog. A tip for using Yahoo: Because there are hundreds of categories, finding a specific topic may present a problem if you don’t know where your topic has been filed. Therefore it may prove useful to first do a cursory search to find where your subject is, and then go to the category and browse it.
 

MetaCrawler

http://www.metacrawler.com/

Top of Page

Provided by: go2net, Inc.
Browsable index: No
Search capabilities: Yes
Notable features:Searches multiple indexes simultaneously
Help and FAQ:Yes, for the product and the searching system.
With a single search request MetaCrawler searches six search engines: AltaVista, Excite, Infoseek, Lycos, Webcrawler, and Yahoo. It supports Boolean AND/OR and phrase searching. MetaCrawler collects confidence scores from each of the search engines used, combines them, and provides the search results in order of relevance based on the combined confidence score. It does not, however, return individual confidence scores. MetaCrawler allows the user to focus the search by geographic region and by selected Internet domain type, e.g. “com,” “edu,” and “gov.” It also allows the user to specify search time spent and number of results per source returned.
 

search.com

http://www.search.com

Top of Page

Provided by: c|net, inc.
Browsable index: Yes, of searching engines
Search capabilities: Yes
Notable features: The number of search engines available. User can customize a page of favorite search engines. Brief annotations and searching tips for each search engine are provided.
Help and FAQ:Yes, for the product. Yes, for the searching system. A Boolean searching primer is also provided.
Search.com provides the user with direct access to hundreds of search engines. The engines are organized into over 25 subject categories, allowing the user to narrow a search by selecting engines specializing in general topics such as art, science, health, news, sports, or entertainment. Each engine is accompanied by a short annotation, as well as one or two searching tips for that engine. The best engines, as determined by search.com, are indicated by a “top pick” icon. The search.com service allows the user to create a personalized page of useful engines that will appear for that user each time search.com initializes. Each subject heirarchy can be searched across selected search engines. The number of engines available, as well as the organization of the site, make it a valuable addition as an Internet searching tool.
 

google.com

http://www.google.com

Top of Page

Provided by: Google Inc.
Browsable index: Google was founded in 1998 by Larry Page and Sergey Brin, two Stanford Ph.D. candidates, who developed a technologically advanced method for finding information on the Internet.
Search capabilities: Yes
Notable features: PageRank™ technology and hypertext-matching analysis developed by Larry Page and Sergey Brin.
Help and FAQ:
Yes, for the product.
Yes, for the searching system.
Many help topics are covered.
Introduction Google runs on a unique combination of advanced hardware and software. The speed you experience can be attributed in part to the efficiency of our search algorithm and partly to the thousands of low cost PC’s we’ve networked together to create a superfast search engine. The heart of our software is PageRank™, a system for ranking web pages developed by our founders Larry Page and Sergey Brin at Stanford University. And while we have dozens of engineers working to improve every aspect of Google on a daily basis, PageRank continues to provide the basis for all of our web search tools.
PageRank Explained – PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.” Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don’t match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page’s content (and the content of the pages linking to it) to determine if it’s a good match for your query.
Copyright Internet Scout Project, 1994-1998. http://scout.cs.wisc.edu/

Top of Page