Using the Internet for College Research Maiden and Dragon
red line
Taming the Search Engine Monster
red line

Keywords - great, now you only have 200,000 URLs to check.

Boolean Logic - If you are Mr. Spock, you can narrow it down to a couple of hundred.

Search Sites' Shocking Secret - They Stink. They're Getting Worse. And It's Deliberate.
Don't just take my work for it, check this editorial from ZDNet.

red line
    (from How to Search the Web, A Guide To Search Tools by Terry A. Gray)

    * Enter as many precise search terms or phrases (if allowed) as possible in order to limit the search. The biggest problem is noise. That is, irrelevant or inconsequential sites among the jewels. Use of the required/prohibited term operator (prepending +/-) helps in reducing noise: +radio* -radiology

    * Enter singular terms. Most search engines will find the substring and return rivers for river. To generalize a subject, use wildcards where allowed (surg* for surgery, surgeries, surgical).

    * Do not use common, generic search terms, or if you must, include them in a phrase with more specific terms. The term book would be far too generic unless it was part of a phrase like "book binding."

    * Enter multiple spellings where appropriate: Khaddafi Quadafy Kaddafi Qadaffi... If you know the correct spelling, using synonyms will broaden a search that is returning too little.

    * Use booleans and especially proximity operators to increase the relevancy of your hits. Where allowed, (Altavista, for example) you may control relevancy based on search terms. Including words in phrases with some search engines (Infoseek) is the same as using proximity operators. Use the adjacency operator where word order is important. Webcrawler has the best implementation of proximity and adjacency operators.

    * Most of all, be persistent and creative. It's a big web out there. The search tools are wonderful but far from complete. Be prepared to supply the ingenuity to make the most of their features.

red line


Boolean AND: Narrows your search to include documents that contain BOTH keywords.

* Al AND Gore

Boolean OR: Broadens your search to include ANY of the keywords.

* Use for alternative spellings such as Chanukah OR Hanukkah.
* Use for common misspellings such as Klu Klux Klan OR Ku Klux Klan.
* Some systems assume OR, so religious beliefs may be treated as religious OR beliefs.

Boolean NOT: Narrows search by excluding one meaning of a word.

* cowboys BUT NOT Dallas
* Gold Rush AND NOT Alaska

Nesting: By combining Boolean words with parenthesis, you can perform multiple tasks at once.

* Saturn AND (car OR automobile) is useful for synonyms

Truncation: Searches on the root of the word adding different word endings or plurals.

* Educat* searches educator, education, educational, educated...
* Some engines truncate automatically, so tribe may also retrieve tribes and tribal
* Other engines recognize that the plural tribes should also retrieve the variants tribe and tribal.

Controls: By adding + or - in front of a word you are saying that the word MUST or MUST NOT be included in the "hits," another name for the results of your search.

* Poccahontas -Disney (information about the woman NOT discussing the Disney movie)
* Poccahontas +Disney (information about the woman in the Disney movie)

Phrase: Searches a phrase or words that have a unique meaning when linked:

* "Wounded Knee" - you add the quotes
* (Westward Expansion) - you add the pare* Michael FOLLOWED BY Jackson
* Some engines treat two words together as Bill OR Clinton
* Engines may not recognize certain punctuation, so sex education returns hits on health curriculum and hits with the commonly used phrase "sex, education and income." In this case if you add NOT income to sex education your search will retrieve better results.
* Some engines drop common words or one-letter words within phrases, so a search for the phrase vitamin A becomes equivalent to searching for vitamin and a search for New Orleans becomes Orleans.

Proximity: Searches one word nearby another word.

* tribal gaming
* Indian NEAR casinos

Case Sensitive: Most engines do not recognize capital letters.

* Newt and newt (the politician and the salamander) are treated identically
* AIDS and aids (the disease and the verb) are treated identically

Searching Specific Fields: Searches only specific parts of web pages, such as the words on the browser's title bar (the document's title) or the first heading.

Relevance: The engine calculates how well the hits match your search request and ranks them in order of relevance.

* Pages which have your keywords in the heading or first paragraph are ranked high.
* Pages in which your keywords appear frequently are ranked high.

Query by Example: The engine has an option of asking for similar pages when you find a good hit.

Natural Language: When it is hard for you to design your search precisely, some engines allow you to ask for information as if you were thinking aloud.

* I want to know about the treaties that Native Americans made when they went to reservations is treated as treaties AND Native Americans AND reservations.
red line

Become a Search Engine Savant

red line
Left arrow to previous page Right arrow to next page
red line
Angel with stars Oh, No!
Return to Internet for College Research Beginning Return to Prof. Dugas' Home Page

Web site designed and maintained by Prof. Terry Dugas
Page last updated 10/10/2000. © 1997, W. Terry Dugas.