Tutorial: Information and Communication Skills—Searching

Back in 1998 or so I was working with an educational technologist named Alan November, an internationally-known technology speaker and writer. At one point, Alan had me write this article on searching the Internet (still a new thing at the time). For years the article was on Alan’s website, but then he redesigned his site in 2005 and my article evidently didn’t survive. This is a good thing, as it was sadly out of date by then—this was written just before Google, for instance, and while Northern Light was still open to the public—and is even more so today. Nonetheless, it’s one of the first things I ever wrote for money, and it contains some good advice sprinkled throughout. Consider this of historical interest if nothing else.

As we turn to the sorts of skills kids need in order to use the Internet effectively and usefully, it is important to stress that the Internet is only one source among many. It just isn’t practical to expect to find the answers to every question on the Internet. In particular, information that was created prior to the 1980s is pretty scarce on the Internet, mostly due to the fact that a lot of that information wasn’t digitized. And even if it was digitized, it’s not always in a format that is easily transferable to the Web.

More than that, certain kinds of information just aren’t on the Internet yet, and even when information is somewhere out there, you can’t always find it. That’s why it’s important to stress to kids that they leverage all the resources they have available to them, including the library, other teachers, and other experts in the community.

Searching

Let’s start with search engines, since they are used so widely by Internet users to find Web sites. It’s really easy for kids to learn the basic drill—they sit down, type in www.altavista.com or www.infoseek.com or www.yahoo.com, enter their search interest, and begin plowing through the 254,876 hits they get back.

The problem is, though, that this is too easy. Kids and teachers need to understand a few important things about search engines. They really need to. The Web is growing at an astounding rate—close to 50%, according to some studies (See Glave). As the Web continues to expand, it becomes harder and harder to find anything without resorting to AltaVista, Infoseek, and Yahoo, to name but a few.

Search Engines and Directories

Actually, grouping AltaVista, Infoseek, and Yahoo together like that really isn’t correct. AltaVista and Infoseek are search engines; while Yahoo is a directory. Search engines employ specialized software—called, variously, bots, spiders, and crawlers—to scan the Web looking for information on Web pages and storing that information in gigantic databases. When you connect to a search engine and type in your search terms, you are really looking through the database for that search engine.

Since the Internet is so large, it is impossible for any one search engine to have the entire Web catalogued. Worse, the database the search engine uses can be weeks, even months or years, out of date. A Web site that contains exactly the perfect information you need might exist out there, but if it has only been available for a week, and your search engine’s database is one week and one day old, then you will not be able to find it.

Directories like Yahoo share this same problem, but for a different reason. Where search engines automate the process by which they discover new sites on the Web, directories use humans to find and organize material. If someone wants to add his or her site to Yahoo ’s database of Web sites, they must fill out an online form requesting that someone from Yahoo take a look at the site. If the employee of Yahoo thinks it is worthy, it is added. This can make adding new sites to the database used by the directory an even slower process than with search engines.

Directories also structure their information a bit differently. Many search engines simply have a text box where you can type your terms in and a button marked “Search.”Yahoo, however, makes information available through hierarchical categories. On Yahoo ’s home page, you are presented with a list of 14 top-level categories, including Arts and Entertainment, Computers and Internet, Government, Health, and Reference. To use Yahoo to find out something about the great composer Bach, you would first choose Arts and Entertainment, then Performing Arts, then Music, then Genres, then Classical, then Composers, then Baroque, and finally you presented with 27 different Web sites devoted to “Bach, Johann Sebastian (1685-1750).”

Now, you might think that you already had to know something about Bach in order to finally arrive at that final list of 27 Web sites. For instance, you had to know that Bach would be found in the Baroque period. That might not be known by everyone searching for information on the composer. In that case, Yahoo allows you to do a search within categories or subcategories. For instance, once you got to the Classical subcategory, you might find yourself stuck. Which way to turn now?

At the top of the page, though, there is a text box into which you could type “Bach,” and underneath it you are given two choices: you can search all of Yahoo, or you can search only in Classical. This arrangement is true no matter where you are within Yahoo. If you get stuck, you can always widen your search to include all of Yahoo ’s database, or you can target your search in the area you wish. And if your search in all of Yahoo turns up nothing, Yahoo automatically forwards your search to AltaVista, which searches its much larger database for Web sites matching your query.

Saving Us From Our Own Mistakes

There are some more caveats we need to keep in mind as we use these wonderful tools. You might remember the old phrase, “Garbage in, garbage out,” which described a fact about programmingif you program poorly, you’re going to get poor results. The same thing holds true when we’re searching. Search engines can only work with the terms we give them. If we’re searching for the wrong thing, or if weve misspelled our terms, or if we don’t include as many terms as we should, we’re not going to get useful results.

For instance, if we want to find out about the Brandenburg Concertos of Johann Sebastian Bach, but we (a) search for “classical music,” (b) search for “John Sebasstian Back,” or (c) search just for “Bach,” we’re probably not going to be too happy with our results. Either we will get results that at best peripherally relate to the composer’s work, or we will get nothing at all, or we will find ourselves overwhelmed by hundreds of thousands of hits.

Some search engines are getting better at helping us find the things we want to find. The relatively new and powerful Northern Light offers one solution. Go to Northern Light and search for, say, just “Bach.” On the right side of the screen is the list of hits (122,265!) that is pretty much standard with any search engine. But the left side of the screen is different. Here those 122,265 results are organized into groups of results that Northern Light calls “Custom Search Folders.” After my search for Bach, my Custom Search Folders included the following: Commercial sites, Personal pages, www.jsbach.org, Educational sites, Cantata, Johann Sebastian Bach, Choral music, Record reviews, Documents in German, History, Industrial machinery.

It’s obvious that some of these would be of no use for me as I look for information on the Brandenburg Concertos—Industrial machinery has nothing whatsoever to do with my subject, Documents in German wouldn’t help because I don’t speak German, and Cantata are the wrong musical form. But it’s just as obvious that some of those would be tremendously helpful. I didn’t even know that www.jsbach.org existed, so that would probably be my first stop, along with Johann Sebastian Bach, Record reviews, and History. Northern Light takes what would otherwise be an overly-broad term and helps me narrow it down in a useful, transparent fashion, with absolutely no input required from the user.

Ask Jeeves is another new site that helps us with our queries. Users of Ask Jeeves pose questions using natural language. Instead of typing “Bach Brandenburg Concertos,” which can feel awkward to many people, Ask Jeeves allows you to ask, “Where can I find information about Bach’s Brandenburg Concertos?” It’s almost spooky how Ask Jeeves seems to really understand your questions and then points you towards pages that are usually exactly what you’re looking for.

After I posed my question above, Jeeves gave me four options for my further investigation:

Remember, I said “usually exactly what you’re looking for.” But I’m very happy with the three results I’ve been given, and that fourth one makes me laugh, so I’m OK with it too.

Here’s another great thing about Ask Jeeves, something that solves the spelling problem I raised above—next to the box in which you type your question is a checkbox you can select labeled “Check my spelling.” Checking this box does just that. If I ask, “Where can I find information about Bach’s Brandenberg Conshertos?”Ask Jeeves politely responds with this: “I think you may have misspelled something. Did you mean: Where can I find information about Bach’s Brandenburg Concertos?”

What’s really useful is that both “Brandenburg” and “Concertos,” my two misspelled words, are really popup menus that give me other choices if Ask Jeeves ’ suggestions aren’t correct. If I click on “Concertos,” for instance, I find that I can also choose from “Concerts,” “Consortium,” and “Concerns,” if those would more closely match what I’m seeking.

Boolean and Other Operators

Northern Light and Ask Jeeves are two great tools we can use as we investigate topics on the Web, and they help solve several annoyances we find when we search the Net, but there are other tricks and tips that need to be taken into account as we use search engines and directories. These involve how we enter our search terms into the search engines.

First of all, anyone who uses a search engine needs to understand Boolean searching. Named for George Boole, the English mathematician who lived from 1815-64, this means of searching involves the use of three simple words, or operators: AND, OR, and NOT.

If you want to find out about Bach’s Brandenburg Concertos, and you want to make sure that the sites the search engine suggests to you contain the specific words “Bach Brandenburg Concertos,” then you would want to place the word AND between each of your three words (most search engines don’t care if you use capitals or lower case; I like to use capitals because it’s easier to distinguish between my search terms and my Boolean operators that way). It would look something like this: Bach AND Brandenburg AND Concertos. Any returns you will get will have those three words somewhere on the page.

Note, however, that the words are not necessarilytogetheron the page, howeveryou could get back a link to a Web page created by George Bach of Brandenburg, Kentucky devoted to Mozart’s Concertos for Piano and Orchestra. Even so, AND can really help narrow down your search to pages specifically addressing your topic.

OR is useful when you’re searching for a variety of things that are somehow related to each other. For instance, for instance, if you wanted to search for information abouteitherBack or Mozart, you would place the word OR between your terms, like this: Bach OR Mozart. You know what the result of such a search would be, however—a unbelievably long list of hits (1,357,387 when I did such a search on Infoseek!). That’s why OR is often combined with AND to narrow down your results.

Let’s say you wanted to find out about Mozart’s concertosorBach’s concertos. Using Boolean logic, you would search for the following: (Mozart OR Bach) AND Concerto. Just as in algebra, you have to use the parentheses in order to let the search engine know exactly what you’re searching for; otherwise, it could think you’re looking for pages about Mozartorfor pages about Bach’s Concertos. With the addition of OR, your searches can really be tailored to your exact needs. For instance, imagine a search for(Mozart OR Bach) AND (Concerto OR Mass). That would manage to be both broad yet focused at the same time.

There are other Boolean terms, but I’m only going to focus on one more, and it’s just as important as AND and OR. Use NOT to exclude things from your search. Weve all searched for something, gotten way too many hits back, and then found that a consistent number of them had nothing to do with our topic, but came back in the search anyway because of some relation to our topic. For instance, let’s say you were interested in Bach’s concertos, but you most definitely did not want anything about the Brandenburg concertos to show up. NOT would be quite useful here. Just search for the following: Bach AND concerto NOT Brandenburg. That would narrow your search down nicely.

Of course, you can also combine the three Boolean terms together. Let’s say you wanted to find out about Bach’s concertos, but not the Brandenburg, and also Mozart’s concertos. Just search for this: (Bach OR Mozart) AND Concerto NOT Brandenburg. As you can see, with a little forethought and some practice, Boolean terms can really help you tremendously in your searching.

But which search engines support Boolean terms? Fortunately for us, all of the big six—AltaVista, Excite, HotBot, Infoseek, Lycos, and Yahoo—support the three Boolean search terms I discussed above, along with several that I didn’t discuss. So does Northern Light, another of my favorites.

To see if your favorite search engine supports Boolean operators, go to its home page and look for a link or button titled, “Search Help” or “Search Tips” or something like that. Click on it and see if it mentions any of the terms I’ve discussed above.

If you want to learn more about Boolean searching, I’d recommend starting at Keith Nichols’ "Boolean Operators Cheatsheet," bookmarking or printing that page out, and then going to Nichols’ fine article entitled "Master Online Searching." For more information on which search engines support which Boolean terms, take a look at Keith Nichols’ "Comparison of Search Engine Operators". (See Nichols)

Some search engines don’t require you to explicitly use Boolean terms when you use more than one word to look for Web pages about a specific topic. These search engines default to certain Boolean operators. If a search engine defaults to an implied AND, this means that if you type in “Bach Brandenburg Concertos,” the search engine treats it as though you really typed “Bach AND Brandenburg AND Concertos.” That’s a nicely limited search, and you didn’t have to type in those Boolean operators. Any time I don’t have to type more things into my computer, I’m happy.

On the other hand, think about the results you’d get if the search engine used an implied OR. Instead of your nicely limited search, you would have had a much larger pool of hits, and more of them undoubtedly would have been spurious.

So what search engines use which implied terms? Unfortunately, there’s no real standardization. Yahoo, HotBot, Lycos, and Northern Light use an implied AND. AltaVista, Excite, and Infoseek use an implied OR. The important thing to note here is that you need to familiarize yourself with the ways your search engine works.

Another way search engines use Boolean operators without appearing to do so involves the minus and plus sign. In these cases, a “+” before search terms acts as AND, and a “-” acts as NOT. In other words, a search for "+Bach +Brandenburg +Concertos" means that all 3 of these words have to appear on a page for it to come up. If you had instead searched for "Bach Brandenburg +Concertos," then you would have received a list of Web pages that have information about Bach’s Concertos, but none about the Brandenburg Concertos. (Notice, by the way, that there shouldn’t be a space between the plus or minus sign and the word you wish to affect.)

Sometimes it’s easier or quicker to use the + and when you’re searching. Fortunately, things are a little easier here than with the default Boolean operators above—Yahoo, AltaVista, Excite, HotBot, Infoseek, Lycos, and Northern Light all know how to work with both the plus and the minus. Standardization can be nice!

When You Must Find It

It can be pretty annoying when you just can’t find what you’re looking for. Even if you’re spelling everything right, and you’re using Boolean operators, and you’re using the plus and the minus, the search engines can still disappoint you. Fortunately, there are a few tricks left in the bag that you can try.

If you’re looking for a particular phrase, like “Bach’s Brandenburg Concertos,” and you want to find that exact phrase on a Web page, then enter it into your search engine with quotation marks around it. Using quotation marks tells the search engine that you’re looking for those three words next to each other. This would be different than using the Boolean AND, because the AND tells the search engine that you’re looking for those three words anywhere on the page, while the use of quotation marks means that you’re actually looking for the phrase. If you use the quotation marks, you’ll never get pages created by George Bach of Brandenburg, Kentucky devoted to Mozart’s Concertos for Piano and Orchestra. And, once again, we’re in luck.

The use of quotation marks for finding a phrase is supported by all the major search engines, which means AltaVista, Excite, HotBot, Infoseek, Lycos, Yahoo, and Northern Light. If you’re using something else, don’t forget to read the particulars about how that search engine works.

Capitalization is another trick that can come in handy. If you’re seeking information about something that has a proper noun for a name, then by all means, capitalize the appropriate words. This helps certain search engines focus on the exact thing you want. For instance, Infoseek will treat a search for “concertos” and “Concertos” differently. If you type in “Concertos,” you may come up with pages devoted to the Brandenburg Concertos, while if you type in “concertos,” the Brandenburg Concertos may be very, very far down on your list of hits.

Unfortunately, only a few of the biggest search engines support case sensitivity: Yahoo, AltaVista, and Infoseek. HotBot supports case sensitivity if you include quotation marks around your phrase, but Excite, Lycos, and Northern Light don’t support it at all.

Here’s one last trick when it comes to finding something you’re determined to track down. URLs, or Uniform Resource Locators, are the addresses of Web pages—<{CCM:BASE_URL}>, for instance is the URL for my Web site. Oftentimes, though, a URL is a lot longer than this. Let’s say you’ve typed in a URL or clicked on a link, and you find yourself at a Web page with a URL of http://www.companyname.com/coolstuff/fun/interesting.html. But there’s nothing there. You get Error 404: Page Not Found. This could mean several things. The page could have been moved or taken down. You may have typed the URL incorrectly, or else the person who created the link to the page may have typed it incorrectly. But let’s say that the information on interesting.html sounds really great. You want it. But it’s not there. How do we find that information?

Definitely check your spelling if you typed in the URL. Just make sure. Sometimes you’ll find that you typed “interestng.html” when you meant “interesting.html.” I’ve done it many times. If you didn’t type the URL, and instead clicked on a link, you could change the spelling in the URL and then hit the Return key again to see if that fixes it. In other words, look up at the URL in the Location or Address field at the top of my Web browser, notice that it says “interestng.html,” and then insert the “i” manually. Hit Return and see if the Web page pops up then.

This trick doesn’t always work, because sometimes people purposely name Web pages with letters missing, especially if they want to adhere to those old DOS and Windows file name limits of 8 characters. In that case, it would make sense to see a page named “intrstng.htm.” But you won’t see it that often.

Here’s one last thing to do, and this sometimes really saves the day. It involves truncating the URL. Go back to that URL I made up: http://www.companyname.com/coolstuff/fun/interesting.html. You’re there, and you see Error 404. You know you typed it correctly, you know the link was correct, and you know the words are spelled correctly. Fine. Try this. Highlight the words “interesting.html” and then hit backspace, or click after the final “l” in “html” and then hit backspace several times, until you’re left with this: http://www.companyname.com/coolstuff/fun/. Now hit return. A lot of times, you’ll be at a page that may link to the newly named “interesting.html,” now called “really_interesting.html.” Or you may see a list of Web pages and files, and you may be able to find the one you’re looking for, or one just as good. If that doesn’t work, then get rid of “fun/” so that you’re left with this: http://www.companyname.com/coolstuff/. Try the same thing. If necessary, keep going until you get to http://www.companyname.com/. At that point, you’re probably out of luck. Your only hope is to go back to the search engines. But still, I’ve found that URL truncation can work wonders when you know that Web page you want is on the server you’re connected to, but it’s been moved.

57 Varieties of Searching

Probably the biggest piece of advice I could give when it comes to search engines is to always use more than one. You would be amazed at the number of times I find people using only Infoseek, or just Yahoo, or nothing but Lycos, and then complaining that they can’t find very much information on their topic. Even though Northern Light and Ask Jeeves are excellent resources, they are still limited because, as I pointed out above, the Internet is just too large and is too much of a moving target for just one search tool to capture and index it.

But don’t forget that there are an unbelievable number of highly specialized search engines out there as well. If you’re interested in finding out about Great Danes, Shih-tzus, or Cocker Spaniels, try the American Kennel Club’s search page. Looking for pictures of great works by Michelangelo, O’Keeffe, or Hokusai Katsushika? Try out the World Wide Arts Resources. I highly recommend the US Department of Health and Human Services’ Healthfinder if youve got students researching diabetes, hernias, or AIDS, as all the sites there provide the reliable health information that can be hard to find on the Web.

As you can see, there is a search engine for every subject. It just takes some digging. Fortunately, weve now got search engines that specialize in indexing search engines! Try those out, and you’re sure to find what you need. Some of my favorites include Virtual Search Engines, All-In-One Search Page, and Beaucoup Search Engines.

Bibliography

Glave, James. "Dramatic Internet Growth Continues." 16 February 1998.
http://www.wired.com/news/news/email/other/technology/story/10323.html

Nichols, Keith. "Master Online Searching."InternetUser, 23 February 1998.
http://www.zdnet.com/products/garage/search/search.master/

Search Engines

All-In-One Search Page
http://www.albany.net/allinone/

AltaVista
http://www.altavista.com/

American Kennel Club’s search page
http://www.akc.org/search.htm

Ask Jeeves
http://www.askjeeves.com/

Beaucoup Search Engines
http://www.beaucoup.com/engines.html

Excite
http://www.excite.com/

Healthfinder
http://www.healthfinder.gov/

HotBot
http://www.hotbot.com/

Infoseek
http://www.infoseek.com/

Lycos
http://www.lycos.com/

Northern Light
http://www.northernlight.com/ or http://www.nlsearch.com/

Virtual Search Engines
http://www.dreamscape.com/frankvad/search.html

World Wide Arts Resources
http://wwar.com/

Yahoo
http://www.yahoo.com/

WebSanity Top Secret