Search 101: how search engines work
By Vicki Waschkowski • Nov 18th, 2008 • Category: Research
The itsy bitsy spider climbed through the world-wide web.
At a basic level, most of us are aware that elusive “spiders” are part of how search engines work. But what are spiders? What do they do? How do search engines find and rank pages – and in particular how can you ensure your page gets ranked above the rest? If these questions are on your mind – have a read below. This is the first in a series of postings in which we will attempt to de-mystify search engines for you, with a simple explanation of what they are and how they work. In future posts we will further explore how you can best optimize your site for search.
What is a Search Engine?
Google, while certainly not the first or only search engine on the web (other important search engines include Microsoft’s Live Search, Yahoo, and Ask.com), may be the one that comes to mind when you think “search.” Its ubiquity has made its name a verb: “to Google” something. But what are Google and other search engines? They are databases containing an index of content from the Internet. And when you enter a search query, they return to you a list of web pages that match the search query you performed on that database.
What are Spiders?
Search engines create their web listings by using automated software programs called spiders. These spiders (also called “crawlers” or “bots”) constantly crawl through the world wide web to check out web pages, index their information, and follow the links listed within them. They return to pages over and over again to catalogue updates. This catalogue of information is the search engine database.
Spiders do not crawl into private networks, password-protected sites, prohibited information repositories, etc. – they only go where they are allowed, and they find new pages through links. Spiders are used to create a copy of all visited pages for the later processing of a search engine that will index the downloaded pages to provide fast search results. Spiders start by visiting the URL, then identifying meta tags and hyperlinks. They then visit those hyperlinks as well.
How do Search Engines Work?
Every search engine works slightly differently, however, the basic process is this: First, a consumer types a query into the search bar. The search engine software then immediately sorts through its database to find matches to the query, and previews its results to the consumer, ranked in order of relevance. The relevance is based on a constantly changing process designed uniquely (and confidentially) by each search engine. Google, for instance, uses its unique “PageRank™ algorithm” that considers “more than 500 million variables and 2 billion terms.” Search engines serve different niches, and build their algorithms in unique ways – which is why different search engines may serve up different sites of relevance for the same query.
Search engines do not search the entire world wide web, but rather their own database of pages. Google, for instance, indexes about 8 billion pages.
Think of a search engine as a really fast librarian. You tell her what you are looking for, and she searches through her database of all the books in the library to find the best ones for your search. She doesn’t search all the books in the world, but she has a big selection to choose from.
How Do You Ensure Your Site is Indexed?
There are two ways your site can be listed on a search engine results page. The first is through organic search – these are the URLs that will naturally appear in the search results to any given query based on the relevance indicated by the algorithm of that particular search engine. These results appear in the centre of the search engine results page. You may have heard the term “search engine optimization” (SEO), which refers to things you can do to ensure spiders crawl your site and that you’re ranked more highly in the list of organic results delivered to the consumer for relevant queries. The second way to ensure your site is listed in the search results to particular queries is through paid search placement. In this scenario, you define the relevant search terms, then pay for your rank within the paid search portion of the search returns. Paid search results typically appear in a top block of search results or in the right-side panel of the first page. All other results listed are organic.
Details of paid and organic search, and some tips for search engine optimization are the topics of the second post in this series. Coming soon will also be a glossary of terms to help you through your understanding of search engine marketing as you explore with us the significance, opportunities, and growth of search – so stay tuned to Lucidity to find out more.