How search engines work
How Search Engines work
We all use search engines to find content on the World Wide Web. In the very beginning there were only a few hundred websites now there are probably trillions, so how do these search engines find all this information. There are numerous search engines however they all work in a similar fashion.
- They send out a computer robot sometimes referred to as a spider to find the content on around the web
- They then store and index the words that are found in a huge Big Database
- They provide a search facility for users to search their database.
Spidering the web
Lets be clear what the search engines are looking for. The internet isn’t just the World Wide Web its many other things too, but Web Search Engines are only interested in searching the World Wide Web. In it’s simplest form Search Engines use a software robot called a spider to record all the main words of an Internet web page. It does not record short words such as ‘a’,’an’ as they provide no meaning. The spiders generally start by visiting very popular websites then records every word on that website and also follows every link within the website and does the same on those websites that have the link. Using Google as our example of just one way search engines work, when Google ‘indexes’ a web page within a website it looks at the words on a page and where on that page they are. It also looks at titles, subheadings and metatags. This now gives the search engine a fairly complete picture of what is on that page. Metatag’s allow the publisher of that web page to make it easier for Google to determine the content of that page using a piece of hidden code (not displayed on the page to the user). This can be open to abuse by the web page publisher trying to fool Google to think that the page has something different on the page than is actually there, so Google checks the metatag with the content to ensure they match. In this way Google now builds a huge database of information from around the web.
Searching the database
To make all this information useful to us, the internet user, it needs to organise the database to make the search facility fast for the user. The results from the search need to be relevant and Google does this by creating a ‘rank’. Unfortunately the way each search engine ranks a web page can be quite different that is why when you search on different search engines you will get different results in order of relevancy. The rank or weight that is assigned to the web page attempts to list the results in order of relevancy to the users search term. So in theory we should get relevant results in order of relevancy. Well not quite!
Relevancy vs. Advertising
Searching the Google database is free and thats one of the main reasons we all use it, but free comes with a price!
Google couldn’t possibly be the giant it is now without an income and so to create this income they have two main income generators called Google AdWords and Google AdSense. So actually Google’s top priority as a business is to cater for its advertisers. So the ranking of relevancy of your search is now different as Google gives priority to advertisers. There are two main areas of the results page returned by Google, The top and right hand columns show advertisers which is referred to the ‘Paid Ads’ sections and the main content area is known as the ‘Natural Search results’. OK we get the top and right hand column, the more advertisers pay the higher the position they get. But what about the ‘Natural Search Results’ surely they will be ordered in a relevant manner. Well a whole industry has been created by this “Natural Search Results’ section and that is ‘Search Engine Optimisation’. Otherwise known as SEO gurus, they will optimise your web pages to encourage Google to rank your webpage higher that your competition. Oh so now the ‘Natural Search Results’ are not really natural they are determined by the skill of the SEO guru! An example of the relevancy of a Google search can be found here ‘Are Search Engines a waste of time?‘
The future of the Internet Search Facility
Google was trail blazer in the world of search engines, and as we needed ways of finding what interested us on the web we used it. We are so accustomed to using the big search engines we don’t actually realise how the quality of search results have been degraded throughout the years, to an incredibly low level. Our example above demonstrates how one particular search resulted in a 14% relevancy and only a 5% click through rate. So what is the answer and how can Google its ways?
Whilst Google provides the search facility for free it has to have advertisers to fund the business so advertising and biased search results will never change. The method of just using key words to index its database is unlikely to change as the whole algorithm model of Google is key to its whole indexing of its database.
The future in our opinion lies in the ability to search the web directly on behalf of the user, so instead of searching a pre determined controlled database you are actually searching the web yourself using your own personal web spider. The web spider then creates a network of things that interest you based around your search term. It will not deliver advertising, it will simply use keywords it will synthesise the words and entered and use links found on the pages it finds to decided what would be more relevant. So there is no need to index a database, as the results are relevant and searches are listed in order of relevancy.
There are a number of groups attempting to perfect this technology and Yirika is one of the first to employ this technology to create a better, richer user search experience. Yirika decided that advertising is too powerful to ensure unbiased search results so Yirika charges for its service albeit a small annual fee to provide a more powerful relevant search engine service. For those that want relevant advertising free unbiased search results Yirika could be your alternative that will save you both time and money.