Thursday, April 29, 2010

The Future of Search Engines

How does a search engine work? A typical person will respond that search engines retrieve data related to the keyword. Search engines do indeed work that way. But ththis answer isn't general enough to accommodate the gist of what a search engine does.

We have identified a two-step process in which search engines work:

Step 1: Infer what the user has intended to search.
Step 2: Retrieve all webpages which match the user's wishes.

This is basically how search engines work. But they usually combine the two steps. They retrieve the webpages that match the user's search term, even when the search term is too ambiguous or vague to know what the user's intention's were. This is problematic, because search engine may retrieve webpages even if it doesn't know what the user really intended to search. Ambiguous keywords and phrases are examples of this.

A solution is simple, such as suggestion keyword refinements. But in general, this is a problematic step in search engines.

But they all must rely on inferring the meaning of the user, and inferring the meaning of the webpages. Even obtaining the user's meaning is by far concise.

We will argue what will happen to the future of web search:

  1. Web pages are evaluated by thousands of heuristics; no single heuristic can override the evaluation done by other heuristics. This is useful, because if a heuristic goes wrong by making a false positive, the overall evaluation isn't as affected as much. So including hundreds of heuristics lessen the errors.
  2. Common heuristics include the number of incoming links, the keyword density of the term, and how popular the site is.
  3. The more and more the search engine understands the webpages, the less and less the algorithm will rely on incoming links and keyword density.
  4. Search engine companies are increasingly incorporating artificial intelligence (AI) to their evaluation function. Thus, their algorithm will be less reliant on the ad hoc heuristics, and more reliant on the actual content.
  5. Thus, search engine optimization (SEO) specialists are going to focus more on quality content rather than focusing on link-building and other keyword density schemes. They will work harder and harder in hiring writers in generating original content rather than scraping content from other sources.
  6. As a side-effect, black hat SEO firms will be introducing more sophisticated ways to build links and generate content automatically. For example, they will use artificial intelligence for machines to generate original articles.
  7. The arms race between the search engine providers and black hat SEO firms is going to be more knowledge-based, and less based on link building algorithms.
  8. There will be a significant increase of utilization of artificial intelligence on both sides. The search engines will use AI to combat spam and detect if the content is genuine, rather than some generated content. The black hat SEO firms too will get more sophisticated, and will use its own AI to deceive the search engine's AI.
  9. Search engines will become "smarter." But I doubt it's going to be real AI.

No comments: