Google’s sophisticated ranking algorithms

Google brought a new concept to evaluating web pages. This concept, called PageRank, has been important to the Google algorithm from the start. PageRank is an algorithm that weights a page’s importance based upon the incoming links. PageRank estimates the likelihood that a given page will be reached by a web user who randomly surfed the web, and followed links from one page to another. In effect, this means that some links are more valuable than others, as a higher PageRank page is more likely to be reached by the random surfer.

The PageRank algorithm proved very effective, and Google began to be perceived as serving the most relevant search results. On the back of strong word of mouth from programmers, Google became a popular search engine. Off-page factors weighted more heavily than on-page factors as Google identified the manipulation of off-page to be more difficult.

Despite being difficult to game, webmasters had already developed link building tools and schemes to influence the Inktomi search engine, and these methods proved similarly applicable to gaining PageRank. Many sites focused on exchanging, buying, and selling links, often on a massive scale. Inktomi, an earlier search engine using similar off-page factors, had forced webmasters to develop link building tools and schemes to influence searches; these same tools proved applicable to Google’s PageRank system. Thus an online industry spawned focused on selling links designed to improve PageRank and link popularity. To drive human site visitors, links from higher PageRank pages sell for more money.

A proxy for the PageRank metric is still displayed in the Google Toolbar, though the displayed value is rounded to the nearest integer, and the toolbar is believed to be updated less frequently than the value used internally by Google. In 2002 a Google spokesperson stated that PageRank is only one of more than 100 algorithms used in ranking pages, and that while the PageRank toolbar is interesting for users and webmasters, “the value to search engine optimization professionals is limited” because the value is only an approximation. Many experienced SEOs recommend ignoring the displayed PageRank.

Google — and other search engines — have, over the years, developed a wider range of off-site factors they use in their algorithms. The Internet was reaching a vast population of non-technical users who were often unable to use advanced querying techniques to reach the information they were seeking and the sheer volume and complexity of the indexed data was vastly different from that of the early days. Combined with increases in processing power, search engines have begun to develop predictive, semantic, linguistic and heuristic algorithms. Around the same time as the work that led to Google, IBM had begun work on the Clever Project, and Jon Kleinberg was developing the HITS algorithm.

As a search engine may use hundreds of factors in ranking the listings on its SERPs; the factors themselves and the weight each carries can change continually, and algorithms can differ widely, with a web page that ranks #1 in a particular search engine possibly ranking #200 in another search engine, or even on the same search engine a few days later.

Google, Yahoo, Microsoft and Ask.com do not disclose the algorithms they use to rank pages. Some SEOs have carried out controlled experiments to gauge the effects of different approaches to search optimization. Based on these experiments, often shared through online forums and blogs, professional SEOs attempt to form a consensus on what methods work best, although consensus is rarely, if ever, actually reached.

SEOs widely agree that the signals that influence a page’s rankings include:

Keywords in the title tag.
Keywords in links pointing to the page.
Keywords appearing in visible text.
Link popularity.
(PageRank for Google) of the page.
Keywords in Heading Tag H1,H2 and H3 Tags in webpage.
Linking from one page to inner pages.
Placing punch line at the top of page.

There are many other signals that may affect a page’s ranking, indicated in a number of patents held by various search engines, such as historical data.

Comments are closed.