Microsoft’s Local Search Algorithm A Privacy Nightmare?

Local search is big and everyone is jumping into this segment. And for a fact, we all tend to realize that the local results from major search engines is nothing to brag about; except if you are in a high internet density area like New York or San Francisco.

While companies have been devising different ways to go local, Microsoft’s new technology seems to offer you very relevant local search results; except that it can be a privacy nightmare.

In a patent filed recently with the US Patent and Trademark Office (USPTO), Microsoft has described a new way to rank local search results. The inventors cite the following issues with current search results

  • Search engines make use of link authority to rank results. While they are good for most cases, they are not exactly relevant when a person is searching for say ‘Italian restaurants near MG Road, Bangalore’. This is because restaurants around MG Road in Bangalore might not necessarily have good PR value
  • Some search engines make use of  ‘click popularity’ where sites which have been clicked a higher number of times tend to be ranked higher. These sites create a positive feedback loop which does not help in showing the relevant results

To overcome these, Microsoft has proposed the usage of the users’ access log in order to study the pages visited in a specific time period and build an implicit pagerank for pages from the user log which will be used as a factor while  displaying search results. So in the earlier example, if the user has visited BangaloreRestaurants.com, results from this website could fetch higher weightage than results from a site like Yelp.

Microsoft says this technology will be particularly useful while ranking pages from intranet websites. While the algorithm sounds interesting, making use of a user’s access log sounds scary. Users are not always comfortable giving third party websites access to the sites they visit. Something does not sound right in Microsoft’s plan to record this log, processing them for implicit pageranking and delivering results back to the user.

What do you think? Are the fears justified or are they unfounded?