According to Wikipedia, Web Scraping is defined as “a computer software technique of extracting information from websites.” Now, Google has taken measures to penalize site scrapers in an effort to reduce what it considers webspam.
Many websites offer an RSS feed of their content. In the early days, many sites just provided a headline and a paragraph of the content, with the hope that people would follow the link back to their website. Now, it’s more common to include all of the content along with images. While that provides a better experience for people using RSS aggregators such as Google Reader or Bloglines, this also gives a site scraper everything they need to post an entire article on their website.
The reason Google is stepping in, is because many of these site scraping sites have dramatically improved their SEO ranking and in some cases have gotten better search results than the original source website.
The reality is that many of these site scrapers will try to find workarounds and it’ll turn into a cat-and-mouse game. Virus and spambot writers have been doing this for years, but it’s in Google’s best interest to clean up their search results, so I suspect they’ll put their sizeable resources to bear on this issue.
What do you think…
Is ‘site-scraping’ morally acceptable? Should Google do whatever is necessary to combat it?