The last of our 8-part series on developing search engine friendly website structures. This was originally written by Rand Fishkin and Moz Staff, and posted on posted Moz. Image courtesy Wikimedia Commons.
How scrapers steal your rankings
Unfortunately, the web is littered with unscrupulous websites whose business and traffic models depend on plucking content from other sites and re-using it (sometimes in strangely modified ways) on their own domains. This practice of fetching your content and re-publishing is called “scraping,” and the scrapers perform remarkably well in search engine rankings, often outranking the original sites.
When you publish content in any type of feed format, such as RSS or XML, make sure to ping the major blogging and tracking services (Google, Technorati, Yahoo!, etc.). You can find instructions for pinging services like Google and Technorati directly from their sites, or use a service like Pingomatic to automate the process. If your publishing software is custom-built, it’s typically wise for the developer(s) to include auto-pinging upon publishing.
Next, you can use the scrapers’ laziness against them. Most of the scrapers on the web will re-publish content without editing. So, by including links back to your site, and to the specific post you’ve authored, you can ensure that the search engines see most of the copies linking back to you (indicating that your source is probably the originator). To do this, you’ll need to use absolute, rather that relative links in your internal linking structure. Thus, rather than linking to your home page using:
You would instead use:
This way, when a scraper picks up and copies the content, the link remains pointing to your site.
There are more advanced ways to protect against scraping, but none of them are entirely foolproof. You should expect that the more popular and visible your site gets, the more often you’ll find your content scraped and re-published. Many times, you can ignore this problem: but if it gets very severe, and you find the scrapers taking away your rankings and traffic, you might consider using a legal process called a DMCA takedown. Moz CEO Sarah Bird offers some quality advice on this topic: Four Ways to Enforce Your Copyright: What to Do When Your Online Content is Being Stolen.