Content Scraping: Is Someone Ripping Your Content? – Part 1
You won’t even know it if you don’t check regularly.
Content scrapers are simple programs that scrape content on topic from sites or blogs for posting to the content scraper’s site. The sole purpose of content scrapers is to rip content, post it to a series of junk sites that are slathered with PPC and paid advertising, and make money on clickthroughs.
Spamming Search Engines
Sites that are built on content scraping don’t cost a lot of money – just the low cost hosting fee. Scrapers search the web for sites that post content on the “subject” of the scraper’s website. So, all of the content is related to a topic – usually something about your health and well being, your family, your finances or some other topic that has wide appeal.
Scrapers register a dozen domain names (cost: less than $30 USD) and build a dozen duplicate sites. Then, the scraper takes all of these garbage sites, and links them together, in effect, spamming search engines who “think” that all of this low level linkage indicates sites that are visited frequently and further the search of site visitors to any of the dozen scraper sites.
Google has cracked down on scraper sites, but they’re still out there. And, (1) they may be scraping content for use on their ring of low-rent PR0 sites, or (2) they may be linking to your higher ranking site and dragging you down – without your knowledge.
Scrapers make their money on clickthroughs. No self respecting site owner would place a paid advert on one of these sites, knowing that the site’s sole objective is to generate revenues via clickthroughs.
Google hates links farms. The prime objective of any search engine is to deliver quality SERPs – the best possible search results based on the user’s query keywords. Links farms water down the quality of SERPs and, as a result, diminish the value of Google’s #1 product – links to relevant sites. It’s not nice to make Google mad.
Most links farms scrape content from blogs and web sites with impunity. Let’s face facts: copyright law on the W3 is non-existent. Do you think I’m going to take the time to sue some content scraper who ripped off an article from my site and posted it to her site without permission? How much do you think I’d collect in damages? Crossing international borders and dealing with copyright laws of another nation. It’s not worth the expenses and content scrapers know it.
So they rip you off. They post your content and content scraped from a dozen other sites, build a bunch of low-rent sites, populate these PR0 sites with ripped content, link the sites together and create the appearance of a series of quality sites.
How To Know If You’ve Been Scraped?
The easiest and least expensive way to see if you’re the victim of copy scraping is to use search engines, since they’re the target of content scrapers (so don’t take it personally).
Go back into your archives. Find a sentence in each post you’ve made. Choose a sentence that has a distinctive phrase or even anchor text in it.
Next, cut and paste that sentence into the Google search box. Then, add quotes around the sentence. This tells Google that you ONLY want SERPs that contain this exact sentence. Now you may get lucky.
However, if you get a “No results found” message, or your blog is the only one that shows up on SERPs, it still doesn’t mean you haven’t been scraped. Content scrapers are black hats so the rules fly right out the window when dealing with these clowns.
Another way of detecting it is by typing in search query link:website.com or install the free & mighty SEO for Firefox tool to have a look at your backlinks. If those domains consist of numerical figures with dash in it, most likely these are the sites that you would like to stay away from. Just click on it & if you see ads are flying all over the site, bingo you have just caught a thief with your own bare hands.
Some content scrapers will take your beautifully written piece and stuff it with their keywords, simply inserting keywords throughout your article. So a simple Google search using quotes may not turn up a content scraper.
Read Part 2 here