Recent Comments

Tip Jar

for NewsCloud

Tip Jar

Stats

NewsCloud.com Front Page

« NewsCloud t-shirts and stickers from CafePress | Main | Using technology for good or evil »

A billionaire's guide to stopping "theft" of your online newspaper content...and relegating your business to obscurity

Recently, I wrote that billionaire Sam Zell's inflammatory remarks about Google stealing the content of online newspapers seemed unfair and inaccurate:

“If all of the newspapers in America did not allow Google to steal their content for nothing, what would Google do?” he asked. “We have a situation today where effectively the content is being paid for by the newspapers and stolen by Google, etcetera. That can last for a short time, but it can’t last forever. I think Google and the boys understand that."
- Sam Zell, Stanford Business Daily

Google isn't stealing this material. The newspapers have left the content wide open - and simply haven't asked Google not to use it. Google News appropriates Fair Use materials (captions and image thumbnails) and subsequently drives significant traffic through to online news sites that might not otherwise receive page views or revenues from these readers (at this time, Google News does not even earn revenue from this activity). A reporter from the Seattle Post-Intelligencer recently said that 10 percent of their sites traffic comes from Google News. Online news aggregators like Google News and NewsCloud are good for the online newspaper business.

Yet, I feel bad for Mr. Zell, having spent $8 billion dollars on the Los Angeles Times and Chicago Tribune and barely having any technical knowledge of how the Internets work. To help him out, I've written an easy how to guide for stopping "theft" of your online newspaper content ... but it might as well be called "How to relegate your online newspaper to obscurity and minimize your subscriber base" or "Minimizing the bandwidth usage of your online newspapers" or "My Secrets of Search engine de-optimization". Mr. Zell needs to understand that the way these sites are operating today is essentially like leaving your garage door open every day with a sign that says "Community Tool Lending Library". You're just asking for someone to use your stuff.

Here are my eight easy steps to stop the Google Boys from driving traffic to your business:

Technorati Tags: , , , , ,

1. Tell Google not to index your online newspaper

Google won't crawl your site and drive you all that bandwidth-eating, revenue generating traffic if you don't want them to. "Google News obeys standard web protocols for robots.txt files and robots meta tags. For more detailed information about creating robots.txt files and robots meta tags, please visit http://www.robotstxt.org/wc/exclusion.html"

Simply put a file called robots.txt in your root directory with the following content to block Google:


User-agent: Googlebot
Disallow: /

or block all search engines:

User-agent: *
Disallow: /

or put this inside the <HEAD></HEAD> section of any HTML page you want to exclude:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

2. Don't allow anyone (especially those pesky buzz-building bloggers) to link to your content

Use Apache's mod_rewrite and .htaccess to redirect all incoming traffic back to your home page. Force new visitors to search for the content they want on your site - that'll be a good experience for them. Put this code in your .htaccess file:

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$ [NC]
RewriteCond %{HTTP_REFERER} !^http://latimes.com [NC]
RewriteCond %{HTTP_REFERER} !^http://www.latimes.com [NC]
RewriteRule ^.*$ http://www.latimes.com/ [R,L]

3. Don't allow people to hot link your images

The above solution works great but this solution will put an unfriendly error graphic of your choice on the offending site:

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mysite\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ /images/nohotlink.jpe [L]

A List Apart has some suggestions as well, but that's only for people who care about their reader's experience. Those ALA guys are a bit lefty if you ask me.

4. Don't let anyone frame your content

Minimize the revenue outside aggregators and news readers generate for you. Put annoying anti-framing javascript in place. Anytime someone tries to frame a page on your site, it'll jolt them out of that convenient browsing experience:

<script language="JavaScript" type="text/javascript">
<!--
function framebreakout()
{
// Generated by thesitewizard Frame Breakout JavaScript Wizard 2.0
// Visit http://www.thesitewizard.com/ to get your own
// frame breakout script FREE!
if (top.location != location) {
top.location.href = document.location.href ;
}
}
//-->
</script>
<body onload="framebreakout()">

5. Turn off your RSS feeds

If you don't want people stealing your titles and captions, shouldn't you stop publishing them in a ready-made machine-readable format? Besides, the only reason you have RSS feeds is because your geek engineers thought it would be cool and help build traffic...but really, they just have too much time on their hands. Maybe you should fire them and outsource future work overseas.

If you have fired them already, just delete any lines that look like this from your Web pages:
<link rel="alternate" type="application/rss+xml" title="Los Angeles Times Welcomes Feed Readers" href="http://www.latimes.com/rss.xml" />

6. Force people to pay for your content

The Wall Street Journal requires readers to pay for access. The New York Times requires you to pay to read its Op-Ed contributors such as Maureen Dowd, Paul Krugman, Thomas Friedman. What? Haven't heard of them? That's probably because the online readership of their content has dropped significantly since The New York Times put them behind a subscription firewall:

Nyt-2

But that's probably for the best, you don't want your contributors to have influence or impact. That might just increase your bandwidth costs. To implement a subscription firewall, look into a free open source user authentication solution like Rampart/UMA.

7. Don't let people copy text from your Web site

Pesky bloggers are notorious for copying excerpts of your articles and pasting them all over your site. They often evade the law by only copying small amounts of text (they call it fair use). Well, let's close that loophole. The following script on your Web pages will prevent them from selecting and copying text:

<!-- Paste this code into an external JavaScript file named: disableSelect.js -->

/* This script and many more are available free online at
The JavaScript Source :: http://javascript.internet.com
Created by: James Nisbet (morBandit) :: http://www.bandit.co.nz/ */

window.onload = function() {
document.onselectstart = function() {return false;} // ie
document.onmousedown = function() {return false;} // mozilla
}

/* You can attach the events to any element. In the following example
I'll disable selecting text in an element with the id 'content'. */

window.onload = function() {
var element = document.getElementById('content');
element.onselectstart = function () { return false; } // ie
element.onmousedown = function () { return false; } // mozilla
}

<!-- Paste this code into the HEAD section of your HTML document.
You may need to change the path of the file. -->

<script type="text/javascript" src="disableSelect.js"></script>

8. Don't let anyone print articles from your Web site

Another risk of the Internets is that people often print articles and share them with your friends. This might only lead people to talk about your newspaper and again, increase future bandwidth costs. Put this code in your style sheet and your readers will print only blank pages. Ha, the jokes on them:

/* disable print */
@media print {body {display:none;}}

9. Turn off those APIs and Web Services

APIs and Web Services just make it easy for people to steal your content. Just turn them off ... and fire anyone who even mentions the words API.

In closing, preventing "theft" is up to you. Open source developers (and others) have created the technology to protect you. But in this personal responsibility world, it's up to you Sam. Go ahead, indulge your luddite ambitions.

Comments

I sometimes imagine a true open-source automobile -- a small stylish car like the Mini Cooper that could be built from a kit. There have been kit cars before, odd-looking plastic sports car replicas, but I'm talking about a well-designed daily driver that's hip and fun. It would arrive in big boxes in your driveway and you'd spend the afternoon putting it together.

Some people would love to spend an afternoon doing that, others would like to but might not feel up to it. The handy fellow down the street might put it together for you if you bake him a pie or help him with his computer.

Somebody might open a shop in town, hire some high school kids and put cars together for people.

People would get their cars, some people would make some money, some people would still want to buy elaborate professionally built vehicles, but the automakers would be severely disrupted.

That's what's happening to news companies. I wonder though how people would feel if every industry succumbed to the "x wants to be free" concept. What if we're all just selling Google ads for products none of us wants to pay money for?

We can pay each other in Googleads. A loaf of bread? That's 2 Googleads. Root canal? Mega-Googleads.

I agree with your tongue-in-cheek commentary but I (following a link from Poynter, ta da) was hoping for tips on how to prevent other sites from flat-out stealing my unique content.

As the editor of a small daily I pay a professional staff to report news from our communities that you can't get anywhere else. I love it when people are drawn to my Web site by links from other sites. It drives my traffic. Watch what happens when one of your stories is discussed on Drudge or Fark.

That's a good thing. However, upstart sites think they can cut and paste our stories and attribute them to our paper. Not good. I spend a lot of time chasing these down.

What we do has great value and we spend a lot of money producing it. Link to those stories and you're my friend. Steal them and we have a problem, Houston.

Why do you charge for your site's
various services? For T-shirts?
Because it costs money
to produce things. Wouldn't you get
more comments if you didn't have to
have other accounts?
The Wall Street
Journal doesn't have much trouble
selling its content to large numbers of
people, to no apparent detriment to
their paper product.
Sometimes this is a little difficult
to communicate to the 'something for
nothing' crowd.
Do you expect to be
paid for what you do or what you create?
How complicated is that? You can't
eat traffic, or use to pay the mortgage.
Web advertising is rising, but it's still
a relative pittance, and is likely to
remain so for the next 8-10 years.
If Dell and Apple gave
away their PCs, wouldn't they stand to
make much more in sales of spare parts
programs and replacement components?
Wouldn't that reasoning work for
Honda and Toyota? They'd make
Exponentially more money on spare
parts if they just gave the cars away.
If, Justice Dept. and AP permitting,
local newspapers would all decide not to
give away unique local content on the
net (apart from providing it as a convenience
to paper subscribers), they might be able
'to make enough money to keep paying the
people who report and edit it.

Mark I. Pinsky
Senior Reporter/Religion
Orlando Sentinel

I am not averse to subscription models, micropayment or advertising - I just think Zell's comments were unfair and ignorant. The technology exists to restrict Google's access to the content. It's not being applied.

I think if you have the means to achieve your goals and those means are widely accepted and published, then complaining about the problem in a public forum, to me, seems inconsiderate of the efforts on the part of Google to provide those options.

I wonder if I could sell Search-Engine De-Optimization as a service? (mind wandering)

Re: "the Seattle Post-Intelligencer recently said that 10 percent of their sites traffic comes from Google News." I'd bet that the percentage would be much higher if they included plain Google searches.

It seems that most newspapers are really missing out on attracting eyeballs to their sites -- because they do not use common SEO practices. Their audience could be so much larger.

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In