Reddit: The Bandwidth Thief

redditFor those of you who are unfamiliar with reddit, it’s a social bookmarking site much like Digg. It’s got a smaller user base but, in my experience, there is a different atmosphere or feeling amongst it’s users. It’s a lot more friendly in the comments section.

Despite it’s smaller user base I’ve had posts on some of my sites receive as much as 50,000 hits a day, so the traffic it sends is not something to be dismissed lightly.

Yet, despite my praise of reddit, I do have issue with them, and my biggest problem is that they actively encourage hot-linking.

reddit guidelinesIf you visit the the “pics” section of reddit you will clearly see the following guidelines posted on the right hand side bar:

  • NSFW or nudity containing posts should be marked as such, or will be banned. SFW marking is optional for posts with ambiguous titles.
  • Do not put URLs in your images
  • Direct links to images are preferred. No blogspam.

Lets look at this from the bottom up. Reddit encourages their users to link directly to the images on your blog. If you’ve created that image, shot that photograph or spent time crafting a post around that image well you can forget about  anybody seeing your site or your post.

All they will see is the image.

Reddit will use your bandwidth to show thousands of people your image without so much as a link back or a thank you.

Let’s say you have a 200kB image that gets submitted to reddit and is seen 50,000 times. That’s 10,000,0000 kB or 9765.625MB or 9.53674 Gigs.

That’s over 9.5 Gigs of your bandwidth used up, most likely without you even knowing it until you check your bandwidth logs and receive your bandwidth bill.

Forget about the adverts you run on your site paying for your bandwidth because nobody will see them. Nobody will click through to your other posts. Most people will never even be aware of what site the image came from.

So greedy and selfish is reddit, that not only do they want to use your bandwidth to show your images to their users, they don’t even want you to be able to have your URL in your image in the vague hope that somebody might type it in to the address bar and check it out.

How dare you want any credit for your own work! How dare you!

When an image is submitted there isn’t a single link back to the originating site. If you submitted an image from this site the only links associated with that image would be the one directly to the image itself and one to search for more images from the originating domain.

That sucks. That’s encouraging bandwidth theft and hot-linking. It’s leeching off of bloggers without so much as the courtesy of a link back.

That’s so cheap that if reddit were at a gay orgy they would be the only ones refusing to give a reach around.

Has Social Media Legitimized Hotlinking in Web 2.0?

Hotlinking used to be a bad thing. Essentially bandwidth theft, hotlinking is the practice of displaying and image (or video, audio file etc… ) on your site but actually having the file served by someone else’s server. In other words you use their bandwidth to display the file on your site. (This is not the same as content theft, taking someone else work and presenting it as your own. That’s another discussion for another post).

We all know that bandwidth isn’t free.

Last night, as Daily Shite clicked over it’s 1 millionth page view for February, I was looking at our bandwidth usage and wondering where did all of the 423 Gigs we’ve transferred over the past 15 days go?

Obviously on a humor site like ours that aggregates the best content from around the web, hotlinking can be  an issue.

We don’t hotlink. We makes sure that all images and files are hosted on our own server and even go as far as having measures in place to ensure that anything that might accidentally get hotlinked is automatically cached on our server and the link rewritten.

Yet so many people hotlink today and I feel that with the rise of sites like Digg, Facebook, Reddit and even my beloved Google Reader are to blame.  They’ve taken hotlinking and instead of supporting the idea that is unacceptable have played a major hand in making hotlinking acceptable.

There are way and means to prevent hotlinking. It’s easy actually, but the problem comes not with the prevention, but the fact that oft times it is necessary not to prevent it  in order to actually be able to promote your content on the web.

If hotlinking is prevented then the images won’t show in Google Reader. Facebook, Reddit and Digg won’t be able to show thumbnails for the posts. This is a negative thing for most publishers regardless of whether or not they are aggregators (like we do on Daily Shite) or content creators like I do on this blog.

No thumbnails means less incentive for users to click through and people can be such very visual creatures.

I suppose measures could be taken to ensure I had a separate directory of thumbnails outside of the “protected directories” of my site that could be used by other sites but that is quite frankly a huge pain in the backside to organize and something that 99% of people will neither do, think about or have the ability to do if they did.

To be honest I don’t begrudge Facebook or Reddit using my images when someone promotes one of our posts, in fact I encourage it. It’s promotion for us.

But what I do have an issue with is that sites like Digg, Stumbleupon and Reddit allow linking to an image directly outside of the structure of the site hosting it and that is a problem for me.

When 20,000 or 50,000 people stumble an image on your site and all they see is the image itself, none of your content or the advertising that pays for the bandwidth that is used to serve that image 50,000 or even 100,000 times then there is an issue.

For some sites 50,000 or 100,000 views of a reasonably sized image could eat their entire bandwidth cap and result in them having to shell out money because the major media sharing sites are too lazy to implement a simple bit of code that could ensure that the page an image is on must be linked to rather than stealing some poor sods bandwidth.

Sharing on sites like Digg, Facebook and Stumbleupon is a partnership. They may promote our content for us when it is submitted, but we also provide them with content for their community to view and to discuss. Without our content these sites are nothing and we get considerably less traffic without them.

It’s a partnership, or at least it’s supposed to be. When it comes to images and hotlinking, the partnership is sorely one sided.

What the heck is Buzzwatch and why is it eating your bandwidth?

A good friend of mine got hit with a notice from his web host today that his site was exceeding his allocated bandwidth for the month.

The site in question has a bandwidth allocation of 2.75 GB which should be more than enough for it’s current traffic load.

Anyway, I was asked to take a look at the site to see if there was any reason for such a spike this month and this is what I found on the list of hosts:

bandwidth

That is 513.85 MB being used up by one host in just January alone.

The IP address resolves to a site called Buzzwatch which appears to be a product of Lokion (introduced at the end of last year) at who’s site only offers this brief explanation:

Buzzwatch identifies your market and brand opportunities by distilling the comments and information posted on user sites such as blogs, forums, and review sites. Used in conjunction with usability studies and focus groups, this invaluable view into consumer commentary gives you actionable insights in a cost-efficient manner.

Actually the Buzzwatch site itself does not offer much more of an explanation and does not contain much more text than this:

Buzzwatch identifies your market and brand opportunities by distilling the comments and information posted on user sites such as blogs, user forums, and review sites. Buzzwatch also culls information from sites where online communities discuss your brand or industry. Used in conjunction with usability studies and focus groups, this invaluable view into consumer commentary will give you actionable insights in a cost-efficient manner.

I’ve asked the site owner whether or not they’ve signed up for Buzzwatch or even heard of them before and they haven’t.

So that leads me to ask: What is Buzzwatch and why is pulling a half a gig of bandwidth form a single blog?

Can any of the O’Flaherty readers help here?

Have you dealt with Buzzwatch or have some insight into what they do and why the are so data hungry as to pull half a gig in one month from a blog?

I have emailed Lokion for more information, but if anybody has an answer tonight I’ll be happy as it’s 4:45 am and I have yet to get to bed for my 5:30 am start..

Update: 18:51 31st January 2008

I just received this email from the VP of Strategy at Lokion:

Paul,

I apologize our system was using up that much bandwidth on your friend’s site, that is not how the application should have been behaving and is unintentional. Primarily based on this we have completely turned off the application that caused the problem and have discontinued it. I’m sorry if this caused any inconvenience.

Please feel free to call me directly if you have any concerns.

I’m still in the dark as to what Buzzwatch is supposed to do though!