taylorbarstow.com

SEO Attacks

People have discovered many ways to disable, harm, or otherwise compromise websites and web services over the years. We have cross site scripting (XSS), denial of service (DoS), and phishing to name a few. Today I’m writing about one that I’m surprised hasn’t gotten more attention or, frankly, become more of a problem: so-called SEO attacks.

What’s an SEO attack? It’s a way of making a site appear less attractive and/or relevant to search engines. This actually isn’t at all hard to do. Anywhere you can post a public comment on a page, you can participate in an SEO attack.

At its worst, an SEO attack could be blind comment spamming, which would be easy for a site owner to detect and prevent. At its best, however, an SEO attack could be very difficult to trace indeed…

Attack #1: Profanity

Google’s SafeSearch definitely filters out results containing profanity—I’ve experienced this with my own blog. (I’m not sure how other search engines handle this.) This makes an SEO attack extremely simple. Just post a public comment that is seemingly on-topic yet has a profanity slipped in there. If the comment gets posted, you just killed the page’s chances of showing up in Google search results.

Attack #2: Link Spamming

We know that all web crawlers use hyperlinks to move from page to page and from site to site. The best web search algorithms also use hyperlinks to help determine a page’s authority on/importance to a given topic. You could pretty easily launch an SEO attack given this information. For example, you could post a comment with a link to a porn site, or to a known link farm.

Potential Harm

Could this really hurt someone’s business? Sure it could. A couple of sites (stackoverflow and amazon) jump to the top of my mind very quickly. Both of these derive significant benefit from Google searches (stackoverflow especially, since it is still a baby). Any real cuts into search engine relevance would be immediately felt.

Solutions?

These approaches may seem absurdly simple, but in fact they are not easy to stop. Moderation works but doesn’t scale (stackoverflow or amazon could never handle moderation, for example). Automated filtering works for profanity, but not for link spamming.

Conclusions

Uh, not sure I have any. SEO attacks seem like a potential problem, and I don’t see an easy solution. At the very least, I’ve concluded that this is a topic that warrants further investigation.

Here’s wishing you an SEO-safe site!

(Admittedly, SEO attacks are a particular type of XSS attack in that they involve injecting something harmful into someone else’s website. XSS attacks, however, more generally use some discreet technical vulnerability to do their work. SEO attacks are all about content which is publicly posted and visible by anyone, and they require zero technical know how to launch.)

Test Awareness Month

It’s Test Awareness Month folks, and I guess I better do my part. For one thing, let me echo Bryan Liles and say that you should Test All The Fucking Time. I’m not going to write a big monologue on testing (I love it, and you should too), but here’s a super useful link if you are wondering how to go about testing Rails apps (it’s not always obvious; how should one test active record relationships, for example?).

Happy testing!

The Adobe AIR Socket Implementation Sucks

AIR sockets write data asynchronously—that is, the data you send to the socket goes into a buffer and the socket implementation decides when to actually write it out. Problem is, there is no way of knowing how much data has been written vs. how much is still in the buffer—no event to subscribe to, no bytesWritten property. There is a flush method on the socket, but it returns immediately without blocking so you can’t use this to force a synchronous write. (Since AIR doesn’t support multiple threads yet, writing synchronously isn’t really an option anyway.)

WTF Adobe? This is a glaring omission. Adrian raises an interesting question: did they actually do this on purpose? How hard could this possibly be?! I mean, they are simply exposing an underlying socket implementation provided by the OS, right? And every socket implementation on the block can tell you how much data is in the stupid buffer. grrrr.

Someone posted this as a bug on AIR’s issue tracker back in April. They didn’t get a response until August. Want to know what it said? “This is an enhancement request, i’ve logged it internally.” Sweet guys, thanks for the update.

Alas, it’s back to platform shopping for me. I hope Adobe can figure this out. Not only is it annoying, it makes them look downright stupid. Oh well.

Resolution #1: Eat (er, Consume) Locally

As hokey as it might seem, I am a fan of New Years resolutions. This may be something about me, or it may be something that everyone experiences (perhaps because we have been conditioned to think this way), but I have generally been able to accomplish quite a bit around this time of year:

  • Quitting smoking (some eight years ago now)
  • Becoming vegetarian (a change which lasted 4 years, though I did eat fish)
  • Beginning an exercise regimen

My first resolution for this year is to eat more locally. Although I will make some exceptions (garlic, nuts, and coffee, for example) I will do my best to buy locally produced food. This includes vegetables, meats, and dairy products. This means I will also be eating more seasonally; if I can’t find local broccoli, I will do without it.

Of course, living more locally goes beyond food—I will also do my best to buy local beer, and to support local merchants rather than regional or national (or multinational!) chains.

This might sound easy to the casual reader, but it really isn’t. When I went vegetable shopping last night, I ended up with some squash from Massachusetts, a huge parsnip from Vermont, some garlic, and (luckily) some local lettuce grown in a greenhouse. Of course the store had all kinds of vegetables—carrots, celery, fennel, broccoli, you name it—but most of it was from California.

You might ask why I am motivated to put myself through this. I have a few reasons, but two are chief among them:

  1. I want to do everything I can to reduce my dependence on fossil fuels—did you know that the average food item in the average American meal has traveled 1500 miles from its source to your plate?
  2. I want to gain a greater understanding and appreciation of where my food comes from, rather than ignoring all that and blindly buying whatever I feel like eating. As I see it, the first step is to buy food that is grown around me. No strawberries in New England in the winter—it just doesn’t make sense.

Anyway, that’s my newest journey. If you’re interested, I hope you will check back later for updates on how it goes. And if you are trying to do something similar, how’s it going and where do you live?

Daemons in Ruby; or, Frustration

Over at unbiasd, we use a couple of daemons to keep our web interface snappy. Rather than letting our mongrels sit around and handle long running tasks (such as outbound web requests), we offload these responsibilities using the delayed_job plugin. Currently we’re using collectiveidea’s branch because it includes a nifty script to run delayed jobs in a daemon environment using the daemon generator plugin.

The bad news? The daemon crashes all the time. Sure, I could write a monitor daemon to restart it when it crashes (daemon generator even includes a prepackaged one of these). But that seems somehow wrong—what if my monitor starts crashing? Write another monitor? What if that starts crashing? In programming I believe in finding and solving the real problem rather than ignoring it and programming around it. The former is akin to an elegant solution, the latter to brute force.

So how do I proceed? Step 1 is clearly to remove all third party code—or at least to become intimately familiar with it. Ideally I’d like to keep using delayed_job, so I’m going to whittle away at the other third party code first. This means getting rid of my daemon generator crutch. This has some obvious advantages:

  1. I can use tobi’s delayed_job, which is clearly more up to date than collectiveidea’s
  2. I can learn the ins and outs of writing daemons in Ruby
  3. Hopefully I can write a good daemon and submit it back to tobi (with the magic of github)

So this is an introductory post—a prelude to a journey. More to come.