Stefan Haflidason

The blog of Stefan Haflidason, PhD.

Throwing a Spanner in the Works – On Purpose

leave a comment »

I’ve just been reading an interesting article by the Netflix staff which describes their experience with moving their infrastructure to Amazon Web Services (AWS). This is something I was pondering for our systems again today so I took particular interest in it.

One paragraph describes something very cool, something that I’ve not witnessed in any companies I’ve worked for [1]:

One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.

Running websites with a global audience can mean that you’re effectively on call 24/7/365 which can be damaging to your health. The only way to get any sort of peace of mind is to make your systems robust by assuming and actually testing the worst case scenario.

Unfortunately this is at direct odds with producing new systems, and it can be hard to persuade your partners/bosses that it’s worth postponing the development of a new feature to revisit your infrastructure design. They may want you to pump out new features so that you can stay ahead of the competition, but in the end it may be your infrastructure that helps your maintain your competitive advantage. After all, if you get popular then those new features won’t matter if your infrastructure can’t handle the traffic, or you lose your users’ data.

Your competitive edge will come from the hard problems you have solved. Reliability, robustness and scalability are likely to be harder problems than most incremental new features you might want to implement, so I’d say focus on that infrastructure and make sure it’s solid; that is the way to get ahead and stay there.

[1] http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html

 

Advertisement

Written by Stefan

December 16, 2010 at 8:14 pm

Posted in Systems

Tagged with ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.