Napkin Series: Hardware as a Single Point of Failure? Not really …

cut rope 1Looking at the root cause of several cloud outages, it’s neither power outage, nor hardware crash. Not even natural disaster. We made hardware, disks, network, cooling, power, even data centers redundant and we arrived at the next challenge: the software. The software that runs in these super-redundant units has became the single point of failure. With the same software running in multiple locations, one little bug can stop everything at the same time, despite the redundant hardware. Windows Azure and other cloud services do a great job in rolling out software updates at multiple update zones, but still, that little bug is just waiting to kick off at the right time and make the press chew on the next big story.

So, what is a redundant software like?

Don’t look at me, I don’t know the answer. But I assume that a redundant software is built by two separate teams who are not supposed to speak to each other. They need to make sure that they solve the same problem – but differently. They take the same inputs and produce the same outputs. But do the work in between differently. Last decade was about hardware failures. Now, they are the past. This decade is when software comes to life.

What do you think? Any comments and conversations – welcome!

.

Linkedin David Szabo, SaaS Strategy Advisor, startup-addict, blogger at http://cloudstrategyblog.com and LEGO SERIOUSPLAY facilitator. Follow me on Twitter!

This entry was posted in Cloud Architecture, Entrepreneur, Kindle, Napkin Series, Windows Azure and tagged , , . Bookmark the permalink.

9 Responses to Napkin Series: Hardware as a Single Point of Failure? Not really …

  1. Interesting view, but not workable, IMHO. I think that what you describe here is purely theoretical: Even if those 2 teams are given the same inputs and the same outputs, the result will be fundamentally different. Just out of the top of my head:
    1) User interface. There aren’t two people in this earth that can design the same UI. Even following strict guidelines, there will be differences in, say, some screen refreshes, the layout of the fields etc.
    2) Performance: Supposing that the two teams build the same logic (don’t think so…) the differentiations in the database query syntax will be enough to produce different performance issues and thus different User Experience (not acceptable!)
    3) One team’s bug will not necessarily be the other’s, too. Imagine 2 programs writing in the same database table with slightly different logic (due to bug, I mean). Disastrous.
    If you push me real hard, I will think of other problems, too. Anyway, I don’t mean that redundant software is a bad idea, I just think that out technology has yet some ground to cover, in order to be in a position to approach the issue more maturely.

    • David says:

      Hehe, I fully agree – it would be a disaster. 🙂 Yet still, software is a single point of failure in cloud environments. This is an old topic in disaster recovery handbooks for business systems, it’s just called differently: return to manual operations. 🙂 Obviously, this is not doable clouds, they instead rely on heavy and thorough testing, which makes sense. But still, as examples show, a single point of failure keeps causing trouble, despite the best testing processes.

      ________________________________

    • Ice says:

      Shoot, so that’s that one sueosspp.

    • accutane says:

      Tip top stuff. I\’ll expect more now.

  2. Teresa says:

    Thank you, I’ve recently been searching for information approximately this topic for ages and yours is the best I’ve came upon so far.
    However, what in regards to the bottom line?
    Are you positive about the source?

  3. Quality articles or reviews is the secret to be a focus for the
    people to go to see the web page, that’s what this web site is providing.

  4. Tayna says:

    hi!,I love your writting so so much! percentage we communicate more avout your
    article on AOL? I require an expert in this area to unravel my
    problem. Maybe that is you! Having a look firward to see you.

  5. oblacco.com says:

    If you wish for apt take much from this post next you have to application such strategies apt your won webpage.

Leave a comment