Home
Learning About Clusters
Programming Clusters
Administering Clusters
Benchmarking Clusters
File Systems for Clusters
Cluster Applications/Grid
Cluster News
Site Map
 
    Home
Search
Monkey Support
Main Menu
Home
News
Features
Columns
Reviews
Links
FAQ's
Contact
Site Information
Projects
Conference Reports
Cluster Tweaks
Site Map
Add This Article
Login Form





Lost Password?
No account yet? Register
Syndicate

Visit Basement Supercomputing

Cluster Builder

Unicef Haiti Earthquake


Why Scripting is Evil Print E-mail
Written by Erik Troan   
Friday, 16 April 2010

Editors Note: This article is part one in a two-part series, published under Creative Commons License. For those that may not know, Erik Troan is one of the original authors of RPM (Red Hat Package Manager). As every HPC administrator knows writing scripts is part of the job. Erik offers some insights as to why this can lead to unexpected problems.

There are two kinds of people in the world. Those who divide the world into two kinds of people and those who don't.

Okay, an old joke, but I'm clearly the first kind of person. I try and split everything into two buckets. System automation solutions lend themselves to this two-sizes-fits-all mantra, with the approaches splitting between scripting solutions and model-based approaches.

It seems like most people think about scripting as the best approach to automation. Whether it's simple shell (or PowerShell) scripts, attaching scripts to Opsware machine definitions, or running an inscrutable perl command line, scripting is king. Engineers and system administrators tend to break problems down into steps, and scripting is a way of codifying those steps and running them on lots of machines.

Scripting is easy to understand and a natural step, but scripting also has serious problems. It's a great tool when there is nothing else available, but scripts are difficult to write, impossible to test, unverifiable and non-invertible.

Why are scripts difficult to write? There are a few reasons; the most obvious is that you're describing how to get to a new state, rather than just describing how things should wind up. Think about how building architects work. They draw up blueprints which completely describe how the important parts of the new building need to end up. What columns support which beams, where the electrical needs to go for code, and what plumbing needs to be put in for fire safety. The construction crews then decide how to get the steel, wires and pipes into place. Can you imagine if the architect had to write a detailed list of instructions describing how to build a building? Down to what kind of screws to use to hold up the drywall and what kind of drill should be used to put them in place?

Writing scripts for system automation means you have to describe every step. Every time. This forces scripting languages to be Turing-complete; they're powerful languages that can solve any problem. That also means the scripts are impossible to analyze for correctness.

So once you've written a script, how do you test it? Install a machine and run it? Go log into a box and run it there, watching it closely? How do you know the box you're testing it on is a good enough representative of the 1,000 machines you're about to run the script on? Like it or not, machines drift. Configurations change, and software installs change. That handcrafted script has to be able to adapt to every one of these divergent machines. You can test a few cases, but are you really testing it exhaustively? Have you tested the error cases, or will things silently fail? Or even worse, will they fail in a manner which leaves the system unresponsive? Software companies pay a lot of people to test their code under every conceivable situation, and we still wind up with Vista. Does an IT staff test their scripts that carefully? Let's say you got the script written and tested. How do you know that it's doing the right thing when you run it? If it makes a configuration change, can you verify the change was made correctly? Chances are you haven't had to describe the change anywhere other than a whiteboard. So what checks those 1,000 machines to makes sure the script did the right thing? How do you audit the system to make sure the script didn't break the change that the previous script was supposed to have made?

Finally, if you've navigated all of those mines, what if the change was simply incorrect? The script did what it was supposed to do, but it turned out to be a bad idea. How do you undo it? Scripts are, by nature, non-invertible. You can't say "oops, let's just undo that." Instead, you're writing a new "undo" script, and testing it, and (hopefully) checking that it did the right thing. Making non-invertible changes to production systems is crazy.

Face it. Scripts suck.


Erik Troan is the founder and chief technology officer for rPath, an innovator in automating application deployment and maintenance across physical, virtual and cloud environments. Learn more about rPath at http://www.rpath.com, follow rPath on Twitter at @rpath and contact Erik at: e...@rpath.com
Comment on this article
You must login to leave comments...


Other Visitors Comments
There are no comments currently....
Last Updated ( Friday, 16 April 2010 )
 
< Prev Article   Next Article >
Mellanox End-To-End Connectivity
Poll
How would you use a 48 core PC (4P with 12 core Magny-Cours) next to your desk?
 
Latest Stories/News
Popular
Cluster Ranting By Eadline
InsideHPC
  • Whamcloud Grapples for Lustre Mindshare

    HPC startups aren’t for the feint of heart. You’ve got to spawn the idea, get financing, and develop the technology while the clock is running. So far, so good for Brent Gorda, CEO of Whamcloud, a new company aiming to close it’s first deal and take Lustre to the next level of performance. Now comes the hard part–gaining mindshare. As [...]

  • Mercury Marine Rocks the Boat with Windows HPC

    If you’re into boating, you know all about Mercury Marine, makers of inboard and outboard motors. As described in this new case study, the company relies on SIMULIA Abaqus software for structural and fatigue analysis of outboard and sterndrive marine engines. When company executives asked analysts to accelerate an upcoming project, they migrated from a [...]

  • Mainstreaming HPC Hampered by Skills Gap

    El Reg has a new feature story on how HPC adoption in mainstream enterprise IT has been hampered by a skills gap. From our research, it is clear that the impact of HPC on mainstream IT is less to do with technology, and more to do with skills and operations management. Commodity hardware is widely regarded as [...]

Who's Online
We have 9 guests online
Worldwide Front Page Visits

Locations of visitors to this page

Monkey Stats
Google PageRank modul - Camelpark SEO centrum

 

Creative Commons License
  ©2005-2008 Seagrove LLC, Some rights reserved. Except where otherwise noted, this site is licensed under a
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.
Cluster Monkey Logo and Monkey Character are Trademarks of Seagrove LLC.