One risky point

Be the first to comment | 2I like it!
July 28, 2008, 02:47 PM —  Computerworld — 

Single point of failure. That's the right term for talking about the mess in San Francisco, where last week the city government finally regained control of its backbone network. Terry Childs, the net admin jailed for locking down administrative access, turned over the passwords during a secret visit from Mayor Gavin Newsom.

Childs' lawyer said Childs hadn't divulged the passwords sooner because he believed "none of the persons who requested the password information ... were qualified to have it," according to court filings.

We're starting to get solid information on the case, now that the impasse has broken (see story). But until Childs revealed the passwords, all we knew for sure was that Childs was in jail and that the network was still working but couldn't be managed.
Beyond that, it's been Rashomon in IT. Depending on who's telling the story, Childs is a brilliant network engineer who did nothing wrong. Or possibly a cyberterrorist who held the government hostage. Or maybe just an overstressed, burned-out guy who's the victim of a misunderstanding.

San Francisco's IT management? That's a bunch of tech-clueless bureaucrats. Or maybe it's a gang of goons who are out to get Childs no matter the cost. Or perhaps it's a group of conscientious public servants whose only concern was regaining control of a crucial network that might have been full of booby traps.

Childs' erstwhile co-workers? They're halfwits who couldn't manage that backbone with both hands and a map. Or innocent victims of a network guru with a God complex. Or enablers who helped create the mess by their silence.

From news reports to blog comments, the reactions have been stunning in their vehemence and variety. And there's not one yawning gulf here, but many: between techies and nontechnical managers, between gurus and regular IT grunts, between designers and administrators, between security wonks and operations guys, between practicing network experts and best-practices pundits.

It seems like suddenly we can agree on nothing. But maybe we can all recognize this:

Terry Childs was a single point of failure.

Never mind whether he's saint or sinner, villain or victim. Set that aside for now.

Focus on this: Childs was the only guy who understood that fiber backbone network. He designed it. He ran it. He maintained it. He controlled it. And nobody could replace him.

In other words, a single point of failure.

Forget whether that situation was because of cheapness, arrogance, incompetence or paranoia. The result was the same: If something happened to Childs -- a stroke, a car accident, a breakdown, a job-related "misunderstanding" -- that single point would fail.

And it did.

Look, this San Francisco fiasco has thrown a spotlight on every ugly division in the IT profession. We see it as a matter of control or expertise or responsibility or stupidity or freedom. We see it as us vs. them , and that reaches into our deepest fears and anger.

So remember this: A single point of failure is a reliability problem. That's something techies and managers, gurus and grunts can understand.

We all have at least one single point of failure lurking somewhere in our IT operations. Waiting until it generates a crisis that spirals into finger-pointing, frustration and fear is not the way to go.

There's really only one good way to deal with a single point of failure: Find it and cure it before it fails.

Frank Hayes is Computerworld 's senior news columnist. Contact him at frank_hayes@computerworld.com.

» posted by ITworld staff

Computerworld

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Free books

Build your tech library with our book giveaways.

Hacking Exposed, Sixth Edition
By Stuart McClure, Joel Scambray, George Kurtz; Published by McGraw-Hill/Osborne

The original Hacking Exposed authors rejoin forces on this tenth anniversary edition to offer completely up-to-date coverage of today's most devastating hacks and how to prevent them. Using their proven methodology, the authors reveal how to locate and patch system vulnerabilities. The book includes new coverage of ISO images, wireless and RFID attacks, Web 2.0 vulnerabilities, anonymous hacking tools, Ubuntu, Windows Server 2008, mobile devices, and more. Enter now!

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

Marketplace