Comments about technological history, system fractures, and human resilience from James R. Chiles, the author of Inviting Disaster: Lessons from the Edge of Technology (HarperBusiness 2001; paperback 2002) and The God Machine: From Boomerangs to Black Hawks, the Story of the Helicopter (Random House, 2007, paperback 2008)

Monday, June 25, 2012

Howl of the Machine: Audio from Rena on the rocks

Nine months ago I was intrigued by an article from the New Zealand Herald in which witnesses aboard the wrecked Rena described unearthly howls, shrieks, and echoing rumbles as waves slowly tore the vessel apart while it lay stranded on the rocks off the port of Tauranga. 

Here's a picture of the cracked hull from NZDF:
Maritime New Zealand's Bruce Anderson said this about the sound: "I wouldn't say it's eerie, but it's quite spooky. It would be really interesting for people to hear the grinding sound being made as the two parts of the ship work together.

The ship finally broke in half in early January. This photo from AP:
No audio was available to the public in October, but now it is.  

Here's an audio clip courtesy of Maritime New Zealand, recorded on October 24 by officials inside the ship. I think you'll find it memorable, and perhaps well suited for techno-haunted houses. Particularly striking is a rising roar not quite halfway through the clip.

Friday, June 22, 2012

Root Cause Analysis: The Traveler's Tale, Part 1

Following is an overview of root-cause analysis, in the manner of a fable ... with time machine! “Root cause analysis” is a common term in industry and in news articles like the recent one about the origins of cracks in a shield building at the Davis-Besse nuclear power reactor, but it's impossible to summarize in a few sentences so most writers just assume that readers know the method. But root cause analysis is a whole array of methods, developed over decades by many people. 

And now for "The Traveler's Tale," Part I.

From: Richard C. Asplundh, Staff Engineer, 
 Rapid Prototyping Div.
To: Boss
Re: Mixed Results with Prototype X-1A Time Machine
Date: September 4, 1877
Via: Catamount Brand Whiskey bottle

Well, boss, the new time machine works! In one direction, anyway! I'm leaving my progress report in a bottle and burying it where I hope someone will find it in your time zone, and send it to the home office. Meanwhile, I'm staying busy back here and trusting that you still have the meter running, paycheck-wise.

Remember how you asked me to do just one little test run before lunch with the X-1A? As in, “Ricky, old boy, how about you go backward just a few minutes, and try out the Back-to-Base Homing Mode?”

I'm here to say the homing mode doesn't work yet. Also, you need to tell the lab rats that they miscalibrated the ChronoCounter, because that "couple of minutes" they dialed in was considerably more than a century.

Fortunately, because of my deep training in all forms of Root Cause Analysis, I've been able to make myself useful back here, pending some help from your direction.

Before I get down to details, here's the view from 40-K feet: I get a job in a hard-rock gold mine. I immediately learn that the boys at the Acme Mine are smack in the middle of an all-out business crisis. But it can't withstand my root-cause skills for long, and that's me without my laptop, or my notebook, or the stack of proprietary software. 

But as I always like to say, root cause analysis is more headware than hardware. Me and my rough-hewn buddies get things sorted out, against all odds.

The big adventure starts like this: There's a haze, I feel dizzy, then find myself in a mountain forest. The machine's control panel says it's just a few minutes before I left, but I can see that this doesn't look like the inside of our company's Southwest South Dakota Warehouse at all. I wait a day in case you might send a rescue squad from the future, but no dice, so I decide to trudge off and meet the natives, whoever they may be.

I have to say it's pretty exciting to go off exploring when you don't know if you're fifty years off the beam or fifty thousand. But I find a grubby town and get my bearings: I've touched down outside of Deadwood, Dakota Territory. It's August 16, 1877, about ten years into the Black Hills gold rush.
Nothing too glamorous about this side of the Old West: unpainted slab-sided buildings up high, and mud down low. I head for a “help wanted” sign, walk inside, and a thin geezer with a visor says they need a man to tend the couple of dozen Missouri mules that live in a stable at the bottom of the Acme Mine. 

Graybeard identifies himself as Too-Tall Johanson. He says that mules pull the ore cars from the heading down a little iron track, back to the main shaft, where a steam engine drags the rock to the surface. The mules live down there, never seeing daylight … like our IT people.

I tell the guy behind the counter that I don't know one end of a mule from the other. But Too-Tall hires me for two dollars a day, hands me a shovel, and says I'll pick it up. “The way things are going at this mine, it won't be for long anyway,” Too-Tall says. This is about when my fact-finding antennae go into hyper-activity.

The owners in Frisco are about to close us down and everything else is going wrong too!” says he. “Cain't hardly understand it!” I clap Too-Tall on the shoulder and tell him that help has arrived from an unexpected direction. He shakes his head and I go off to grab some worn-out old miner togs from a heap in the back, which feel like they were hacked out of old pine shingles. I buy a carbide lamp on credit at the company store, and down into the dark I go. 

After a week I switch to the midnight shift. That way, I can give a few pointers to the mine's baseball team at batting practice after dinner. It's in the cellar, with 43 losses and 26 wins.

The mine is in even worse shape. According to the boys, things seemed to go south all of a sudden. Starting about two weeks before I drop in, gold-ore production took a nosedive first in quantity and then quality. And all of a sudden there were new, weird problems nobody had seen before. Miners are a superstitious bunch and morale took a tumble. 

I find out why nobody wanted the mule-tending job: there was an unexplained explosion in the mule stable a week back and it's made them all superstitious that another mule is about to blow up. Meanwhile, all the mules have belly aches and make a lot of noise. We are buying gallons of Brother Jubal's patent medicine and mixing it in the water trough, but it doesn't help.

I find out plenty of other things the first week. I buy some paper and a box of pencils and make a stack of notes back at the bunkhouse. Soon it's time to start my root-cause analysis, frontier-style. It's the world's first. (As I once explained at a staff picnic last year, while the core concepts behind root causes are recognizable in Aristotle's notions of moral responsibility and determinism, it didn't get going until Operations Research during World War II and the postwar study of loss control. So right now I'm seventy years ahead of the competition.) 

I initiate my work one Saturday night when I'm whooping it up with the graveyard shift in the Dirty Dog Saloon. I shove aside the shot glasses and peanut shells, pull out two sheets of foolscap with my incident description and pass it around to the boys for a review.

There isn't room for all my work product in one whiskey bottle, but it went like this:  
“Series of mishaps and problems at the Acme Mine beginning around August 2, continuing to date. Most time-critical is rapid deterioration in gold-ore production. Problem first noticed with downward trend in ore deliveries in tons/day, falling to 33% below targets. Persisted for two weeks. Tonnage recovered to acceptable range by August 12 but starting August 7, assayed ore quality at the stamp mill fell from 10 troy oz/ton to 1 oz. Monthly gross revenue from stamp mill dropped 32% year over year. Owners plan to close mine at end of fiscal.” 
  I was careful to word this like the classic I.D.: focus on describing the most serious symptom and don't point fingers or guess about solutions. Too early for that!

Well, incident descriptions are something new to the crowd at the Dirty Dog Saloon, so I buy another round of rotgut liquor and warm them up to the idea. The audience even adds a couple of bullet points, along with a bullet hole following gunplay between a tinhorn gambler and a placer miner two tables over.

As you know, my next task is the Problem Statement, describing the deviation from the desired state, and of course it's scoped to stay within our field of control. I draft a short paragraph, in declarative sentences, stating the goal to achieve. 

Now the hard part is about to begin: getting buy-in from the powers that be. Something tells me that they don't place a lot of faith (yet) in statistics and decision trees.

Which follows in the second whiskey bottle, so stay tuned for the Part 2 of "Root Cause Analysis: The Time Traveler's Tale."

Thursday, June 14, 2012

Prometheus, and thoughts on interplanetary contamination

Noting this CBSNews item on the probability that flakes from a Teflon seal may contaminate core samples to be gathered with the Martian probe Curiosity rover's sample drill. Curiosity is on its way and due to make a nerve-wracking touchdown near the Gale Crater late on August 5, JPL-time.

Since Teflon molecules are largely carbon, the effect will be to exaggerate results from the tiny onboard analytical lab about how much native carbon is in Martian dirt.

Unintentional interplanetary contamination is an interesting subject because it may be difficult to manage. This post is about half of the problem, back-contamination: offworld material coming back home in a spacecraft, and posing a risk due to chemical or biological activity.

After leaving the theater, ears ringing with the volume, watchers of Prometheus might ponder how leaders of a real expedition might go explore the surface of a planet without bringing back creepy crawlies, particularly the micro-sized ones. 

The movie does not set a shining example here. Upon the explorers' return from the first trip out to alien-land, the hangar deck and apparently the crew's quarters are wide open to anything clinging to the suits' outsides. Or to the insides of the suits, given that the explorers doffed their helmets when checking out the alien dome.

In that respect the back-contamination precautions shown in Prometheus are no more advanced than the methods employed during Apollo's manned missions to the lunar surface, which all failed in keeping loose moondust from the astronauts' bodies, or from the capsule when it returned to Earth.

How did dust spread so widely? At the start of each mission, each pair of Apollo astronauts put on their suits, emptied the lunar module of air, and climbed outside to take samples, explore, and (in the later missions), drive their moon buggy around:
At the end of each EVA they climbed back inside the Lunar Module, closed the hatch, and refilled the cabin with oxygen from tanks. They ate, slept, took notes, then went back outside for another activity. What's the problem? Every trip brought more dust into the cabin. 

It didn't turn into a crisis, but it could have. Lunar dust turned out to be a safety risk and possibly a health risk as well. Lunar dust is the size of finely-ground flour grains, but sharp-edged, reactive, and greatly more abrasive. Here's a micrograph, from NASA:

It's also highly adhesive in a high-tech way. Note the hook-like protrusions, which grab onto space-suit fabric then dig in; and it carries an electrostatic charge that binds it firmly to smooth surfaces like helmets, camera lenses, and solar panels. Wiping with a cloth only scratches the surface (literally).

Once astronauts started using the moon buggy to get around, the dust problem was totally out of control as soon as a fender came loose. At that point, a rooster-tail of dust sprayed all riders. A NASA controller commented that the suits turned so dark from the buggy rides that it looked like the two men had been playing in a coal bin. Back in the LM cabin, dust worked itself into everything including the astronauts' noses and eyes.

The later expeditions found that after three extended EVA trips on the surface, the suits were played out: moving joints were grinding to a halt, and the dust was wearing through the fabric.

Some dust migrated from the LEM through the service module to the capsule, so a portion came back to Earth in an uncontrolled manner. While lunar dust didn't harbor a killer-interstellar virus, the latest plans call for trips to planets that could harbor life. So serious preventive work is needed before manned trips to Mars, and will be greatly more urgent when we dispatch expeditions to Earth-like planets with signs of life.

One solution could be the suitport:
In the suitport approach, astronauts never bring their suits back into a crew cabin or any occupied space; they climb into them from the back, working out of a shirt-sleeves environment on wheels:
I saw no suitports in Prometheus, but the gadget could be a lifesaver in the real world ... or worlds.

Thursday, June 7, 2012

Superjet Crash: Flight data recorder, also found

Following up on my earlier post about the cockpit voice recorder (CVR) found at the Superjet 100 crash site on May 15. 
As predicted, the solid-state recording chip in the CVR was in good shape, despite the unit's battered appearance in news photos. Indonesia's investigative agency is running the last 20 minutes past three interpreters to put together a common transcript. (Why 20 minutes? The passenger jet had been aloft for just 12 minutes, following takeoff from Jakarta.) 

That text may be all we ever learn about the cockpit chatter; as in most crashes, there are no plans to release the audio itself.

Local residents found the FDR last week. It was in good condition under a heap of dirt. The maneuvers of the aircraft near Mt. Salak are already documented from radar tracks, but if the plane had some kind of mechanical problem, data on the FDR  should reveal that.

Meanwhile, given the news vacuum, one Russian tabloid blames American sabotage, but most articles cite alleged recklessness by the pilot to impress customers about the RJ's agility. 

Pushing the maneuvering envelope to show off a new plane wouldn't come as a big surprise, but I haven't seen a convincing explanation why the pilot would ask ATC to descend from a safe altitude so he could fly through the mist, among the volcanoes. One maneuver near the volcano was a 360-degree turn.

It was truly an odd request assuming that he knew his whereabouts, and wasn't distracted.