Comments about technological history, system fractures, and human resilience from James R. Chiles, the author of Inviting Disaster: Lessons from the Edge of Technology (HarperBusiness 2001; paperback 2002) and The God Machine: From Boomerangs to Black Hawks, the Story of the Helicopter (Random House, 2007, paperback 2008)

Monday, July 30, 2012

Big Bay Bang: In a Long Line of Accidental Pyrotechnics

As a writer in history and technology I try to group current events with some class of earlier, comparable mishaps. In this case the class is the "inadvertent and instantaneous firing of multiple pyrotechnics."

On the evening of July 4, a half million fireworks-fans were treated to a spectacular show over San Diego Bay, all five locations (four barges and a pier) launching their entire arsenal of 7,000 pyrotechnics in a few seconds, rather than 15 minutes (Photo Ben Baller, AP):
Each pyro device (typically launched out of a short, vertical mortar tube) is ignited from a central control. The control closes circuits according to a programmed chronology, sending an electrical charge to an electric match that produces a brief flame inside a lifting charge. One variety of the electric match is the nichrome wire that my brothers and I used as kids, when launching our Estes rockets.

The contractor, Garden State Fireworks, later released a three-page, not particularly instructive, paper that lays blame on the interaction between a primary and backup computer program. Many details are left unexplained, but we can gather that a typical Big Bang show is commanded over a continuous radio link transmitted from a central point, because the five launching locations are scattered across 14 miles of harbor. 

The local firing computer on each barge was to have on hand a stored ignition sequence so that it could continue running its part of the choreographed show, in case the radio link were broken.

The show's radio-command link was supposed to start running five minutes before the first launch, so that operators stationed at each local computer could verify that all systems were nominal. And the radio transmission did begin at this time. Normally the program runs "in the dark," meaning that even though the sequence has begun, there's no visible result except on computer displays.

Not this time. Garden State's mea culpa paper says that the backup program somehow got crossways with the main program, and the combination immediately instructed each computer to send firing signals to all the electric matches at once, somehow deleting the planned interval. 

So this year's aptly-titled Big Bay Boom show started five minutes ahead of schedule and ended soon thereafter. Despite the early start and the intensity of the fusillade, we're told that none of the fireworkers were injured, having sought out armored shelters nearby. We can hope that they were all wearing hearing protection at the time. 

Reading about this reminded me of an earlier but more sobering simul-blast, 53 years earlier, also in early July. The location was Chennault AFB in Louisiana, where a B-47E StratoJet bomber, No. 53-4212, was being prepared to join other planes already on the flight line. All were to be ready for a quick takeoff in case of nuclear war

Date: July 6, 1959. A single H-bomb (probably a Mark-15) was on board and No. 4212 was fully fueled. Attached to the rear of the fuselage were dozens of solid-fueled takeoff-assist rockets, often abbreviated JATO bottles. Each would provide a thousand pounds of thrust upon ignition. Here's a B-47 photo that will give an idea of the JATO bottles' placement on the underside.
Here's an old movie of a B-47 taking off with the help of JATOs:
As the B-47's pilot stood on the ladder giving directions, somehow an airman's electrical checks ignited the entire rack of 30 JATO bottles. (I am guessing the root cause can be traced to a continuity check to verify the bridgewires, but if so, something went terribly wrong because a proper continuity-checking device uses voltages far below the energy needed for ignition.)

All 30 rockets belching flame and smoke, the 100-ton airplane lumbered over its chocks and accelerated toward the flight line. A wing broke off, spilling fuel. The navigator jumped out, his clothes on fire, but survived. The pilot tried to jump free but was killed when the airplane rolled over him. After a few hundred feet the plane lurched off the pavement and crashed in a fireball.

It was a lot scarier than the usual one-sentence mention to be found in the history of the weapons program, revealing no more than a "ground fire in a B-47 at Chennault AFB consumed one nuclear weapon." So scary, in fact, that some base workers heard the emergency sirens, saw the flames, jumped in their cars, and didn't slow down till they crossed the Texas border. 

JATO demonstrations are still a highlight at many military air shows, pushing C-130s into steep climbouts. A large number of JATOs figured in a daring plan to land a C-130 in a Tehran stadium during the Iranian hostage crisis, and take off again. But the prototype crashed in tests and the entire rescue plan fell through for other reasons.

Monday, July 23, 2012

Oceans of Mystery: Another newly discovered seamount

In this January 2011 post I described a North Pacific seamount discovered in 1950 by the research ship Cobb. In waters thought to be two miles deep, Cobb's sonar detected the tip of an undersea mountain that reached to 110 feet of the surface. Here's a map from a Popular Mechanics showing the location:
Now we have similar news from the HMS Echo: during explorations of the Red Sea last year, that Royal Navy survey ship found an unmapped seamount rising to 131 feet of the surface, in waters shown on charts as more than 1,200 feet deep. Here's a depth chart of the Red Sea:
Here's a depth-coded sonar image of the newly christened seamount:
Some Red Sea fishermen knew about the Gibraltar-sized pinnacle (since they used the peak as a holding ground for their anchors) but not the mappers.

While no ship draws 131 feet of water (the record holder is Seawise Giant, an ultra-large oil tanker that drew 80 feet of water) the seamount would pose a hazard for submarines. 

In the previous post, I described how the nuclear submarine Pogy (SSN-647) ran into an uncharted Pacific seamount in 1972. The sailors survived that high-speed collision, but the boat took major damage.

Wednesday, July 18, 2012

A Traveler's Tale, Part 3: The Root-Cause Conclusion


Part 1 and Part 2: Rick Asplundh plans an easy test of the company's time machine at the request of the R&D Division chief, intending to go back in time just a few minutes and then return. But the uncalibrated machine goes back much further, dropping him into the Black Hills outside Deadwood, Dakota Territory. And the “return to home” button doesn't work either. 

Asplundh waits by the stranded machine, then goes into town to find work. He takes a job at a gold mine tending mules underground, and quickly learns that the underground workings are deep in crisis: Earlier that month, something very odd started afflicting the mules and the mining gear too, reducing daily tonnage, and then ore quality. Seeing the miners spiral into despair, Asplundh realizes he has a skill nobody in 1877 has: training in root cause analysis! 

Co-opting the Acme mine's baseball team to serve as fellow investigators, Rick uses all the tools in the root-cause-analysis toolbox to narrow the huge initial brainstormed list of possible causes. He feels he is closing in on the chain of events until one fateful day, when the Acme mine's management decides that he has been planning to sabotage the operations for a rival owner.

Now, Part 3 of the memo from our hero to his far-away boss, “A Traveler's Tale.”

                         ~  ~  ~  ~  ~  ~ ~  ~

THIRD MEMO
From: Richard C. Asplundh, Staff Engineer, Rapid Prototyping Div.
To: Boss
Re: Mixed Results with Prototype X-1A Time Machine, Continued
Date: September 8, 1877
Via: Sulphur Springs Tonic Water


As you recall, I was in hot water earlier this week because I had my baseball team work up an Anticipatory Failure Determination. (Note to self: I should have gotten the mine bosses on board with AFD first, which is a powerful tool but one that can sound pretty strange to a stranger.)

As you know, AFD asks experts to brainstorm how they'd make sure that some specific and catastrophic problem comes about.

I'm summoned to the mine's front office and accused of planning to sabotage the mine. Why? Because I had asked the company baseball team (doubling as my informal root-cause team) to work out answers to the following question, while hewing to the principles of AFD: “If you wanted to guarantee that the miners in Acme's 'Queen of the North' payzone go after the worst ore possible -- ore so bad that it would crash the whole Acme business -- how could you be sure they do it wrong every time, day after day?”

By turning the tables on the usual prevention-oriented questions, AFD has a way of prying out new and creative thinking on cause and effect. It can be very useful when departments are on the defensive and no longer contributing.

So the mine bosses prepare to wring out a confession. I'm guessing they think I've been sent from a rival mine to ruin their business, so the enemy can buy it for a song.

Time to show my cards. “Let's think this out first," I say, displaying a steely-eyed confidence I don't yet feel. I point to the calendar on the wall. "The problem started two weeks before I even landed here,” I say, "so I didn't have anything to do with it. Yep, yep, it was an odd question and I should have cleared it with you gentlemen first, but it was just my way of confirming what's really wrong with the mine's bottom line. What's more, we found the root cause! A few simple steps, and the Acme Mine's problem is solved!”

That brightens them up enough to let me pace the floor while I talk. The superintendent had been fingering a length of rope, perhaps planning to make a hangman's knot, but he sets it down. No noose is good noose!  

Cautionary note: at this point I actually haven't nailed down all the details, but it's no time to dither. I figure that just talking through the facts one more time will make the picture clearer. So I take a deep breath and reach over for the long roll of butcher paper holding my team's last two weeks of work, including the Current Reality Tree. It's damned impressive – it'd run across the street and into the saloon if I rolled it all out.

I point to a date in the timeline. "Your problem started when the Health & Safety person, Yosemite Jack, decided to get rid of the mine-rats by starving them out,” I say. “It had some unintended consequences. On or about July 28 Jack ordered the miners to stop feeding the rats the scraps from their buckets when on lunch break in the mine. The men weren't so sure about this, but they decided to go along. The rats got hungry and hungrier. What to do? Our furry little friends started nibbling on the leather steam hoses.

"The steam leaks got worse and worse, so the steam drills didn't work like before. The daily tonnage went down just a little, day by day. Look at this line right here, the first week of August,” and I jabbed at the chart. “That's from the Queen of the North payzone. The problem starts to really hit home on August 8, when some goofball spills a big pail of lard on eight crates of dynamite sitting in the sun, before it's taken down the hoist for storage in the mine. How did that happen?”

The lard was to grease the axles on the mine cars,” says a shift boss, “and I guess somebody from the Supply Department knocked the keg over when horsing around. It got hot, the bung comes loose, it dripped out.”

Aha! From little acorns, great oaks grow! Nobody thinks to clean the lard off and two hundred pounds of yummy dynamite go down to the bottom of the mine for storage in the powder magazine. The rats sniff out the lard, like icing on a cake. Unfed, unloved, they start dragging sticks of dynamite off to nibble them later. They hide them … where?” I point to a diagram of the mine. My finger starts at the powder magazine and slides over to the underground mule stable, a few hundred feet down a side tunnel.

Too-Tall Johnson offers: “In the mule feed?”

Exactly! Rats live in hay because it's warm. Rats are always hiding food and they hide some in the hay for future meals. The mules come along and munch them up along with the hay and it gives the mules a bad headache. Worse: one mule bites down a little too hard, and away it goes. Now we have sick mules with headaches, we've got superstitious miners, we've got an unexplained explosion in the stable, we've got leaky steam hoses, we've got drills that underperform, and we've got a dynamite shortage. 

"So why does this go on? Nobody puts the pieces together, and some of the pieces are trouble to even talk about. We root-cause people see this all the time: the facts are there but nobody puts them together.

"Take the powder monkeys. They don't say anything because they know they're short on the dynamite count, and they don't want to let management know that somehow they're a couple of cases short. Nobody wants to speak up - they might get accused of stealing dynamite for some gang of highwaymen, or for some cheapskate mine wanting to cut its costs. What to do? They can hold back on how much dynamite they put into the holes before blasting. That seems to make sense because the holes aren't so deep as before, what with the drills losing power from leaky steam hoses.

"So now neither the drills nor dynamite are working like they did before, and the daily tonnage really starts to fall off the cliff. Look at this line on the chart here, when you managers order them to bring the tonnage back up, or else. What's the only way to bring the tonnage up?” I point to my Anticipatory Failure Diagram.

The bookkeeper speaks up. “They could have tried to fake the scalehouse numbers, but that's double-checked at the stamp mill and outside our containment zone.” He stands up to read better, “… Or they could start blasting out crumbly ore that's weak and easy to break out, whatever the assay! That would bring the tonnage up, but at the cost of quality.”

Exactly! Each department tried to paper over their own problem, just enough to stay out of trouble. So we get the Undesirable Effects box, listed right here at the top of the Current Reality Tree: First a big drop in tonnage, then a drop in assayed ore quality. The men in the Queen of the North payzone went after low-quality quartz that's all rotten and fractured. The mining engineer was out all this time with broken bones so he couldn't get around to see what was going on. 

"He wasn't a root cause of the problem, but when he fell down the shaft and got laid up, he opened the way for the little causes to join up. So: we talked about sick mules, exploding hay, an injured engineer, leaky steam hoses, and bad cases of Giant Powder. None of these were a root cause, but they opened the way for the mine rats to cause havoc with your business plan. I'd have to admit that it's the first Rat Cause Analysis I've ever worked on.”

So that's the story, boss! They gave me a rousing cheer. Better yet, they approved a project budget and appointed two action teams, one aboveground and one underground. Next day the Acme Mine got rid of all the greasy dynamite, they cleaned out the explosive hay in the mule stable, they smoked a peace-pipe with the mine rats, they brought in a temporary mining engineer, and they bought new steam hoses and drill bits. Acme is back up and hitting the numbers on tonnage and quality too. Now we'll see if they remember my closing suggestion: weekly followup for the next two months, then monthly through the end of Fiscal 1878. The landscape is littered with great action plans that fell apart because nobody in authority followed up.

Speaking of the next two months, I'm still stuck here in 1877, but I'm going to try a little Bayesian inference to troubleshoot my misbehaving time machine. I'll need a lot more data to plot, though, so that means more trial runs of the X-1A. 

So keep me on the payroll, and stay alert for more messages! My memos might come packaged up in anything from a Sumerian clay pot to a force-field bottle.

Monday, July 16, 2012

Train collision in Oklahoma: Jump or die


I mentioned in this post, Unstoppable and the Story of CSX 8888, that railroad companies don't permit engineers to leap from a moving train except in cases of imminent collision. This may have struck some as a quaint but unlikely scenario, a reference to engineer Casey Jones' last words to his fireman: "Jump, Sim, Jump!" 


But it still happens. A recent incident came with the June 13 collision of two Union Pacific freight trains near Goodwell, Oklahoma. (Photo: Trudy Hart, Guymon Daily Herald)
Each head-end engine (the locomotive in front) had two crewmen, for a total of four lives at risk.

How could two dispatched trains smash into each other in mid-morning on a straight, flat stretch of track? So far there's no explanation why the westbound train didn't get to its assigned siding in time, though railfans have plenty of speculations. 

We have only the bare facts from the NTSB preliminary statement: speed of each train, total weight (about 12,000 tons), the weather (clear and hot), and the trains' location on a main line in the UP's Pratt Subdivision. But an explanation will be forthcoming: for one, there's recorded data. While the collision destroyed black-box recorders in the lead engines, each train had a locomotive at the rear with its own event recorder, and the UP's traffic computers should have a record of what happened when.

And there's a survivor to interview. While three crewmembers died in the crash and flames, a fourth jumped from the slower of the two trains and tumbled to a stop with minor injuries. 

Knowing when to stay or go is a key part of what I call the Survivability Zone -- more on that later.

Saturday, July 7, 2012

Root Cause Analysis: The Traveler's Tale, Part 2

Recap of Part 1: In the first installment about our analytically-inclined time traveler, Rick Asplundh plans an easy test of the company's time machine at the request of the R&D Division chief, intending to go back in time just a few minutes and then return. But the uncalibrated machine goes back much further, dumping him in the Black Hills outside Deadwood, Dakota Territory. Date: 1877. And the “return to home” button doesn't work either. Asplundh waits by the stranded machine, then goes into town to find work. He takes a job at a gold mine tending mules underground, and quickly learns that the underground workings are deep in a business crisis. Earlier that month, something very odd started afflicting the mules and the mining gear too, reducing daily tonnage, and then ore quality. 

Seeing the miners spiral into despair, Asplundh realizes he has a skill nobody in 1877 has: training in root cause analysis! He convenes a group of fellow miners at a saloon and takes them through the initial scoping steps. Rick used their bullet points to write up an incident description, but that's just the beginning. He needs the Acme mine management to empower a team and get behind a business-improvement process that won't be invented for another seventy years. Now, Part 2 of the memo that Rick is writing to his boss, “A Traveler's Tale.”

       ~ ~ ~ ~ ~ ~ ~ ~ ~

           SECOND MEMO

From: Richard C. Asplundh, Staff Engineer, Rapid Prototyping Div.
To:      Boss
Re:     Mixed Results with Prototype X-1A Time Machine, Continued
Date:   September 6, 1877
Via:    "Winged-Victory" Brand beer bottle

Picking up from the first memo: I was on the way to Too-Tall Johnson's office after getting off work one morning, pondering how I could get the mine bosses to take my incident description and root-cause analysis seriously.

I find the front office abuzz with sports talk. Forget the gold mine going bust: today's panic is the mine's baseball team, because the playoffs are coming up. Another mine has hired away our coach, and odds are long. It looks like the crises are just piling up into a big explosion, but I am not discouraged. Using principles from Eliyahu Goldratt's Theory of Constraints, I think I'll try an Evaporating Cloud diagram. It's one way to work through apparently unsolvable dilemmas to a win-win solution by breaking down preconceptions. I go to a dusty counter, unroll a sheet of butcher paper, and draw the boxes and arrows, Goldratt-style: The mine bosses want the business problem to be fixed but they are so discouraged they don't want to invest any effort or staff … unless it's for the baseball team. I probably could help the mine by nominating and leading a root-cause investigation team if I could get one with knowledge and implementation authority, but that's a hard sell in the 1870s. I've got a true dilemma on my hands that if left unsolved, will wreck the business and my confidence as well.

Goldratt said that every dilemma offers a way out – no exceptions. The word “team” sticks in my mind. I decide to focus my thinking on the mine's nine and its place in this puzzling picture, then Eureka!

I evaporate the cloud by injecting more details and getting past old assumptions, just like Goldratt described in his business novel The Goal. Fact: the Acme Nine needs a head coach. Fact: I coached Little League baseball. Fact: from going to the games, I know that our batting roster has almost all the mining knowledge I need, from hard-rock miners to drillers, bookkeepers to blacksmiths to the steam hoist operators. We even have a shift boss as a third bagger, and we'll need a supervisor to carry the weight when the going gets tough, as every root-cause process does. It's got a few holes but I can figure a way to fill out any knowledge gaps, or my initials aren't RCA.

I turn, call for attention, and shout these immortal words: “What if I coach the team, and finish out the season?”

A pause, then “Huzzah for Rick!,” they say. They know from my color commentary in the stands that I understand the game, and I've shown them a few pitches outside the Gem Theater that turned their heads. Or was it the beer?

But first!” I pull out my first-draft Problem Statement and ask for some advice. After some editing around the potbelly stove, it reads like this when signed by the mine superintendent:

Problem: Beginning in early August the Acme Mine began suffering from a mysterious, abrupt production crisis, first reflected in daily tonnage and then a declining ore assay. Together these have slashed revenue and threaten to close the operation. Goal: Find the root cause and meet or exceed revenue targets before the next quarterly report to Corporate.”

One more thing,” I add as I fold up the paper, which is critical to scoping my investigation and pursuing implementation schemes. “We've got a problem in the infield, so I need to switch out our shortstop with a Welshman who works in the powder magazine. I'm thinking Scorch-Face Smith.” Yes, Scorch is a good switch-hitter when he's sober, the superintendent agrees, but wants to know why. I think fast and say daily exposure to nitroglycerine is going to give Scorch that extra burst of speed to steal some bases.

The real reason is that the first wave of data-gathering has given me an intuition (yes, hunches are a valid part of any root-cause investigation, if followed with data and logical cause-and-effect rigor) that we can't dig up our root cause without an explosives guy on board.

Why? I hear that the powder monkeys know more about some problems down below than they have let on. And that reluctance doesn't necessarily mean they are part of the root cause. Typical for root cause investigations: somebody's always dragging their feet, or actively misdirecting the effort, and guilt may have nothing to do with it. They act out of fear their department will get blamed or maybe they'll have to do more work because somebody else screwed up. I'd say that office politics is in the top three reaons why root-cause efforts fail, along with lack of persistence and poor followup at the end of of the improvement process, during implementation.

So on top of everything else in managing my new 1870s lifestyle and running the mule team downstairs, I'm supposed to coach a pretty rough bunch into a championship baseball team. I teach them a few pitching and hitting tricks that Abner Doubleday never thought up. They appreciate the tips and that's why they're willing to indulge my root-cause work on the side. They think I'm half-addled – I can't even keep my story straight on where I came from. That's okay – whatever it takes!

I don't have to remind you that mustering the right investigation team is key: they need to have subject-matter expertise, good interpersonal skills to dig out the “who, what, when, where, why, and how” information, and finally the ability to implement a solution. Because this is looking like we have a long and tangled chain of causation that produces multiple symptoms – nothing simple here -- they need a good facilitator to keep them on track. That's me.

One of my jobs is to keep the root cause team/ baseball team in the containment zone. That means sticking to the range of relevant operations: not all events that affect business operations, but only those that they have the ability to change. So when they start blaming gold prices in Frisco, that's out of bounds. I'm not looking for cheery faces all the time: I usually find constructive disagreement gives the best results.

And there's plenty of unhappiness to keep me happy. Right away I have to a put a new man in right field since I need a really smart guy from the steam-drilling department. He's got bad eyesight and this would have caused an uproar except that our pitcher is developing quite a slider and not many balls are getting out to right field.

Two weeks pass. The baseball team has won a few games nobody expected and now it's time to introduce my newly energized baseball team to the Ishikawa fishbone diagram. “Now, I know what you're going to say, so just hold your horses,” I start out. “Scorch, you were going to say that fishbones have their detractors, but!  I like fishbone diagrams because filling out them out brings forth rich speculation about the full range of possible causes and connections. So bear with me.” These are hard-rock miners who haven't been doing a lot of diagramming on the frontier or any kind of symbolic logic for that matter, so it's not easy to sell them on the idea, but I insist that this is part of my strategy to raise team morale before the playoffs.

So after the practice each afternoon, and with a few gallons of beer that I buy, we flesh out the fishbone. You'd recognize it right away: problem statement goes on the right, linked to a spine that ties together branches representing major categories of possible causes. We settle on six fishbone categories adjusted for the setting:
  • Machines: Steam drills (drills, boilers that supply steam); tools for dressing bits; tools used at the working face in the mine; Hoist gear; Mine cars; Mine rails; Timbers to hold up workings
  • Men: Miners (poor training, poor safety discipline, labor unrest); Hoist operators; Mule wranglers and stablehands ( I make sure to include my department, to make the point that no one is immune from the evil eye)
  • Materials used up daily: Dynamite, Coal used for raising steam, Steel drill bits, water quality
  • Management: Superintendent; shift foremen
  • Measurement: Assay office, scalehouse that weighs tonnage, surveys of underground workings
  • Mine (Environment): Ore bodies; Temperature; Humidity; Flooding; Fire; Cave-ins
Normally I'd keep my fishbone nice and dry in a conference room with a big whiteboard, using markers and sticky notes. A good fishbone is something the Japanese keep posted in meeting rooms for months, in fact. But in the Wild West, I settled with a whitewashed wall on the back of a toolshed by the baseball diamond and that's my whiteboard. We just nail the pieces of paper to the board.

It can be hard to move the notes around on such a board but we've got to have something that keeps the paper from blowing away in the Chinook wind. Four-Finger Halloran, the second baseman, says we should move the mining engineer from “Men” to “Management” since he does the geology and his plans guide the excavations. The bat-boy, who's wanted in Texas for a murder or two, speaks up: “Yep, if the engineer was on the job! He was visiting his uncle over in the Glory Hole Mine last month, stepped on a rotten trapdoor and fell down a shaft. He's been laid up all month.” I tell them not to jump to conclusions like blaming it all on the absent engineer. But something here is worth pursuing, so I enrich this stem with some notes and move the Mine Engineer stem and all its twigs to a new spot under the Management branch. I need a nail to hold it and one of the boys obliges with a throwing-knife: it lodges in the paper and misses me. I give him a hard look but we now have a set of possible intermediate causes and will add earlier causes to them.

As I've said a million times or more, the typical fishbone diagram starts as no more than a categorized, broad-span brainstorm list based on current knowledge, in which the main categories show up as four to eight big branches. Within each branch are stems. Here, “drills” and “hoists” are stems under “machinery.” Each stem that looks promising to the RCA team deserves the addition of twigs that list the events leading to the specific problem. 

It's this later work, probing into possible causes with methods like the Five Whys, that brings the project to a good conclusion. I don't believe that using “why” exactly five times for each symptom is always the best method, and sometimes a better question is “how could,” but the boys seem to prefer the Whys. There are too many permutations to investigate to the Nth degree, but there's enough wisdom on the baseball team to pick the most promising theories for a closer look. I buck them up by saying that the odds are good that somewhere in this burgeoning list is our root cause, or causes, along with the chain of events that led to the bad events described in the Problem Statement. Now we need facts to prove, or disprove, whether the circled, top-priority causes were a factor.

Okay. Now set aside all your guesses on why the mine is is in so much trouble,” I say. “Just go out and gather the data and let the bones fall where they may.” At first the boys struggle with this, since it could reflect badly on their department, but I compare it to a bloodhound. Who can lead a bloodhound when it picks up a scent? Nobody! The bloodhound finds its own trail.

Duly instructed if not inspired, the Acme Nine baseball/root-cause team indulges me by bringing new information after each practice. Some of it is in tabular form. I gather it up and try out a Pareto chart – trying to find the classic 20% of causes that will bring 80% of the benefit. Paretos need a heap of statistics and these are sparse along the frontier. All I have for each day are things like mining tonnages, water in the sump, steam generation, and staffing levels. The mining engineer is still out from his fall down the shaft, so I don't have the benefit of his expertise. In fact, it may be better that he stays away. The mood was pretty ugly last week, because some of the fellows thought he had sabotaged the mine.

We circle a dozen possible causes to start with, though we'll probably have to dig into more of them later. To illustrate how hard they work at this, take the Steam Drill Problem, which is just one twig under the branch labeled Machinery:

Why – Level 1 “The steam drills are under-performing by 35% when making holes for placement of explosives - Why?” They guess that it could be the drills, the driller, or the steam supply. The drill squad goes off to investigate and reports back to the full team that nothing is wrong with the drills or drillers, but the steam flow is low.

So that leads to the Second Why: “Why is the supply of steam to the drills low?” The team says it could be due to at least five causes, including the boiler machinery (such as plugged tubes, from mineral scaling), or a change in fuel supply, or efficiency of the steam lines. The “steam team” goes off to check and reports back that supply pressure is nominal at the boiler side, but strangely low at the far end of the hoses. A clue!

Now the Third Why: “Why is steam pressure low at the drills' inlet? “ Before charging off, I have them brainstorm all possible causes: there could be too much hose in the run, the hose could be leaky, or maybe it's plugged with rust or debris.” Give me numbers! I cry and the steam team hustles off check. They discovers that the hoses are leaking more than usual: an average of 2.5 leaks per 20 feet of hose. We are starting to close in on a contributing cause, I'm sure.

And a fourth Why: “Why are there so many leaks in the steam hoses?” They hypothesize as follows: the leather or rubber could be getting old, the man in charge of patching the routine holes might be falling down on the job, there could be sabotage, or something could be wearing them out faster. A newly appointed Hose Team goes off to investigate.

So you get the idea. By now everybody in the mine begins to see that using Five Whys is no shortcut, and that root-cause work is more perspiration than inspiration.

Meanwhile I'm compiling a detailed chronology for the Change Analysis. That's a complete list of events at the mine, day to day, over the last six weeks. It took a lot of work, but I like them: a chronology is one way to sift through the causes and effects, in this case, what might have driven the deterioration in production statistics. My Pareto chart tells me that there's no significant difference between ore production between the day shift and the night shift, but there was a striking change in tonnage over time, in one of three payzones where miners are working. That payzone, called Queen of the North, had been accounting for most of the revenue, and now it's way down, so the Pareto charts indicate a cause or causes should be found there.

We carry on for another week and I feel like we're closing in on it. The Acme baseball team is on a winning streak. Better yet, a plausible chain of causation is emerging out of our many Cause and Effect Diagrams and a Current Reality Tree

I start them working on an Anticipatory Failure Analysis. Shortly afterward the crisis hits, which it always does at some point among us root-cause practitioners. But this crisis is a little more pointed: namely, the point of a Colt .45. When I walk into the office, there's a committee waiting for me. Or shall I say a posse? I'll finish my report when and if they let me out of the hoosegow.

         ~ ~ ~ ~ ~ ~ ~ ~ ~
Conclusion in Part 3!