Tell me about pipeline software?greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread |
Could we be facing a series of pipeline blowouts brought on by rushed Y2k fixes?>>>>>>>>>>>>> background - the Bellingham blowout
The story as reported was (1) they were doing something with computer programs when (2) a "maintenance program" used up too much memory and the system bogged down, so (3) they rebooted, and somewhere along there the line blew out.
I recast (2) as: a housekeeping module failed to prevent a memory leak. This system handles x-hundred chunks of data a second, or something, and now it's got a memory problem. Maybe they had the problem all along, and had been rebooting, and just failed to catch it in time THIS time. Or maybe this is a brand new problem in a new module.
So maybe the line had a weak place just waiting to blow. Or maybe a faulty shut-down routine or a timing glitch in the program caused some kind of transient pressure spike in the line. If we have a brand new program with a serious memory leak, then I'm not too reluctant to think it might have other faults in timing routines.
If this is an error in a mainline logic path, and it bogged pretty quick after loading it, then it must have been tested pretty lightly. A memory leak shows up quick under load, unless it's in a low-volume part of the code. So who's installing a poorly-tested program in mid-1999? Sounds a lot like a fix that was rushed in for Y2k compliance. Ok, got no real evidence at all for this, I'm just mulling it over.
>>>>>>>>>>>>>>>>> end background
If I'm gonna build a pipeline, I'm not gonna put it in the ground and hire some kid out of vo-tech to write a VB program to run it. This has got to be heavy engineering, fluid dynamics, telecomm, you name it. You've got to have time-sensitive scripts for opening and closing valves, so they won't induce pressure spikes. And if it's new, it's gotta have a whiz-bang GUI to impress the manager who bought it, right? So we're talking major intellectual investment to write this thing.
But there aren't all that many pipelines, few hundred, whatever, so you have a real narrow market that requires a big investment to break into. That tells me we have only one or two vendors writing all the pipeline management software.
How's that work? Probably have a generalized engine with all kinds of parameter controls, load your local configuration, where the valves are, the tanks, etc. Maybe the vendor customizes it just a little for every client, special reports, whatever. That would correspond real well with some systems that I have personal knowledge of, systems in heavy-engineering but narrow markets.
JUST SUPPOSE that this is not an isolated hunk of code, but is simply the first installation of the new "Y2k compliant" version of the package that runs all the pipelines in the country? Do we have a bunch of managers elsewhere saying "Whoops, better wait for a patch"? Are the big pipelines browbeating the vendor hoping for priority on THEIR customized system, while little pipelines are just hoping for the best, no time to test, we'll just have to put it in and hope they got it right?
Anybody out there who can fill in the blanks?
-- bw (home@puget.sound), June 18, 1999
No need for facts. Feel free to speculate. Speculation will be taken as fact.
-- cd (artful@dodger.com), June 18, 1999.
Nope, need some facts.Speculation's value is in sketching scenarios, because it suggests the next questions to ask. A lab scientist does a particular test because he thinks that this particular test will prove or disprove a guess he already made.
Speculation on the pipeline suggests that useful questions might be: (1) How many companies are writing pipeline management software? (2) Will this company make any statements about whether this particular program change was Y2k related? (3) Did the software that (apparently) messed up come from a vendor, or was it written in-house? (4) If from a vendor, is the pipeline manager happy with the response time they're getting from that vendor? How big is the vendor's staff? Did a vendor rep come and install this package and tell them to turn it on, or did they benchtest using this pipeline's configuration first? Were any programming changes made by that vendor rep onsite?
The scientific method, contrary to myth, does not consist of gathering lots of facts and then the truth appears by magic. It consists of making guesses and then figuring out whether the facts support or undercut that guess. The tough part is in letting go of a bad guess, when facts are against it.
So far, in the pipeline case, we have virtually no facts. Time to figure out what questions to ask.
-- bw (home@puget.sound), June 18, 1999.
bw wrote:"If this is an error in a mainline logic path, and it bogged pretty quick after loading it, then it must have been tested pretty lightly. A memory leak shows up quick under load, unless it's in a low-volume part of the code. So who's installing a poorly-tested program in mid- 1999? "
-------------------
I may not know anything about pipeline software, but I do know programmers. And I know way to many programmers who do unit and system testing with a minimal amount of data. If the software is not fully tested by another person then it could go into production and fail when the data overload happens. Unfortunately, the testers are sometimes as bad as the programmers when it comes to the amount of data they use to test. It drives me up the wall!!!
DJ
-- DJ (reality@check.com), June 18, 1999.
I'll call some people at Pipeline & Gas magazine on Monday - let you know more then.Technically, don't put too much into the "memory leak" theory - it could have happened, but not neccesarily that specific way.
Equally likely, a "simple" open-close error, or sensor-off = signal off= turn-off command. Or some other problem - we don't know enough to know; there is one possibility of coding the correct command at every decision point. And one possibility of coding it wrong. Given several hundred thousand decision points, all potentially affected by re-introduced bugs after the software upgrade, there are potentially hundreds of new errors that could have been introduced.
Testing all of them can't be done - at best you can filter out the easiest to find, and test as many conditions as possible, and hope for the best.
As stated then, expect many more like this as time and management pressures to load use software in the field, without thoroughly testing all of new software interfaces become extraordinarily high.
-- Robert A. Cook, PE (Kennesaw, GA) (cook.r@csaatl.com), June 20, 1999.