did you all notice this y2k power plant example in that chicago trib story?

greenspun.com : LUSENET : TimeBomb 2000 (Y2000) : One Thread

I know that the Chi Trib ("global warning") was posted earlier today, but I just wanted to point out this example from that story, for those who wonder about "hard facts" on y2k, or "concrete examples:"

**As part of an experiment last year, technicians at the huge Xingo hydroelectric dam on Brazil's Sao Francisco River set the dates on the plant's main computer forward to Jan. 1, 2000.

**What happened next is still sending chills through Latin America.

**"When they put the date forward, the whole control board went haywire," remembers Marcos Ozorio, one of the members of Brazil's presidential Year 2000 commission. "Twelve thousand warning lights flashed all across the board, with all kinds of alarm information."

**Technicians quickly switched back the date, and are now ferreting out the plant's Y2K bugs. But "if you had been surprised by a situation like this, what you'd have had to do is shut down the plant until you found where the failures were," Ozorio said. "Automatically you'd be taking off the energy board 30 percent of northeast Brazil."

Gee, I guess it's serious, huh?

Link:

A global warning on Y2K



-- Drew Parkhill/CBN News (y2k@cbn.org), March 22, 1999

Answers

Oh well...so much for Hydro being a reliable source of post y2k juice...

-- a (a@a.a), March 22, 1999.

Thanks Drew. I saw this article but missed that part. It's added now.

-- Rob Michaels (sonofdust@com.net), March 22, 1999.

Thank you very much Drew. Talk about timing! I just linked this with Old info but apparently still being ignored. I was hoping to find Mr. Cook tonight and get his comments, but this is a home run! You da man! <:)=

-- Sysman (y2kboard@yahoo.com), March 22, 1999.

Drew,

This example proves the folly of the "fix on failure" strategy for responding to Y2K. If this had happened on January 1, 2000, how long would it have taken to restore power?

-- Incredulous (ytt000@aol.com), March 22, 1999.


Sounds like they went from the "let's see what happens" to the "OH, SHIT!" phase pretty fast. I usually have the same feeling after a plumbing repair when I turn the water back on!

"The types of problems we will see should be fixed within 72 hours", famous last words of Sen. Dodd.

-- Bill (y2khippo@yahoo.com), March 22, 1999.



Note that no where in the story does it say that the power generation equipment did (or would have) shut down, only that a series of alarms were generated. If changing the date back cleared all of the alarms, that tells me that nothing significant was impacted by rolling it forward and the alarms generated were probably nuisance alarms that would have disappeared after being acknowledged. Plus, it sounds like they just invented their contingency plan if something does go wrong -- set the date back to 1998 and everything is fine.

ANP

-- Another NORMal Person (ANP@BettyFord.com), March 22, 1999.


ANP,

>Note that no where in the story does it say that the power generation equipment did (or would have) shut down, only that a series of alarms were generated.

Why do you think they *have* alarms, ANP? Do you think that maybe, just maybe, those alarms are intended to signal that there are problems in the power generation equipment? So that the first-level presumption when an alarm signals is that there is something wrong in the power-generating function to which the alarm is related? So that when "twelve thousand warning lights flashed all across the board, with all kinds of alarm information", if it had occurred during normal operation, not a test, the technicians would have been justified in proceeding on the assumption that something was really wrong?

Suppose a certain warning light, indicating a turbine bearing failure, came on in normal operation. Wouldn't you agree that the technicians should do something other than simply resetting the alarm? If so, then do you not consider twelve thousand warning lights to be a bit of a problem?

>If changing the date back cleared all of the alarms, that tells me that nothing significant was impacted by rolling it forward

You mean that did _not_ tell you that the alarm functions themselves were impacted? Or do you not consider alarm functions to be significant?

Do you ever replace the batteries in your smoke detectors? Do you have smoke detectors?

>the alarms generated were probably nuisance alarms that would have disappeared after being acknowledged

How do you justify that speculation?

>Plus, it sounds like they just invented their contingency plan if something does go wrong -- set the date back to 1998 and everything is fine.

Riiiight ... because the twelve thousand warning lights and associated alarms are just nuisances ... someone really pulled a fast one on that electric company, didn't they?

-- No Spam Please (No_Spam_Please@anon_ymous.com), March 22, 1999.


The true effects on the power grid are illustrated by this example. Very very few problems have been found by utilities in power generation. It's the human reaction to monitoring misinformation or conflicting readings in power distribution that contributes to "incidents."

I trust Robert will verify that the world's two major nuclear accidents were contributed to by improper operator reaction to data.

There is a school of thought that we could have no power problems at all during rollover...if all the operators are handcuffed and blindfolded.

-- PNG (png@gol.com), March 22, 1999.


Thanks, Drew.

It did slip by.

Another NORMal Person,

As I recall, they did a rollover "test" at NORAD, and a similar, not exact, thing happened. Screens blanked. The operators were stunned. That's one of the times when they began to understand the "human" and international government scope of the Y2K problem.

Diane

-- Diane J. Squire (sacredspaces@yahoo.com), March 22, 1999.


My discussion with ANP continues at the above noted link.

PNG - I also hope to get input from Robert (Dr or P.E.) <:)=

-- Sysman (y2kboard@yahoo.com), March 23, 1999.



If there were no changes in loads, circuits, trips, or equipment, and if the alarms really were NOT do to faults in euipment, trips, overspeeds, or undervoltage, or low lube oil, or no water, or too high a temp, or too low a fuel pressure, or any of "ten thousand" other things - real and potential and imaginary - then you're right: tie up the operators, let th emachinery keep running.

But it would only take one turbine and generator - at 35 tons - 50 tons rotating mass at 1800 rpm - to lose lube oil, overspeed, lock up, shutdown, or trip out - to destroy the guts of the power station below the dam. The damage is astounding when one of those things gets loose. The fault - as here, and as at peach Bottom nuclear station - could be alarms only. But if the same _kind_ of failure tripped the lube oil or overspeed safety switch - you're looking at millions of dollars of damage and possibly years off line in the hydro facility.

Or anywhere else a fatal casuality strikes..

Now, how many fossil plants have truly tested their systems?

-- Robert A. Cook, P.E. (Kennesaw, GA) (cook.r@csaatl.com), March 23, 1999.


OK Spam-Man

I've worked with industrial control systems for 15 years, have you ever been in a control room? If you believe that every alarm that a control system generates is an impending failure, you are competely clueless. Try not to be so condescending when you are talking about things you don't know about.

If an alarm was due to a bearing failure, then changing the date would have no effect whatsoever on it. Changing the date may cause an erroneous calculation or 2 but as I have said in another post, there is absolutely nothing in a regulatory controller that requires an absolute date reference. A wrong date will, at the most, cause a maintenance logging program to generate error messages indicating that the instrumentaion is beyond its pre-specified maintenance interval. Even so, a maintenance routine will NEVER shut down a controller or system on its own it will simply generate thousands of alarms until the maintenance procedures are followed and the alarms acknowledged. Hmmm... thousands of alarms caused by changing the date that disappear when the date is changed back, sounds familiar...

And, no, I do not believe the alarm functions themselves were impacted or simply changing the date back would not have solved the problem. If for some bizarre reason, setting the date ahead had caused a whole host of setpoints to be recalculated as zero (or infinity), they would not have been able to recover merely be resetting the date. And, yes I have both smoke and CO detectors -- thanks for your concern!

Finally, the contingency plan idea was tongue in cheek but what is wrong with it. The whole idea of a contingency is to have a qucik recovery plan if something goes wrong despite your best efforts to prevent it. If they have already done this, what would be wrong with resetting their clocks to 1998 or earlier temporarily. If it is the difference between having power and having the corret date appear on a computer screen, that's a no-brainer. To Diane:

Nothing personal but this sounds like another anonymous, unverifiable anecdote. Can you provide more specifics?

-- Another NORMal Person (ANP@BettyFord.com), March 23, 1999.


Hi Robert. Ignore my page. Meet my friend ANP. Oh well, I was going to bed, but think I'll lurk for a while! <:)=

-- Sysman (y2kboard@yahoo.com), March 23, 1999.

But, they only misread - then misinterpreted - one alarm for one valve at TMI - guess it couldn't be so bad ...... small error. Shut down the plant though.

My former engineering officer only let one small nut get caught in the reduction gear - no big deal - we were only laid up for 8 months for main engine (turbine) repairs. That Soviet nuclear icebreaker that blew radiaoactive water all over the auxiliary space only had one valve misoperated - destroyed the whole ship though, didn't it? Couple of Soviet subs sunk - only one thing broke. The Thresher went down - killed a lot of people - but only one sil-brazed joint failed - just one.

We are not real sure what kiled those men on the Scorpion - but I'm sure "most" of the ship was still okay when that one thing failed that let water in the hull.

Change in conditions is very threatening when you cannot believe your senses - and your sensors. And Jan 01to Feb 29, 2000 won;t be routine anythings.

-- Robert A. Cook, P.E. (Kennesaw, GA) (cook.r@csaatl.com), March 23, 1999.


Agreed 100% Robert. Differentiating good data from bad is the danger. Which annunciator light on which annunciator panel is telling me the truth? Are they both wrong? Are they both right? [deltas in os, w, e, Hz, uv, ov, instantaneous ol that auto-actuate are e/m ...right?]

-- PNG (png@gol.com), March 23, 1999.


Oh man, where is ANP? It's been an hour! We went back and forth all evening and half afternoon, and I'm just an amature (well, software). Now that I fianlly get you pros hooked up, he just vanishes? And he didn't even say good-nite! We even got Mr. Yourdon to chip in. I hope he comes back. All this effort, and nothing to show for it. This ain't easy. I need a vacation... <:(=

-- Sysman (y2kboard@yahoo.com), March 23, 1999.

good night sysman...Don't worry. Someone else will come along.

-- PNG (png@gol.com), March 23, 1999.

Robert,

Even if the alarms were erroneous, what if the systems were running in cascade? Couldn't a false alarm in the right place cause real ripple effects throughout the system as they respond to the failure?

-- margie mason (mar3mike@aol.com), March 23, 1999.


Assume the alarms were false. Now you have hundreds of alarms going off. Then several minutes later, something does go wrong. How would you know till it was to late? You wouldn't!

-- SCOTTY (BLehman202@aol.com), March 23, 1999.

Comments to Sysman:

Sorry I bagged on you last night, thought you were gone. See my response to Chuck in the "Old info.." thread.

Comments to Mr. Cook:

You may be misinterpreting me but I did not say that generating all of those alarms was not a problem. But operators learn very quickly which ones are relevant and which are not. As I said, if an operating hydroelectric plant can have its clocks set ahead, generate 12,000 alarms, and return to normal operation simply by setting the clocks back with NO loss of operations, that is good news, not bad. Had any bad control targets actually been calculated and accepted and valves opened or shut incorrectly, simply changing the date back would not have restored the system. Since it did, then the majority of those alarms had to have been 'nuisance type' (i.e. "LIC202 requires tuning", "FT305 requires recalibration", etc.) the type that operators see every day, acknowledge every day, and simply generate a maintenance request to take action. In the worst case, the plant now knows that they can operate on January 1, 2000 by doing nothing more than setting their clocks backwards while they resolve any issue they might have missed. Remember, this WAS a Y2K test, designed to identify whether or not they were ready. Obviously they were not and are now armed with the data necessary to become ready.

I think this is a good indicator of why companies are hesitant to say anything. If they have not tested, they are bad. If they do test and the system is not 100% compliant, they are worse. If they are fixing their problems, they aren't doing it fast enough. If they say they are done fixing, they are lying. ANP

-- Another NORMal Person (ANP@BettyFord.com), March 23, 1999.


But ANP, you've identified the "missing" test results - given this kind of systemic error appearing "often" as clocks are actually "set ahead" during integrated testing - even after remediation and repair is complete - where are the "rest of the systems"?

Granted, I will not be so rash as to claim the following - "That similar failures and results are not reported means that "the rest of the world" is not fully testing, is not yet ready to test, is not doing real integrated testing, is (at most) doing piece part and component-level testing. It not mean that test results are only incidently reported or occur, get fixed, and the test repeated. "

See, one cannot prove a positive by showing that a negative doesn't exist. Nor can you establish a negative by showing that "sometimes" a positive exists. Each test that is reported is a positive sign - for that perticular company, in that particular section, to help the customers of that single company. My concern is that the very few positives are being promoted so widely because they are so infrequent. We should have now, something like 40 or 50 power plants and grid control and telephone and chemical plants reporting "testing complete" every day.

And they aren't. A sucessful test should not be news. A compliant power plant should not be news. And no "top 50" city government or county government, no state government, and only one federal agency has completed.

Because this systemic alarm/control failure has occurred several times as clocks are set forward after remediation - it does, however, indicate that such alarm/automatic shutdown conditions apparently occur frequently, and thus process failure is likely. As person who has done system-wide testing, and component testing, and program testing - I reserve my opinion - until proved incorrect by millions of reports of successful process testing, because there are millions of processes out there in millions of different businesses - that "complete system-wide integrated testing is not being done on a wide-spread basis. I disagree with your assumption that you can predict the outcome of a process (paper mill, chemical refinery, oil recovery, or plastic production, or chip production and etching, etc.) based on "design logic". It works in theory in one controller - but not on the whole process. The effect on the whole process - from truck coming in the front door to truck going out the back door with invoice in hand - can only be assured of success by full-up integrated testing.

Or by "fix after failure". If the failure implied by "fix on failure left" the machinery and the factory still standing to make fixing possible.

-- Robert A. Cook, P.E. (Kennesaw, GA) (cook.r@csaatl.com), March 23, 1999.


Good afternoon ANP. Glad to see that you've returned. No problem with last night. I was disappointed that you and Mr. Cook didn't get into more of a discussion. I've only been a "regular" here for about two months, but I have grown to respect his opinion.

I want to thank you for the time and effort that you are putting in here. This forum needs more people like yourself that can present a valid and logical argument. This is an area of concern for all of us, and we need all the information that we can find on this topic. Please stay with us for a while and continue to share your experience with us. <:)=

-- Sysman (y2kboard@yahoo.com), March 23, 1999.


Good afternoon Robert. A pleasant surprise! <:)=

-- Sysman (y2kboard@yahoo.com), March 23, 1999.

Same article, different topic, I hate to start another thread for this. There is another detail of this article which I am surprised no one has pointed out. Each country was rated on a scale of 1 to 4. The U.S. being a 1, China & Russia for example being 4's.

Now I realize I am nit picking, but...Italy was a '2'.

What do we know about Italy? They just started. They don't take this seriously.

from article:

"The Year 2000 doesn't exist in Italy," said Paolo Tedone, vice president of a Milan software company. "You try to raise the awareness in Italy, you speak to people, you show them the evidence, the reports from the U.S., but somehow it's impossible for Italians to believe this is serious. They think that we'll find a way to fix it at the last minute." (snip)

Italy, however, is deemed to be a "country at risk." The Italian government did not get around to establishing a commission to look into the matter until early this year. (snip)

Okay, a long way to go for a short point, but if Italy rates a two, what does that tell us about threes & fours???? The rest of the two's?

GROUP 1: Highly prepared.

Australia, Belgium, Bermuda, Netherlands, Ireland, Israel, Switzerland, Sweden, United States

GROUP 2: Some systems ready.

Bahamas, Brazil, Chile, Finland, Hungary, Iceland, Italy, Japan, Mexico, New Zealand, Norway, Peru, Portugal, Singapore, South Korea, Spain, Taiwan, Thailand

GROUP 3: Significant shortcomings.

Argentina, Armenia, Austria, Bulgaria, Colombia, Czech Republic, Dominican Republic, Egypt, Guatemala, India, Jamaica, Jordan, Kuwait, Malaysia, Panama, Poland, Puerto Rico, Saudi Arabia, South Africa, Sri Lanka, Turkey, United Arab Emirates, Venezuela, Yugoslavia

GROUP 4: Highly vulnerable to disruptions.

Afghanistan, Bahrain, Bangladesh, Cambodia, Chad, China, Congo, Costa Rica, Ecuador, El Salvador, Ethiopia, Fiji, Haiti, Indonesia, Kenya, Laos, Lithuania, Morocco, Mozambique, Nepal, Nigeria, Pakistan, Philippines, Romania, Russia, Somalia, Sudan, Uruguay, Vietnam, Zimbabwe

( In surveying business and governments in 87 nations and regions, the Gartner Group, an independent consultancy, found a wide-range of vulnerability to the Year 2000 computer problem.)

This was hardly comforting to me.

-- Deborah (infowars@yahoo.com), March 23, 1999.


Hi Deb. Here's another chart I came across yesterday, from Gartner, dated March 15 <:)=

Preliminary World Wide Y2K Compliance Survey

-- Sysman (y2kboard@yahoo.com), March 23, 1999.


Mr. Cook:

I'm not sure I understand everything you said so bear with me. I have not heard of any problems as described above occurring "after remidiation and repair is complete" so please clarify.

Also, I think we are somewhat in agreement that a successful test does not mean everything is hunky dory any more than an unsuccessful test means that the world is doomed. I do disagree, however with a few of your statements.

Each test that is reported is a positive sign - for that perticular company, in that particular section, to help the customers of that single company.

Not true, it actually benefits any plant of similar design or any plant using similar equipment. As non-critical systems and components are tested and found to be compliant, companies are able to focus more resources on the really critical and unknown systems. A fully integrated, simultaneous test of every system in plant is simply not possible nor will it occur. Industrial systems are not nearly as integrated as some would have you believe. The term "islands of automation" is a more appropriate description of most large industrial plants. I am most familiar with paper plants and I can assure you that there is not paper plant anywhere in the world that is capable of setting all of its clocks in all of its systems ahead simultaneously, nor would they want to. The different process areas have process interrelationships but the computers and control systems are separate.

My concern is that the very few positives are being promoted so widely because they are so infrequent. We should have now, something like 40 or 50 power plants and grid control and telephone and chemical plants reporting "testing complete" every day.

And they aren't. A sucessful test should not be news. A compliant power plant should not be news. And no "top 50" city government or county government, no state government, and only one federal agency has completed.

Why should we have 40-50 per day? Simply because there are X number of plants and Y number of days, you will not see X/Y per day reporting completion. That is another way that people not familiar with the issue try to simplify it by neatly categorizing it as good or bad and trying to force a nice linear completion curve so they can tell whther we are ahead of schedule or behind. Most companies have set their completion target dates at either June 30 or September 30 to give themselves 3 or 6 months cushion. So, I would be very surprised if there were that many companies declaring themselves fully compliant, at least for another 3 months or so. The real measure of progress is how many companies are ahead of or behind their self-imposed deadlines for being Y2K ready.

As person who has done system-wide testing, and component testing, and program testing - I reserve my opinion - until proved incorrect by millions of reports of successful process testing, because there are millions of processes out there in millions of different businesses - that "complete system-wide integrated testing is not being done on a wide-spread basis.

As a person with similar experience, I will reserve MY opinion until proven wrong by verifiable accounts of catastrophic failures of systems after remediation efforts have been completed.

The effect on the whole process - from truck coming in the front door to truck going out the back door with invoice in hand - can only be assured of success by full-up integrated testing.

As I said above, the general accounting and administrative systems are not integrated with the process control systems and even the various control systems are not so tightly integrated that all must be tested simultaneously. Design logic will tell you a great deal about where the potential problems exist. Code checking routines can identify all uses of date codes and date fields to simplify the pre-test auditing. If there is no exchange of date related information between to separate sub-systems, they can be tested independently. We have, in fact, done this in house, connecting several controllers together that exchanged information but nothing date related. Two of the controllers were intentionally non-compliant but, when they failed, the built-in error handling of the other controllers accepted the loss of communication and continued to operate. Obviously not all control functions were available but the basic regulatory functions were unaffected, meaning the plant will continue to operate, albeit at a slightly degraded performance level. Finally, I cannot speak for other companies but I can say that we have seen 100% compliance when we have tested our systems after fixing known non-compliance issues. In other words, ther have been no gotchas, unexpected occurrences, or issues hidden in the grass.

ANP



-- Another NORMal Person (Sam Malone@BettyFord.com), March 23, 1999.


Sysman,

"Revised: March 15, 1999." Is this really current?? Oh, ouch. Sure would be nic to see some fives.

Thanks

-- Deborah (infowars@yahoo.com), March 23, 1999.


Interesting to see that the legal profession scored a 0. Since lawyers are gearing up for Y2k suits, this leads to the interesting possibility of lawyers suing others lawyers for gross negligence.

-- (someone@somewhere.com), March 23, 1999.

ANP (I will continue to address you in the way you sign your own messages),

>And, no, I do not believe the alarm functions themselves were impacted or simply changing the date back would not have solved the problem.

1. Folks at the power plant set the date forward.

2. Then, warning lights flashed, with alarm information.

3. Setting the date back stopped the warnings and alarms.

But you "do not believe the alarm functions themselves were impacted"?

Apparently your definition of "function" differs from those of people for whom an "alarm function" includes the man-machine interface at the control board.

Personally, I consider that an alarm function that does not properly signal at the intended man-machine interface is _not_ working properly. When the detectors, relays, and so forth, at the measuring end work properly, but the appropriate signals do not appear, or inappropriate signals do appear, at the intended display, to me that means that the alarm function is not working properly end-to-end.

Are you contending that partial-path success is just as good as end-to-end success? Is that the basis of our disagreement?

-- No Spam Please (No_Spam_Please@anon_ymous.com), March 24, 1999.


Moderation questions? read the FAQ