Tried and true Programming.

Uploaded on:
Review Dependable Software. 9.5.1 Requirements on Software DependabilityFailure RatesPhysical versus Outline Faults9.5.2 Software Dependability TechniquesFault Avoidance and Fault RemovalOn-line Fault Detection and ToleranceOn-line Fault Detection TechniquesRecovery BlocksN-rendition ProgrammingRedundant Data9.5.3 ExamplesAutomatic Train ProtectionHigh-Voltage Substation Protection.
Slide 1

Mechanical Automation Industrielle Automation 9.5 Dependable Software Logiciel fiable Verlässliche Software Prof. Dr. H. Kirrmann & Dr. B. Eschermann ABB Research Center, Baden, Switzerland 2010-05-15, HK

Slide 2

Overview Dependable Software 9.5.1 Requirements on Software Dependability Failure Rates Physical versus Plan Faults 9.5.2 Software Dependability Techniques Fault Avoidance and Fault Removal On-line Fault Detection and Tolerance On-line Fault Detection Techniques Recovery Blocks N-form Programming Redundant Data 9.5.3 Examples Automatic Train Protection High-Voltage Substation Protection

Slide 3

Requirements for Safe Computer Systems Required disappointment rates as indicated by the standard IEC 61508: security frameworks wellbeing control frameworks [per operation] respectability level [per hour] - 8 - 5 - 4 - 9 ³ 10 to < 10 4 ³ 10 to < 10 - 7 - 4 - 3 - 8 ³ 10 to < 10 3 ³ 10 to < 10 - 6 - 3 - 2 - 7 ³ 10 to < 10 2 ³ 10 to < 10 - 5 - 2 - 1 - 6 ³ 10 to < 10 1 ³ 10 to < 10 most wellbeing basic frameworks < 1 disappointment each 10 000 years (e.g. railroad flagging)

Slide 4

Software Problems Did you ever see programming that did not come up short once in 10 000 years (i.e. it never fizzled amid your lifetime)? • First space transport dispatch postponed because of programming synchronization issue, 1981 (IBM). • Therac 25 (radiation treatment machine) slaughtered 2 individuals because of programming imperfection prompting monstrous overdoses in 1986 (AECL). • Software imperfection in 4ESS phone exchanging framework in USA prompted loss of $60 million because of blackouts in 1990 (AT&T). • Software blunder in Patriot gear: Missed Iraqi Scud rocket in Kuwait war slaughtered 28 American officers in Dhahran, 1991 (Raytheon). • ... [add your most loved programming bug].

Slide 5

The Patriot Missile Failure The Patriot Missile disappointment in Dharan, Saudi Arabia, on February 25, 1991 which brought about 28 passings, is at last inferable from poor treatment of adjusting blunders. On February 25, 1991, amid the Gulf War, an American Patriot Missile battery in Dharan, Saudi Arabia, neglected to track and catch an approaching Iraqi Scud rocket. The Scud struck an American Army garisson huts, murdering 28 troopers and harming around 100 other individuals. A report of the General Accounting office, GAO/IMTEC-92-26 , entitled Patriot Missile Defense: Software Problem Led to System Failure at Dhahran, Saudi Arabia investigations the causes (selection): "The reach door\'s forecast of where the Scud will next show up is a component of the Scud\'s known speed and the season of the last radar location. Speed is a genuine number that can be communicated overall number and a decimal (e.g., 3750.2563...miles every hour). Time is kept consistently by the framework\'s inside check in tenths of seconds yet is communicated as a whole number or entire number (e.g., 32, 33, 34...). The more drawn out the framework has been running, the bigger the number speaking to time. To anticipate where the Scud will next show up, both time and speed must be communicated as genuine numbers . As a result of the way the Patriot PC plays out its counts and the way that its registers are just 24 bits in length, the transformation of time from a whole number to a genuine number can\'t be any more exact than 24 bits. This transformation brings about lost exactness creating a less precise time computation. The impact of this error on the extent entryway\'s computation is straightforwardly relative to the objective\'s speed and the length of the framework has been running. Subsequently, playing out the change after the Patriot has been running consistently for amplified periods causes the extent door to move far from the focal point of the objective, making it more improbable that the objective, for this situation a Scud, will be effectively intercepted."

Slide 6

Ariane 501 disappointment On June 4, 1996 an unmanned Ariane 5 rocket propelled by the European Space Agency detonated only forty seconds after its lift-off from Kourou, French Guiana. The rocket was on its first voyage, following 10 years of improvement costing $7 billion. The annihilated rocket and its payload were esteemed at $500 million. A leading body of request researched the reasons for the blast and in two weeks issued a report. (not any more accessible at the first site) "The disappointment of the Ariane 501 was brought on by the complete loss of direction and state of mind data 37 seconds after begin of the principle motor start succession (30 seconds after lift-off). This loss of data was because of determination and outline blunders in the product of the inertial reference framework. The inner SRI* programming special case was brought about amid execution of an information transformation from 64-bit skimming point to 16-bit marked whole number worth. The drifting point number which was changed over had a worth more noteworthy than what could be spoken to by a 16-bit marked whole number. " *SRI remains for Système de Référence Inertielle or Inertial Reference System. Code was reused from the Ariane 4 direction framework. The Ariane 4 has distinctive flight qualities in the initial 30 s of flight and special case conditions were created on both inertial direction framework (IGS) channels of the Ariane 5. There are a few examples in different spaces where what worked for the primary usage did not work for the second. "Reuse without an agreement is folly" 90% of security basic disappointments are prerequisite blunders (a JPL study)

Slide 7

Malaysia Airline 124: impact of human administrator BY Robert N. Charette/December 2009 (IEEE Spectrum, February 2010) The travelers and team of Malaysia Airlines Flight 124 were simply subsiding into their five-hour flight from Perth to Kuala Lumpur that late on the evening of 1 August 2005. Roughly 18 minutes into the flight, as the Boeing 777-200 arrangement flying machine was moving through 36 000 feet height on autopilot, the air ship—abruptly and all of a sudden—pitched to 18 degrees, nose up, and began to climb quickly. As the plane passed 39 000 feet, the slow down and overspeed cautioning markers went ahead all the while—something that should be inconceivable, and a circumstance the team is not prepared to handle. At 41 000 feet, the summon pilot detached the autopilot and brought down the plane\'s nose. The auto throttle then instructed an expansion in push, and the art dove 4000 feet. The pilot countered by physically moving the throttles back to the inert position. The nose pitched up once more, and the flying machine climbed 2000 feet before the pilot recovered control. The flight group advised airport regulation that they couldn\'t keep up height and asked for to come back to Perth. The team and the 177 shaken however uninjured travelers securely came back to the ground. The Australian Transport Safety Bureau examination found that the air information inertial reference unit (ADIRU)— which gives air information and inertial reference information to a few frameworks on the Boeing 777, including the essential flight control and autopilot flight chief frameworks—had two defective accelerometers. One had turned sour in 2001 . The other fizzled as Flight 124 passed 36 571 feet. The deficiency tolerant ADIRU was intended to work with a fizzled accelerometer (it has six). The repetitive configuration of the ADIRU additionally implied that it wasn\'t compulsory to supplant the unit when an accelerometer fizzled. Notwithstanding, when the second accelerometer fizzled, an inactive programming irregularity permitted contributions from the principal flawed accelerometer to be utilized, bringing about the wrong bolster of speeding up data into the flight control frameworks. The irregularity, which lay covered up for 10 years, wasn\'t found in testing in light of the fact that the ADIRU\'s creators had never considered that such an occasion may happen. The Flight 124 group had fallen prey to what clinician Lisanne Bainbridge in the mid 1980s recognized as the incongruities and conundrums of robotization. The incongruity, she said, is that the more propelled the robotized framework, the more pivotal the commitment of the human administrator gets to be to the fruitful operation of the framework. Bainbridge likewise examines the oddities of computerization, the primary one being that the more dependable the mechanization, the less the human administrator might have the capacity to add to that achievement. Thusly, administrators are progressively let alone for the circle, in any event until something startling happens. At that point the administrators need to get included rapidly and perfectly, says Raja Parasuraman, teacher of brain research at George Mason University in Fairfax, Va., who has been concentrating on the issue of progressively dependable mechanization and how that influences human execution, and in this manner general framework execution. "There will dependably be an arrangement of conditions that was not expected, that the computerization either was not intended to handle or different things that just can\'t be anticipated," clarifies Parasuraman. So as framework unwavering quality methodologies—however doesn\'t exactly achieve—100 percent, "the more troublesome it is to distinguish the mistake and recuperate from it," he says. What\'s more, when the human administrator can\'t distinguish the framework\'s blunder, the outcomes can be disastrous.

Slide 8

Airbus Paris - Rio Sunday Times, June 18, 2009 Airbus PC bug is primary associate in accident with Flight 447 Charles Bremner in Paris Faulty velocity readings and electronic disappointments were refered to by accident agents yesterday as they said they were nearer to comprehension the loss of Air France Flight 447 on June 1, with the passings of every one of the 228 individuals on load up. Paul-Louis Arslanian, head of the French mishap examination agency, said that it was too soon to claim on the occasions that drove the Airbus A330 to collide with the Atlantic around 1,000km (600 miles) off Brazil, yet included: "I think we might get nearer to our goal."His comments reinforced suspicion among experts that a bug in the automated flight arrangement of the Airbus could be the way to the catastrophe. Brazi

View more...