Maker Pro
Maker Pro

computer reliability

J

JosephKK

Found this recently:

++++++++++

Subject: Tech worker: 'Blue screen of death' on oil rig's computer

Gregg Keizer, *Computerworld*, 26 Jul 2010

A computer that monitored drilling operations on the Deepwater Horizon
had been freezing with a [BSOD] prior to the explosion that sank the
oil rig last April, the chief electrician aboard testified Friday at a
federal hearing.

In his testimony Friday, Michael Williams, the chief electronics
technician aboard the Transocean-owned Deepwater Horizon, said that
the rig's safety alarm had been habitually switched to a bypass mode
to avoid waking up the crew with middle-of-the-night warnings.

Williams said that a computer control system in the drill shack would
still record high gas levels or a fire, but it would not trigger
warning sirens, He also said that five weeks before the April 20
explosion, he had been called to check a computer system that
monitored and controlled drilling. The machine had been locking up
for months. You'd have no data coming through." With the computer
frozen, the driller would not have access to crucial data about what
was going on in the well.

The April disaster left 11 dead and resulted in the largest oil spill
in U.S. history.

==========

What can i say? MS Windows should not be used for safety critical
systems in any way.
 
J

Jamie

JosephKK said:
Found this recently:

++++++++++

Subject: Tech worker: 'Blue screen of death' on oil rig's computer

Gregg Keizer, *Computerworld*, 26 Jul 2010

A computer that monitored drilling operations on the Deepwater Horizon
had been freezing with a [BSOD] prior to the explosion that sank the
oil rig last April, the chief electrician aboard testified Friday at a
federal hearing.

In his testimony Friday, Michael Williams, the chief electronics
technician aboard the Transocean-owned Deepwater Horizon, said that
the rig's safety alarm had been habitually switched to a bypass mode
to avoid waking up the crew with middle-of-the-night warnings.

Williams said that a computer control system in the drill shack would
still record high gas levels or a fire, but it would not trigger
warning sirens, He also said that five weeks before the April 20
explosion, he had been called to check a computer system that
monitored and controlled drilling. The machine had been locking up
for months. You'd have no data coming through." With the computer
frozen, the driller would not have access to crucial data about what
was going on in the well.

The April disaster left 11 dead and resulted in the largest oil spill
in U.S. history.

==========

What can i say? MS Windows should not be used for safety critical
systems in any way.
I didn't know BSOD was $MS ?
 
G

Grant

Found this recently:

++++++++++

Subject: Tech worker: 'Blue screen of death' on oil rig's computer

Gregg Keizer, *Computerworld*, 26 Jul 2010

A computer that monitored drilling operations on the Deepwater Horizon
had been freezing with a [BSOD] prior to the explosion that sank the
oil rig last April, the chief electrician aboard testified Friday at a
federal hearing.

In his testimony Friday, Michael Williams, the chief electronics
technician aboard the Transocean-owned Deepwater Horizon, said that
the rig's safety alarm had been habitually switched to a bypass mode
to avoid waking up the crew with middle-of-the-night warnings.

Williams said that a computer control system in the drill shack would
still record high gas levels or a fire, but it would not trigger
warning sirens, He also said that five weeks before the April 20
explosion, he had been called to check a computer system that
monitored and controlled drilling. The machine had been locking up
for months. You'd have no data coming through." With the computer
frozen, the driller would not have access to crucial data about what
was going on in the well.

The April disaster left 11 dead and resulted in the largest oil spill
in U.S. history.

==========

What can i say? MS Windows should not be used for safety critical
systems in any way.

Related story in latest comp.risks says they turned off the alarm
system at night so workers could sleep and not have to wake up for
the frequent false alarms at 3:30 :(

Grant.
 
R

Rich Grise

Subject: Tech worker: 'Blue screen of death' on oil rig's computer

Gregg Keizer, *Computerworld*, 26 Jul 2010

A computer that monitored drilling operations on the Deepwater Horizon
had been freezing with a [BSOD] prior to the explosion that sank the oil
rig last April, the chief electrician aboard testified Friday at a
federal hearing.

In his testimony Friday, Michael Williams, the chief electronics
technician aboard the Transocean-owned Deepwater Horizon, said that the
rig's safety alarm had been habitually switched to a bypass mode to
avoid waking up the crew with middle-of-the-night warnings.

Williams said that a computer control system in the drill shack would
still record high gas levels or a fire, but it would not trigger warning
sirens, He also said that five weeks before the April 20 explosion, he
had been called to check a computer system that monitored and controlled
drilling.  The machine had been locking up for months.  You'd have no
data coming through."  With the computer frozen, the driller would not
have access to crucial data about what was going on in the well.

The April disaster left 11 dead and resulted in the largest oil spill in
U.S. history.

==========

What can i say?  MS Windows should not be used for safety critical
systems in any way.

MS Windows should not be used for _any_ systems. ;-)
Old
The Yorktown lost control of its propulsion system because its computers
were unable to
divide by the number zero, the memo said. The Yorktown's Standard
Monitoring Control
System administrator entered zero into the data field for the Remote Data
Base Manager
program. That caused the database to overflow and crash all LAN consoles
and miniature
remote terminal units, the memo said.

http://gcn.com/articles/1998/07/13/software-glitches-leave-navy-smart-ship-dead-in-the-water.aspx

What kind of idiot "programmer" fails to check for a divide-by-zero
condition? Maybe I'm O/C, but when I write a program that uses data, I
mercilessly limit-check the data - of course, what action to take with
bad data would depend on the application.

And I certainly wouldn't do it on a Doze platform! <Mr. Yuck icon>

Cheers!
Rich
 
R

Rich Grise

----------------------------------------- Waaayyyy too much reading to do
in a reasonable amount of time. If you can point to any documentation
that would be applicable to the subject of this thread, please do so.
I'm not a Windows proponent, but since it's the OS that runs all of the
apps that I need and like, it's the one that I use and prefer until
something much better comes along.

On a Linux system, when an app crashes it doesn't take down the whole
furshlugginer system.

Cheers!
Rich
 
Robert said:
Whoever wrote the data entry program
should be strung up buy the balls for NOT checking
the validity of EVERY parameter entered during entry!
There is absolutely NO excuse!
The Rules of Operating System Design
#1 Applications must never crash the OS.
#2 APPLICATIONS MUST NEVER CRASH THE OS.

No. The OS must not be *able* to be crashed by an application. *WHATEVER*
mischief the application tries to get into.
 
BSODs are usually caused by a bug in the OS itself -- some user mode
application makes a system call, and some driver or other part of the OS
doesn't check parameters or whatever and -- poof! -- a bug causes a critical
bit of memory to be overwritten or some important process table trashed.

Ok, but that doesn't change the point; *nothing* in user-mode should *ever*
crash the OS. This failure was one caused by exactly this (invalid entry).
What people are really saying is that, "those writing device drivers and the
OS itself need to be held to a higher standard than those just writing user
mode apps," and I'd agree with that.
Certainly.

Writing device drivers is also not the
kind of thing you usually see beginning programmers do either (there is no,
"Windows Device Drivers for Dummies" or "Windows Device Drivers in 24hrs" book
out there -- yet). Nevertheless, over time there have been plenty of buggy
drivers written by well-known companies that certainly had the resources to do
better.

Like M$. How many times have they done kernal mode things in user mode?
E.g., some Creative Labs Sound Blaster drivers would crash and burn
on multi-processor PCs, because they didn't bother to appropriate lock and
synchronize access to their various queues and other data structures. They
had this problem for years, and chose to ignore it because, up until the point
that Intel started putting multiple cores on a single IC (and true
multi-processing became inexpensive), it was only high-end users and
"enthusists" with dual- or quad-CPU motherboards and Creative felt that was a
tiny enough market that they could ignore it. :-(

Creative has always ignored reliability. Their products have *always* sucked
as badly as M$, or worse. I'm surprised they've survived.
 
G

Grant

Creative survives because everyone else is just as bad. On desktop
linux boxes, the only thing I run are C-media boards. At least the
drivers work.

Firewire devices seem to be very reliable. What did they get right
that USB didn't?

USB is simply a souped up keyboard and mouse clocked serial data
interface: 5V, clock, data, 0V down only just so far of shielded
cable. I think USB3 introduces some LVDT lane tricks for the
higher speed link options, like the SATA serial data connection.

Grant.
 
M

Martin Brown

Found this recently:

++++++++++

Subject: Tech worker: 'Blue screen of death' on oil rig's computer

Gregg Keizer, *Computerworld*, 26 Jul 2010

A computer that monitored drilling operations on the Deepwater Horizon
had been freezing with a [BSOD] prior to the explosion that sank the
oil rig last April, the chief electrician aboard testified Friday at a
federal hearing.

In his testimony Friday, Michael Williams, the chief electronics
technician aboard the Transocean-owned Deepwater Horizon, said that
the rig's safety alarm had been habitually switched to a bypass mode
to avoid waking up the crew with middle-of-the-night warnings.

Williams said that a computer control system in the drill shack would
still record high gas levels or a fire, but it would not trigger
warning sirens, He also said that five weeks before the April 20
explosion, he had been called to check a computer system that
monitored and controlled drilling. The machine had been locking up
for months. You'd have no data coming through." With the computer
frozen, the driller would not have access to crucial data about what
was going on in the well.

The April disaster left 11 dead and resulted in the largest oil spill
in U.S. history.

==========

What can i say? MS Windows should not be used for safety critical
systems in any way.

Neither should Transocean. Odd that BP should have to pay for their
mistakes. I guess Transocean is too small to be worth suing.

Regards,
Martin Brown
 
P

Paul Keinanen

BSODs are usually caused by a bug in the OS itself -- some user mode
application makes a system call, and some driver or other part of the OS
doesn't check parameters or whatever and -- poof! -- a bug causes a critical
bit of memory to be overwritten or some important process table trashed.

In Windows NT 3.5x most of the graphic interface was handled in user
mode, causing a lot of slow user/kernel/user mode changes. To improve
speed, in NT 4.0, most of the GUI code was moved into kernel mode, but
they forgot to add the parameter cheks :). Just calling some innocent
looking GUI routine and passing a null pointer by mistake, where a
valid pointer was required could cause the BSOD :).

NT4.0 SP1 at least added parameter checks.

The situation with NT4 service packs was as bad as with the base
operating system version, only every other service pack was usable,
since it patched the bugs introduced by the previous SP :).

Regarding the kernel mode overwriting problems, part of the blame goes
to Intel, since the 386+ family only contained write protection on the
segment register level, but not on virtual memory page level. Super
minicomputers in the 1970's had page level write protection, so this
was nothing new when the 386 was created.

If the OS had used sensible code and data virtual address mapping,
even the limited segment based protection would have helped a lot to
catch bugs.

While handling exceptions caused by kernel mode access is risky in a
production system, at least a lot of kernel driver bugs could have
been detected during during driver testing, if the driver address
space could be limited.

Fortunately, the most recent versions of the x86 architecture will
provide some page level protection against illegal memory access.
 
N

Nico Coesel

Dave M said:
-----------------------------------------
Waaayyyy too much reading to do in a reasonable amount of time. If you can
point to any documentation that would be applicable to the subject of this
thread, please do so.
I'm not a Windows proponent, but since it's the OS that runs all of the apps
that I need and like, it's the one that I use and prefer until something
much better comes along.

Also, the BSOD can be attributed to Windows malfunction or misconfiguration,
a hardware failure, or application software failure or misconfiguration. I
haven't heard whether the actual cause of the BSOD was ever determined.
Until that can be known, you can't put the blame on the OS. At any rate,
the brunt of the blame should rest on the computer tech, since, apparently,
the problem was never resolved.

I agree here. In my experience Windows can run very reliably (uptime
 
G

Grant

Nice Gaussian TR/TF/ But Intel owns the assignment. Working with
Intel I'd guess they freely license such stuff... they love
peripherals :)

...Jim Thompson

Gotta work hard to get their crap paged memory CPU architecture out
there in the '80s. On the IBM PC-AT, they nurse the CPU thru reset
to get from extended memory mode back to real mode, with a special
byte reserved inside the RTC chip to tell the CPU where it's at on
reset :) Now, that's a big software fix for an Intel CPU... Fixed
that with the '386 and later. Then they got faster than a Z80 8bit
processor.

Grant.
 
R

Rich the Cynic

You're not keeping up with the thread.
...and the fact that the term even exists and is widely recognized
is evidence that that platform is the wrong choice.

1) In 1997, the guided missile frigate USS Yorktown
was dead in the water for over an hour
because **an app** tried to divide by zero,

No, because some dozer scriptkiddie neglected to check for an out-of-
bounds condition before sending his brainchild off into lala land.
2) In 2010, the Deepwater Horizon was running NT
(again, shown unsuitable for mission-critical operations) and
was so unreliable that the operator disabled parts of the system.

Again, human negligence; whoever bought Windoze SW should be prosecuted -
maybe Bill Gates should face murder charges, since it was the failure of
his OS that caused the blast.

Thanks,
Rich
 
No, because some dozer scriptkiddie neglected to check for an out-of-
bounds condition before sending his brainchild off into lala land.

That still should not hang the OS. The app may crash but that's all.
Again, human negligence; whoever bought Windoze SW should be prosecuted -
maybe Bill Gates should face murder charges, since it was the failure of
his OS that caused the blast.

I think your people should get together with BP's people. ;-)
 
J

JosephKK

-----------------------------------------
Waaayyyy too much reading to do in a reasonable amount of time. If you can
point to any documentation that would be applicable to the subject of this
thread, please do so.
I'm not a Windows proponent, but since it's the OS that runs all of the apps
that I need and like, it's the one that I use and prefer until something
much better comes along.

Also, the BSOD can be attributed to Windows malfunction or misconfiguration,
a hardware failure, or application software failure or misconfiguration. I
haven't heard whether the actual cause of the BSOD was ever determined.
Until that can be known, you can't put the blame on the OS. At any rate,
the brunt of the blame should rest on the computer tech, since, apparently,
the problem was never resolved.

Do you truly use anything that is not replicated in another OS?
As to the the Yorktown issue, that problem was most likely an application
software deficiency, not the OS. Any software developer worth 10% of his
pay will trap and handle bad data entry occurrences, which is what that was.
If the application software calculates and attempts to use a zero value in a
calculation it should detect that and handle it so as not to crash either
the OS or the application.

Just the same, an OS that crashes over that is not worthy of the
appelation OS.
 
J

JosephKK

I agree here. In my experience Windows can run very reliably (uptime

Can and usually does are not the same. I would bet that i can run XP
for 1 year at a crack so long as i do not do nay updates (which always
require a reboot). In linux i have done nearly a year, but power
failure got in the way, and i could keep the system completely up to
date.
 
J

JosephKK

Found this recently:

++++++++++

Subject: Tech worker: 'Blue screen of death' on oil rig's computer

Gregg Keizer, *Computerworld*, 26 Jul 2010

A computer that monitored drilling operations on the Deepwater Horizon
had been freezing with a [BSOD] prior to the explosion that sank the
oil rig last April, the chief electrician aboard testified Friday at a
federal hearing.

In his testimony Friday, Michael Williams, the chief electronics
technician aboard the Transocean-owned Deepwater Horizon, said that
the rig's safety alarm had been habitually switched to a bypass mode
to avoid waking up the crew with middle-of-the-night warnings.

Williams said that a computer control system in the drill shack would
still record high gas levels or a fire, but it would not trigger
warning sirens, He also said that five weeks before the April 20
explosion, he had been called to check a computer system that
monitored and controlled drilling. The machine had been locking up
for months. You'd have no data coming through." With the computer
frozen, the driller would not have access to crucial data about what
was going on in the well.

The April disaster left 11 dead and resulted in the largest oil spill
in U.S. history.

==========

What can i say? MS Windows should not be used for safety critical
systems in any way.

Related story in latest comp.risks says they turned off the alarm
system at night so workers could sleep and not have to wake up for
the frequent false alarms at 3:30 :(

Grant.

Kind of a clue that some serious things were let to just slide. If i
managed a $100 million rig and there was some sloppy and safety
critical software like that, the programmer would be on the rig
troubleshooting it 24/7. And maybe his boss to boot.
 
J

JosephKK

Richard said:
Found this recently:

++++++++++

Subject: Tech worker: 'Blue screen of death' on oil rig's computer

Gregg Keizer, *Computerworld*, 26 Jul 2010

A computer that monitored drilling operations on the Deepwater Horizon
had been freezing with a [BSOD] prior to the explosion that sank the
oil rig last April, the chief electrician aboard testified Friday at a
federal hearing.

In his testimony Friday, Michael Williams, the chief electronics
technician aboard the Transocean-owned Deepwater Horizon, said that
the rig's safety alarm had been habitually switched to a bypass mode
to avoid waking up the crew with middle-of-the-night warnings.

Williams said that a computer control system in the drill shack would
still record high gas levels or a fire, but it would not trigger
warning sirens, He also said that five weeks before the April 20
explosion, he had been called to check a computer system that
monitored and controlled drilling. The machine had been locking up
for months. You'd have no data coming through." With the computer
frozen, the driller would not have access to crucial data about what
was going on in the well.

The April disaster left 11 dead and resulted in the largest oil spill
in U.S. history.

==========

What can i say? MS Windows should not be used for safety critical
systems in any way.

Old
The Yorktown lost control of its propulsion system because its
computers were unable to
divide by the number zero, the memo said. The Yorktown’s Standard
Monitoring Control
System administrator entered zero into the data field for the Remote
Data Base Manager
program. That caused the database to overflow and crash all LAN
consoles and miniature
remote terminal units, the memo said.

http://gcn.com/articles/1998/07/13/software-glitches-leave-navy-smart-ship-dead-in-the-water.aspx
Whoever wrote the data entry program should be strung up buy the
balls for NOT checking the validity of EVERY parameter entered during entry!
There is absolutely NO excuse!

But the software met specifications. Perhaps the team that wrote the
specifications should be strung up instead.
 
Top