John said:
[....]
Programmers have pretty much proven that they cannot write bug-free
large systems.
In every other area, humans make mistakes and yet we seem surprised
that programmers do too.
In most other areas of endeavour small tolerance errors do not so
often lead to disaster. Boolean logic is less forgiving. And fence
post errors which even the best of us are inclined to make are very
hard to spot. You see what you intended to write and not what is
actually there. Walkthroughs and static analysis tools can find these
latent faults if budget permits.
Some practitioners, Donald Knuth for instance have managed to produce
virtually bug free non-trivial systems (TeX). OTOH the current
business paradigm is ship it and be damned. You can always sell
upgrades later. Excel 2007 is a pretty good current example of a
product shipped way before it was ready. Even Excel MVPs won't defend
it.
Unless there's some serious breakthrough - which is
really prohibited by the culture
I think there really is a fundamental limitation that makes it such
that the programming effort becomes infinite to make a bug free large
system. We do seem to be able to make bug free small systems,
however.
Software programming hasn't really had the true transition to a hard
engineering discipline yet. There hasn't been enough standardisation
of reliable component software parts for sale off the shelf equivalent
in complexity to electronic ICs that really do what they say on the
tin and do it well.
By comparison thanks to Whitworth & Co mechanical engineering has
standardised nuts and bolts out of a huge arbitrary parameter space.
Nobody these days would make their own nuts and bolts from scratch
with randomly chosen pitch, depth and diameter. Alas they still do in
software
Compare a software system to an FPGA. Both are complex, full of state
machines (implicit or explicit!), both are usually programmed in a
heirarichal language (C++ or VHDL) that has a library of available
modules, but the FPGAs rarely have bugs that get to the field, whereas
most software rarely is ever fully debugged.
There are a few points here. If you take a "typical" embedded card with
an FPGA and a large program, you'll find the software part has orders of
magnitude more lines of programmer-written code than the FPGA. In an
FPGA, the space is often taken with pre-written code (such as an
embedded processor or other high-level macros), and much of the
remaining space is taken by multiple copies of components. Although
getting each line of the FPGA code right is harder than getting each
line of the C/C++ right, there are less lines in total. And for various
reasons (not all of which are understood), studies show that the rate of
bugs in programs is roughly proportional to the number of lines, almost
independent of the language and of the type of programming. Weird, but
apparently true. That's part of the reason for using higher level
languages like Python (or MyHDL for FPGA design) rather than C++ - not
only do programmers typically code faster, they make less mistakes.
Of course, an FPGA project typically involves a lot more comprehensive
testing than a typical C++ project, and is typically better planned
(with less feature creep), both of which are critical to getting low bug
rates.
However, you have to remember that hardware (and FPGA) and software do
different jobs. Sometimes there are jobs that can be implemented well
in either, but that's seldom the case. And when there is, it is
generally much faster to develop a software solution. What is often
missing in the software side is a commitment of time and resources to
proper development and testing that would mean the development took
longer, but gave a more reliable result (and thus often saves money in
the long term). With FPGA design, if you don't make such a commitment,
your project will never work at all - thus it is more likely that a
released product is nearly bug-free.
So, computers should use more hardware and less software to manage
resources. In fact, the "OS kernal" of my multiple-CPU chip could be
entirely hardware. Should be, in fact.
There are certainly benefits in putting some of an OS in hardware - but
the hardware can never be as flexible as software. If you want an
example of a device with a hardware OS, have a look at
http://www.innovasic.com/fido.htm (it's 68k based, so you'll like it).
I've seen other cpus with OS hardware - typically it is to make task
switching more predictable so that hardware devices like timers and
UARTs can be simulated in software.
Yes. The bug level is proportional to the ease of making revisions.
That's why programmers type rapidly and debug, literally, forever.
No, a lack of commitment to proper design strategy and testing is why
software developers typically start in the middle of a project and never
properly finish it. The ease of making revisions and sending out
updates is part of why such a commitment is never made - managers
believe it is cheaper to ship prototype software and let users do the
testing.
The number of bugs is roughly proportional to the lines of code. It's
the debugging and testing (or lack thereof) that is often the problem,
combined with structural failures due to lack of design.