Maker Pro
Maker Pro

How to develop a random number generation device

K

krw

XP goes for longish periods (ms) with the interrupts off too.
I'd be perfectly happy to go back to Win2K. Unfortunately, that
isn't a reasonable choice.
 
K

krw

On Sep 15, 11:09 am, John Larkin
[....]
architecture. In a few years we'll have, say, 1024 processors on a
chip, and something new will be required to manage them. It will be a
thousand times simpler and more reliable than Windows.
I think that the number of virtual cores will grow faster than the
number fo real cores. With extra register banks and a bit of clever
design, a single ALU can look like two slightly slower ones.

Not register banks, just a couple of bits in the rename register
files.

I think you mistook my point. You would have as many set of registers
as there are virtual CPUs, perhaps plus some. When a task hits a
point where it needs to wait, its ALU section starts doing the work
for the lower priority task. This could be all hardware so no context
switching time other than perhaps a clock cycle would be needed.

I don't think I did. My point is that you don't need banks of
registers, simply use the renaming that's already there and a couple
of bits to mark which registers are renamed to which virtual CPUs.
No context switch and no bank switching. All the hardware is already
there. More registers are needed in the register files but multiple
copies of the unused ones aren't.
I figure they would form some kind of repeating pattern along the
chip. This way the problems have to be solved only once. The amount
of hardware in a FPU is more than is in the integer ALU and floating
point operations are less common so I think it would work out.

FPUs are small. I dint remember exactly but the FPU I worked on
wasn't a lot bigger than the FXU. It certainly wasn't a large as the
VMX units and those weren't all that big compared to the instruction
decoder, sequencer, and arrays. AFAIC instruction units aren't the
major issue. In fact, they're often duplicated because they can be
cheaply.
On the later X86 machines there is a second ALU just for doing
addressing. We already have sort of more ALUs than FPUs in the current
machines.

The PPC-970 had two FPUs, two FXUs, a VMX, and separate ALUs in the
Load/Store unit*S*. The dual core was still in the 200sq.mm. class.
Most of that area was in arrays.
On operations like 1/sqrt(X), doubling the number of transistors can
more than double the speed. You can make the initial guess very good
and loop much less.

Doubling it again likely won't have the same results though.
Diminishing returns bite hard.
 
M

MooseFET

I'd be perfectly happy to go back to Win2K. Unfortunately, that
isn't a reasonable choice.

Where I work, we just did another install of SUSE Linux. We have a
huge investment in DOS based code in the test department. Running
"dosemu" under SUSE they work just fine. Under XP there were lots of
problems. Under Vista there was no hope at all.

XP has a character dropping rate on the RS232 of about 1 in 10^5 to
10^7. This is much worse than the "its broken" limit on what is being
tested.

XP also doesn't let DOS talk to USB to RS-232 converters. Under SUSE,
it works just fine.

One of the machines is not on a network. XP seems to get unhappy if
it is not allowed to phone home every now and then.
 
J

John Larkin

Where I work, we just did another install of SUSE Linux. We have a
huge investment in DOS based code in the test department. Running
"dosemu" under SUSE they work just fine. Under XP there were lots of
problems. Under Vista there was no hope at all.

XP has a character dropping rate on the RS232 of about 1 in 10^5 to
10^7.

Dang, how do you get it to work that well? By waiting 15 seconds for
all the characters to dribble in from the buffers?

John
 
M

MooseFET

On Sep 15, 11:09 am, John Larkin
[....]
architecture. In a few years we'll have, say, 1024 processors on a
chip, and something new will be required to manage them. It will be a
thousand times simpler and more reliable than Windows.
I think that the number of virtual cores will grow faster than the
number fo real cores. With extra register banks and a bit of clever
design, a single ALU can look like two slightly slower ones.
Not register banks, just a couple of bits in the rename register
files.
I think you mistook my point. You would have as many set of registers
as there are virtual CPUs, perhaps plus some. When a task hits a
point where it needs to wait, its ALU section starts doing the work
for the lower priority task. This could be all hardware so no context
switching time other than perhaps a clock cycle would be needed.

I don't think I did. My point is that you don't need banks of
registers, simply use the renaming that's already there and a couple
of bits to mark which registers are renamed to which virtual CPUs.
No context switch and no bank switching. All the hardware is already
there. More registers are needed in the register files but multiple
copies of the unused ones aren't.

Yes, I think I mistook your point.


The way I had imagined it was that the registers of the virtual CPUs
that are not currently running would be in a different place than the
ones that are actually being used. My concern was not increasing the
fan in and out of the busses on the ALU so that there would be no
increase in the loading and hence delay in those circuits.

I also imagined the register exchanging having its own set of busses.
Perhaps I was too worried about bus times and not worried enough about
ALU times.
FPUs are small. I dint remember exactly but the FPU I worked on
wasn't a lot bigger than the FXU. It certainly wasn't a large as the
VMX units and those weren't all that big compared to the instruction
decoder, sequencer, and arrays. AFAIC instruction units aren't the
major issue. In fact, they're often duplicated because they can be
cheaply.

You may have a point here. I've never actually measured the sizes of
such things. I was thinking back to the designs of bit slice
machines.

Doubling it again likely won't have the same results though.
Diminishing returns bite hard.

The throughput continues to grow fairly quickly but you end up with a
pipeline. When the circuit gets to a certain point, the stages become
equivelent to a multiplier circuit.

BTW:
There are four ways of getting to a sqrt() function. If you are doing
it on a micro controller or other machince where dividing is very
costly Newtons method is the slowest. If you have a fast multiply
finding 1/sqrt(X) is much quicker.
 
D

David Brown

John said:
You don't install OS patches? How do you manage that?

It's interesting that you didn't know about Patch Tuesday.

As a general rule, I don't install OS patches either - they lead to too
much trouble. When managing a network with windows machines, especially
with a mixture of flavours, you just have to accept that they are
vulnerable and ensure that you avoid bad stuff getting into the network
- there is no point in hoping that the latest windows patches will help.
That means serious checking on incoming email (kill *all* executable
attachments, and virus check the rest) with all other pop3 access
blocked, a decent firewall (obviously not running windows!), and good
user training backed up by nasty threats if anyone tries to do anything
risky on the machines.
 
D

David Brown

John said:
John said:
On Sep 15, 11:09 am, John Larkin
[....]
architecture. In a few years we'll have, say, 1024 processors on a
chip, and something new will be required to manage them. It will be a
thousand times simpler and more reliable than Windows.
I think that the number of virtual cores will grow faster than the
number fo real cores. With extra register banks and a bit of clever
design, a single ALU can look like two slightly slower ones.

I expect to see multicore machines with less actual floating point
ALUs than actual integer ALUs.
Sounds sort of like Sun's Niagra chips, which have (IIRC) 8 cores, each
with 4 threads, but only a few floating point units. For things like
web serving, it's ideal.
Yup. Low-horsepower tasks can just be a thread on a multithread core,
and many little tasks don't need a dedicated floating-point unit.

My point/fantasy is that OS design should change radically if many,
many real or virtual CPUs are available. One CPU would be the manager,
and every task, process, or driver could have its own, totally
confined and protected, CPU, and there would be no context switching
ever, and few interrupts in fact.
That's not going to work for Linux, anyway - there is a utility thread
spawned per cpu at the moment (work is underway to avoid this, because
it is a bit of a pain when you have thousands of cpus in one box).

However, there is no point in having a cpu (or even a virtual cpu)
dedicated to each task. Many sorts of tasks spend a lot of time
sleeping while waiting for other events - a cpu in this state is a waste
of resources.

Only if you think of a CPU as a valuable resource. As silicon shrinks,
a CPU becomes a minor bit of real estate. It makes sense to use it
when there's something to do, and put it to sleep when there's not.
Lots of power gets saved by not doing context switches.

CPUs *are* a valuable resource - modern cpu cores take up a lot of
space, even when you exclude things like the cache (which take more
space, but cost less per mm^2 since you can design in a bit of
redundancy and thus tolerate some faults).

The more CPUs you have, the more time and space it costs to keep caches
and memory accesses coherent. There are some sorts of architectures
which work well with multiple CPU cores, but these are not suitable for
general purpose computing.
My point is that large numbers of CPU cores *will* become common and
cheap, and we need a new type of OS to take advantage of this new
reality. Done right, it could be simple and astoundingly secure and
reliable.

I would be very surprised to see a system where the number of CPU cores
was greater than the number of processes. I expect to see the number of
cores increase, especially for server systems, but I don't expect to see
systems where it is planned and expected that most cores will sleep most
of the time.
I'd be happy to waste a little silicon if I could have an OS that
doesn't crash and that doesn't go to sleep for seconds at a time for
no obvious reason.

Multiple cores gives absolutely no benefits in terms of reliability or
stability - indeed, it opens all sorts of possibilities for
hard-to-debug race conditions.
 
J

John Larkin

What is interesting is that you actually think that is how it is done.

Overkill. You likely won't manage those worth a shit... either.

There's not a lot I can do about Windows (I have to run it for some of
the apps I use) but it's certainly worth $3K to have reliable hardware
and drives. Every time a Dell dies, it costs me or one of my people a
week or two to get everything back to where it was, and we're surely
worth more than $3K a week.

The cool thing about raid hot-plug is that I can occasionally plug in
a blank drive, and my C: drive gets cloned, OS and all. I stash the
clone in a baggie. If my machine dies for any reason, I grab a spare
box from down the hall, plug in the copy of C:, and I'm back online in
5 minutes.

And, once a year maybe, I plug a brand-new drive in one of the raid
lots, so my drives never die from shear wear-out.

John
 
J

John Larkin

John said:
John Larkin wrote:
On Sep 15, 11:09 am, John Larkin
[....]
architecture. In a few years we'll have, say, 1024 processors on a
chip, and something new will be required to manage them. It will be a
thousand times simpler and more reliable than Windows.
I think that the number of virtual cores will grow faster than the
number fo real cores. With extra register banks and a bit of clever
design, a single ALU can look like two slightly slower ones.

I expect to see multicore machines with less actual floating point
ALUs than actual integer ALUs.

Sounds sort of like Sun's Niagra chips, which have (IIRC) 8 cores, each
with 4 threads, but only a few floating point units. For things like
web serving, it's ideal.

Yup. Low-horsepower tasks can just be a thread on a multithread core,
and many little tasks don't need a dedicated floating-point unit.

My point/fantasy is that OS design should change radically if many,
many real or virtual CPUs are available. One CPU would be the manager,
and every task, process, or driver could have its own, totally
confined and protected, CPU, and there would be no context switching
ever, and few interrupts in fact.

That's not going to work for Linux, anyway - there is a utility thread
spawned per cpu at the moment (work is underway to avoid this, because
it is a bit of a pain when you have thousands of cpus in one box).

However, there is no point in having a cpu (or even a virtual cpu)
dedicated to each task. Many sorts of tasks spend a lot of time
sleeping while waiting for other events - a cpu in this state is a waste
of resources.

Only if you think of a CPU as a valuable resource. As silicon shrinks,
a CPU becomes a minor bit of real estate. It makes sense to use it
when there's something to do, and put it to sleep when there's not.
Lots of power gets saved by not doing context switches.

CPUs *are* a valuable resource - modern cpu cores take up a lot of
space, even when you exclude things like the cache (which take more
space, but cost less per mm^2 since you can design in a bit of
redundancy and thus tolerate some faults).

The more CPUs you have, the more time and space it costs to keep caches
and memory accesses coherent. There are some sorts of architectures
which work well with multiple CPU cores, but these are not suitable for
general purpose computing.
My point is that large numbers of CPU cores *will* become common and
cheap, and we need a new type of OS to take advantage of this new
reality. Done right, it could be simple and astoundingly secure and
reliable.

I would be very surprised to see a system where the number of CPU cores
was greater than the number of processes. I expect to see the number of
cores increase, especially for server systems, but I don't expect to see
systems where it is planned and expected that most cores will sleep most
of the time.

Well, I remember 64-bit static rams, and 256-bit DRAMS. I can't see
any reason we couldn't have 256 or 1024 cpu's on a chip, especially if
a lot of them are simple integer RISC machines.
Multiple cores gives absolutely no benefits in terms of reliability or
stability - indeed, it opens all sorts of possibilities for
hard-to-debug race conditions.

They don't if you insist on running a copy of a bloated OS on each. A
system designed, from scratch, to run on a pool of cheap CPUs could be
incredibly reliable.

It's gonna happen.

John
 
J

John Larkin

On Sep 15, 11:09 am, John Larkin
[....]
architecture. In a few years we'll have, say, 1024 processors on a
chip, and something new will be required to manage them. It will be a
thousand times simpler and more reliable than Windows.
I think that the number of virtual cores will grow faster than the
number fo real cores. With extra register banks and a bit of clever
design, a single ALU can look like two slightly slower ones.

Not register banks, just a couple of bits in the rename register
files.

I think you mistook my point. You would have as many set of registers
as there are virtual CPUs, perhaps plus some. When a task hits a
point where it needs to wait, its ALU section starts doing the work
for the lower priority task. This could be all hardware so no context
switching time other than perhaps a clock cycle would be needed.

Right. Move a lot of the functionality of the OS into hardware.
Whether the 1024 CPUs are real hardware or pipeline tricks, similar to
multithreading, we can count on the hardware to work right.

John
 
V

Vladimir Vassilevsky

Dear John,
Try developing a perfect OS of your own. I did. That was a very
enlightening experience of why certain things have to be done by the
certain ways. Particular questions welcome.

Well, I remember 64-bit static rams, and 256-bit DRAMS. I can't see
any reason we couldn't have 256 or 1024 cpu's on a chip, especially if
a lot of them are simple integer RISC machines.

1024 CPUs = 1048576 software interfaces and a hell of the bus arbitration.

The weak link is a developer. It is obviously more difficult to
develop multicore stuff; hence it is a higher probability of flaws.

Especially if you remember about the 50-page silicon erratas for pretty
much any modern CPU.
They don't if you insist on running a copy of a bloated OS on each. A
system designed, from scratch, to run on a pool of cheap CPUs could be
incredibly reliable.

What do you think in particular would be better for a typical desktop
applications?
It's gonna happen.

You have to listen to the screams of the SEL software developers...


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
 
J

John Larkin

Dear John,
Try developing a perfect OS of your own. I did. That was a very
enlightening experience of why certain things have to be done by the
certain ways. Particular questions welcome.

I did write three RTOS's, one for the 6800, one for the PDP-11, one
for the LSI-11. As far as I know, they were perfect, in that they ran
damned fast and had no bugs. The 6800 version included a token ring
LAN thing, which I invented independently in about 1974.

1024 CPUs = 1048576 software interfaces and a hell of the bus arbitration.

No worse a software interface than if each process was running on a
single shared CPU; much less, in fact, since irrevelant interrupts,
swapping, and context switches aren't going on. Each process
absolutely owns a CPU and only interacts with other processes when
*it* needs to, probably through shared memory and semaphores.

As far as bus arbitration goes, they all just share a central cache on
the chip, with a single bus going out to dram. Cache coherence becomes
trivial.
The weak link is a developer. It is obviously more difficult to
develop multicore stuff; hence it is a higher probability of flaws.

Putting a few hundred RISC cores on a chip, connecting to a central
cache, is easy. You only have to get it right once. In our world,
incredibly complex hardware just works, and modestly complex software
is usually a bag of worms. Clearly we need the hardware to help the
software.
Especially if you remember about the 50-page silicon erratas for pretty
much any modern CPU.

Intel, maybe. Are any of the RISC machines that bad? But my PC doesn't
have hardware problems, it has software problems.

What do you think in particular would be better for a typical desktop
applications?

Oh, 256 CPUs and, say, 32 FPUs should be plenty.
You have to listen to the screams of the SEL software developers...

Of course lots of software people won't like this. Well, they had
their chance and blew it.

John
 
J

Joel Kolstad

John Larkin said:
But my PC doesn't
have hardware problems, it has software problems.

Sure it does (have hardware problems), Intel was just smart enough to allow
patching of the microcode so that buggy features are worked around or simply
not used. When the buggy simply can't be fixed, it's the compiler writers who
get burdened with having to keep up which bugs are still present and insuring
that code isn't generated that would expose them (very few people program x86
CPUs in assembly anymore!). Intel is also smart enough to do a lot of testing
any time a new CPU comes out -- I'm sure there are still plenty of people
there who remember the nasty FDIV bug, as well as the lesser-known problem
with the first 1GHz CPUs that would randomly fail to compile the Linux kernel

I've been surprised at just how buggy a lot of OC software is if you actually
start pushing it to its limits -- it's clear that much software today only
gets very rudimentary testing. (And as I've stated before, I personally know
"programmers" who believe they can claim on a progress report that they
"finished such and such software" as soon as it *compiles*. :-( )
 
D

David Brown

John said:
John said:
On Sun, 16 Sep 2007 22:07:42 +0200, David Brown

John Larkin wrote:
On Sep 15, 11:09 am, John Larkin
[....]
architecture. In a few years we'll have, say, 1024 processors on a
chip, and something new will be required to manage them. It will be a
thousand times simpler and more reliable than Windows.
I think that the number of virtual cores will grow faster than the
number fo real cores. With extra register banks and a bit of clever
design, a single ALU can look like two slightly slower ones.

I expect to see multicore machines with less actual floating point
ALUs than actual integer ALUs.

Sounds sort of like Sun's Niagra chips, which have (IIRC) 8 cores, each
with 4 threads, but only a few floating point units. For things like
web serving, it's ideal.

Yup. Low-horsepower tasks can just be a thread on a multithread core,
and many little tasks don't need a dedicated floating-point unit.

My point/fantasy is that OS design should change radically if many,
many real or virtual CPUs are available. One CPU would be the manager,
and every task, process, or driver could have its own, totally
confined and protected, CPU, and there would be no context switching
ever, and few interrupts in fact.

That's not going to work for Linux, anyway - there is a utility thread
spawned per cpu at the moment (work is underway to avoid this, because
it is a bit of a pain when you have thousands of cpus in one box).

However, there is no point in having a cpu (or even a virtual cpu)
dedicated to each task. Many sorts of tasks spend a lot of time
sleeping while waiting for other events - a cpu in this state is a waste
of resources.
Only if you think of a CPU as a valuable resource. As silicon shrinks,
a CPU becomes a minor bit of real estate. It makes sense to use it
when there's something to do, and put it to sleep when there's not.
Lots of power gets saved by not doing context switches.
CPUs *are* a valuable resource - modern cpu cores take up a lot of
space, even when you exclude things like the cache (which take more
space, but cost less per mm^2 since you can design in a bit of
redundancy and thus tolerate some faults).

The more CPUs you have, the more time and space it costs to keep caches
and memory accesses coherent. There are some sorts of architectures
which work well with multiple CPU cores, but these are not suitable for
general purpose computing.
My point is that large numbers of CPU cores *will* become common and
cheap, and we need a new type of OS to take advantage of this new
reality. Done right, it could be simple and astoundingly secure and
reliable.
I would be very surprised to see a system where the number of CPU cores
was greater than the number of processes. I expect to see the number of
cores increase, especially for server systems, but I don't expect to see
systems where it is planned and expected that most cores will sleep most
of the time.

Well, I remember 64-bit static rams, and 256-bit DRAMS. I can't see
any reason we couldn't have 256 or 1024 cpu's on a chip, especially if
a lot of them are simple integer RISC machines.

You can certainly get 1024 CPUs on a chip - there are chips available
today with hundreds of cores. But there are big questions about what
you can do with such a device - they are specialised systems. To make
use of something like that - you'd need a highly parallel problem (most
desktop applications have trouble making good use of two cores - and it
takes a really big web site or mail gateway to scale well beyond about
16 cores). You also have to consider the bandwidth to feed these cores,
and be careful that there are no memory conflicts (since cache coherency
does not scale well enough).
They don't if you insist on running a copy of a bloated OS on each. A
system designed, from scratch, to run on a pool of cheap CPUs could be
incredibly reliable.

That's a conjecture plucked out of thin air. Of course a dedicated OS
designed to be limited but highly reliable is going to be more reliable
than a large general-purpose OS that must run on all hardware and
support all sorts of software - but that has absolutely nothing to do
with the number of cores!
 
D

David Brown

John said:
I did write three RTOS's, one for the 6800, one for the PDP-11, one
for the LSI-11. As far as I know, they were perfect, in that they ran
damned fast and had no bugs. The 6800 version included a token ring
LAN thing, which I invented independently in about 1974.



No worse a software interface than if each process was running on a
single shared CPU; much less, in fact, since irrevelant interrupts,
swapping, and context switches aren't going on. Each process
absolutely owns a CPU and only interacts with other processes when
*it* needs to, probably through shared memory and semaphores.

A shared memory interface for 1024 cpus? That's going to be absolutely
vast, or have terrible latency.

I still don't understand why you think that interrupts or context
switches are a reliability issue - processors don't have problems with them.

And I'd love to hear you explain to customers that while their web
server has a load average of a couple of percent, they need to buy a
second processor chip just to run an extra cron job. A single cpu per
process will *never* be realistic.
As far as bus arbitration goes, they all just share a central cache on
the chip, with a single bus going out to dram. Cache coherence becomes
trivial.

"Just share a central cache?" It might sound easy to you, but I suspect
it would be *slightly* more challenging to implement.
Putting a few hundred RISC cores on a chip, connecting to a central
cache, is easy. You only have to get it right once. In our world,
incredibly complex hardware just works, and modestly complex software
is usually a bag of worms. Clearly we need the hardware to help the
software.

You are too used to solid, reliable, *simple* cores like the cpu32.
Complex hardware is like complex software - it *is* complex software,
written in design languages then "compiled" to silicon. Like software,
big and complex hardware has bugs.
Intel, maybe. Are any of the RISC machines that bad? But my PC doesn't
have hardware problems, it has software problems.

Yes, many RISC machines have substantial errata. The more complex you
make the design, the more bugs you get.

What you seem to be missing is that although the cores on your 1K cpu
chip are simple (and can therefore be expected to be reliable, if
designed well), they don't exist alone. If you want them to support
general purpose computing tasks, rather than a massive SIMD system, then
you have a huge infrastructure around them to feed them with instruction
streams and data, and enormous complications trying to keep memory
consistent.
Oh, 256 CPUs and, say, 32 FPUs should be plenty.

My desktop machine might well run more than 256 processes. How does
that fit in your device? But most of the time, there are only 2 or 3
processes doing much work - often there will be 1 process which should
run as fast as possible, as single-thread performance is the main
bottleneck for desktop cpus.
 
R

Richard Henry

Sure it does (have hardware problems), Intel was just smart enough to allow
patching of the microcode so that buggy features are worked around or simply
not used. When the buggy simply can't be fixed, it's the compiler writers who
get burdened with having to keep up which bugs are still present and insuring
that code isn't generated that would expose them (very few people program x86
CPUs in assembly anymore!). Intel is also smart enough to do a lot of testing
any time a new CPU comes out -- I'm sure there are still plenty of people
there who remember the nasty FDIV bug, as well as the lesser-known problem
with the first 1GHz CPUs that would randomly fail to compile the Linux kernel

I've been surprised at just how buggy a lot of OC software is if you actually
start pushing it to its limits -- it's clear that much software today only
gets very rudimentary testing. (And as I've stated before, I personally know
"programmers" who believe they can claim on a progress report that they
"finished such and such software" as soon as it *compiles*. :-( )

That's because "Unit testing" is the next block on the Work Breakdown
Structure.
 
N

Nobody

But it knows what chunks of memory it has allocated to a particular
process. As long as it's in your own memory space, who cares if you
overwrite/overrun your own buffers?

Doing so is the essence of a "buffer overrun exploit", one of the most
common types of security vulnerability for code written in C/C++.

It allows a malicious user to make a program do something that it isn't
supposed to do.

E.g. consider a program being run on a web server to process form
input from a web page. If the program suffers from a buffer overrun flaw,
simply sending the right data in a POST request can allow the attacker to
execute arbitrary code on the web server.

Or a buffer overrun in a mail client could allow someone to run arbitrary
code on the user's machine by sending them a specially crafted email.

This is one of the common ways that computer systems get "hacked".

Persuading a process to write outside of its allotted address space is
harmless. The CPU will cause an exception and the OS will typically
terminate the process. Even if it didn't, there's nothing there for it to
damage. With modern hardware (e.g. 80286 and later running in protected
mode), the address space of one process (or the OS kernel) simply isn't
"visible" to another process.
 
J

Joel Kolstad

Nobody said:
With modern hardware (e.g. 80286 and later running in protected
mode), the address space of one process (or the OS kernel) simply isn't
"visible" to another process.

True, but if you can manage to create a buffer overflow in a kernel process
(the TCP/IP stack being a common target here, often implemented as a
kernel-level driver), you have the keys to the kingdom.
 
N

Nobody

So you agree at this point.

Yes; subject to the caveat that the term "buffer overrun" is normally used
in reference to the exploitable case, where the overrun occurs between
buffers within the process' address space. E.g. the wikipedia entry for
"buffer overrun" or "buffer overflow" only addresses this case:

http://en.wikipedia.org/wiki/Buffer_overrun

Technical description

A buffer overflow occurs when data written to a buffer, due to
insufficient bounds checking, corrupts data values in memory addresses
adjacent to the allocated buffer. Most commonly this occurs when copying
strings of characters from one buffer to another.

The case where the buffer is at the beginning or end of a mapped region,
and the overrun attempts to modify a different region, is usually ignored.
The OS will just kill the process, so there's no exploit potential, and
it's statistically far less likely (most buffers aren't at the beginning
or end of a mapped region).
Yes go back a re-read it carefully.

What's to re-read? Exploitation requires the write to succeed, which
requires that the overrun has to occur into memory which is writable by
the task.
You seem to be confused about what we are talking about.

I know what *I'm* talking about, which is what most programmers mean by
"buffer overrun", i.e. the exploitable case, not the segfault case.
We are
talking about making an OS safe. If an application task commits an
overrun that causes that task to fail, it is quite a different matter
than talking about a buffer over run based exploit.

The segfault case is uninteresting; it's a "solved" problem. The
exploitable case is one of the main mechanisms through which computers get
hacked. It's *the* main mechanism for most C/C++ code.
He is talking about process isolation and it not being violated by a
buffer overrun if the OS is well written. He is correct in what he
said.

Indeed. But none of the current OSes are defective in this regard.

Windows 95/98/ME lacked memory protection on certain parts of memory for
backwards compatibility (i.e. portions of the bottom megabyte were shared
between processes and globally writable, for compatibility with real-mode
(8086) code.

And this case isn't what people are normally referring to if they're
talking about "problems", "vulnerabilities" etc of buffer overruns.
You have assumed that by causing the over run the attacker has gained
control. As I explained earlier this need not be the case.

An attacker *may* gain control through this mechanism. Whether or not they
can depends upon how the variables are laid out in memory and how the
variables affect the program.

The point is that protection against this issue mostly has to be done by
the language and/or compiler.
 
Top