John said:
I have made no such claims.
You have repeatedly said that current OS's (software OS's running on one
or a few cores) is inherently unreliable, while your idea of a massively
multi-core cpu running a task per core would be totally reliable. As
far as I can see, you are the only person who believes this. If I've
misunderstood (either about your claims, or if you can show that others
share the idea), please correct me.
using a hardware design that is obviously (to me, anyway)
Can't help what's obvious to you
and you offer no justification beyond repeating claims that
Isn't it?
No it isn't. At best, you can compare apples and oranges and note that
a ram chip is more reliable than windows, despite the former having more
transistors than the later has lines of source code.
We agree that typical hardware design processes are more geared to
producing reliable and well-tested designs than common software design
processes. But that does not translate into a generalisation that a
given task can be performed more reliably in hardware than software.
and therefore you can
I can't guarantee it. My ideas are necessarily simplistic, and would
Perhaps "guarantee" was a bit strong - but you stated confidently that
your 1024-core one-core-per-task devices were "gonna happen".
get more complex in a real system. Like, for example, my multicore
chip would probably have a core+GPU or three optimized for graphics,
and maybe some crypto or compression/decompression gadgets. There's no
point sacrificing performance to intellectual purity.
This is beginning to sound a lot more like a practical system - devices
exist today with several specialised cores, particularly in the embedded
market. Arguably graphics cards fall into this category, as do high-end
network cards with offload engines. But that's a far cry from your
cpu-per-thread idea, and it is done for performance reasons - *not*
reliability.
But the trend towards multiple cores, running multiple threads each,
is a steamroller. So far, it's been along the Microsoft "big OS"
model, but whan we get to scores of processors running hundreds of
threads, wouldn't a different OS design start to make sense? The IBM
Cell is certainly another direction.
Forget windows - it's a bad example of an OS, and a it's an extreme
example of unreliable software. There is no "Microsoft big OS" model -
they just have a bad implementation of a normal monolithic kernel OS.
There are uses for computers based on running large numbers of threads
in parallel - the Sun Niagara processors can handle 64 threads in
hardware (running on 8 cores). But these do not use a core (or even a
virtual core) per thread - the cores have context switches as threads
and processes come and go, or sleep and resume. Clearly you will get
better *performance* when you can minimise context switching - but no
one would plan for a system where context switching did not happen.
There is nothing to suggest that the system could be made more reliable
by avoiding context switches, except in the sense of reliably being able
to complete tasks at the required speed - it's a performance issue.
Sorry, I missed that part. Why is it, or more significantly, why *will
it* be impractical to design a chip that will contain, or act like it
contains, a couple hundred CPU cores, all interfaced to a central
cache?
Perhaps I didn't explain it well, or perhaps you didn't read these posts
- it's hard to follow everything on s.e.d.
The problem with so many cores accessing a shared cache is that you have
huge contention for the cache resources. RAM cells get bigger, slower
and more complex the more ports they have - it's rare to get more than
dual-ported RAM blocks. So if you have 1000 cores all trying to access
the same cache, you're going to have huge latencies. You also need
complex multiplexing hierarchies for your cross-switches - as each cpu
needs to access the cache, you basically require a 1000:1 multiplexer.
Assuming your cache has multiple banks and access to some IO or other
buses, you'd need something like a 1000:10 cross-switch. That would be
really horrible to implement - you'd need to find a compromise between
vast switching circuits and multiple levels introducing delays and
bottlenecks.
Here's a brief view of the Niagara II - your device would face similar
challenges, but greatly multiplied:
http://www.theinquirer.net/?article=42256
If each core has an L1 cache to relieve some of the pressure (without
it, the system would crawl), you then have a very nasty problem of
tracking cache coherency. Current cache coherency strategies do not
scale well - they are a big problem on multicore systems.
With existing multiprocessor systems, it is the cache and memory
interconnection systems that are the big problem. If you look at
high-end motherboards with 8 or 16 sockets, the cross-bar switches that
keep memory coherent and provide fast access for all the cores cost more
than the processors themselves. Building it all in one device does not
make it significantly easier (although it saves on some buffers).
There are alternative ways to connect up large numbers of cores - a NUMA
arrangement with cores passing memory requests between each other would
almost certainly be easier. But you would have very significant
latencies and bottlenecks, a very large number of inter-core buses, and
you'd still have trouble with the L1 cache coherence.
With a new OS, and certain significant restraints on the software, you
could perhaps avoid many of the L1 cache coherence problems. In
particular, being even more restrictive about memory segments would
allow you to assume that L1 data is private, and thus always coherent.
For example, if all memory came from either a read-only source for code,
or was private to the task using it, then you'd have coherency. You'd
need a system for read and write locks for memory areas, with a central
controller responsible for dishing out these locks and broadcasting
cache invalidations when these changed, but it might work.
However, you've lost out on a range of requirements here. First off,
your cores are now far from simple, and the glue logic is immense. Thus
you have lost all hope of making the device cheap and reliable.
Secondly, you've still got significant latencies for all memory access,
slowing down the throughput of any given core, crippling your maximum
thread speed. The bottlenecks don't matter so much in the grand view of
the device - the total bandwidth to the cpus should still be more than
if it were a normal multi-core device. Thirdly, you've lost
compatibility with all existing software - it won't run most programs,
as they rely on being able to have shared data access.
Why? Because Windows, and other "big" OS's like Linux, don't support
it?
Yes, that's about it. To be more precise, it will be impractical for
general purpose computing because it won't run common general purpose
programs. Even with the required major changes to the software and
compilation tools, and without the cache restrictions mentioned earlier,
it would run common programs painfully slowly.
It's generally accepted tha a microkernal-based OS will be more
reliable than a macrokernal system, because of its simplicity, but the
microkernal needs too many context switches to be efficient.
A microkernel *may* be more reliable because of its modular design -
each part is relatively simple and communicates through limited,
controlled ports. That's far from saying it always *will* be more
reliable. Much of the theoretical reliability gains of a microkernel do
not actually help in practice. For example, the ability of low-level
services to be restarted if they crash is useless when the service in
question is essential to the system. Thus there are no reliability
benefits from putting your memory management, task management, virtual
file system, or interrupt system outside the true kernel - if one of
these services dies, you're buggered whether it kills the kernel or not.
A similar situation is found in Linux - because X is separate from the
kernel, it can die and restart independently of the OS itself. But to
the desktop user, their system has died - they don't know or care if the
OS itself survived.
Most of the benefits of a microkernel can actually be achieved in a
monolithic kernel - you keep your services carefully modularised,
developed and tested as separate units with clear and clean interfaces.
It's a good development paradigm - it does not matter in practice if
the key services are directly linked with the kernel or not, since they
are all essential to the working of the OS. About the only way a
microkernel improves reliability is by enforcing this model - you are
not able to cheat.
What *does* make sense is keeping as many device drivers as possible out
of the kernel itself. Non-essential services should not be in the kernel.
http://en.wikipedia.org/wiki/Microkernel
So, let's get rid of the context switches by running each process in
its own real or virtual (ie, multithreaded) CPU. Then nobody can crash
the kernal. A little hardware protection for DMA operations makes even
device drivers safe.
You underestimate the power of software bugs - you'll *always* be able
to crash the kernel!
The context switches in this case are completely irrelevant to
reliability. The issue with microkernels and context switches is purely
a matter of performance - they cost a lot of time, especially since they
involve jumps to a different processor mode or protection ring. If you
want to produce a cpu that minimises the cost of a context switch
through hardware acceleration, then it would definitely be a good idea
and would benefit microkernel OS's in particular. But it's a
performance improvement, not a reliability improvement. Other hardware
for accelerating key OS concepts such as locks or IPC would help too.
Deja vu, I guess.
"Scientific minds" are often remarkably ready to attack new ideas,
rather than playing with, or contributing to them. I take a lot of
business away from people like that.
And I'm no dreamer: I build stuff that works, and people buy it.
So do I - but we both make and sell practical solutions which are a step
beyond our competitors. We would not try and sell something that seems
a revolutionary new idea at first sight, but terribly impractical to
implement and lacking the very benefits we first thought.
There's nothing wrong with dreaming, quite the opposite. But you have
to be able to see when it is nothing but a dream.
mvh.,
David