Maker Pro
Maker Pro

another bizarre architecture

W

Walter Banks

I suspect you would have likes the NS32000, 32 bit, 16 registers, hex
assembly, near perfect instruction set symmetry. real purty in my book.

One of the better examples where inefficient software tools killed a perfectly
good processor.

w..
 
D

Didi

which is, of course, event driven software. which is (AIUI) what windows
is all about. perhaps that explains it.

You must be joking? While event and interrupt are quite different
things,
to claim that windows is "event driven" with its many seconds range
latencies is laughable at best. A wanna-be event dirven, may be :).

Dimiter
 
T

Terry Given

Didi said:
You must be joking? While event and interrupt are quite different
things,
to claim that windows is "event driven" with its many seconds range
latencies is laughable at best. A wanna-be event dirven, may be :).

I never said it was good, and the last bit ("perhaps..." et al) is
indicative of my low opinion of windoze.

[snip]

tell me about it. A couple of years back I developed some testers that
used a PC to talk to a range of little blue I/O boxes. The PC(s) were >=
1GHz pentiummyjigs, and our pc guru (who is good) couldnt even get a
guaranteed 1ms interrupt out of the poxy OS.

Cheers
Terry
 
D

Didi

I never said it was good, and the last bit ("perhaps..." et al) is
indicative of my low opinion of windoze.

I thought this was the case, although I now see my posting did
not show it.
tell me about it. A couple of years back I developed some testers that
used a PC to talk to a range of little blue I/O boxes. The PC(s) were >=
1GHz pentiummyjigs, and our pc guru (who is good) couldnt even get a
guaranteed 1ms interrupt out of the poxy OS.

Oh I am sure nobody can even dream of 1mS latency with windows.
Some time ago, when they had only NT, a guy told me 22 mS was
the best achievable (he was living in a windows world, though, so
I don't know if this was possible or wishfull thinking).

Dimiter

You must be joking? While event and interrupt are quite different
things,
to claim that windows is "event driven" with its many seconds range
latencies is laughable at best. A wanna-be event dirven, may be :).

I never said it was good, and the last bit ("perhaps..." et al) is
indicative of my low opinion of windoze.

[snip]

tell me about it. A couple of years back I developed some testers that
used a PC to talk to a range of little blue I/O boxes. The PC(s) were >=
1GHz pentiummyjigs, and our pc guru (who is good) couldnt even get a
guaranteed 1ms interrupt out of the poxy OS.

Cheers
Terry
 
T

Terry Given

Didi said:
I thought this was the case, although I now see my posting did
not show it.




Oh I am sure nobody can even dream of 1mS latency with windows.
Some time ago, when they had only NT, a guy told me 22 mS was
the best achievable (he was living in a windows world, though, so
I don't know if this was possible or wishfull thinking).

ISTR thats about what chris offered me. absolute shit, even a poxy PIC
can do better


BTW, if you want to top post, snip out all the other stuff - there was
enough in your top post to make it all superfluous. that way us
bottom-posters dont have to :)

Cheers
Terry
 
K

Ken Smith

You must be joking? While event and interrupt are quite different
things,
to claim that windows is "event driven" with its many seconds range
latencies is laughable at best. A wanna-be event dirven, may be :).


"Event driven" is a classification of how something operates. A "Pinto"
was a car. Windows uses the event FIFO model. This ensures that the
events are taken in turn. It doesn't ensure that they are acted on
quickly. This model actually makes it harder to react quickly to events
but it saves having to implement event commutation. Consider this
happening:

Disk operation complete
Mouse moved to the right 10 mickeys
Printer port interrupt
Serial interrupt
Mouse button clicked

You can safely move the mouse action down the list to after the serial
interrupt. In an interrupt priority system it could be. Windows, however
can't easily do this sort of thing,
 
K

Ken Smith

Didi wrote: [....]
tell me about it. A couple of years back I developed some testers that
used a PC to talk to a range of little blue I/O boxes. The PC(s) were >=
1GHz pentiummyjigs, and our pc guru (who is good) couldnt even get a
guaranteed 1ms interrupt out of the poxy OS.


There are special drivers for serial ports that get about that sort of
timing.


The trend these days is to offload the work from the PC to some external
box. This way you can have the PC only set the parameters and run the
user interface. The actual work is done by a much more capable processor
such as an 8051.
 
V

Vladimir Vassilevsky

John said:
I don't like interrupts. The state of a system can become
unpredictable if important events can happen at any time. A
periodically run, uninterruptable state machine has no synchronization
problems. Interrupts to, say, put serial input into a buffer, and
*one* periodic interrupt that runs all your little state blocks, are
usually safe. Something like interrupting when a switch closes can get
nasty.

As everything else, this approach has its limits.

1. Once the number of states in the state machine gets over a hundred,
the code is very difficult to manage. The dependencies are growing all
the way. Changing anything can be a pain in the butt. It is almost
impossible to verify all kinds of transitions between the states. For
that reason it is very easy to overlook something.

2. There are kinds of tasks which call for multithreading. Caching,
hashing, calculations, vector graphics and such. Those tasks can be
organized as the state machines however it is going to be messy.

Vladimir Vassilevsky

DSP and Mixed Signal Design Consultant

http://www.abvolt.com
 
N

nospam

Didi said:
Oh I am sure nobody can even dream of 1mS latency with windows.
Some time ago, when they had only NT, a guy told me 22 mS was
the best achievable (he was living in a windows world, though, so
I don't know if this was possible or wishfull thinking).

There is no inherent reason for high interrupt latencies on PCs running
Windows.

I did quite detailed testing on a fast PC running 2k server, an edge on an
input pin triggered an interrupt which flipped an output pin.

The delay between input and output edges was nominally 17us -0 +4 which
occasionally stretched to +15 during intense disc activity. The system was
quite happy taking interrupts at 10kHz.

That of course was interrupt latency to a driver interrupt handler, not to
an associated DPC or back through the scheduling system to application
level event handlers.

You do rely on interrupt handlers in other drivers complete promptly, some,
particularly network card drivers are poor in this respect.

--
 
D

Didi

The delay between input and output edges was nominally 17us -0 +4 which
occasionally stretched to +15 during intense disc activity. The system was
quite happy taking interrupts at 10kHz.

While 17 uS is a bit too long for a GHz range CPU, this is a sane
figure.
I have seen people complain about hundreds of milliseconds response
time to the INT line of ATA drives, but this may have been on windows
95/98 which were... well, nothing one could call "working".
That of course was interrupt latency to a driver interrupt handler, not to
an associated DPC or back through the scheduling system to application
level event handlers.

Well plain user experience is enough to see the latencies there, they
are in the seconds range, sometimes tens of seconds. They may learn
how to do this in another 10 years time...

Dimiter
 
P

Paul Keinanen

While you can not get _guaranteed_ 1 ms response from standard Windows
or Linux (or in fact from any system with virtual memory, without
locking all referenced pages into memory), but perhaps 95 % to 99 % of
all events.

One way to test response times is to run a half duplex slave protocol
in the device to be tested. This will test the latencies from the
serial card to kernel mode device driver into the user mode protocol
code and then back to the device. Observe the pause between the last
character of the request and the first character of the response with
an oscilloscope or serial line analyzer. With 1 ms serial line
analyser time stamp resolution, the two way latency was somewhere
between 0 and 2 ms (or 1-2 character times at 9600 bit/s).
Oh I am sure nobody can even dream of 1mS latency with windows.
Some time ago, when they had only NT, a guy told me 22 mS was
the best achievable (he was living in a windows world, though, so
I don't know if this was possible or wishfull thinking).

A few years ago I did some tests with NT4 on 166 MHz and the 20 ms
periodic wakeup occurred within +/-2 ms more than 99 % of the time,
provided that no user interactions happened at the same time. With
user interactions, the worst case wakeup observed was about 50 ms.

Of course, any application using SetTimer can be delayed by seconds,
if the user grabs a window and shakes it all over the screen :).

Paul
 
P

Paul Keinanen

Didi wrote: [....]
tell me about it. A couple of years back I developed some testers that
used a PC to talk to a range of little blue I/O boxes. The PC(s) were >=
1GHz pentiummyjigs, and our pc guru (who is good) couldnt even get a
guaranteed 1ms interrupt out of the poxy OS.


There are special drivers for serial ports that get about that sort of
timing.


The trend these days is to offload the work from the PC to some external
box. This way you can have the PC only set the parameters and run the
user interface. The actual work is done by a much more capable processor
such as an 8051.

While there are protocol specific intelligent I/O processors doing all
the protocol handling, but for instance RocketPort 8-32 line
multiplexor cards simply implement deep Rx and Tx FIFOs for each
channel in an ASIC. No interrupts are used, but the driver scans all
Rx FIFOs once every 1-10 ms and each FIFO is emptied at each scan. The
Tx side works in a similar way.

The latency with such cards does not depend so much about the number
of active channels or number of bytes in a channel, but rather about
the scan rate. So if the scan rate is 10 ms, the Rx-processing-Tx two
way latency is 10-20 ms regardless of number of lines. With 115200
bit/s, there can be about 120 character at each scan with 10 ms scan
rate. However, if the received message ends just after the previous
scan, there can be a more than 100 character time pause before the
response is sent.

Paul
 
P

Paul Keinanen

"Event driven" is a classification of how something operates. A "Pinto"
was a car. Windows uses the event FIFO model. This ensures that the
events are taken in turn. It doesn't ensure that they are acted on
quickly. This model actually makes it harder to react quickly to events
but it saves having to implement event commutation. Consider this
happening:

Disk operation complete
Mouse moved to the right 10 mickeys
Printer port interrupt
Serial interrupt
Mouse button clicked

You can safely move the mouse action down the list to after the serial
interrupt. In an interrupt priority system it could be. Windows, however
can't easily do this sort of thing,

What you are describing sounds very much like the 16 bit Windows 3.x
style single thread system as well as 32 bit windowed applications.

However, at least in the Windows NT family, you can run console
multithread applications with ordinary main() and synchronisation
primitives similar to those used in RSX11/VMS.

If the timing is important in your Windows application, stay away from
windowed applications and use console applications instead.

Paul
 
G

Grant Edwards

While there are protocol specific intelligent I/O processors
doing all the protocol handling, but for instance RocketPort
8-32 line multiplexor cards simply implement deep Rx and Tx
FIFOs for each channel in an ASIC. No interrupts are used, but
the driver scans all Rx FIFOs once every 1-10 ms and each FIFO
is emptied at each scan. The Tx side works in a similar way.

The latency with such cards does not depend so much about the
number of active channels or number of bytes in a channel, but
rather about the scan rate. So if the scan rate is 10 ms, the
Rx-processing-Tx two way latency is 10-20 ms regardless of
number of lines. With 115200 bit/s, there can be about 120
character at each scan with 10 ms scan rate. However, if the
received message ends just after the previous scan, there can
be a more than 100 character time pause before the response is
sent.

As one of the maintainers of that driver, I thought I might
comment on the reasoning behind the decision not to use an
interrupt-driven scheme. For a few channels with sporadic data
flow, using interrupts makes a lot of sense -- the driver isn't
using up CPU time unless there is data to be transferred, and
data is handled with low latency.

However, the RocketPort driver is intended to support a large
number of channels with high throughput rates: it supports up
to 256 serial ports at 921.6K baud all in use at 100%. For a
large number of ports with heavy usage, polling the boards at a
fixed interval results in a much lower overhead than handing
interrupts from up to 256 different UARTs.

There's a definite trade-off between low-latency and efficient
high-throughput, and the RocketPort driver leans towards the
latter. [Though CPU speed has increased so much that polling at
1ms isn't really much overhad and provides pretty good
latency.]
 
N

nospam

Didi said:
While 17 uS is a bit too long for a GHz range CPU, this is a sane
figure.

But the PCI bus doesn't run at GHz and has lots of overhead for a single
cycle I/O access. On the same system the minimum width of a software
generated pulse on an I/O line was about 4us.

--
 
P

Paul Keinanen

As one of the maintainers of that driver, I thought I might
comment on the reasoning behind the decision not to use an
interrupt-driven scheme. For a few channels with sporadic data
flow, using interrupts makes a lot of sense -- the driver isn't
using up CPU time unless there is data to be transferred, and
data is handled with low latency.

However, the RocketPort driver is intended to support a large
number of channels with high throughput rates: it supports up
to 256 serial ports at 921.6K baud all in use at 100%. For a
large number of ports with heavy usage, polling the boards at a
fixed interval results in a much lower overhead than handing
interrupts from up to 256 different UARTs.

No doubt, the RocketPort cards are targeted for ISPs running a large
number of modem lines using a full duplex protocol like PPP. In such
applications, the scan rate or line turn-around time is really not an
issue.

However, in any half-duplex protocol, the latencies and line
turnaround times can seriously degrade the system throughput. A 10 ms
delay at 115k2 corresponds to 120 character dead time, which is
catastrophic for the throughput when using short messages.
There's a definite trade-off between low-latency and efficient
high-throughput, and the RocketPort driver leans towards the
latter. [Though CPU speed has increased so much that polling at
1ms isn't really much overhad and provides pretty good
latency.]

The 1 ms poll rate has not been an issue even at processors below 1
GHz.

While the poll rate for Linux 2.4 drivers was usually 1/HZ, I stumped
on some driver version that forced 10 ms poll time even for kernels
with HZ >100. When forcing the poll rate to 1 ms (HZ=1000), the
throughput and latency performance was quite acceptable at serial
speeds below 115k2.

Paul
 
K

Ken Smith

[....]
Disk operation complete
Mouse moved to the right 10 mickeys
Printer port interrupt
Serial interrupt
Mouse button clicked

You can safely move the mouse action down the list to after the serial
interrupt. In an interrupt priority system it could be. Windows, however
can't easily do this sort of thing,

What you are describing sounds very much like the 16 bit Windows 3.x
style single thread system as well as 32 bit windowed applications.

Yes applications that are tied to the user interface have the problem
right up at the surface. In other applications, the FIFO model is more
hidden but it is still there under the surface. The task dispatching
still tends to take the task in the order of the events.

The code does not run very much as interrupt code. When the interrupt
happens, the fact is recorded and then the code returns to being
non-interrupt code. It is the what happens next that matters at this
point.
However, at least in the Windows NT family, you can run console
multithread applications with ordinary main() and synchronisation
primitives similar to those used in RSX11/VMS.

You still can't get quick responce times on things like serial ports. The
problem is at the OS level.
 
J

John Larkin

But the PCI bus doesn't run at GHz and has lots of overhead for a single
cycle I/O access. On the same system the minimum width of a software
generated pulse on an I/O line was about 4us.

But what was the *maximum* width of that pulse? I'd expect that, under
Windows, the probability tail is still nonzero at some number of
seconds.

John
 
N

nospam

John Larkin said:
But what was the *maximum* width of that pulse? I'd expect that, under
Windows, the probability tail is still nonzero at some number of
seconds.

It was generated in a driver interrupt handler so the maximum was also
about 4us barring PCI bus contention from other bus masters.
--
 
P

Paul Keinanen

Yes applications that are tied to the user interface have the problem
right up at the surface. In other applications, the FIFO model is more
hidden but it is still there under the surface. The task dispatching
still tends to take the task in the order of the events.

Sounds like all the treads waiting for various events are running at
the same priority level.

On older Windows NT versions, only the tread priority levels 16, 22-26
and 31 were available in the realtime priority class, so assigning
priorities to various threads was quite tricky. Starting from Windows
2000 the levels 16-31 are available.

In non-realtime priority classes, round-robin scheduling and priority
boost for interactive threads etc. makes predicting timing more or
less pointless.

Paul
 
Top