Maker Pro
Maker Pro

Larkin, Power BASIC cannot be THAT good:

J

Jan Panteltje

runtime in C on a 2.4Ghz "intel core2 duo" machine is about
3 seconds. about 2 when multithreaded.

dunno about that, those beasts are maninly focussed on floating point
performance, and this is an integer problem.

Cray was (maybe still is?) a vector processing hardware platform.
With vector processing I mean this:

memory
add memory
memory
address counter

So basically clock from one memory into the next via the adder.
That is very very fast for huge data sets that all need the same operation.

I have done some hardware like that :)
No memory has 'float', it is all integers of some size :)



What does you C code look like? Mine is in the other posting.


here's the multithreaded one I used

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>

static long S[64000000];
static short A[64000000];

typedef struct argstype
{
short *a_start;
long *s_start;
long count;
} argstype;

static void *thread_func(void *vptr_args)
{
argstype *arg=vptr_args;
{
long *sp=arg->s_start;
short *ap=arg->a_start;
long x=arg->count;
do
*sp++ += *ap++;
while (--x);
}
return NULL;
}

int main(int argc,char **argv)
{
long y,n;

n=10;
if(argc>1)
n=atoi(argv[1]);

for(y=0;y<n;++y)
{
argstype instance_a={A,S,32000000};
argstype instance_b={A+32000000,S+32000000,32000000};
pthread_t thread_a;

// start a thread to process half the data
if (pthread_create(&thread_a, NULL, thread_func, &instance_a) != 0)
{
perror("pthread_create");
return EXIT_FAILURE;
}
// process the other half in the foreground
thread_func(&instance_b);

// wait for the thread to finish.

if (pthread_join(thread_a, NULL) != 0)
{
perror("pthread_join");
return EXIT_FAILURE;
}
}
return 0;
}

OK, I tried that, gcc -o test10 test10.c -lpthread
but it wont run on the eeePC, it simply says:
Killed
Looks like 512 MB is not enough :)
And only one core of course.
On an other PC with 385964k RAM it starts swapping:
~# nice -n -19 ./test10
nice -n -19 ./test10 19.25s user 0.63s system 99% cpu 20.046 total
 
M

Martin Brown

Jasen said:
It looks like it, with -O4 optimisiation on dual-threaded and single-threaded
versions both run in 1.9 seconds here.

Yes. It is bandwidth limited on the external ram because of the way it
hits sequential locations once per loop. Using DDR2 ram execution is
roughly 2s and older DDR ram is about 3s in execution. The absolutely
dire code generated by debug mode with all optimisation disabled is only
4s. So it isn't really a test of PowerBasics code generation at all.

The real test of an optimising compiler is how well it can exploit cache
aware alogrithms with multiple nested loops. A cache aware version of
this vector add and accumulate (without using SIMD instructions) is
about 15% faster than the simple loop on a good optimising compiler
(actually that is measured on MSC I haven't checked its code generation
for optimisation - too tedious). I suspect gcc might be better.

The optimisation totally disabled version compiled by MSC is:

for (i = 0; i < ARRAY_SIZE; i++) {
004010BE mov dword ptr ,0
004010C5 jmp main+0D0h (4010D0h)
004010C7 mov ecx,dword ptr
004010CA add ecx,1
004010CD mov dword ptr ,ecx
004010D0 cmp dword ptr ,3D09000h
004010D7 jge main+0F7h (4010F7h)
s += a;
004010D9 mov edx,dword ptr
004010DC mov eax,dword ptr [a]
004010DF movsx ecx,word ptr [eax+edx*2]
004010E3 mov edx,dword ptr
004010E6 mov eax,dword ptr
004010E9 add ecx,dword ptr [eax+edx*4]
004010EC mov edx,dword ptr
004010EF mov eax,dword ptr
004010F2 mov dword ptr [eax+edx*4],ecx
}
004010F5 jmp main+0C7h (4010C7h)

And takes 4s on P4 3GHz with DDR ram.
Optimiser turns it into something like

mov esi, a
mov edi, s
mov ecx, ARRAY_SIZE
xor edx, edx
forloop:
movsx eax, word ptr[esi+2*edx]
add dword ptr[edi+4*edx], eax
inc edx
loop forloop

And this extremely tight optimised loop code still takes 3s with DDR ram
and just under 2s with DDR2 ram.

MSC has changed the way it does this. My oldest compiler generates a
pointer implementation for array access whereas newer compilers exploit
the additional scaled indexing features of later CPUs. The oldest
compiler generates very slightly faster code for the loop (at least it
does after a manual reordering of the generated instructions).

forloop:
movsx eax,word ptr[esi]
add esi, 2
add dword ptr[edi], eax
add edi, 4
loop forloop

Code snippets above subject to typos.
Be interested to see what code gcc -O4 generates.

Regards,
Martin Brown
 
N

Nico Coesel

Jan Panteltje said:
Even C++ compiler writers could at one point no longer agree what it was about :)
In my view you must be complete idiot if you invent 'operator overloading',
and write needlessly long things with :: in between.

But ... as with many things, many went for it, many cannot do anything else.
Maybe my strong opinion comes from the fact that I started with hardware,
then BASIC, then asm...., and then finally C.
A bit of php too, nice for websites.
Never ever needed the ++ in C for anything, never.

At some point object oriented programming turns out to be very
convenient. It is possible to use OO programming techniques in plain
C, but it is a pain in the ass and memory allocation becomes tedious
(just look at the Linux kernel). Thats where C++ kicks in. I have the
same background but I gradually started to use more and more C++.
 
V

Vladimir Vassilevsky

Jan Panteltje wrote:

I really do not know for WHAT?

Sure. Let's do everything in assembler. Or, better, directly in the
machine codes.

I will give an example from the Linux area about C++, and how bad it is;
You must have (as a Linux user) have heard of Qt.
So... but it made Trolltech some money.

That's the point.
As a money making machine the commerce in C++ compilers, C++ books, and as a stimulus
for ever more powerful hardware buying, C++ does the hardware manufacturers some service.
Now I am not against GUIs, I am against C++ related bloat crap...

1. Bloated crap makes money.
2. C++ is just a tool. If used properly, it helps avoiding certain kinds
of the dumb mistakes and simplifies the cooperation between the developers.
Wow, almost 1.6 MB!!!! >
Now if I do (du reports in kB):
du /usr/local/Trolltech/Qt-4.1.0/
348855 /usr/local/Trolltech/Qt-4.1.0/
384 MB YES three hundred eighty four Mega Byte!

Who cares.
If you do not mind my asking : What is that incredible application you need that ++ in C for?

If you don't mind my questions, what was the largest program piece that
you wrote yourself? What was the largest commercial software project
that you participated to as a part of a team?


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
 
N

Nico Coesel

Jan Panteltje said:
I really do not know for WHAT?

The whole idea behind OO is to keep the data and the functions that
process the data together in one unit. Even more, functions and data
that are internal are hidden from other objects in the program. Sounds
complicated but the static keyword makes symbols hidden to other
object files in C.

Imagine you create a library. The library obviously has internal
functions and data that you don't want to export so you declare them
static or tell the linker not to export some symbols. That is already
some sort of OO programming. OO programming is like building your
application from small internal libraries.

Another thing they added is object inheritance so you can create
derived classes from a base class. Last year I wrote a class that can
drawn signals on a 'canvas' (drawing area). From this class I derived
two new classes. One that can draw to a printer canvas and one to draw
to the screen canvas. Ofcourse the same thing can be done in plain C,
but the code would have been much harder to understand.

Under the hood a C++ object is nothing more than a pointer to a struct
containing pointers to data and functions (vtable).
The whole paradigm (difficult word had to look it up one time) or
say perhaps "concept" of "objects' and 'object oriented' to me only
means that the one who talks about it does not realize that
processors normally (Turing machines) process things sequentially,

Sorry, but this doesn't make any sense to me. OO has nothing to do
with multi-threading or parallel processing.
No problem with Linux kernel in C, it would have been super-bloated and dead
long time ago if Linus had allowed C++ in it.

I will give an example from the Linux area about C++, and how bad it is;
You must have (as a Linux user) have heard of Qt.

Yup. Heard of it. I stayed away from Qt since the beginning. Never
liked the fact it is actually closed source.
Qt is some toolset to create a GUI interface... basically.

It is way more, it is intended to create cross platform applications.
So you can have one source which builds for several platforms.
Therefore Qt must provide a universal replacement for several Windows
and Linux APIs. That takes a lot of code! I use wxWidgets for that
purpose.
Qt3 was big, and heavy, KDE desktop and a whole lot of applications for that, were,
like objects, like widgets on a screen.
If you want a good, fast, small, practically stable, almost bug free, toolkit to make GUIs in Linux
use xforms, I use it all the time, libforms is:
-rwxr-xr-x 1 root root 1580536 2008-02-04 14:42 /usr/X11R6/lib/libforms.so.1.0*
Wow, almost 1.6 MB!!!!

I use wxWidgets to write portable applications (I do most development
on a Windows machine). I admit wxWidgets is bigger than libforms but
wxWidgets does a lot more. I still manage to put X-Windows, GTK2
libraries + engines, a wxWidgets C++ application and misc Linux stuff
(busybox, ftp server, etc) into 12MB of flash with several MB to spare
on an embedded device (industrial control application).
Well, drugs are addictive too.

If you do not mind my asking : What is that incredible application you need that ++ in C for?

See above. But I doubt C++ has anything to do with Qt being bloatware.

Look into Gnash (gnu flash player). Entirely written in C++. It also
runs on the hardware platform I mentioned above with plenty of space
to spare.
 
F

Frank Buss

Jan said:
For commercial apps it is nice, as you do not have to release source then of
your embedded systems that you sell.
You could buy a Qt license, and you would be OK using it commercially.
Or are you telling me you are not releasing source and using Linux
apps like busybox (see your statement below) in your products?

Sun bought Trolltech, now it is LPGL for all platforms. Some time ago,
before Sun bought the company, you could bought a commercially licence for
Linux, embedded platforms and Windows, or you could use the GPL version
(not for Windows). Now you can use it with commercial applications for all
platforms for free, if you publish all changes to the Qt library you made.
But with LGPL you don't have to publish the source code of your program any
more.
 
N

Nico Coesel

Jan Panteltje said:
Yes, but this can be done in C in functions with local vars very easy.

But not global variables inside one object file. C++ adds the ability
to have multiple levels of variable scopes.
I think not, even the contrary, anytime you define a function and
give it a sensible name like (referring to your example):

#define DESTINATION_PRINTER 1
#define DESTINATION_SCREEN 2
... draw_line (int destination, int x1, int x2, int y1, int y2),

and you see that even in the C code example I posted early in this or a related thread,
it can be called from anywhere in a program simply by name.
If you correctly prototype it, then there will be no errors in the way
of calling it with the wrong variables.
And do parameter checks in the function itself.
Important is to use clearly readable defines, function names, and variables.

Okay, imagine working with several people on one project. You create a
function called draw_circle and your co worker creates a function
called draw_circle. Now that is a bit of a problem if you are not
using objects. Either you need to refactor or he.
Oh, but it does!

Absolutely wrong. Where did you get that idea?
Yes that Wxwidgets or whatever it is called today did not want to compile here, needed things...
So rm * -rf fixed that.

You should have read the manual installed the proper packages and run
configure. wxWidgets compiles out of the box on Windows.
mmmm
I have the macromedia? flashplayer.

FYI that doesn't run on MIPS platforms.
 
N

Nobody

At some point object oriented programming turns out to be very
convenient. It is possible to use OO programming techniques in plain
C, but it is a pain in the ass and memory allocation becomes tedious
(just look at the Linux kernel). Thats where C++ kicks in. I have the
same background but I gradually started to use more and more C++.

It's easy enough to implement an OO-style class hierarchy in C. The real
kicker is construction and destruction. C++ lets you ensure that an
object's constructor is called before anything else uses it, and its
destructor is called when it goes out of scope, whether by a return or
by an exception.
 
M

Martin Brown

That is about right for a non-optimising naive native code compiler that
saves every loop variable back to memory. When fully optimised to be all
in registers the figure should come down to 2.2s or so.

There is a small difference between the BASIC and C. You are using
signed INTEGERs he is using unsigned. It shouldn't affect the runtime
though provided that the C compiler generates the right opcodes.

That is believable. Most compilers get between 0.22 and 0.3 depending on
how fast the memory subsystem is under sustained sequential access.

Have you tested his code on your PC and your code on his embedded
system? It could be that DMA transfer of raw data is robbing him of
memory bandwidth. Or the Kontron board has other memory speed issues.
for (multiply = 0; multiply < 10; multiply ++) // 10 x
{
for ( index = 0; index < DATA_ARRAY_SIZE; ++index )
sum_data[index] += inbound_data[index];
}

It is difficult to see how even the dumbest compiler could get this to
take more than 0.5s per loop on modern hardware. Be interesting to see
the generated code for this loop. If it looks sensible then we can
establish that you are looking at a hardware problem.

Only to boneheaded BASIC hackers.
The assembly code produced by the C program is only five opcodes, and
appears to be about as smart as it can be. The only improvement I can
suggest is to count down, not up, so a simple test for zero can end
the loop.

It is the memory subsystem that isn't performing. You could add
additional computation to the loop and it should not affect the timing.

Regards,
Martin Brown
 
M

Martin Brown

My first instinct was that it might have created an object where
indexing was in multiples of 6 bytes, but it has correctly padded to 8.

The explanation is that with two distinct arrays you have something else
to do inside the loop whilst waiting for the cache to load. The
structure you have created hits the same block again far too quickly and
so ends up waiting on both every time. Worse when the write through
cache is active your read from the second chunk in the same dirty cache
line are compromised.

Because data lengths are different part way through the old loop with
the original code the movsx 16 bit fetch becomes available in cache. The
32 bit fetches and stores are always running ahead of the caches ability
to satisfy them.

The standard method for cache aware algorithms is to work on as many
distinct cache blocks simultaneously as the architecture will allow.

That way you are using the delay time of the first fetch constructively.
I have to say on the newest Pentiums there is almost no difference. None
of the tricks I know will speed it up significantly past 0.22s on my
box. There isn't enough work being done inside the inner loop.
; arr.x += arr.y;
;
?live1@128: ; EAX = i, EBX = arr, EDX = j, ESI = startTime
@6:
movsx ecx,word ptr [ebx+8*eax+4]
add dword ptr [ebx+8*eax],ecx
inc eax
cmp eax,64000000
jl short @6
inc edx
cmp edx,10
jl short @4
;
; }
;
; endTime = GetTickCount();



What does the "packet" thing do? I can't find it in any of the C or
C++ keyword lists.


Creates a user defined type 8 bytes long containing a .x and a .y
Does it round up the struct size? The loop appears to be crunching
8-byte chunks, so two bytes are dead weight that wastes cache.

Although it might waste cache space having matched size operands in
separate arrays might still be faster - SIMD vector instructions are
better for that case. SSE 128bit registers can do 4 parallel 32 bit adds
moving 16 bytes at a time. You would need to test it.

Regards,
Martin Brown
 
N

Nico Coesel

Jan Panteltje said:
But then C++ is not really an OO language :)

OO has nothing to do with concurrency either. You should do some rm -f
on your bookshelf. Did you by any mistake got books from Ammeraal?
 
N

Nico Coesel

John Larkin said:
The assembly code produced by the C program is only five opcodes, and
appears to be about as smart as it can be. The only improvement I can
suggest is to count down, not up, so a simple test for zero can end
the loop.

The basic program calculates 64 million numbers, the C version 67.1
million. Very short tests are not very usefull to do exact comparison
of run times because the OS timers aren't that precise. You should
make the test last for at least 20 seconds to cancel effects of task
switching. The powerbasic and C version should perform equally.
 
M

Martin Brown

John said:
On Thu, 21 May 2009 09:03:56 +0100, Martin Brown


It's interesting that people seem to be converging to around 0.22
seconds on various machines. I guess we all buy the same DRAM chips.

It is dominated by memory speed under sustained sequential access.
Has anyone tried it on a box with DDR3 ram?
My down-count loop, in Basic, is hitting about 0.207 or so.

It would be interesting to see the code it has generated. I see only a
tiny difference at all between counting up and counting down. There is a
slight gain in doing the opposite of your initialisation code since then
the cache will be preloaded with useful data first time around.

It would also be interesting in a curious sort of way to see the code
that gcc without optimisation generated that was so incredibly slow. It
was way off the mark at 0.7s if the box was capable of 0.2s (even the
Mickeysoft compiler manages 0.4s without any optimisation).

I changed my initialisation code to count down so that the cache would
be preloaded with relevant data at the begining. It helps on the older
P4s but does nothing at all on the fastest new box.

Regards,
Martin Brown
 
V

Vladimir Vassilevsky

John said:
A programmer will spend all his career programming, so wants to play
with shiny new toys.

No.

The development of the mass software is a sweat shop production line at
which the low wage employees work from 7:00 till 5:00.
As an engineer, I only program when I absolutely
have to, and want to get it done ASAP, wrapped up, bug free, and not
have to revisit it, so I can get back to designing electronics, which
is much more fun.

You are the engineer. They are the farmers.
So I use simple tools that are easy to remember how
to use, in both the short term and the long term.

So they use the tools which allow them to put the things together so it
somehow works. On schedule and within the budget.
Imagine having to go
back and modify an embedded-product program that was coded in an
industry-approved, academically-beloved state-of-the-art structured
language: Pascal.

What go back? Ship and forget. The product will be obsolete in less then
one year anyway. All of the programmer staff will be turned arround by
then as well.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
 
N

Nico Coesel

Jan Panteltje said:
Well, I have
http://en.wikipedia.org/wiki/Object_oriented

and maintain that C++ is not really an OO language.

It also points out what I pointed out before, the relation between GUI related
programming and object oriented programming that seems to exists.

For the rest I do not care, C gives me the freedom to write in any
way I like, without the ++ crap getting in the way.
The argument that C++ would be better for groups is killed by the existence
of Linux, and many Linux applications.

In my view, that what is considered 'OO' is just a help for people who cannot program,
or not have the ability to think analytically enough to construct
a program that works in a normal way.

What really happened with C++, is what happened with me on a much smaller scale for example
in comp.os.development.apps
I needed at some point to compare 2 files, on 2 http servers, original and mirror.
So I wondered if that was possible from my eeePC without actually downloading those long files
as 1) it has only few bytes memory left as it is cram full of extra stuff
I added, and 2) it wears out the FLASH if you store those files.
So I googled, asked... there was no such utility like 'diff' via the network.
So I started writing one, not a small job really.
Then somebody mentioned why not use 'wget' and pipes.
Now that was really attractive, so I wrote a very short C program that used
wget and popen() to do the job, works perfectly.
Then after that somebody said: 'you can do that in bash:
cmp <wget -O - url1 <wget -O- url2'
or something like that.
So what I wrote I did not need.
I wrote it because I did not have enough experience with bash.
IN THE SAME WAY Strouskop started writing a lot of crap on top of C because he did not know
how to program in C (or know how to program at all if you ask me).
Now newbies into programming, the poor suckers that come into that realm, fall for that,
they do not know how to program either, like old rats like me and Larkin etc,
so they think they _need_ all that ++ crap.
And parrot the reasons the sales druid gave when selling them the C++ compiler.
Is that clear?

I'm inclined to recapture the above as 'Don't know about OO. Don't
want to know about OO. Therefore OO must be bad'.

If you can't program, nothing is going to help. OO doesn't make
programming simpler. Its just another way to look at programming and
organise modules in a piece software in a more logical way. Using OO
doesn't make your program slower either.
 
J

Jim Thompson

I think a very good example of this is SPICE3: Even though it was specifically
re-written in C (rather than FORTRAN) to make it easily extensible/changeable,
anyone looking to actually implement some extensions faces a much tougher row
to hoe than if it had been writte in C++ from the start.

Hmm... I wonder what Mike coded LTSpice in?

I think the first code I was actually *paid* to write might have been in
REXX -- summer internship at IBM after my senior year of high school!

Spice2 _was_ written in Fortran. I had to implement a modification of
Gummel-Poon (~1980) so that Cbe was properly represented during
forward bias... it doesn't keep increasing (as originally modeled)...
it actually drops rapidly.

...Jim Thompson
--
| James E.Thompson, P.E. | mens |
| Analog Innovations, Inc. | et |
| Analog/Mixed-Signal ASIC's and Discrete Systems | manus |
| Phoenix, Arizona 85048 Skype: Contacts Only | |
| Voice:(480)460-2350 Fax: Available upon request | Brass Rat |
| E-mail Icon at http://www.analog-innovations.com | 1962 |

Stormy on the East Coast today... due to Bush's failed policies.
 
R

Rich Grise

Jan said:
Bloated and slow.
MS is an example how that ++ can get out of hand, adding nothing of real interest, real muscle,
but tons of bloat and bugs.
When you write a "language" that illiterates can program in, you get
illiterate programmers. ;-)

Cheers!
Rich
 
N

Nico Coesel

Jan Panteltje said:
I'm inclined to recapture the above as 'Don't know about OO. Don't
want to know about OO. Therefore OO must be bad'.

Did you actually read the link?
Up to where its says Criticism?
<quote>
Criticism
object-oriented programming as a false annunciation. Usually this claim is
view that it is a superior way to program."[8]
A study by Potok et al. [9] has shown no significant difference in productivity between OOP and
procedural approaches.
Christopher J. Date stated that critical comparison of OOP to other technologies,
relational in particular, is difficult because of lack of an agreed-upon and
rigorous definition of OOP.[10]. In [11], a theoretical foundation on OOP is proposed.
Alexander Stepanov suggested that OOP provides a mathematically-limited viewpoint and called it,
"almost as much of a hoax as Artificial Intelligence" (possibly referring to the
Artificial Intelligence projects and marketing of the 1980s that are
sometimes viewed as overzealous in retrospect).[12][13]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I like that one :) :) Right on!

Paul Graham, a successful web entrepreneur and programming author, has suggested that the
purpose of OOP is to act as a herding mechanism which keeps mediocre
programmers in mediocre organizations from "doing too much damage". This is
at the expense of slowing down productive programmers who know how to use
more powerful and more compact techniques. [1]
<end quote>

99.9% of all programmers world wide are using OO to make their jobs
easier. Based on the opinions (I see no technical reasoning in the
piece you quoted) of a few critics you say 99.9% of the programmers
are wrong. If the critics where right, we would not have *any* piece
of software available OR nobody would use OO. Therefore it is safe to
say OO is a technique which has significant advantages.

People tend to adopt technology that has an added value. I think the
quick adoption of C# is a good example.

Paul Graham obviously likes to make a fool out of himself (I had a
good laugh about so much nonsense and opinions posed as facts):
http://www.paulgraham.com/noop.html

I'm inclined to recapture the text in noop.html as 'Don't know about
OO. Don't want to know about OO. Therefore OO must be bad'. At least
at the end of the piece Paul admits he has no clue and no experience
with OO programming.
 
N

Nobody

Well, I have
http://en.wikipedia.org/wiki/Object_oriented

and maintain that C++ is not really an OO language.

C++ is most definitely an OO language. The OO features account for most
of the differences between C and C++.
It also points out what I pointed out before, the relation between GUI related
programming and object oriented programming that seems to exists.

OO is a natural fit for GUI toolkits, although it doesn't have to use C++.
Both Xt and GTK are written in C but provide OO interfaces.
For the rest I do not care, C gives me the freedom to write in any
way I like, without the ++ crap getting in the way.
The argument that C++ would be better for groups is killed by the existence
of Linux, and many Linux applications.

In my view, that what is considered 'OO' is just a help for people who cannot program,
or not have the ability to think analytically enough to construct
a program that works in a normal way.

Given that you don't use OO and don't appear to even understand it, your
opinion of OO doesn't really have a great deal of weight.

Of course, it's *possible* to write any application without using an OO
language or even OO concepts. But OO can provide very definite benefits in
terms of modularity.
 
Top