Maker Pro
Maker Pro

Distributed microcontroller computing

X

xarvia

I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

I was considering the possibility of creating a network of
microcontrollers, probably using I2C, with one main processor dishing
out smaller sub problems and reporting results back to a PC. To date I
have only used Microchip PICs, primarily the 18f4550 and done my
programming in MPLAB ASM. I think I could easily implement this setup
using these chips but they are too slow (40MHz about 20Mips I think).

I have heard that there are some high speed microcontrollers out there,
I had a quick look at Freescales range of PowerPC based embedded
processors. These are about $25/800Mips - as far as I can tell this
would be significantly cheaper on a per MIP basis than buying computer
hardware (even omitting the hdd and using minimal ram).

My questions:
1. What is the best $/Mips microcontroller out there, preferably with
I2C.

2. What is the best $/Mips microcontroller that I will be able to
understand without too much trouble given that I have only used
Microchip products so far?

3. The technical docs for the Freescale embedded processors left me
somewhat confused (quite a jump up from microchip microcontrollers!!!)
what external components do these devices need to run? In particular is
the program code stored in the device or does it obtain this via the
bus?

4. Now I'm not sure about the gory technical details but I assume that
given the simplicity of the instructions my program needs to execute
the MIPS figures for microcontrollers should be fairly comparable with
that for desktops. The main difference I can see here would be
regarding branching - I've heard that modern processors go to great
lengths to predict which instructions will be executed so that no time
is wasted while retrieving program instructions...

5. Another possibility that would keep me busy for a while is
implementing the program in assembler on my PC. Unfortunately every
computer that I have access to has a different architecture
(Athlon,Sempron,Duron,2xPentium M) so I guess I would have to restrict
myself to x86 instruction set? What sort of performance increase do you
think this might produce?

Thank you very much for your time and I look forward to hearing from
you!
 
Z

zwsdotcom@gmail.com

I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

If you can factor this problem as well as you claim, and it is only
going to take 5,475 PC-days to solve it, you're much better off to
either:

1. rent distributed computer time, or

2. buy as many PCs as it will take to solve the problem in the desired
timeframe. Bear in mind that they only need minimal RAM and a tiny hard
disk (or flash card) and network connection, no display or anything
else. Probably each node can be built for ~$100. If you want the answer
in one month, you need 182 computers. If you're happy to get it in six
months, you only need 31 computers (or rather, 26 additional machines -
$2,600, much less than the cost of developing your custom hardware and
probably the same timeframe).

Unless this is an academic exercise - and it sounds more like you want
to get into a specific piece of encrypted data - you're wasting time
and money developing and debugging a custom multiprocessor engine.
 
R

Rich Grise

I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

The Connection Machine! ;-)
I was considering the possibility of creating a network of
microcontrollers, probably using I2C, with one main processor dishing
out smaller sub problems and reporting results back to a PC. To date I
have only used Microchip PICs, primarily the 18f4550 and done my
programming in MPLAB ASM. I think I could easily implement this setup
using these chips but they are too slow (40MHz about 20Mips I think).

How many chips are you planning to use? Are you saying it will take
5 X 3 PC-years? What happens if you do this same thing over an ordinary
LAN?

I've written 8087 code, to do the inner loop for a Mandelbrot set on an
AT. (anybody remember them?) Don't all PCs these days (I assume by
"5pcs" you mean, "five personal computers") have a math coprocessor?

If you can't do this with today's multi-GHz PCs and a decent LAN, how
many micros are you planning on throwing at this problem? Can I have
a piece of the action? ;-)

Good Luck!
Rich
 
X

xarvia

I realise that building/renting distributed computing time would be the
more efficient way to go were I to take my own time into account.
However this is just a hobby/academic exercise. I should probably have
explained that in my initial post. I think setting this up would be an
interesting exercise and would teach me a lot that I want to understand
about microcontrollers/asm/distributed computing.

Also you can heave a sigh of relief that I'm not attempting to crack
into encrypted data! The problem I'm looking at is one that a few of
the other postgrads and I have in the Math dept had a bit of a play
with. As far as I can tell there is absolutely no practical application
to anything arising from this other than having an answer!!!:

The problem is to assign one of 5 colours for each of the integers from
1 upwards such that if any two integers have the same colour the
integer that they sum to must have a different colour:

eg.
If we have
1-green 2-blue 3-green

then 4 cannot be green or blue as 1+3 = 4 and 2+2 = 4.

A correct sequence of 160 digits for 5 colours is known. I wish to find
a sequence of 162 digits.
 
X

xarvia

Hi Rich,

Communication speed between the processors is of relatively little
importance, depending on how the processes are setup. I suggested I2C
because that's what I'm familiar with with microcontrollers thus far.

The reason I thought of microcontrollers is that I'm utilising very
little of the sophistication of a modern computer and none of the
baggage that comes along with it in terms of motherboard etc. So I
thought perhaps you could do the same with some of the faster
microcontrollers out there which I think can run up to 1000Mips.

I'm just a student so it wouldn't be too many micros. If you're keen to
donate CPU time I'm sure I can find a use for it - Internet mersenne /
SETI @ home style.

Cheers!
 
T

Tim Wescott

xarvia said:
I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

I was considering the possibility of creating a network of
microcontrollers, probably using I2C, with one main processor dishing
out smaller sub problems and reporting results back to a PC. To date I
have only used Microchip PICs, primarily the 18f4550 and done my
programming in MPLAB ASM. I think I could easily implement this setup
using these chips but they are too slow (40MHz about 20Mips I think).

I have heard that there are some high speed microcontrollers out there,
I had a quick look at Freescales range of PowerPC based embedded
processors. These are about $25/800Mips - as far as I can tell this
would be significantly cheaper on a per MIP basis than buying computer
hardware (even omitting the hdd and using minimal ram).

My questions:
1. What is the best $/Mips microcontroller out there, preferably with
I2C.

2. What is the best $/Mips microcontroller that I will be able to
understand without too much trouble given that I have only used
Microchip products so far?

3. The technical docs for the Freescale embedded processors left me
somewhat confused (quite a jump up from microchip microcontrollers!!!)
what external components do these devices need to run? In particular is
the program code stored in the device or does it obtain this via the
bus?

4. Now I'm not sure about the gory technical details but I assume that
given the simplicity of the instructions my program needs to execute
the MIPS figures for microcontrollers should be fairly comparable with
that for desktops. The main difference I can see here would be
regarding branching - I've heard that modern processors go to great
lengths to predict which instructions will be executed so that no time
is wasted while retrieving program instructions...

5. Another possibility that would keep me busy for a while is
implementing the program in assembler on my PC. Unfortunately every
computer that I have access to has a different architecture
(Athlon,Sempron,Duron,2xPentium M) so I guess I would have to restrict
myself to x86 instruction set? What sort of performance increase do you
think this might produce?

Thank you very much for your time and I look forward to hearing from
you!
The Freescale 32-bit microcontrollers (at least the ones that I know of)
don't have any on-board ROM, flash or otherwise. You'd have to design a
PC board with flash, debug it, then build a bunch of iterations.

Even if your time is free by the time you have something working you'll
have spent more than $100/ea -- PC boards are cheap when you build them
in huge quantities, but not when you do onsie-twosie.
 
T

Tim Wescott

xarvia said:
I realise that building/renting distributed computing time would be the
more efficient way to go were I to take my own time into account.
However this is just a hobby/academic exercise. I should probably have
explained that in my initial post. I think setting this up would be an
interesting exercise and would teach me a lot that I want to understand
about microcontrollers/asm/distributed computing.

Also you can heave a sigh of relief that I'm not attempting to crack
into encrypted data! The problem I'm looking at is one that a few of
the other postgrads and I have in the Math dept had a bit of a play
with. As far as I can tell there is absolutely no practical application
to anything arising from this other than having an answer!!!:

The problem is to assign one of 5 colours for each of the integers from
1 upwards such that if any two integers have the same colour the
integer that they sum to must have a different colour:

eg.
If we have
1-green 2-blue 3-green

then 4 cannot be green or blue as 1+3 = 4 and 2+2 = 4.

A correct sequence of 160 digits for 5 colours is known. I wish to find
a sequence of 162 digits.
If you just see how many cheap motherboards you can pack into one case
with all their ethernet ports wired together you'll learn lots about
your chosen subject -- particularly if you write your core algorithms in
assembly. You'll also end up spending less $/mip than you would if you
tried to build stuff in small quantities from scratch, even if your time
is free.
 
J

John Larkin

I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

I was considering the possibility of creating a network of
microcontrollers, probably using I2C, with one main processor dishing
out smaller sub problems and reporting results back to a PC. To date I
have only used Microchip PICs, primarily the 18f4550 and done my
programming in MPLAB ASM. I think I could easily implement this setup
using these chips but they are too slow (40MHz about 20Mips I think).

I have heard that there are some high speed microcontrollers out there,
I had a quick look at Freescales range of PowerPC based embedded
processors. These are about $25/800Mips - as far as I can tell this
would be significantly cheaper on a per MIP basis than buying computer
hardware (even omitting the hdd and using minimal ram).

My questions:
1. What is the best $/Mips microcontroller out there, preferably with
I2C.

2. What is the best $/Mips microcontroller that I will be able to
understand without too much trouble given that I have only used
Microchip products so far?

3. The technical docs for the Freescale embedded processors left me
somewhat confused (quite a jump up from microchip microcontrollers!!!)
what external components do these devices need to run? In particular is
the program code stored in the device or does it obtain this via the
bus?

4. Now I'm not sure about the gory technical details but I assume that
given the simplicity of the instructions my program needs to execute
the MIPS figures for microcontrollers should be fairly comparable with
that for desktops. The main difference I can see here would be
regarding branching - I've heard that modern processors go to great
lengths to predict which instructions will be executed so that no time
is wasted while retrieving program instructions...

5. Another possibility that would keep me busy for a while is
implementing the program in assembler on my PC. Unfortunately every
computer that I have access to has a different architecture
(Athlon,Sempron,Duron,2xPentium M) so I guess I would have to restrict
myself to x86 instruction set? What sort of performance increase do you
think this might produce?

Thank you very much for your time and I look forward to hearing from
you!


It's likely that your algorithm can be implemented in an FPGA, perhaps
entirely in parallel, one iteration per clock. Then one largish FPGA
may be able to crunch tens or even hundreds of copies every clock, and
clock at 100 MHz or more. This could potentially out-compute the
fastest Pentium by a factor on the order of thousands.

You can buy inexpensive PCI boards with one or several Xilinx FPGAs on
board.

What's the problem being solved?

John
 
J

Jim Thompson

It's likely that your algorithm can be implemented in an FPGA, perhaps
entirely in parallel, one iteration per clock. Then one largish FPGA
may be able to crunch tens or even hundreds of copies every clock, and
clock at 100 MHz or more. This could potentially out-compute the
fastest Pentium by a factor on the order of thousands.

You can buy inexpensive PCI boards with one or several Xilinx FPGAs on
board.

What's the problem being solved?

John

2+2 ;-)

...Jim Thompson
 
T

Tim Wescott

John said:
It's likely that your algorithm can be implemented in an FPGA, perhaps
entirely in parallel, one iteration per clock. Then one largish FPGA
may be able to crunch tens or even hundreds of copies every clock, and
clock at 100 MHz or more. This could potentially out-compute the
fastest Pentium by a factor on the order of thousands.

You can buy inexpensive PCI boards with one or several Xilinx FPGAs on
board.

What's the problem being solved?

John
Ooh. Clever. I should have thought of that.

He describes it for Rich the Affluent Wacko, or whatever he's calling
himself these days.
 
R

Rich, Under the Affluence

[crossposted to sci.electronics.design,rec.puzzles, followups not set]

I realise that building/renting distributed computing time would be the
more efficient way to go were I to take my own time into account.
However this is just a hobby/academic exercise. I should probably have
explained that in my initial post. I think setting this up would be an
interesting exercise and would teach me a lot that I want to understand
about microcontrollers/asm/distributed computing.

Also you can heave a sigh of relief that I'm not attempting to crack
into encrypted data! The problem I'm looking at is one that a few of
the other postgrads and I have in the Math dept had a bit of a play
with. As far as I can tell there is absolutely no practical application
to anything arising from this other than having an answer!!!:

The problem is to assign one of 5 colours for each of the integers from
1 upwards such that if any two integers have the same colour the
integer that they sum to must have a different colour:

eg.
If we have
1-green 2-blue 3-green

then 4 cannot be green or blue as 1+3 = 4 and 2+2 = 4.

A correct sequence of 160 digits for 5 colours is known. I wish to find
a sequence of 162 digits.

You've clearly come to the wrong group.

The genii at rec.puzzles will answer this in a day or so. ;-)

Good Luck!
Rich
 
Z

zwsdotcom@gmail.com

I realise that building/renting distributed computing time would be the
more efficient way to go were I to take my own time into account.
However this is just a hobby/academic exercise. I should probably have
explained that in my initial post. I think setting this up would be an
interesting exercise and would teach me a lot that I want to understand
about microcontrollers/asm/distributed computing.

Maybe, but if I was your professor and you used this approach, I'd mark
you down because it's poor engineering practice to invent a wheel where
adequate and affordable solutions exist.

Engineering a device that matches the $/mip of a COTS PC is not a
trivial job, especially if you consider reliability, and it's not
necessary. The micros you are considering don't offer any specific
performance benefit that seems particularly apposite to your
application. Micros that approach the performance level of a desktop PC
CPU are exceedingly difficult to design in - there are board-level
engineering issues that you just don't want to get into, plus the parts
themselves are hard to work with by hand.

Custom massively parallel engines are built when one of the following
is true:

1. It's much cheaper, $/op (where "op" is "iteration of your specific
algorithm" to build custom hardware than use COTS hardware with custom
software, or

2. There is some operation that can be vastly speeded by a custom
processor design.
 
D

dalai lamah

Un bel giorno xarvia digitò:
I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

Looks like the ideal task of one ore more FPGAs. For tasks like this, you
can reach performances some orders of magnitude better than any general
purpose CPU.
 
T

Tim Wescott

Rich said:
[crossposted to sci.electronics.design,rec.puzzles, followups not set]

I realise that building/renting distributed computing time would be the
more efficient way to go were I to take my own time into account.
However this is just a hobby/academic exercise. I should probably have
explained that in my initial post. I think setting this up would be an
interesting exercise and would teach me a lot that I want to understand
about microcontrollers/asm/distributed computing.

Also you can heave a sigh of relief that I'm not attempting to crack
into encrypted data! The problem I'm looking at is one that a few of
the other postgrads and I have in the Math dept had a bit of a play
with. As far as I can tell there is absolutely no practical application
to anything arising from this other than having an answer!!!:

The problem is to assign one of 5 colours for each of the integers from
1 upwards such that if any two integers have the same colour the
integer that they sum to must have a different colour:

eg.
If we have
1-green 2-blue 3-green

then 4 cannot be green or blue as 1+3 = 4 and 2+2 = 4.

A correct sequence of 160 digits for 5 colours is known. I wish to find
a sequence of 162 digits.


You've clearly come to the wrong group.

The genii at rec.puzzles will answer this in a day or so. ;-)

Good Luck!
Rich
If you ever start posting as "Under the Effluent" I'm going to worry.

But then you work as a support person at a university or corporation,
don't you? So I guess you're already there, at least partially.
 
C

cs_posting@hotmail.com

xarvia said:
The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

What you want is not a general purpose processor, but a dedicated
computational unit implemented in an FPGA... or a lot of them.
 
R

Rich, Under the Affluence

Rich said:
[crossposted to sci.electronics.design,rec.puzzles, followups not set]
I realise that building/renting distributed computing time would be the
more efficient way to go were I to take my own time into account.
However this is just a hobby/academic exercise. I should probably have
explained that in my initial post. I think setting this up would be an
interesting exercise and would teach me a lot that I want to understand
about microcontrollers/asm/distributed computing.

Also you can heave a sigh of relief that I'm not attempting to crack
into encrypted data! The problem I'm looking at is one that a few of
the other postgrads and I have in the Math dept had a bit of a play
with. As far as I can tell there is absolutely no practical application
to anything arising from this other than having an answer!!!:

The problem is to assign one of 5 colours for each of the integers from
1 upwards such that if any two integers have the same colour the
integer that they sum to must have a different colour:

eg.
If we have
1-green 2-blue 3-green

then 4 cannot be green or blue as 1+3 = 4 and 2+2 = 4.

A correct sequence of 160 digits for 5 colours is known. I wish to find
a sequence of 162 digits.

You've clearly come to the wrong group.

The genii at rec.puzzles will answer this in a day or so. ;-)
If you ever start posting as "Under the Effluent" I'm going to worry.

But then you work as a support person at a university or corporation,
don't you? So I guess you're already there, at least partially.

I'm a consulting engineer who's on standby most of the time. It doesn't
pay anything unless there's work, but the fringe bennies are fantastic,
like, I get to sit here and play on the computer all day, and the PHB
isn't charging me any rent to park my RV in his lot. :) (I've been
considering sending in a picture to "Redneck Yard" on "Blue Collar TV" -
I'm usually surrounded by big weldments:
http://www.abiengr.com/~sysop/images/RedneckYard1.jpg :) )

Thanks!
Rich
(BTW, "under the affluence" is the short form of "under the affluence
of incohol.", as in, "Hi, I'm John Audubon. I watch birds." "Hi, I'm
Bill Spooner. I watch birds.") ;-)
 
P

petrus bitbyter

xarvia said:
I realise that building/renting distributed computing time would be the
more efficient way to go were I to take my own time into account.
However this is just a hobby/academic exercise. I should probably have
explained that in my initial post. I think setting this up would be an
interesting exercise and would teach me a lot that I want to understand
about microcontrollers/asm/distributed computing.

Also you can heave a sigh of relief that I'm not attempting to crack
into encrypted data! The problem I'm looking at is one that a few of
the other postgrads and I have in the Math dept had a bit of a play
with. As far as I can tell there is absolutely no practical application
to anything arising from this other than having an answer!!!:

The problem is to assign one of 5 colours for each of the integers from
1 upwards such that if any two integers have the same colour the
integer that they sum to must have a different colour:

eg.
If we have
1-green 2-blue 3-green

then 4 cannot be green or blue as 1+3 = 4 and 2+2 = 4.

A correct sequence of 160 digits for 5 colours is known. I wish to find
a sequence of 162 digits.

Looks like interesting to me. Like others said already, I have the feeling a
dedicated eight bits processor may be the best solution. Still I'd like to
know somewhat more about your approach of this problem. I can imagine a
brute force attack, just trying and backtracking, but I guess your
distributed approach will be better. Can you tell more without revealing top
secret objects? For example, how does the program looks like you want your
micros to run?

Some other questions. How does the 160 digits solution looks like and how
has this been calculated? Are you sure the 162 digits sequence exists? Why
not looking at the 161 digits sequence first?

petrus bitbyter
 
R

Rich, Under the Affluence

(BTW, "under the affluence" is the short form of "under the affluence
of incohol.", as in, "Hi, I'm John Audubon. I watch birds." "Hi, I'm
Bill Spooner. I watch birds.") ;-)

Or, "Hi, I'm Gritch Rice. I jotch bokes. ;-D
(apologies to Spehro, or John, or whoever was the originator of that
particular boke. ;-P )

Cheers!
Rich
 
I

Ian Stirling

xarvia said:
I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

Also.
It may be possible to rewrite your algorithm, using the MMX units of
a processor, or even just the registers, using the processor as a
SIMD machine.
DES IIRC has been done dramatically faster this way than normal.
 
F

Frithiof Andreas Jensen

xarvia said:
I've got a pure math problem implemented in C that will take about 3
years to solve using all 5pcs available to me (the algorithm is about
as efficient as it will get without some major mathematical insights).

The funny thing about this problem is that I can easily split it into
smaller problems each of which only requires about 1000 bytes of memory
and only uses bitwise operations and additions mod 256.

I was considering the possibility of creating a network of
microcontrollers,

Buy an array of them then: http://www.picochip.com/
My questions:
1. What is the best $/Mips microcontroller out there, preferably with
I2C.

The CPU on the PC or the GPU on the Graphics Card!
5. Another possibility that would keep me busy for a while is
implementing the program in assembler on my PC. Unfortunately every
computer that I have access to has a different architecture
(Athlon,Sempron,Duron,2xPentium M) so I guess I would have to restrict
myself to x86 instruction set? What sort of performance increase do you
think this might produce?

A NEGATIVE increase - the optimiser that comes with the compiler is likely
to kick any naive assembler implementation.

The optimiser/compiler will "know" what works best on the target
architecture - in most cases better than the developer.

Why not look at hacking the GPU instead??

The Graphics processor is *much* more powerful than the CPU is - for simple
logic operations - and it is massively parallel to begin with, depending on
how much money you shell on the card.

Finally, hacking GPU's is a HOT a worthy topic of Papers and Presentation,
while grubbing around with mouse controllers and FPGA's is geeky and so
yesterday.

Some meat here:
http://download.developer.nvidia.com/developer/SDK/Individual_Samples/gpgpu_samples.html
 

Similar threads

Top