Maker Pro
Maker Pro

Skybuck's Universal Code 5 (The Native Version)

M

MooseFET

What you mean storing it as a group ?

Storing it sequentially or storing it as a group is the same thing to me at
least.

So explain yourself if you believe you were not mistaken.

Sequentially means one after the other on the media. Disks are rarely
used this way. Things are usually written in sectors. The sectors
may not be in order. Normally you don't see this when you read and
write because the OS hides this from you.

If you simply allow the OS to work with the data as a single flat
structure you may have one entry that is partly on one sector and
partly on another. This means that to seek to that element and read
it, two sectors must be read. You will take a major speed hit if you
allow that to happen.
 
M

MooseFET

recent invention hahahahaha =D
1. It does look like a linked list. However linked lists are usually used
to store individual elements.
In this case the entire linked list is a single element or integer. It's
one big integer.
A scalable integer for that matter.
2. I believe the linked list is a better structure than an array.

[...]

Do you know about pointer chasing?

It is likely he doesn't. He is very new to the field.
 
M

MooseFET

One last note:

To truely make it a universal code the pointers could be relative to itself.

So the location of the pointer itself functions as base zero.

This way the maximum gap between elements is 2^32 bytes or something like
that.

Which means unlimited ammounts of memory can be used as long as the gaps
between the sub integers is never exceeded.

This ideas requires extra processing though:

Take the address of the pointer and add the contents of the pointer to the
address to get the final address of the next element.

Not to bad... just a few instructions extra.

Why let it be an instruction? The add process could have its own ALU
section in the CPU. If the pointer is loaded first, there is time
while the data section load is happening for hardware to perform the
add.

If you have a machine with a 128 bit bus, you would be far better off
going to a pair of 64 or 32 and 96 bit values. If you want speed, you
want to use the whole bus.

There are a few operations that can be done more quickly if you can
get to the MSB of the value. You may want to make your linked list a
doubly linked list and keep a pointer to both ends.
 
S

SoothSayer

I assume when I tell the OS to write a file sequentially it will do it's
best to do so.


That is not one of the commands at your disposal to even "tell" the OS
to execute.

Wrong layer, dipshit.
 
S

Skybuck Flying

MooseFET said:
Sequentially means one after the other on the media. Disks are rarely
used this way. Things are usually written in sectors. The sectors
may not be in order. Normally you don't see this when you read and
write because the OS hides this from you.

I assume when I tell the OS to write a file sequentially it will do it's
best to do so.

If you believe this is not the case then you will have to proof that ;)
If you simply allow the OS to work with the data as a single flat
structure you may have one entry that is partly on one sector and
partly on another. This means that to seek to that element and read
it, two sectors must be read. You will take a major speed hit if you
allow that to happen.

Most sectors are 512 bytes, so 64 pairs of integer and pointer will fit in
it.

Worst case scenerio would be a little offset, so than 62 pairs would fit in
it.

Also harddisks are pretty smart.

The chance is big they will read the next sector and cache the previous
sector anyway.

So no additional seeks are needed if the data happens to spill over onto the
previous or next sector.

Worst case scenerio sectors are somewhere else, which means 1 full read for
the 62 pairs. 1 read for the previous spilling
1 read for the next spilling.

However since the fill was written sequantially, the next integers and
previous ones will be cached/read/write as well.

So these can all be considered full reads.

Which means the spilling onto different sectors doesn't really have to be a
problem.

Which means the file fragmentation is the only real problem.

Bye,
Skybuck.
 
S

SoothSayer

Most sectors are 512 bytes, so 64 pairs of integer and pointer will fit in
it.


Nope. If it is formatted with a file system recognized by the OS or
APP, then there will be overhead on any data segment YOU wish to write to
it. There has to be a "file" written, regardless of how you think you
want to structure your data set.
 
M

MooseFET

I assume when I tell the OS to write a file sequentially it will do it's
best to do so.

It doesn't.
If you believe this is not the case then you will have to proof that ;)

I know this not to be the case and yes I do have proof. I have the
source code for the part of the OS that does this function. I have
also used to tools to examine what is really on the disk.
Most sectors are 512 bytes, so 64 pairs of integer and pointer will fit in
it.

Elsewhere in the thread, you will find my comments as to why it may
not be 64 bits per entry if you want it to go fast.

[....]
The chance is big they will read the next sector and cache the previous
sector anyway.

You don't understand what really goes on with cached disk operations.
It still drives up the processing time.
So no additional seeks are needed if the data happens to spill over onto the
previous or next sector.

Think about the FAT based system to see why it still takes some time
even if no seek is needed.
 
D

David L. Jones

Now that a new day has broken it's time to clearify some things about my
recent invention hahahahaha =D

1. It does look like a linked list. However linked lists are usually used to
store individual elements.

In this case the entire linked list is a single element or integer. It's one
big integer.

A scalable integer for that matter.

Rubbish, it's just a linked list. Anything else is an illusion created
in your silly alternate reality.

Dave.
 
R

Robert Baer

MooseFET said:
Why let it be an instruction? The add process could have its own ALU
section in the CPU. If the pointer is loaded first, there is time
while the data section load is happening for hardware to perform the
add.

If you have a machine with a 128 bit bus, you would be far better off
going to a pair of 64 or 32 and 96 bit values. If you want speed, you
want to use the whole bus.

There are a few operations that can be done more quickly if you can
get to the MSB of the value. You may want to make your linked list a
doubly linked list and keep a pointer to both ends.
Gee, that reminds me of the FORTRAN Increment and Decrement fields as
well as the contractions for their names used in LISP..
 
T

TheKraken

How does the operating system find free sectors to write data too ?

Not that it matters much.

You have no clue what does or does not matter.
I have no access to the underlieing file system.

And writing one own's file system is a bit over the top don't you think ? :)

For you, yes. Considering that you do not even know what is going on
with an existing file system, I hardly think you could deal with
authoring a new one.
 
M

MooseFET

How does the operating system find free sectors to write data too ?

On FAT systems, it searches the FAT table looking for a magic number
that indicates a free sector. Many other systems make the free
sectors into a big linked list.

Not that it matters much.

I have no access to the underlieing file system.

And writing one own's file system is a bit over the top don't you think ? :)

It makes sense to know a little about what the OS does so that you can
get the most speed out of it.
 
M

MooseFET

Gee, that reminds me of the FORTRAN Increment and Decrement fields as
well as the contractions for their names used in LISP..


It was hardly a new idea I was suggesting, just the application of an
old one to this problem.

Starting with the top bit really speeds up the computation of an
unnatural-log.
 
S

Skybuck Flying

Well I sure didn't learm much from you, welcome to the ban list.

Bye,
Skybuck.
 
S

Skybuck Flying

Nonsense.

Load whatever you need into RAM and proceed from there.

That's what RAM is for.

I have 2 and even 4 GB of RAM.

I have yet to encounter a situation where the harddisk is a real bottleneck.

Bye,
Skybuck.
 
R

Robert Baer

Skybuck said:
How does the operating system find free sectors to write data too ?

Not that it matters much.

I have no access to the underlieing file system.

And writing one own's file system is a bit over the top don't you think ? :)

Bye,
Skybuck.
Simple, it writes to the first clustor that is NOT marked as used.
 
M

MooseFET

Since you clipped the very thing you are so unwisely calling nonsense,
I'll put it back
**** begin insert ***

How does the operating system find free sectors to write data too ?

On FAT systems, it searches the FAT table looking for a magic number
that indicates a free sector. Many other systems make the free
sectors into a big linked list.
Not that it matters much.
I have no access to the underlieing file system.
And writing one own's file system is a bit over the top don't you think ? :)

It makes sense to know a little about what the OS does so that you can
get the most speed out of it.


**** end inserted ****

Nonsense.

You seem to have an extremely poor understanding of the subject if you
call my posting that.

Load whatever you need into RAM and proceed from there.

To load it into RAM you need to read the disk. You wanted speed. Now
you are throwing away speed hand over fist just because you didn't
understand the subject. Go read up on it a bit and you will see.

That's what RAM is for.

I have 2 and even 4 GB of RAM.

I have 4GB and it isn't even close to enough. Just the raw data file
for one of the things I am doing is over 2GB. Forunately, I can
operate on it in sections. By carefully arranging how it is broken
up, I have got the processing time to be less than 8 hours.

The nice folks at Linear. Put a 48 Bit "do it yourself" virtual
memory system into LTspice. This is because the data they have is
well over 4GB.

I have yet to encounter a situation where the harddisk is a real bottleneck.

You haven't done much then.
 
M

MooseFET

Simple, it writes to the first clustor that is NOT marked as used.

That is not how Windows really does it. For FAT devices, it has a
needlessly complex way of selecting which sector to use. The first
fit method tends not to lead to fragmentation. The Windows method is
carefully tuned to lead to a great deal of fragmentation. This is why
you have to defrag so often on a windows box.
 
V

Vladimir Vassilevsky

MooseFET wrote:

On FAT systems, it searches the FAT table looking for a magic number
that indicates a free sector. Many other systems make the free
sectors into a big linked list.

There are two different things: the data structures which describe how
the disk space is used and the algorithms which operate with those
structures. The smart and the dumb algorithms are equally applicable to
FAT as well as any other structure.

Not too big of deal. We developed POSIX compatible FAT filesystem to be
used with our OS or as a standalone. The allocation algorithm is more or
less smart; there is a severe speed penalty for the fragmentation with
the flash cards!
You seem to have an extremely poor understanding of the subject if you
call my posting that.

I plonked him a while ago. His posts are mostly nonsense, not worth reading.
I have 4GB and it isn't even close to enough. Just the raw data file
for one of the things I am doing is over 2GB. Forunately, I can
operate on it in sections. By carefully arranging how it is broken
up, I have got the processing time to be less than 8 hours.

What is it, if this is not a secret? RF design?
The nice folks at Linear. Put a 48 Bit "do it yourself" virtual
memory system into LTspice. This is because the data they have is
well over 4GB.

I know a guy who does the Spice level simulation of the microprocessors;
he is complaining about that sort of problems :)


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
 
S

Skybuck Flying

Simply read file data sequentially from 0 to N-1.

This is garantueed to provide the fastest performance.

If it doesn't then it's a file system problem.

I would recommend to defrag the file system.

It's up to the file system to make sure the fragments of the files are as
good as possibly de-fragmented.

Then it's up to me to make sure I read the files fast with a good algorithm.

For little files this is easy for big files over 4 GB special handling will
be necessary.

So I am not really seeing the problem.

If I store many, many, many, many integer lists as described, then what is
the problem ?

Simply read the lists sequentially from files should work just fine.

Bye,
Skybuck.
 
Top