Maker Pro
Maker Pro

Date encoding puzzle - how does SACD work?

T

ted

I can't quite fathom how Sony's SACD encoding works.

According to their specification, a 44.1KHz 16 bit audio stream is
transmitted or encoded as a single serial bit sequence at a rate of
2.82MHz. The encoding is called "Pulse Density mode" So the
transmitted bits contain the density or amplitude of the audio signal.
A simple low pass filter is all that is required to recover the audio.

Using simple maths this means that every original audio sample (at
44.1Khz or every 22.6uS) is encoded using 64 bits of SACD stream. In
contrast, standard PCM uses 16 bits to convey the 65536 analogue
levels required for 16 bit accuracy.

What I cannot understand, if the SACD "decoder" is a simple low pass
filter. How can you get 65536 different levels (as would be required
for every audio sample) out of just 64 serial -equally weighed- bits??
A simple low pass filter just averages the bits, so at most one would
get 64 levels out of a 64 bit serial sequence.

This is a major discrepancy..so what is the catch? or am I
misunderstanding the whole operation??

In other words, is SACD a "lossy" decoding method, i.e. one where bits
are lost (as opposed to standard PCM)

PS: I have looked extensively at web sources, which seem to go deeply
into noise and noise shaping arguments. None seem to describe the
argument in simple IT terms as described above, which seems to
counteract Shannons principle of communications.

Any ideas anybody??


Thanks in advance

ted
 
J

John Jardine

ted said:
I can't quite fathom how Sony's SACD encoding works.

According to their specification, a 44.1KHz 16 bit audio stream is
transmitted or encoded as a single serial bit sequence at a rate of
2.82MHz. The encoding is called "Pulse Density mode" So the
transmitted bits contain the density or amplitude of the audio signal.
A simple low pass filter is all that is required to recover the audio.

Using simple maths this means that every original audio sample (at
44.1Khz or every 22.6uS) is encoded using 64 bits of SACD stream. In
contrast, standard PCM uses 16 bits to convey the 65536 analogue
levels required for 16 bit accuracy.

What I cannot understand, if the SACD "decoder" is a simple low pass
filter. How can you get 65536 different levels (as would be required
for every audio sample) out of just 64 serial -equally weighed- bits??
A simple low pass filter just averages the bits, so at most one would
get 64 levels out of a 64 bit serial sequence.

This is a major discrepancy..so what is the catch? or am I
misunderstanding the whole operation??

In other words, is SACD a "lossy" decoding method, i.e. one where bits
are lost (as opposed to standard PCM)

PS: I have looked extensively at web sources, which seem to go deeply
into noise and noise shaping arguments. None seem to describe the
argument in simple IT terms as described above, which seems to
counteract Shannons principle of communications.

Any ideas anybody??


Thanks in advance

ted

I've never come across "SACD" but please bear with me, as I'm going to
pontificate, or 'stick an oar in' (as mentioned in another post :).
Nobody's replied yet so there's nothing to lose
After thinking about for a while I had to agree with you that
straight-forward linear data encoding the 64bits would as you say just
average out at 1 part in 64 accuracy which of course is pretty poor.
I then thought that maybe Sony were taking advantage of predistorting the
bit significances to fit the graph of a CR filter and tested the idea by
writing a programme to do this. Yet again rubbish!.
While massaging the prog' I played around with the effects of shifting a
single "+5V" bit position up and down the 64 bit length of a data word. (1x1
63x0's) It was then obvious what Sony and numerous (and others no doubt)
are up to!.
Yes, a little data is encoded in the actual 64 bits but the real data is
encoded in the time-position of the bits. (time-position encoding TPE?)
For example, feed a 20kHz CR low pass filter with a 64 bit string having
only the first bit set and the averaged filter output voltage after the 64th
bit, is near nothing (1E-30V say).
Put that single bit in the last 64th position and the CR filter gives 214mV
out.
(single bit if at 63rd=9mV, 62nd=39uV, 61st=17uV, 60th=1uV, 59th=0.03uV
....).
Essentially each present or absent data bit will supply or extract (charge,
discharge) a lump of energy from the CR filter at different points in time.
The final averaged cap' 'sample' voltage at the end of the data string will
reflect all the individual charges and discharges during the preceding 22uS
As we have a 'linear' setup then the final sample voltage is simply the
addition of the effect of 64 individual CR Volts wrt time response graphs.
Ie the sum of 64 individual exponentials.
This Sony method looks wide ranging and offers a simple way to serially
pump out low distortion serial sinewave data from say a PIC micro using a
single CR filter. The fascinating bit will be back programming the 64 (or
less) exponentials that generate a particular wave shape.
regards
john
 
A

Active8

On Sun, 12 Oct 2003 21:30:32 +0100, John Jardine, wrote...

a pretty good explaination to the OP, but he really f'd me up with that
"CR filter" stuff. after realizing that he's posting from the UK where
they drive on the left side of the road, i realized he was talking about
an "RC filter" :)

smart move, coding a sim.

brs,
mike
 
P

Picture Man

ted said:
I can't quite fathom how Sony's SACD encoding works.

According to their specification, a 44.1KHz 16 bit audio stream is
transmitted or encoded as a single serial bit sequence at a rate of
2.82MHz. The encoding is called "Pulse Density mode" So the
transmitted bits contain the density or amplitude of the audio signal.
A simple low pass filter is all that is required to recover the audio.

Using simple maths this means that every original audio sample (at
44.1Khz or every 22.6uS) is encoded using 64 bits of SACD stream. In
contrast, standard PCM uses 16 bits to convey the 65536 analogue
levels required for 16 bit accuracy.

What I cannot understand, if the SACD "decoder" is a simple low pass
filter. How can you get 65536 different levels (as would be required
for every audio sample) out of just 64 serial -equally weighed- bits??
A simple low pass filter just averages the bits, so at most one would
get 64 levels out of a 64 bit serial sequence.

This is a major discrepancy..so what is the catch? or am I
misunderstanding the whole operation??

In other words, is SACD a "lossy" decoding method, i.e. one where bits
are lost (as opposed to standard PCM)

PS: I have looked extensively at web sources, which seem to go deeply
into noise and noise shaping arguments. None seem to describe the
argument in simple IT terms as described above, which seems to
counteract Shannons principle of communications.

Any ideas anybody??

Sounds to me like exactly like the data stream you'd get if you picked
off the digital output of a "64x over-sampling digital filter" just
before it feeds a "1-bit DAC", to use the standard audio industry terms.
 
P

Paul Burke

Active8 said:
i realized he was talking about
an "RC filter"


An RC filter is for removing Roman Catholics. It's the opposite of a
Catholic converter, as fitted to cars..

Paul Burke
 
T

ted

John Jardine said:
Yes, a little data is encoded in the actual 64 bits but the real data is
encoded in the time-position of the bits. (time-position encoding TPE?)
For example, feed a 20kHz CR low pass filter with a 64 bit string having
only the first bit set and the averaged filter output voltage after the 64th
bit, is near nothing (1E-30V say).
Put that single bit in the last 64th position and the CR filter gives 214mV
out.
(single bit if at 63rd=9mV, 62nd=39uV, 61st=17uV, 60th=1uV, 59th=0.03uV
...).
John

Thanks for the explanation. Is that the way the Sony system actually
works, or is it something (quite cleverly) that you figured out
independently??

The Sony system states that only a simple low pass filter is used to
recover the sound. In the system you describe, I assume you will need
a more complex device to sample or accumulate the accrued waveform
every 64 bit times, in which case the sampler would need to be pretty
accurate negating the benefits of a very simple decoder (which was the
original objective of SACD)

I haven't done the maths, but I would also find it difficult to
believe that 64 exponential "spot" coefficients (added in different
combinations) could be added together to result in 65536 different
levels of amplitude. I may be wrong of course!!

Regards

Ted
 
T

ted

Picture Man said:
Sounds to me like exactly like the data stream you'd get if you picked
off the digital output of a "64x over-sampling digital filter" just
before it feeds a "1-bit DAC", to use the standard audio industry terms.
I think you are right in context. The only difference is that most
manufacturers such as Crystal semiconductors who make these high
resolution A/D and D/A do not use one bit DAC oversamplers, they use
multibit oversamplers.

They use the one bit analogy purely for describing how the product
works, But internally they are quite dfifferent.


Regards

ted
 
T

ted

Picture Man said:
Sounds to me like exactly like the data stream you'd get if you picked
off the digital output of a "64x over-sampling digital filter" just
before it feeds a "1-bit DAC", to use the standard audio industry terms.
I think you are right in context. The only difference is that most
manufacturers such as Crystal semiconductors who make these high
resolution A/D and D/A do not use one bit DAC oversamplers, they use
multibit oversamplers.

They use the one bit analogy purely for describing how the product
works, But internally they are quite dfifferent.


Regards

ted
 
A

Active8

i realized he was talking about
an "RC filter"


An RC filter is for removing Roman Catholics. It's the opposite of a
Catholic converter, as fitted to cars..

Paul Burke
[/QUOTE]
ROFLMAO
 
J

John Jardine

ted said:
John

Thanks for the explanation. Is that the way the Sony system actually
works, or is it something (quite cleverly) that you figured out
independently??

The Sony system states that only a simple low pass filter is used to
recover the sound. In the system you describe, I assume you will need
a more complex device to sample or accumulate the accrued waveform
every 64 bit times, in which case the sampler would need to be pretty
accurate negating the benefits of a very simple decoder (which was the
original objective of SACD)

I haven't done the maths, but I would also find it difficult to
believe that 64 exponential "spot" coefficients (added in different
combinations) could be added together to result in 65536 different
levels of amplitude. I may be wrong of course!!

Regards

Ted

Ted, I've honestly no idea if it is the same :). The suggestion was a sort
of idle conjecture for fun. Just filling time whilst a real answer came in.
I've just made a Google search on SACD but all I'm seeing are advert's and
yet more advert's.
What I do know for sure, is that there is nothing new under the sun. Your
mention of the 'Density wrt amplitude' aspect fits exactly with this kind of
encoding. Ie, a high sample voltage will require a very long run of "1's" at
the back end of the 64 bit data word.
A single R+C low pass filter (regards Active8!) would seem a perfect means
of accruing the 64 data bits. Indeed, this simple time-constant is a wholly
intrinsic, part of the data conversion.
From what little I played with the programme I noticed wide dynamic range on
sample voltages just by patterning small blocks of data. Variation
increasing enormously as each new bit was added. The way I looked at it was
that a CR discharging from say 5V to 0V over a period of time, provides an
infinity of voltage level steps. It is "simply" a question of arranging the
preceding bit patterns such that after the last bit has been presented at
the end of the data word, the accrued voltage on the cap is that which is
wanted. Ie the "time" variable stops at that point.

Yes, 64 bits worth of 'course corrections' would seemingly imply just a few
steps (eg 64 levels) are available for use. But it's the time aspect that
must be accounted for and we're dealing here with exponential Voltage wrt
time curves. Energy is constantly being added/extracted/dissipated.
Something akin it seems, is floating point maths, where 40 bits (signed
mantissa+exponent) can encode numbers over a 1e38 dynamic range. There's
whole warehouses full of numbers that can't be fully encoded but the method
is precise enough for general use.
I'd like to play with it further but from past experience, know I'd be
wasting my time.
Surely someone out there can clearly explain how the damned Sony SACD
works!.
regards
john
 
Top