Maker Pro
Maker Pro

Sine Lookup Table with Linear Interpolation

R

Robert Baer

Jamie said:
But, but, but, but,.... we work in radians!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Except maybe a carpenter!

Jamie
So?? Convert already - or are you one of those heathens or worse,
infidels?
 
M

Martin Brown

I've been studying an approach to implementing a lookup table (LUT) to
implement a sine function. The two msbs of the phase define the
quadrant. I have decided that an 8 bit address for a single quadrant is
sufficient with an 18 bit output. Another 11 bits of phase will give me
sufficient resolution to interpolate the sin() to 18 bits.

If you assume a straight line between the two endpoints the midpoint of
each interpolated segment will have an error of
((Sin(high)-sin(low))/2)-sin(mid)

Without considering rounding, this reaches a maximum at the last segment
before the 90 degree point. I calculate about 4-5 ppm which is about
the same as the quantization of an 18 bit number.

There are two issues I have not come across regarding these LUTs. One
is adding a bias to the index before converting to the sin() value for
the table. Some would say the index 0 represents the phase 0 and the
index 2^n represents 90 degrees. But this is 2^n+1 points which makes a
LUT inefficient, especially in hardware. If a bias of half the lsb is
added to the index before converting to a sin() value the value 0 to
2^n-1 becomes symmetrical with 2^n to 2^(n+1)-1 fitting a binary sized
table properly. I assume this is commonly done and I just can't find a
mention of it.

The other issue is how to calculate the values for the table to give the
best advantage to the linear interpolation. Rather than using the exact
match to the end points stored in the table, an adjustment could be done
to minimize the deviation over each interpolated segment. Without this,
the errors are always in the same direction. With an adjustment the
errors become bipolar and so will reduce the magnitude by half (approx).
Is this commonly done? It will require a bit of computation to get
these values, but even a rough approximation should improve the max
error by a factor of two to around 2-3 ppm.

The other question is why bother with linear interpolation when you can
have quadratic accuracy with only a little more effort?
Now if I can squeeze another 16 dB of SINAD out of my CODEC to take
advantage of this resolution! lol

One thing I learned while doing this is that near 0 degrees the sin()
function is linear (we all knew that, right?) but near 90 degrees, the
sin() function is essentially quadratic. Who would have thunk it?

It is obvious from the shape of the sine curve!

Classic sin(A+B) = sin(A)cos(B)+cos(A)sin(B)

If A is the actual nearest lookup and B<<1 then you can approximate

cos(B) = 1 - B^2/2
sin(B) = B

For known fixed B these can also be tabulated (exactly or approx).

Then Sin(A+B) = sin(A) + cos(A).B - sin(A).B^2/2

Sin(A) is obtained by indexing forwards in your first quadrant LUT
Cos(A) by indexing backwards from the end of the table

More lookups needed but about the same number of multiplies.

Sorting out phase is left as an exercise for the reader.

The other method popular in FFT computer codes is to store tabulated the
best approximation to the discrete roots of (1, 0)^(1/N)

In practice storing slightly tweaked values of

s = sin(B)
s2 = 2*(sin(B/2)^2)

Where B = 2pi/N

And adjust so that to within acceptable tolerance it comes back to 1,0
after N iterations. Then reset to exactly 1,0 each time around.

sn' = sn - s2*sn - s*cn
cn' = cn - s2*cn + s*sn

(subject to my memory, typos and sign convention errors)

Probably only worth it if you need both sin and cos.
 
J

josephkk

I've been studying an approach to implementing a lookup table (LUT) to
implement a sine function. The two msbs of the phase define the
quadrant. I have decided that an 8 bit address for a single quadrant is
sufficient with an 18 bit output. Another 11 bits of phase will give me
sufficient resolution to interpolate the sin() to 18 bits.

If you assume a straight line between the two endpoints the midpoint of
each interpolated segment will have an error of
((Sin(high)-sin(low))/2)-sin(mid)

Without considering rounding, this reaches a maximum at the last segment
before the 90 degree point. I calculate about 4-5 ppm which is about
the same as the quantization of an 18 bit number.

There are two issues I have not come across regarding these LUTs. One
is adding a bias to the index before converting to the sin() value for
the table. Some would say the index 0 represents the phase 0 and the
index 2^n represents 90 degrees. But this is 2^n+1 points which makes a
LUT inefficient, especially in hardware. If a bias of half the lsb is
added to the index before converting to a sin() value the value 0 to
2^n-1 becomes symmetrical with 2^n to 2^(n+1)-1 fitting a binary sized
table properly. I assume this is commonly done and I just can't find a
mention of it.

The other issue is how to calculate the values for the table to give the
best advantage to the linear interpolation. Rather than using the exact
match to the end points stored in the table, an adjustment could be done
to minimize the deviation over each interpolated segment. Without this,
the errors are always in the same direction. With an adjustment the
errors become bipolar and so will reduce the magnitude by half (approx).
Is this commonly done? It will require a bit of computation to get
these values, but even a rough approximation should improve the max
error by a factor of two to around 2-3 ppm.

Now if I can squeeze another 16 dB of SINAD out of my CODEC to take
advantage of this resolution! lol

One thing I learned while doing this is that near 0 degrees the sin()
function is linear (we all knew that, right?) but near 90 degrees, the
sin() function is essentially quadratic. Who would have thunk it?

Without reading any other replies, i would suggest either the CORDIC
algorithm of simpson's rule interpolation. From the rule for integration
to find any intermediate value you evaluate the curve value within the
segment covered by rule. 4 intervals on the step size you have will be
limited by the 18 bits of input. Do 20 bits of input and round to 18 or
not. About 1 PPM.

?-)
 
R

rickman

Without reading any other replies, i would suggest either the CORDIC
algorithm of simpson's rule interpolation. From the rule for integration
to find any intermediate value you evaluate the curve value within the
segment covered by rule. 4 intervals on the step size you have will be
limited by the 18 bits of input. Do 20 bits of input and round to 18 or
not. About 1 PPM.

I can't say I follow what you mean about Simpson's rule interpolation.
Interpolation is used to derive Simpson's Rule, but I don't see a way to
use an integral for obtaining an interpolation. Isn't Simpson's Rule
about approximating integrals? What am I missing?
 
J

josephkk

It is simple, the LUT gives values for each point in the table, precise
to the resolution of the output. The sin function between these points
varies from close to linear to close to quadratic. If you just
interpolate between the "exact" points, you have an error that is
*always* negative anywhere except the end points of the segments. So it
is better to bias the end points to center the linear approximation
somewhere in the middle of the real curve. Of course, this has to be
tempered by the resolution of your LUT output.

If you really want to see my work it is all in a spreadsheet at the
moment. I can send it to you or I can copy some of the formulas here.
My "optimization" is not nearly optimal. But it is an improvement over
doing nothing I believe and is not so hard. I may drop it when I go to
VHDL as it is iterative and may be a PITA to code in a function.

Hmm. I would like to take a crack at simpson's rule interpolation for
this to measure error budget. I may not be timely but this is
interesting. joseph underscore barrett at sbcglobal daht cahm


?-)
 
J

josephkk

I'm not sure this would be easier. The LUT an interpolation require a
reasonable amount of logic. This requires raising X to the 5th power
and five constant multiplies. My FPGA doesn't have multipliers and this
may be too much for the poor little chip. I suppose I could do some of
this with a LUT and linear interpolation... lol

No multipliers(?); ouch. My Simpson's rule method may fit but it will
need a trapezoidal strip out of a full Wallace tree multiplier. Preferably
three (speed issue, that may not be that big for you). How do you do the
linear interpolation without a multiplier?

?-)
 
J

josephkk

I can't say I follow this. How do I get a 22.5 degree sine table to
expand to 90 degrees?

As to my improvements, I can see where most implementations don't bother
pushing the limits of the hardware, but I can't imagine no one has done
this before. I'm sure there are all sorts of apps, especially in older
hardware where resources were constrained, where they pushed for every
drop of precision they could get. What about space apps? My
understanding is they *always* push their designs to be the best they
can be.

I agree. 45 degree sine and cosine table will do the job as will a 90
degree sine or cosine table.

?-)
 
J

josephkk

It could also be a difficult problem to solve without some sort of
exhaustive search. Each point you fudge affects two curve segments and
each curve segment is affected by two points. So there is some degree
of interaction.

Anyone know if there is a method to solve such a problem easily? I am
sure that my current approach gets each segment end point to within ±1
lsb and I suppose once you measure THD in some way it would not be an
excessive amount of work to tune the 256 end points in a few passes
through. This sounds like a tree search with pruning.

I'm assuming there would be no practical closed form solution. But
couldn't the THD be calculated in a simulation rather than having to be
measured on the bench?

The calculation can be done (most spice versions will do this, even excel
can be used), but that doesn't mean that bench testing will give the same
results. The calculation must be done on the final value series.

?-)
 
J

josephkk

Why can't some harmonic component be added to the table points to improve
downstream distortion? Of course, you'd need a full 360 degree table, not a
folded one.

How do you get that? All harmonics would be within the 90 degree table.
Harmonic phase shifts is a different matter though, that would take
careful characterization of production quantities of the final product,
$$$$$.

?-)
 
G

glen herrmannsfeldt

(snip, someone wrote)
No multipliers(?); ouch. My Simpson's rule method may fit but it will
need a trapezoidal strip out of a full Wallace tree multiplier. Preferably
three (speed issue, that may not be that big for you). How do you do the
linear interpolation without a multiplier?

The LUT is indexed by the high (4) bits of the input and the low bits
to interpolate with. So it is, pretty much, 16 different interpolation tables
which, in other words, is multiply by table lookup.

-- glen
 
J

josephkk

You're no fun anymore. :)

It does have a 0.01% FS range, so you'd expect to get a lot better than
-80 dB.

But for fast stuff, I'd probably do the same as you--filter the
daylights out of it and amplify the residuals.

Cheers

Phil Hobbs

Which is how most old school distortion analyzers work. Pop'tronics did a
four part series on one that used auto-tuned parallel-t filters with over
80 db fundamental notch rejection in the late 60s or early 70s. Realy a
slick design in a lot of ways, especially for a hobby mag.

?-)
 
R

rickman

No multipliers(?); ouch. My Simpson's rule method may fit but it will
need a trapezoidal strip out of a full Wallace tree multiplier. Preferably
three (speed issue, that may not be that big for you). How do you do the
linear interpolation without a multiplier?

I can do multiplies without a multiplier block. It gets to be
problematic to do a *lot* of multiplies. The interp will be done in a
shift/add type multiplier as well as the 8 bit amplitude multiplier for
the scale factor (which is already in an existing product). The CODEC I
am using has 128 clock cycle per sample so there is time to use a serial
Booth's multiplier or two.
 
R

rickman

Hmm. I would like to take a crack at simpson's rule interpolation for
this to measure error budget. I may not be timely but this is
interesting. joseph underscore barrett at sbcglobal daht cahm

Oh, I thought you were talking about using Simpsom's rule as part of the
interpolation. You are talking about using it to estimate the integral
of the error. Great idea! I'll shoot you an email.
 
R

rickman

(snip, someone wrote)


The LUT is indexed by the high (4) bits of the input and the low bits
to interpolate with. So it is, pretty much, 16 different interpolation tables
which, in other words, is multiply by table lookup.

That is the poor man's LUT approach where the resources are precious.
It requires a "coarse" table using the high and mid bits of the input
and a second table indexed by the high and low bits of the input for a
correction. I have a 512x18 LUT available and in fact, it is dual
ported for simultaneous access if needed. This by itself gives pretty
good results. The linear approximation would be done with a
serial/parallel Booth's two technique using n clock cycles where n is
the number of bits in the multiplier. I've got some time for each
sample and expect to share the hardware between two channels.
 
R

robert bristow-johnson

On 3/21/2013 12:18 AM, glen herrmannsfeldt wrote: ....

That is the poor man's LUT approach where the resources are precious. It
requires a "coarse" table using the high and mid bits of the input and a
second table indexed by the high and low bits of the input for a
correction.

that's sorta the case for even the middle-class man's LUT and
interpolation. you're always splitting a continuous-time parameter
(however many bits are used to express it) into an integer part (that
selects the set of samples used in the interpolation) and a fractional
part (that defines how this set of samples are to be combined to be the
interpolated value).
 
R

rickman


Actually this would not work for anything except odd harmonics. The way
the table is folded they have to be symmetrical about the 90 degree
point as well as the X axis at 0 and 90 degrees. Odd harmonics would do
this if they are phased correctly. I don't think it would be too useful
that way.

Of course you'd have to null each harmonic for gain and phase,
preferably on each production unit, but it wouldn't be difficult or
expensive *if* you had a distortion analyzer with the required PPM
performance. We regularly do automated calibrations, polynomial curve
fits and such, more complex than this.

Might as well use a separate table for the nulling, it would be much
simpler.
 
  What ever happen to the good old days of generating sine waves via
digital instead of using these DDS chips that no one seems to know
exactly how they do it? Well, that seems to be the picture I am getting.
----
   You need an incremintor (32 bit) for example, a phase accumulator (32
bits) and lets say a 12 bit DAC.

   All this does is generate a +/- triangle reference via the phase
accumulator. The 31th bit would be the sign which indicates when to
start generating decal instead of acel, in the triangle output.

   Using integers as your phase acc register, this will naturally wrap
around. THe idea is to keep summing with the incriminator, which has
been initially calculated verses the update time (sample) and desired freq.

   Now since the DAC is only 12 bits, you use the MSB's as the ref for
the DAC and the LSB (20 bits) for the fixed point integer math to get
the needed precision.

  Now that we have a nice triangle wave that can be past to a hybrid log
amp, the rest is trivial.

  Actually, today I was looking at one of our circuits that seem to be
generating trapezoid on the (-) peak instead of sine and that was a
log amp. Seems the transistor in the feed back failed some how and
it most likely has been like that for a long time, no one never noticed
it before...

   And on top of that, years ago I've made sinewave oscillators from the
RC point of a 555 timer that drove a log amp designed to give rather
clean sinewaves.

Jamie

well the DDS chips work just like you describe, an accumulator
but instead of generating a triangle and shaping it, the MSBs index a
sinewave table going to a DAC

you can do with one quadrant of sine table, with MSB-1 you can
chose 2nd,4rd quadrant, invert the bits indexing the sinewave (makes
the index count "in reverse")

MSB choose 1st,2nd or 3rd,4th quadrant, for 3rd and 4th negate the
sinewave output

-Lasse
 
J

Jamie

rickman said:
Actually this would not work for anything except odd harmonics. The way
the table is folded they have to be symmetrical about the 90 degree
point as well as the X axis at 0 and 90 degrees. Odd harmonics would do
this if they are phased correctly. I don't think it would be too useful
that way.




Might as well use a separate table for the nulling, it would be much
simpler.
What ever happen to the good old days of generating sine waves via
digital instead of using these DDS chips that no one seems to know
exactly how they do it? Well, that seems to be the picture I am getting.
----
You need an incremintor (32 bit) for example, a phase accumulator (32
bits) and lets say a 12 bit DAC.

All this does is generate a +/- triangle reference via the phase
accumulator. The 31th bit would be the sign which indicates when to
start generating decal instead of acel, in the triangle output.

Using integers as your phase acc register, this will naturally wrap
around. THe idea is to keep summing with the incriminator, which has
been initially calculated verses the update time (sample) and desired freq.

Now since the DAC is only 12 bits, you use the MSB's as the ref for
the DAC and the LSB (20 bits) for the fixed point integer math to get
the needed precision.

Now that we have a nice triangle wave that can be past to a hybrid log
amp, the rest is trivial.

Actually, today I was looking at one of our circuits that seem to be
generating trapezoid on the (-) peak instead of sine and that was a
log amp. Seems the transistor in the feed back failed some how and
it most likely has been like that for a long time, no one never noticed
it before...

And on top of that, years ago I've made sinewave oscillators from the
RC point of a 555 timer that drove a log amp designed to give rather
clean sinewaves.

Jamie
 
G

glen herrmannsfeldt

In comp.dsp robert bristow-johnson said:
On 3/21/13 6:20 PM, rickman wrote:

(snip, I wrote)
You could instead index one ROM with eight bits, giving the
interpolation size (delta y) and then index another ROM with that,
and the low for data bits. That would allow closer matching of
the interpolation slope to the actual data.

Seems to me that someone started selling MOS ROMs and then designed
some ready to sell.
that's sorta the case for even the middle-class man's LUT and
interpolation. you're always splitting a continuous-time parameter
(however many bits are used to express it) into an integer part (that
selects the set of samples used in the interpolation) and a fractional
part (that defines how this set of samples are to be combined to be the
interpolated value).

One would have to go through the numbers to see how much difference
it makes. This was before the EPROM. They final mask layer was generated
from the data file from the user (in large enough quantities to make
it worthwhile) or for the special case ROMs in the data book.

-- glen
 
Top