Instantaneous (analogue) compression of speech signals

John Woodgate · Jan 7, 2005

With regard to the other question of *increasing* articulation index
from straight linear performance given a system that is otherwise non-
peak power limited, I think this is quite dubious.

It isn't dubious, and the application is not radio. There is strong
evidence for the increase in intelligibility, but one particular piece
of work is unpublished for commercial reasons.

Thank you for your extensive and encouraging discourse.

John Woodgate · Jan 7, 2005

I read in sci.electronics.design that Ken Smith
about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:

Since the band of interest is 300Hz to 3KHz,

It isn't: I've posted that it's 100 Hz to 5 kHz (at least).

we don't have to worry
about the harmonics of the frequencies above 1KHz. Those can be removed
with a simple low pass filter. I haven't verified it yet but it seems
to me that 3 stages of phase shifter and 4 clippers should be able to
make a significant compression of amplitude but make less that 5%
distortion on a sine wave.

Interesting, a bit more ambitious circuit-wise than I really expected,
and how to realise the weird fractional power law?

The intermodulation distortion will not be made zero by this method. If
the input has more than one frequency component, the distortion will be
much higher.

The input is mainly speech, but there could be music as well. In any
case, many frequencies.

Robert Baer · Jan 7, 2005

Rich said:
I guess they haven't been able to get those thiotimoline capacitors off
the production line yet ...

;-)
Rich

The first lab prototype that will be built next month, will go
zonkers, thus producing that Richter 9 earthquake..

Robert Baer · Jan 7, 2005

gwhite said:
You have stated that it is perhaps desirable to limit the distortion term(s) in
the instantaneous compressor (limiter) to only third order with the precept that
distortion will at the least be less objectionable. This is possible to a
point. The polynomial concept recalls the commonplace linear radio PA
terminology of "strong non-linearity" and "weak non-linearity. When a "linear"
device is driven into saturation or cutoff, then it is said to be exhibiting
strong non-linearity. Since no device is perfectly linear, the *approach* to
the hard on-off state is referred to the weak non-linearity region.

For the weak region of non-linearity, the standard polynomial model is used.
Because the question in this case invloves only speech frequencies, we can
comfortably ignore memory effects and avoid the nasty Volterra analysis and
kernal generation that would otherwise be needed. More commonly, the problem is
backwards from the present: we would be hoping to generate a memoryless
polynomial model of the imperfect amplifier and inverse it for the purpose of
predistortion cancellation of the non-linear terms in the polynomial model.
Here we hope to *generate* a third order polynomial distortion into the
amplifier.

The following example polynomial expresses the idea:

y = 3*x - x^3, |x| < 1

The gain of the example circuit is three for small signals (coefficient of the
linear term: the whole idea can be scaled, but is normalized here). The domain
limits of +/-1 are where the slope of the transfer function us zero. In
practice this is precisely the boundary where a practical circuit transitions
from weak non-linearity to strong-nonlinearity. IOW, it is hard clipping where
|x| > 1. This is what was meant in a previous comment about polynomial
representation as "possible up to a point." So the actual transfer function
shall more completely be as follows:

{3*x - x^3, |x| < 1
y = { 1 , x > 1
{-1 , x < -1

The practical limit of clamping can ultimately not be avoided (nor would we want
to avoid it given the x^3 term), it can only be traded off regarding the various
specification constraints. To the extent the circiut has fidelity to the
polynomial model, then for all input levels between -1 and +1, *only* third
order distortion product will be present. Thus the clipping is "soft" till it
*smoothly* (no undefined first derivatives) moves into the hard clipping region.

Since the basic normalized model is known, the next question is simple and
inexpensive implementation. Multipliers can be configured to provide the cubic
function explicitly. However, with there is no obvious intuition regarding the
*necessary* smooth transition from cubic polynomial to hard limiting. Moreover,
it is questionable how "simple" the idea of using multipliers is anyway. A
simplistic implementation that comes to mind is to attempt a basic piecewise
circuit of diodes. The "last" diodes to turn on would clamp to the the
normalized +/-1 values. Piecewise diode compression is common, likely because
it is straightforward and "always works."

The y = K*tanh(x) idea has also been proposed. The series expansion is:

tanh(x) = x - x^3/3 + 2*x^5/15 - 17*x^7/315 + ..., -pi/2<x<pi/2

Obviously distortion terms of higher order than three exist. This may or may
not be a problem when practical considerations are made. I wonder if a
combination of tanh and cleverly placed/biased diodes might offer the most
elegant solution that has high compliance to third order only performance.

It is interesting to note that the first two terms of my third order equation
and the tanh equation are identical (just multiply my equation by 1/3). This
was not intentional -- I only noticed afterwards. In that sense, the tanh is a
pretty good approximation right out of the gate.

The MATLAB plot is interesting:

Just start to bend tanh curve with diodes as it approachs the +/-1 saturation
points and you got it (since the two curves lay right on top of each other
otherwise). I also wonder if the natural saturation of the tanh could be
coordinated to do it *without* the diodes. Maybe...

~~~~~~~~~~~~~~~~~~~
With regard to the other question of *increasing* articulation index from
straight linear performance given a system that is otherwise non-peak power
limited, I think this is quite dubious. Two-way radios that use the clipping
don't do it because it makes it better at the "exciter." They do it because of
peak transmitter power limitations and noise at the receiver.

Forget equations, ignore distortion measurements.
Feed the audio into a comparitor and use its output.
Give it a try and listen to the results (no joke); it may amaze you.

martin griffith · Jan 7, 2005

Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals? I've been doing a
little work on it but I'm unable to judge the resulting sound quality.
Why do treble boost controls no longer have any audible effect for me?
(;-)

ISTR the Datong RF clipper from the 70's. that gave a 6dB improvement.
I've done a quick google, but nobody seems to have the circuit

martin

Serious error.
All shortcuts have disappeared.
Screen. Mind. Both are blank.

John Woodgate · Jan 7, 2005

I read in sci.electronics.design that martin griffith
[email protected]>) about 'Instantaneous (analogue) compression of speech
signals', on Fri, 7 Jan 2005:

ISTR the Datong RF clipper from the 70's. that gave a 6dB improvement.
I've done a quick google, but nobody seems to have the circuit

It uses SSB clipping. It's far too complicated for what I want.

John Larkin · Jan 7, 2005

I read in sci.electronics.design that John Larkin <[email protected]>
wrote (in <[email protected]>) about
'Instantaneous (analogue) compression of speech signals', on Wed, 5 Jan
2005:

True, this technique is well-known, but it's costly. I'm looking for an
ingenious low-cost solution.

Sounds ideal for a cheap DSP chip.

John

Ken Smith · Jan 7, 2005

John Larkin said:
Sounds ideal for a cheap DSP chip.

For one channel of voice grade signal, I'd bet a PIC or 8051 based circuit
could do it. The tricky bit is the dynamic range of the ADC. It is easy
to get 24bits worth of analog dynamic range and harder to get that in an
ADC.

Jim Thompson · Jan 7, 2005

I read in sci.electronics.design that Ken Smith
about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:

It isn't: I've posted that it's 100 Hz to 5 kHz (at least).

[snip]

Say you hard low-pass before 10KHz, then an all-pass at 10KHz will
give 100us of delay.

I've used this scheme to process audio to eliminate "pops" from
records.

But I'm unsure of any value for your compression needs.

...Jim Thompson

John Woodgate · Jan 7, 2005

I read in sci.electronics.design that Jim Thompson
4ax.com>) about '"all pass" thought about (analogue) compression', on
Fri, 7 Jan 2005:

I read in sci.electronics.design that Ken Smith
about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:

It isn't: I've posted that it's 100 Hz to 5 kHz (at least).

Click to expand...

[snip]

Say you hard low-pass before 10KHz, then an all-pass at 10KHz will give
100us of delay.

This 'delay' thing is another issue entirely. You can get useful field
patterns by deploying two or more loops carrying uncorrelated speech
signals. One method of decorrelation is to use a wide-band 90 degree
phase-shift and we've already discussed my AFILS phase-shifter here.
Another method is to use a delay of around 10 ms. This isn't as nice as
it might seem, because it gives a comb-filtered frequency response, and
some people seem to be able to hear it. I couldn't, even before I went
deaf.

I've used this scheme to process audio to eliminate "pops" from records.

But I'm unsure of any value for your compression needs.

You mean how much compression? Well, that depends on the subjective
evaluation. Too much gives a very penetrating sound, that is quite
unpleasant.

I wasn't able to do the subjective test with my colleagues today; we ran
out of time discussing speech intelligibility measurements!

Ken Smith · Jan 7, 2005

I read in sci.electronics.design that Ken Smith
about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:

It isn't: I've posted that it's 100 Hz to 5 kHz (at least).

That would make the circuit need a few more sections.

Interesting, a bit more ambitious circuit-wise than I really expected,
and how to realise the weird fractional power law?

I'd aproximate it with a it of curve fitting. Perhaps the sum of a few
long tail pairs.

The "more ambititious" than expected problem could be, I think, the killer
for this idea.

The input is mainly speech, but there could be music as well. In any
case, many frequencies.

Perhaps there is still something in the idea worth considering. Whatever
clipping curve you apply to the signal could be broken into 2 parts and
a simple all pass filter used. The result should be no worse than the one
stage of clipping and may in fact sound better. Instead of trying to zero
the 3rd harmonic, higher harmonics could be targetted.

John Woodgate · Jan 7, 2005

I read in sci.electronics.design that Ken Smith
about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:

Perhaps there is still something in the idea worth considering.
Whatever clipping curve you apply to the signal could be broken into 2
parts and a simple all pass filter used. The result should be no worse
than the one stage of clipping and may in fact sound better. Instead of
trying to zero the 3rd harmonic, higher harmonics could be targetted.

I don't immediately see how that would work for a broadband input
signal. Splitting the signal into octave bands and processing as you
propose would indeed work, because the third and higher harmonics are
out-of-band, if the all-pass maintains its 180 degree phase-shift,
relative to that at f to 2f, from 3f to 6f, where f is the lower band-
edge frequency of an octave-band filter.

gwhite · Jan 7, 2005

John said:
It isn't dubious, and the application is not radio. There is strong
evidence for the increase in intelligibility, but one particular piece
of work is unpublished for commercial reasons.

I realize it is not radio (and that's actually why my doubt of the application
exists). However, much of the work on articulation index and speech
intelligibility has occurred in the radio/telecom field. More specifically,
radio/telecom work has dealt quite directly with the concept of speech
clippers--probably more so than other fields since it tends to be a distinctly
peak power limited environment (especially radio). It is a source of info
so-to-speak. That is, the effects of clipping on speech intelligibility has
been dealt with directly.

For all my references, clipping does not *increase* intelligibility. On the
other hand it does *not harm* the intelligibility for rather high peak clipping
levels (up to 20 dB or so). It does however degrade the subjective quality.
Again, this is according to my sources. One of which is (ch2):

http://www.noblepub.com/shopexd.asp?id=11

I think the discussion in it is pretty good. "RF/IF clippers" are the best.

I would naturally be interested in technical discussion and evidence to the
contrary.

http://www.dstan.mod.uk/data/00/025/16000100.pdf
(6.1.3.17 Peak Clipping with Noise
"If the speech is clipped before the noise mixes with it there can be
an improvement in intelligibility." Of course, that is exactly the _radio_
problem solved.

6.1.4.10 Peak Clipping
"Peak clipping can improve intelligibility of a relatively noise-free speech
signal from
a microphone, in situations where high levels of noise mix with the speech
before it
reaches the listener. The clipping must occur before the noise mixes.
Although peak clipping the audio waveform can improve intelligibility, a better
approach is to use Radio Frequency clipping. With AF clipping the distortion
products are spread throughout the speech band.
When the RF waveform is clipped, the distortion products do not overlap the
transposed speech frequencies, and can be filtered out before the RF signal is
transposed back. [Not generally true as I pointed out earlier. Harmonics are
gone, but not odd-order intermod products.] The final audio waveform has
smoothly rounded peaks rather than
flattened peaks." [True])

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bucketloads of free-bee's on speech/hearing:

http://www.vard.org/prog/98/98prch13.htm
"In backgrounds of noise, both intelligibility and quality were more adversely
affected by peak clipping than by compression or linear amplification. ... These
results indicated that the output of a hearing aid should be limited with
compression rather than peak clipping."

http://www.kemt.fei.tuke.sk/Predmety/KEMT320_EA/_web/Online_Course_on_Acoustics/intelligibility.html

http://www.jblpro.com/pub/technote/spch_intl_1.pdf
http://www.jblpro.com/pub/technote/spch_intl_2.pdf

http://www.icsi.berkeley.edu/~steveng/PDF/Spectral_Slits.pdf
http://www.icsi.berkeley.edu/~steveng/PDF/Bandshift.pdf
http://www.icsi.berkeley.edu/ftp/global/pub/speech/papers/icassp98-uttcomb.pdf
http://www.wramc.amedd.army.mil/departments/aasc/avlab/Eurospeech2003-R.pdf

http://soma.crl.mcmaster.ca/~jeff/NIPS/NIPS_Submission.pdf

http://www.gold-line.com/pdf/articles/p_sti01e14.pdf

http://www.icsi.berkeley.edu/ftp/global/pub/speech/papers/thesis-bedk98.pdf

http://www.phonak.com/com_1998proceedings_6.pdf
http://www.phonak.com/com_1998proceedings_9.pdf

http://www.acoustics-engineering.com/files/TN002.pdf

http://www.svconline.com/mag/avinstall_measuring_intelligibility/

http://fonsg3.let.uva.nl/Proceedings/Proceedings20/ShuzhenWu/ShuzhenWu.html#Heading6

http://www.frye.com/library/acrobat/hrarticle.pdf
http://www.frye.com/library/acrobat/hrarticle2.pdf

http://www.auditory.org/mhonarc/2002/msg00326.html

http://www.cnel.ufl.edu/~markskow/papers/mdsThesisMain.pdf

http://ieeexplore.ieee.org/iel5/8159/23791/01091628.pdf

http://www.eng.uwo.ca/people/vparsa/Audiology/Compression_Tutorial.pdf
"It also seems to be established that compression limiting gives superior
quality to peak
clipping, although the hearing aid needs to be sufficiently saturated for this
advantage to occur
and it is not known how often this degree of saturation occurs in practice for
various degrees of
hearing loss and for various maximum power output settings. There appears,
however, to be no
reason not to use output controlled compression limiting over peak clipping,
except for the most
profoundly impaired listeners."

gwhite · Jan 7, 2005

I haven't verified it yet but it seems to me that
3 stages of phase shifter and 4 clippers should be able to make a
significant compression of amplitude but make less that 5% distortion on a
sine wave.

As far as a sine wave goes, the RF clipper eliminates harmonics entirely. A
baseband version has been given. The patent expired.

The intermodulation distortion will not be made zero by this method.

Nor by any method.

John Woodgate · Jan 7, 2005

I realize it is not radio (and that's actually why my doubt of the
application exists).

What do you doubt? I've posted as much detail about the application -
induction-loop systems for use with hearing aids - as I can without
compromising certain interests.

However, much of the work on articulation index
and speech intelligibility has occurred in the radio/telecom field.
More specifically, radio/telecom work has dealt quite directly with the
concept of speech clippers--probably more so than other fields since it
tends to be a distinctly peak power limited environment (especially
radio). It is a source of info so-to-speak. That is, the effects of
clipping on speech intelligibility has been dealt with directly.

Agreed, although those studies are on signals that can be more seriously
degraded (not by the clipping but by other system characteristics -
bandwidth and noise) than those I'm concerned with.

For all my references, clipping does not *increase* intelligibility. On
the other hand it does *not harm* the intelligibility for rather high
peak clipping levels (up to 20 dB or so). It does however degrade the
subjective quality. Again, this is according to my sources. One of
which is (ch2):

http://www.noblepub.com/shopexd.asp?id=11

I think the discussion in it is pretty good. "RF/IF clippers" are the
best.

RF/IF clipping is not an option. The amplifiers concerned are analogue,
with transformer/rectifier power supplies. AS such they need minimal EMC
assessment and usually no testing. Introducing RF and/or digital
processing changes the situation greatly and involves significant extra
cost and development time.

Jim Thompson · Jan 7, 2005

What do you doubt? I've posted as much detail about the application -
induction-loop systems for use with hearing aids - as I can without
compromising certain interests.

Agreed, although those studies are on signals that can be more seriously
degraded (not by the clipping but by other system characteristics -
bandwidth and noise) than those I'm concerned with.

RF/IF clipping is not an option. The amplifiers concerned are analogue,
with transformer/rectifier power supplies. AS such they need minimal EMC
assessment and usually no testing. Introducing RF and/or digital
processing changes the situation greatly and involves significant extra
cost and development time.

I'm puzzled by "RF/IF" clipping. How does that work to improve the
demodulated audio?

...Jim Thompson

Lasse Langwadt Christensen · Jan 7, 2005

John said:
I read in sci.electronics.design that Anthony C Smith
et.com>) about 'Instantaneous (analogue) compression of speech signals',

Thanks for that. Are you sure it's even harmonics? If the clipping is
precisely symmetrical the harmonics are all odd order.

I believe I've seen it done with LEDs of different color, to
get a asymmetric clipping and thus even harmonics (and odd)

-Lasse

Lasse Langwadt Christensen · Jan 7, 2005

John said:
I read in sci.electronics.design that Keith Wootten
'Instantaneous (analogue) compression of speech signals', on Tue, 4 Jan
2005:

Interesting, because it has potentially less distortion than diode
clipping. It's not quite 'instantaneous', though, and it's not so easy
to find suitable lamps.

I have a pair of speakers that use a lamp it to protect the tweeters, but
I'd say its more of a power limiter than a voltage limiter
and wouldn't it violate your requirement of not having a time constant to
worry about ?

-Lasse

Ken Smith · Jan 7, 2005

I read in sci.electronics.design that Ken Smith
about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:

I don't immediately see how that would work for a broadband input
signal. Splitting the signal into octave bands and processing as you
propose would indeed work, because the third and higher harmonics are
out-of-band, if the all-pass maintains its 180 degree phase-shift,
relative to that at f to 2f, from 3f to 6f, where f is the lower band-
edge frequency of an octave-band filter.

Imagine that we have broken the clipping operation into two steps.
Further imagine that we have made these steps such that if the first step
adds NmV of 5th harmonic to the signal, the second does also. This means
that the second step is a little harder than the first.

For purposes of thinking about it assume, we first pass the signal through
just the clippers with no phase shifter between them and record the
spectrum of the result. Then we do this:

An all pass filter with a modest Q can shift, lets say, the 1KHz to 5Hz
band. The phase curve suddenly starts adding delay at about the
1KHz point.

Any harmonic, made from a signal well below 1KHz, that is above the 1KHz
point will be shifted in phase relative to its fundamental.

If this signal is again clipped, new harmonic components will be created
in the clipping process. These new components will be at some phase angle
to the shifted ones that have passed through the all pass filter.

The sum of two vectors is at its maximum when the vectors are aligned.
Any phase difference between the new harmonics and the ones from the all
pass means that the amplitude of the sum will be less than if there was no
phase shift.

Over some band of frequencies, the phase shift will be between 120 and 240
degrees and the harmonics will tend to cancel.

Since none of the harmonics can be greater than the case where there was
no shifter but some are smaller, the THD is less for the circuit with the
phase shifter.

That sounded clear to me, but I already know what I was thinking.

Ken Smith · Jan 7, 2005

As far as a sine wave goes, the RF clipper eliminates harmonics entirely. A
baseband version has been given. The patent expired.

Nor by any method.

FT the signal

Raise each amplitude to the 5/7th power but don't change the phase

iFT the new spectrum.

No new frequencies are created and no interaction between the amplitudes
has happened. This method has neither harmonic nor IM distortion.

Moore's Lobby Podcast

Menu

Categories

Platforms

Content

Connect With Us

Network

Instantaneous (analogue) compression of speech signals

Instantaneous (analogue) compression of speech signals

John Woodgate

John Woodgate

Robert Baer

Robert Baer

martin griffith

John Woodgate

John Larkin

Ken Smith

Jim Thompson

John Woodgate

Ken Smith

John Woodgate

gwhite

gwhite

John Woodgate

Jim Thompson

Lasse Langwadt Christensen

Lasse Langwadt Christensen

Ken Smith

Ken Smith

Similar threads