Instantaneous (analogue) compression of speech signals

John Woodgate · Jan 4, 2005

I read in sci.electronics.design that Jim Thompson
4ax.com>) about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:

As a start, at the very least, don't you need to define the level at
which ONSET of amplitude reduction begins, and then an absolute MAXIMUM
output level?

Sure, but that comes later. I'm first looking for different techniques
to try.

In fact, it's easy to define the critical points in millivolts or
whatever using sine wave signals; it's another matter to make
*meaningful* measurements of the speech signals.

Keith Wootten · Jan 4, 2005

John Woodgate said:
I read in sci.electronics.design that Keith Wootten
'Instantaneous (analogue) compression of speech signals', on Tue, 4 Jan
2005:

Interesting, because it has potentially less distortion than diode
clipping. It's not quite 'instantaneous', though, and it's not so easy
to find suitable lamps.

It may be effective in conjunction with diode clipping - a small lamp
will respond in (probably) some few milliseconds, so the initial harsh
diode clipping would be short-lived and possibly less audible and less
objectionable as a result. Speculation, naturally.

Does he have a web site?

No, it was a poor joke. DSP may be over the top, but a PIC (or
whatever) with ADC, look-up table and DAC would be pretty simple for a
Lo-Fi implementation.

Cheers

Roger Hamlett · Jan 4, 2005

John Woodgate said:
Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals? I've been doing a
little work on it but I'm unable to judge the resulting sound quality.
Why do treble boost controls no longer have any audible effect for me?
(;-)

It really does depend what you want. Generally, you can clip quite
agressively, _provided_ you have an automatic gain adjustment before the
clipping, and may actually improve intelligability if this is done right.
The IC that used to be commonly used, was the Plessey SL6270 VOGAD (voice
operated gain adjustment device), which massively reduced the dynamic
range needed for speech. A search on this may find an equivalent. Are you
trying to listen while speaking, or recording the sound. It is terribly
difficult to 'judge' your own voice through such a circuit. Try using a
text message recorded by somebody else, and playing it through the
circuit.
I might even be able to find a couple of the original Plessey IC's. We
used them many years ago, as part of a speech digitisation system, and
there may still be some in my cupboards somewhere!...
Presumably you have lost most of the higher frequency response in your
hearing. This has practically no effect on speech (you can filter
everything above about 2.5Khz, and still understand speech perfectly), but
would massively reduce the effect of treble 'boost' controls.

Best Wishes

John Woodgate · Jan 4, 2005

I read in sci.electronics.design that Keith Wootten
'Instantaneous (analogue) compression of speech signals', on Tue, 4 Jan
2005:

No, it was a poor joke.

It was a reasonable feed for a dead-pan response.

DSP may be over the top, but a PIC (or
whatever) with ADC, look-up table and DAC would be pretty simple for a
Lo-Fi implementation.

True, but anything like that raises EMC issues, which I don't want to
get involved in. Analogue is much less hassle, if it works well enough.

John Woodgate · Jan 4, 2005

I read in sci.electronics.design that Roger Hamlett <rogerspamignored@tt
elmah.demon.co.uk> wrote (in <[email protected]>)
about 'Instantaneous (analogue) compression of speech signals', on Tue,
4 Jan 2005:

It really does depend what you want.

Almost everything does. (;-)

Generally, you can clip quite
agressively, _provided_ you have an automatic gain adjustment before the
clipping, and may actually improve intelligability if this is done right.

Do you have any references for the increase in intelligibility? That's
part of the bigger picture.

The IC that used to be commonly used, was the Plessey SL6270 VOGAD (voice
operated gain adjustment device), which massively reduced the dynamic
range needed for speech.

Yes, I have one of those in my assisted hearing device that I use in
committee meetings. Google shows a number of NOS sources but I don't see
a current equivalent.

A search on this may find an equivalent. Are you
trying to listen while speaking, or recording the sound. It is terribly
difficult to 'judge' your own voice through such a circuit. Try using a
text message recorded by somebody else, and playing it through the
circuit.

That's what I'm using; the Canford Quick Check voice tracks, for
example.

I might even be able to find a couple of the original Plessey IC's. We
used them many years ago, as part of a speech digitisation system, and
there may still be some in my cupboards somewhere!...
Presumably you have lost most of the higher frequency response in your
hearing. This has practically no effect on speech (you can filter
everything above about 2.5Khz, and still understand speech perfectly), but
would massively reduce the effect of treble 'boost' controls.

Yes, you've twigged it. I have lost a lot from 1 kHz up and that does
cause problems with speech.

Jim Thompson · Jan 4, 2005

Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals? I've been doing a
little work on it but I'm unable to judge the resulting sound quality.
Why do treble boost controls no longer have any audible effect for me?
(;-)

How about "compandors" used by most radio stations to keep their
modulation index maxed out....

http://www.semiconductors.philips.com/acrobat_download/applicationnotes/AN176.pdf

http://www.portset.co.uk/compand.htm

http://www.toko.co.jp/products/ctlg/ic/com_compandor_e.htm

http://www.chipdocs.com/datasheets/datasheet-pdf/Philips-Semiconductors/NE570.html

http://ieeexplore.ieee.org/Xplore/Toclogin.jsp?url=/iel5/4/22551/01050814.pdf

http://www.onsemi.com/site/products/parts/0,4454,62,00.html

...Jim Thompson

John Woodgate · Jan 4, 2005

I read in sci.electronics.design that Jim Thompson
4ax.com>) about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:

http://www.semiconductors.philips.com/acrobat_download/applicationnotes/
AN176.pdf

This is helpful for theory but the devices are not now available, I
think.

http://www.portset.co.uk/compand.htm

I have concerns about the 'direct' mode, which is clearly non-linear!

http://www.toko.co.jp/products/ctlg/ic/com_compandor_e.htm

http://www.chipdocs.com/datasheets/datasheet-pdf/Philips-
Semiconductors/NE570.html

Not 'instantaneous' and data only in Japanese

-(

http://ieeexplore.ieee.org/Xplore/Toclogin.jsp?url=/iel5/4/22551/0105081
4.pdf

Not accessible to me.

http://www.onsemi.com/site/products/parts/0,4454,62,00.html

Not instantaneous; these use a rectifier and thus involve at least one
time-constant.

Thanks for your help.

Reg Edwards · Jan 4, 2005

Digital stuff is just a passing fashion.

ANALOGUE INEVITABLY RULES!

You can't design a microprocessor without considering the inter-connections
to be transmission lines with Zo, attenuation and phase delay.

Rob Gaddi · Jan 4, 2005

John said:
True, but anything like that raises EMC issues, which I don't want to
get involved in. Analogue is much less hassle, if it works well enough.

Not too too badly. TI's MSP430 line pulls less than 2mA and doesn't
need an external clock. I haven't read up specifically on any of the
EMC issues, but with no loops or high currents it's hard to see what
would broadcast.

Jim Thompson · Jan 4, 2005

Digital stuff is just a passing fashion.

ANALOGUE INEVITABLY RULES!

You can't design a microprocessor without considering the inter-connections
to be transmission lines with Zo, attenuation and phase delay.

Poor Reg, a senile ending to a once-creative mind :-(

...Jim Thompson

Jim Thompson · Jan 4, 2005

I read in sci.electronics.design that Jim Thompson
4ax.com>) about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:

This is helpful for theory but the devices are not now available, I
think.

I have concerns about the 'direct' mode, which is clearly non-linear!

Not 'instantaneous' and data only in Japanese -(

Not accessible to me.

Not instantaneous; these use a rectifier and thus involve at least one
time-constant.

Thanks for your help.

I thought you were the audio expert ?

What curve would you like and I'll create it in circuitry for you?

...Jim Thompson

John Woodgate · Jan 4, 2005

I read in sci.electronics.design that Jim Thompson
4ax.com>) about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:

What curve would you like and I'll create it in circuitry for you?

I couldn't afford your professional services. And as yet I don't know
what I want. I'm still at the breadboard stage. I have something that
'doesn't not work', but I want to know how far off optimum it is. And
the sound quality is important but I can't hear well enough to assess
it. I need 'ears-on' local assistance, and I have some colleagues
visiting on Friday. Maybe they will listen for me.

Jim Thompson · Jan 4, 2005

I read in sci.electronics.design that Jim Thompson
4ax.com>) about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:

I couldn't afford your professional services.

I didn't note a fee attached to my offer ;-)

And as yet I don't know
what I want. I'm still at the breadboard stage. I have something that
'doesn't not work', but I want to know how far off optimum it is. And
the sound quality is important but I can't hear well enough to assess
it. I need 'ears-on' local assistance, and I have some colleagues
visiting on Friday. Maybe they will listen for me.

OK. I appreciate how hearing loss slips up on you. I've very little
high frequency response in my left ear... If I bury my right ear in
the pillow I don't even hear the phone ring in our bedroom
(high-pitched electronic "ringer").

...Jim Thompson

John Woodgate · Jan 4, 2005

I read in sci.electronics.design that Jim Thompson
4ax.com>) about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:

I didn't note a fee attached to my offer ;-)

Neither did I, and I wanted to confirm that up-front. (;-) Let's see how
things pan out.

Tom MacIntyre · Jan 4, 2005

I didn't note a fee attached to my offer ;-)

OK. I appreciate how hearing loss slips up on you. I've very little
high frequency response in my left ear... If I bury my right ear in
the pillow I don't even hear the phone ring in our bedroom
(high-pitched electronic "ringer").

We're practically twins in that regard, Jim, except mine was sudden,
due to a skull fracture, and complete, profound, 100% loss in the left
ear, with nothing left behind but the tinnitus. It makes it easy to
sleep when laying on the right side, but I have concerns about smoke
detectors, etc.

Tom

Ken Smith · Jan 5, 2005

Jim Thompson said:
How about "compandors" used by most radio stations to keep their
modulation index maxed out....

I like the homomorphic compressor, not because it is better in any way but
because it is unusual.

Take the abs() of the signal but remember the sign.

Take the ln() of the abs()

High pass

do the exp()

Restore the sign.

Rich Grise · Jan 5, 2005

I read in sci.electronics.design that Anthony C Smith ....

Thanks for that. Are you sure it's even harmonics? If the clipping is
precisely symmetrical the harmonics are all odd order.

Anything using a rectifier involves a time constant, and I want to avoid
that because it introduces an extra variable - the time constant.

If a diode clipper is unsatisfactory, would a log amp do?

Thanks,
Rich

Rich Grise · Jan 5, 2005

I read in sci.electronics.design that Keith Wootten
'Instantaneous (analogue) compression of speech signals', on Tue, 4 Jan
2005:

Interesting, because it has potentially less distortion than diode
clipping. It's not quite 'instantaneous', though, and it's not so easy
to find suitable lamps.
Does he have a web site?

He will, as soon as they finish that span to the grapes.

;-)
Rich

Pig Bladder · Jan 5, 2005

I read in sci.electronics.design that Roger Hamlett <rogerspamignored@tt
Yes, you've twigged it. I have lost a lot from 1 kHz up and that does
cause problems with speech.

You're trying to build a hearing aid, without admitting that you need a
hearing aid, is this it?

John S. Dyson · Jan 5, 2005

Thank you for bringing this up. I now remember, the context of "'Deep'
clipping" was modulated RF, although my mental dredge is coming up AM, as
opposed to SSB, but the point is entirely the same.

Not actually playing with that application myself, it appears that it
would be a damned good approach for speech applications. There have
been some other people responding who know more about this specific
application than I do, but it might be good enough for alot of applications.

I'm afraid you're out of my league here, although I do want to say that
it's only the implementation I'm ignorant about - I get the _point_ of
what you're saying about transforming signals quite clearly, thanks.

It isn't beyond your abilities AT ALL, I am sure. However, basically,
using an FFT is conceptually to do alot of filters, apply a nonlinear
operation (e.g. clipping or other, perhaps more gentle math operation)
that does the 'limiting' or 'compression' operation. A 'gentle', but
perhaps inadequate for speech operation might be 'sqrt.' This would
have the effect of doing a 2:1 compression on the signal. After the
nonlinear operation, then the signal is rebuilt by doing an inverse FFT.

When doing the fft method (any dsp engineer can help you with this), the
key is to use the correct windowing method (used to meld the chunks of
FFT samples together) and make sure that the math operation doesnt' screw
with the phase of the signal. The math operations should only play
with the amplitude, unless there is some kind of much more fancy operation.
It is amazing that the magnitude of the transformed audio can be
really severely damaged, but if the phase is kept the same, then the
reconstructed audio is still recongizeable.

For the window, when looking at my (perhaps incorrect) source code,
the comments say that I used the 'hann window.' If someone really
needs to know, I can probably resurrect the code. I haven't looked
at it seriously in the last 3-5yrs.

The method that I used to overlap the FFTs and do the necessary windowing
did a pretty good job of avoiding the expected 'choppiness' in the signal.
The FFT method of signal processing was just too fancy and too aggressive
for my own needs. In fact, I didnt' like the sound of multi-band audio
agc in general, and instead developed a very fancy complex attack/decay
time scheme that does low distortion for fast effective attack/decay times.
(LF modulation of other signal components and various other kinds of
LF distortion are audibly mitigated by doing a super intelligent control
of the gain... The attack and decay times are totally undefinable except
in an instantaneous sense.) On my desktop machine, my most fancy single
band algorithm (which is probably more complex than many multi-band schemes)
takes about 1/100 of the CPU. Trivial AGC algorithms can probably be
1000X faster than that... My straightforward implementation of the
complex algorithm does use limited numbers of exp and log type operations.

Even a multiband scheme will produce short term distortion
products simply because of the physics and mathematics that define
the limitations of real world frequency domain filters. So, I designed
a single band gain control scheme that hides most of the intermod problems to
exist only during short transients -- probably worse than a multiband
scheme, but damned good for single band. It isn't perfect, but is about
as good as a single band scheme can be (in fact, it is probably better
than many multi-band schemes.) The multi-band schemes are still limited
by the phasing effects on the sound (for deep/fast compression.) Both
the pumping avoidance and intermod avoidance are fairly well achieved
in my single band scheme. When the single band agc is used sanely
(e.g. 1.4:1 compression through 3:1 compression (in dB) and the
compression is set to be gentle, the audio still sounds really good.)
If the compression is aggressively applied, it still sounds 'good',
and still doesn't audibly pump, and maintains a very high density
of the audio, but has little purpose other than perhaps to process
audio for advertisements, shortwave or AM station transmitter.

Of course, this is fairly far off topic WRT speech processing, so I won't
bother you with more off topic info (unless someone is interested in
my latest version of my audio AGC code -- very old versions are used in
some free and probably
commercial software.) The new stuff (developed in the last several years)
is far far better than anything else that I have played with (or developed
myself.) It is still ugly, but could be cleaned up if there would be
any demand. (It is written in sane C++, and happily uses inline asms
for P4 SSE math operations, some take advantage of the SIMD capabilities.)

John

Moore's Lobby Podcast

Menu

Categories

Platforms

Content

Connect With Us

Network

Instantaneous (analogue) compression of speech signals

Instantaneous (analogue) compression of speech signals

John Woodgate

Keith Wootten

Roger Hamlett

John Woodgate

John Woodgate

Jim Thompson

John Woodgate

Reg Edwards

Rob Gaddi

Jim Thompson

Jim Thompson

John Woodgate

Jim Thompson

John Woodgate

Tom MacIntyre

Ken Smith

Rich Grise

Rich Grise

Pig Bladder

John S. Dyson

Similar threads