Thank you for bringing this up. I now remember, the context of "'Deep'
clipping" was modulated RF, although my mental dredge is coming up AM, as
opposed to SSB, but the point is entirely the same.
Not actually playing with that application myself, it appears that it
would be a damned good approach for speech applications. There have
been some other people responding who know more about this specific
application than I do, but it might be good enough for alot of applications.
I'm afraid you're out of my league here, although I do want to say that
it's only the implementation I'm ignorant about - I get the _point_ of
what you're saying about transforming signals quite clearly, thanks.
It isn't beyond your abilities AT ALL, I am sure. However, basically,
using an FFT is conceptually to do alot of filters, apply a nonlinear
operation (e.g. clipping or other, perhaps more gentle math operation)
that does the 'limiting' or 'compression' operation. A 'gentle', but
perhaps inadequate for speech operation might be 'sqrt.' This would
have the effect of doing a 2:1 compression on the signal. After the
nonlinear operation, then the signal is rebuilt by doing an inverse FFT.
When doing the fft method (any dsp engineer can help you with this), the
key is to use the correct windowing method (used to meld the chunks of
FFT samples together) and make sure that the math operation doesnt' screw
with the phase of the signal. The math operations should only play
with the amplitude, unless there is some kind of much more fancy operation.
It is amazing that the magnitude of the transformed audio can be
really severely damaged, but if the phase is kept the same, then the
reconstructed audio is still recongizeable.
For the window, when looking at my (perhaps incorrect) source code,
the comments say that I used the 'hann window.' If someone really
needs to know, I can probably resurrect the code. I haven't looked
at it seriously in the last 3-5yrs.
The method that I used to overlap the FFTs and do the necessary windowing
did a pretty good job of avoiding the expected 'choppiness' in the signal.
The FFT method of signal processing was just too fancy and too aggressive
for my own needs. In fact, I didnt' like the sound of multi-band audio
agc in general, and instead developed a very fancy complex attack/decay
time scheme that does low distortion for fast effective attack/decay times.
(LF modulation of other signal components and various other kinds of
LF distortion are audibly mitigated by doing a super intelligent control
of the gain... The attack and decay times are totally undefinable except
in an instantaneous sense.) On my desktop machine, my most fancy single
band algorithm (which is probably more complex than many multi-band schemes)
takes about 1/100 of the CPU. Trivial AGC algorithms can probably be
1000X faster than that... My straightforward implementation of the
complex algorithm does use limited numbers of exp and log type operations.
Even a multiband scheme will produce short term distortion
products simply because of the physics and mathematics that define
the limitations of real world frequency domain filters. So, I designed
a single band gain control scheme that hides most of the intermod problems to
exist only during short transients -- probably worse than a multiband
scheme, but damned good for single band. It isn't perfect, but is about
as good as a single band scheme can be (in fact, it is probably better
than many multi-band schemes.) The multi-band schemes are still limited
by the phasing effects on the sound (for deep/fast compression.) Both
the pumping avoidance and intermod avoidance are fairly well achieved
in my single band scheme. When the single band agc is used sanely
(e.g. 1.4:1 compression through 3:1 compression (in dB) and the
compression is set to be gentle, the audio still sounds really good.)
If the compression is aggressively applied, it still sounds 'good',
and still doesn't audibly pump, and maintains a very high density
of the audio, but has little purpose other than perhaps to process
audio for advertisements, shortwave or AM station transmitter.
Of course, this is fairly far off topic WRT speech processing, so I won't
bother you with more off topic info (unless someone is interested in
my latest version of my audio AGC code -- very old versions are used in
some free and probably
commercial software.) The new stuff (developed in the last several years)
is far far better than anything else that I have played with (or developed
myself.) It is still ugly, but could be cleaned up if there would be
any demand. (It is written in sane C++, and happily uses inline asms
for P4 SSE math operations, some take advantage of the SIMD capabilities.)
John