Difference between revisions of "How Audacity Noise Reduction Works"

From Audacity Wiki
Jump to: navigation, search
(Reply to Peter about this page and the Manual page.)
Line 2: Line 2:
 
* '''Gale 19Dec14:''' There was previously perceived a need for a page like this that had greater technical depth than appropriate for the Manual, and I think the need is still there. Some of this content is still valid, the second question completely so. The content surely needs updating. Some worthwhile content is obviously in [[Proposal Noise Removal]]. If nothing else can be done about this page before 2.1.0 then we should link to that page I think.<p>I think because of the slightly unfortunate rename of the effect, this page needs to state it is nothing to do with http://en.wikipedia.org/wiki/Dolby_noise-reduction_system, or should be moved to [[How Audacity Noise Reduction Works]].</p>
 
* '''Gale 19Dec14:''' There was previously perceived a need for a page like this that had greater technical depth than appropriate for the Manual, and I think the need is still there. Some of this content is still valid, the second question completely so. The content surely needs updating. Some worthwhile content is obviously in [[Proposal Noise Removal]]. If nothing else can be done about this page before 2.1.0 then we should link to that page I think.<p>I think because of the slightly unfortunate rename of the effect, this page needs to state it is nothing to do with http://en.wikipedia.org/wiki/Dolby_noise-reduction_system, or should be moved to [[How Audacity Noise Reduction Works]].</p>
 
*'''Peter 21Dec14:''' I never understood the rationale for this page.  The content appears more suited to civilian users rather than developers (and they should really be catered for in the Manual's documentation as you observed elsewhere). The "artifacts" section could become a FAQ perhaps.  I can probably see a need for a detailed page on the algorithm and implementation for future developers as it is obviously a fairly complicated effect (witnessed by the amount of email traffic as it was under development).  Maybe we should be encouraging Paul Licameli to write the page for us?
 
*'''Peter 21Dec14:''' I never understood the rationale for this page.  The content appears more suited to civilian users rather than developers (and they should really be catered for in the Manual's documentation as you observed elsewhere). The "artifacts" section could become a FAQ perhaps.  I can probably see a need for a detailed page on the algorithm and implementation for future developers as it is obviously a fairly complicated effect (witnessed by the amount of email traffic as it was under development).  Maybe we should be encouraging Paul Licameli to write the page for us?
 +
**'''Paul 27Jan15:''' I have addressed the problem of dated or incomplete content.  I rewrote the page, retaining much of the old content that was still correct.  Still debatable is the purpose of the page and whether some of this is worth replicating in the user manual.'''
 
}}
 
}}
 
== About the Noise Reduction algorithm==
 
== About the Noise Reduction algorithm==
The noise reduction algorithm uses {{external|[http://en.wikipedia.org/wiki/Fourier_analysis Fourier analysis]}}: it finds the spectrum of pure tones that make up the background noise in the quiet sound segment that you selected - that's called the "frequency spectrum" of the sound.  That forms a fingerprint of the static background noise in your sound file.  When you reduce noise from the sound as a whole, the algorithm finds the frequency spectrum of each short segment of sound.  Any pure tones that aren't sufficiently louder than their average levels in the fingerprint are reduced in volume.  That way, (say) a guitar note or an overtone of the singer's voice are preserved, but hiss, hum, and other steady noises can be minimized. The general technique is called {{external|[http://en.wikipedia.org/wiki/Noise_gate spectral noise gating]}}.   
+
 
 +
'''Q:''' <i>How do you actually reduce noise?  What is the algorithm?</i>
 +
 
 +
'''A:''' The noise reduction algorithm uses {{external|[http://en.wikipedia.org/wiki/Fourier_analysis Fourier analysis]}}: it finds the spectrum of pure tones that make up the background noise in the quiet sound segment that you selected - that's called the "frequency spectrum" of the sound.  That forms a fingerprint of the static background noise in your sound file.  When you reduce noise from the sound as a whole, the algorithm finds the frequency spectrum of each short segment of sound.  Any pure tones that aren't sufficiently louder than their average levels in the fingerprint are reduced in volume.  That way, (say) a guitar note or an overtone of the singer's voice are preserved, but hiss, hum, and other steady noises can be minimized. The general technique is called {{external|[http://en.wikipedia.org/wiki/Noise_gate spectral noise gating]}}.   
  
 
The first pass of noise reduction is done over just noise.  For each windowed sample of the sound, we take a Fast Fourier Transform (FFT) using a Hann window and then statistics, including the mean power, are tabulated for each frequency band.
 
The first pass of noise reduction is done over just noise.  For each windowed sample of the sound, we take a Fast Fourier Transform (FFT) using a Hann window and then statistics, including the mean power, are tabulated for each frequency band.
Line 15: Line 19:
 
{{ednote|
 
{{ednote|
 
* '''Paul 24Jan15:''' Time smoothing alludes to the attack and release.  It was lately decided to hide those controls at least for 2.1.0, but there is still some nonzero time smoothing hardcoded, so this is still worth mention.  The prior description said frequency smoothing was applied before time smoothing, which was incorrect both for 2.0.6 and for the rewritten effect.
 
* '''Paul 24Jan15:''' Time smoothing alludes to the attack and release.  It was lately decided to hide those controls at least for 2.1.0, but there is still some nonzero time smoothing hardcoded, so this is still worth mention.  The prior description said frequency smoothing was applied before time smoothing, which was incorrect both for 2.0.6 and for the rewritten effect.
** '''Gale 25Jan15:''' I included the mention in the text above. Please correct it if it is wrong.}}
+
** '''Gale 25Jan15:''' I included the mention in the text above. Please correct it if it is wrong.
 +
** '''Paul 27Jan15:''' It's good.  I never agreed with hiding attack and release (release may be the more useful) but I got overruled.  Longer release might give better results when the sound has reverb or percussive notes with decaying tails.}}
 +
 
 +
'''Q:''' <i>How many frequency bands does the noise gate use?</i>
 +
 
 +
'''A:''' In Audacity we use an FFT size of 2048, which results in 1025 frequency bands.
 +
 
  
==Frequency Bands==
+
===Artifacts===  
In Audacity we use an FFT size of 2048, which results in 1025 frequency bands.
 
  
 +
'''Q:''' <i>What causes the 'tinkling' artifacts, and what steps can and have been taken to eliminate them?</i>
  
==Artifacts==
+
'''A:''' The tinkly artifacts happen when individual pure tones are near the threshold to be preserved -- they are small pieces of the background soundscape that survived the thresholding, perhaps because the background noise is slightly different from the fingerprint or because the main sound has overtones that are imperceptible but that boost them slightly over the threshold.
The tinkly artifacts happen when individual pure tones are near the threshold to be preserved -- they are small pieces of the background soundscape that survived the thresholding, perhaps because the background noise is slightly different from the fingerprint or because the main sound has overtones that are imperceptible but that boost them slightly over the threshold.
 
  
 
Any Fourier-based noise reduction algorithm will have some artifacts like the "tinkle-bells".  They are a symptom of the problem of ''discrimination'' - deciding whether a particular analog signal is above or below a decision threshold - that is central to the fields of digital data processing and information theory.  In general the tinkle-bell artifacts are ''quieter'' than the original noise.  The real question is whether they are ''more noticeable'' than the original noise.  (For example, noise-gating the Beatles' ''Sun King'' track off the ''Abbey Road'' album is a bad idea, because the soft brushed cymbal sounds merge smoothly into the tape hiss on the original master recording, so tinkle bells and a related problem -- fluttering -- are prominent in noise-gated versions of that track.)
 
Any Fourier-based noise reduction algorithm will have some artifacts like the "tinkle-bells".  They are a symptom of the problem of ''discrimination'' - deciding whether a particular analog signal is above or below a decision threshold - that is central to the fields of digital data processing and information theory.  In general the tinkle-bell artifacts are ''quieter'' than the original noise.  The real question is whether they are ''more noticeable'' than the original noise.  (For example, noise-gating the Beatles' ''Sun King'' track off the ''Abbey Road'' album is a bad idea, because the soft brushed cymbal sounds merge smoothly into the tape hiss on the original master recording, so tinkle bells and a related problem -- fluttering -- are prominent in noise-gated versions of that track.)
Line 29: Line 38:
 
{{ednote|
 
{{ednote|
 
* '''Paul 24Jan15:''' I almost want this paragraph in the Manual rather than here.  Just remove the talk of "discrimination."
 
* '''Paul 24Jan15:''' I almost want this paragraph in the Manual rather than here.  Just remove the talk of "discrimination."
** '''Gale 25Jan15:''' I assume you mean the paragraph above this ednote, Paul? The final paragraph below this note would need a lot more explanation if it was in the Manual. If you mean above, I'll see if I can work it in to the Manual if no-one else does - that page looks somewhat unfinished (still two P1's). <p>Can "Residue" still be roughly described as per the Development Manual's current description "the sound that is removed"? "Residue" probably means less to most users than the old (not working properly) "Isolate" did, so I guess few will use it. Should it have been called "Invert" or "Difference"?</p>  
+
** '''Gale 25Jan15:''' I assume you mean the paragraph above this ednote, Paul? The final paragraph below this note would need a lot more explanation if it was in the Manual. If you mean above, I'll see if I can work it in to the Manual if no-one else does - that page looks somewhat unfinished (still two P1's). <p>Can "Residue" still be roughly described as per the Development Manual's current description "the sound that is removed"? "Residue" probably means less to most users than the old (not working properly) "Isolate" did, so I guess few will use it. Should it have been called "Invert" or "Difference"?</p>
*'''Peter 26Jan15:''' [[ToDo-2]] I can see a case for moving/copying the whole of this section on Artifacts to the Manual.  The entry in the Manual would probably need to be toned down a bit in the technicalities, as Paul indicates, but the information here seems to be user-based stuff rather than just developer material - and this page is designed to be targeted at developers.
+
** '''Paul 27Jan15:''' I mean the previous paragraph.  "Residue" was Steve's preferred term.  One might want the difference of wet and dry signals (Residue), or to pass the noisy part of the signal full strength, ignoring the reduction, attack/release, and frequency smoothing (Isolate). Old Noise Removal's buggy Isolate didn't really do either.  The alpha version of Noise Reduction made both available.  I think "sound that is removed" [be careful not to say "noise"] describes not Isolate but Residue, well enough. Finally the suggestion for mixing below isn't from me. I never tried that. What I have written is still an incremental update of an original that I think was Dominic's.}}  
** '''Gale 26Jan15:''' Not many users will understand "noise gating sounds that are well separated (either in volume or frequency spectrum) from the background noise, or by mixing a small amount of the original noisy track back into the noise gated sound". If we want to recommend "Advanced noise reduction techniques" then those should be on Wiki with sufficient explanation, but not on this page. Abridged information on this page is still useful for developers. It seems fairly obvious to me that we should be linking to or including content from [[Proposal Noise Removal]] on this page.
 
}}  
 
  
 
The Frequency Smoothing slider does not affect the number of artifacts, but it can make each less evident by spreading the effects of discrimination errors among nearby frequency bands.
 
The Frequency Smoothing slider does not affect the number of artifacts, but it can make each less evident by spreading the effects of discrimination errors among nearby frequency bands.

Revision as of 15:55, 27 January 2015

Peter 15Dec14: The new Noise Reduction effect is replacing Noise Removal in 2.1.0.

This page is for now and until 2.1.0 an orphan page in the Wiki linked to only from the Noise Reduction page in the Wiki.

ToDo-210 Once 2.1.0 is released this page can effectively replace How Noise Removal Works in the Wiki and that page will then only remain in the Legacy Wiki.

ToDo-1 This page needs some severe editing from someone who really understands how the effect works - all I have done so far is to copy the existing How Noise Removal Works Wiki page and make textual changes to replace "remove" with "reduce" throughout. So for all I know this could well be wildly inaccurate. We also have to as ourselves the question: "Do we really need this page, or do we (or should we) have sufficiently adequate documentation in the Manual - or on the Wiki page Noise Reduction?".

  • Gale 19Dec14: There was previously perceived a need for a page like this that had greater technical depth than appropriate for the Manual, and I think the need is still there. Some of this content is still valid, the second question completely so. The content surely needs updating. Some worthwhile content is obviously in Proposal Noise Removal. If nothing else can be done about this page before 2.1.0 then we should link to that page I think.

    I think because of the slightly unfortunate rename of the effect, this page needs to state it is nothing to do with http://en.wikipedia.org/wiki/Dolby_noise-reduction_system, or should be moved to How Audacity Noise Reduction Works.

  • Peter 21Dec14: I never understood the rationale for this page. The content appears more suited to civilian users rather than developers (and they should really be catered for in the Manual's documentation as you observed elsewhere). The "artifacts" section could become a FAQ perhaps. I can probably see a need for a detailed page on the algorithm and implementation for future developers as it is obviously a fairly complicated effect (witnessed by the amount of email traffic as it was under development). Maybe we should be encouraging Paul Licameli to write the page for us?
    • Paul 27Jan15: I have addressed the problem of dated or incomplete content. I rewrote the page, retaining much of the old content that was still correct. Still debatable is the purpose of the page and whether some of this is worth replicating in the user manual.

About the Noise Reduction algorithm

Q: How do you actually reduce noise? What is the algorithm?

A: The noise reduction algorithm uses Fourier analysis : it finds the spectrum of pure tones that make up the background noise in the quiet sound segment that you selected - that's called the "frequency spectrum" of the sound. That forms a fingerprint of the static background noise in your sound file. When you reduce noise from the sound as a whole, the algorithm finds the frequency spectrum of each short segment of sound. Any pure tones that aren't sufficiently louder than their average levels in the fingerprint are reduced in volume. That way, (say) a guitar note or an overtone of the singer's voice are preserved, but hiss, hum, and other steady noises can be minimized. The general technique is called spectral noise gating .

The first pass of noise reduction is done over just noise. For each windowed sample of the sound, we take a Fast Fourier Transform (FFT) using a Hann window and then statistics, including the mean power, are tabulated for each frequency band.

During the noise reduction phase, those statistics and the Sensitivity setting determine a threshold for each frequency band. We start by setting a gain control for each frequency band such that if the sound has exceeded the threshold, the gain is set to 0 dB, otherwise the gain is set lower to the Noise Reduction slider setting (e.g. -18 dB), to suppress the noise.

Then time-smoothing is applied (so that the gain for each frequency band moves slowly), followed by frequency-smoothing (so that a single frequency is never suppressed or boosted in isolation). Prior to 2.1.0 release, the user could specify a single value that determined both "attack" and "decay" time-smoothing. The 2.1.0 code has provision for separate "attack" and "release" sliders but these are currently hidden, with hardcoded time-smoothing applied instead. Lookahead is employed in time-smoothing, so that if this effect was redesigned for real-time there would be some delay.

The gain controls are then applied to the complex FFT of the signal and the inverse FFT applied, followed by another Hann window. The output signal is then pieced together using overlap/add of one-fourth the window size.

  • Paul 24Jan15: Time smoothing alludes to the attack and release. It was lately decided to hide those controls at least for 2.1.0, but there is still some nonzero time smoothing hardcoded, so this is still worth mention. The prior description said frequency smoothing was applied before time smoothing, which was incorrect both for 2.0.6 and for the rewritten effect.
    • Gale 25Jan15: I included the mention in the text above. Please correct it if it is wrong.
    • Paul 27Jan15: It's good. I never agreed with hiding attack and release (release may be the more useful) but I got overruled. Longer release might give better results when the sound has reverb or percussive notes with decaying tails.

Q: How many frequency bands does the noise gate use?

A: In Audacity we use an FFT size of 2048, which results in 1025 frequency bands.


Artifacts

Q: What causes the 'tinkling' artifacts, and what steps can and have been taken to eliminate them?

A: The tinkly artifacts happen when individual pure tones are near the threshold to be preserved -- they are small pieces of the background soundscape that survived the thresholding, perhaps because the background noise is slightly different from the fingerprint or because the main sound has overtones that are imperceptible but that boost them slightly over the threshold.

Any Fourier-based noise reduction algorithm will have some artifacts like the "tinkle-bells". They are a symptom of the problem of discrimination - deciding whether a particular analog signal is above or below a decision threshold - that is central to the fields of digital data processing and information theory. In general the tinkle-bell artifacts are quieter than the original noise. The real question is whether they are more noticeable than the original noise. (For example, noise-gating the Beatles' Sun King track off the Abbey Road album is a bad idea, because the soft brushed cymbal sounds merge smoothly into the tape hiss on the original master recording, so tinkle bells and a related problem -- fluttering -- are prominent in noise-gated versions of that track.)

The Sensitivity slider biases the thresholds of all frequency bands. Higher settings will thus reduce the number of artifacts, but at the risk of introducing the opposite discrimination error, in which parts of the desired signal are misclassified as noise and so reduced. The purpose of the Residue radio button is to pass the difference between the original sound and what would result from choosing Reduce. When the Sensitivity is excessive, "tinkle-bells" will be heard in Residue rather than in Reduce, and where the original had louder sounds, rather than in the pauses between sounds. By previewing the results of Residue before applying the effect, the best Sensitivity balance can be found.

  • Paul 24Jan15: I almost want this paragraph in the Manual rather than here. Just remove the talk of "discrimination."
    • Gale 25Jan15: I assume you mean the paragraph above this ednote, Paul? The final paragraph below this note would need a lot more explanation if it was in the Manual. If you mean above, I'll see if I can work it in to the Manual if no-one else does - that page looks somewhat unfinished (still two P1's).

      Can "Residue" still be roughly described as per the Development Manual's current description "the sound that is removed"? "Residue" probably means less to most users than the old (not working properly) "Isolate" did, so I guess few will use it. Should it have been called "Invert" or "Difference"?

    • Paul 27Jan15: I mean the previous paragraph. "Residue" was Steve's preferred term. One might want the difference of wet and dry signals (Residue), or to pass the noisy part of the signal full strength, ignoring the reduction, attack/release, and frequency smoothing (Isolate). Old Noise Removal's buggy Isolate didn't really do either. The alpha version of Noise Reduction made both available. I think "sound that is removed" [be careful not to say "noise"] describes not Isolate but Residue, well enough. Finally the suggestion for mixing below isn't from me. I never tried that. What I have written is still an incremental update of an original that I think was Dominic's.

The Frequency Smoothing slider does not affect the number of artifacts, but it can make each less evident by spreading the effects of discrimination errors among nearby frequency bands.

You can reduce the effect of tinkle bells by noise gating sounds that are well separated (either in volume or frequency spectrum) from the background noise, or by mixing a small amount of the original noisy track back into the noise gated sound. Then the muted background noise tends to mask the tinkle bells. That technique works well for (e.g.) noisy microcassette recordings, where the noise floor might only be 20 dB below the loudest sounds on the tape. You can get about 10dB of noise reduction that way, without excessive tinkly artifacts.