Several times in the last years some people got a “not so good” recording of the set-/field-recordings (original dialogs, atmospheres, …) with a lot of “noise” or “hum” and a separate recording of that noise or hum made a little bit later. The set engineers (yes, plural) tried to lecture the sound designers or mixing engineers how to get rid of the unwanted part of the sound. You only need to play it along with the original recordings in phase-reverse and it will cancel out the noise you want to get rid of and therefore you get a clean recording, clean dialog, clean atmosphere – or whatever…
Some say they learned that from other professionals and some even said, they learned that in school or in college / audio-engineering-school. Its easy… if you know math & physics & theory 😉
Well – wait a minute…
let me explain why its not so easy or why it will not work and why noise-cancelling is a bit different – or to be more exact – why one thing works and the other doesn’t – and what some may have missed in their thoughts about math and physics and theory.
When you have waveforms, no matter how they look or sound, you can add them and the result would be that the sound will just be a bit louder (1) – its called mixing – its just adding sound or digitally speaking, adding values of waveforms at a point in time. Adding 1+1=2, but it also works that 1+(-1)=0. The exact opposite value (i.e. digital = number or analog = voltage) will result in a sum of “0”. And the exact opposite value one will get if the phase is reversed and the two signals are mixed.
These two mono-signals [fig. 2] (just to make it clearer) are the waveform of a refrigerator. The blue one is the original, the red one is just phase-revered. These two signals mixed with equal levels will cancel each other out – completely – if played with the same level you will not hear anything.
This [fig. 3] is the waveform of the same fridge, just a bit longer (zoomed out one might say). What you probably see in this and the above screen-shot is that it seems not to repeat at all. Eventhough the sound of a fridge (that’s why i have chosen it) seems to be pretty steady and boring – if you look close enough at the details, it does not repeat at all. Its like waves at the beach – every single one is a little bit different. So how can you find a exact match that would cancel the two out? Well – you can’t!
In that refrigerator sound there are also very typical frequency-ranges, some call it humming (many refrigerators have that), that seems to be steady and that part is a bit louder then the noise around it – maybe we could at least find a way to cancel that out?
This is the spectrum of that sound [fig. 4] one can clearly see a louder area just at about 200 Hz and another two louder parts at about 1300 Hz and 1500 Hz (2).
Here [fig. 5] you can see four FFT analysis at four different locations within one second of the sound – if you look closely you will see, that everyone is a bit different – just like the waves at a beach i mentioned before.
So with the exact frequency we my be able to cancel some areas out – but therefore we also need the exact phase of that wave and just hope, that the areas we want to cancel out do not change and stay in phase. But with this 3D FFTs [fig. 6] you can clearly see, that this is not the case – the levels in every area keeps changing or fluctuating all the time – on a irregular, random and unpredictable basis.
The 3D-FFT-analysis shows, the level of basically every aspect of this short recording is changing all the time at least in small values.
So maybe we will not get it perfect and cancel the unwanted parts of the sound out but at least we could try to approximate and make it at least somewhat better than it is?
But – it’s not working at all – it gets even worse. Even if you get the right frequencies and try to get the phase right for cancellation – all you get in the best case is a cancellation for very short times (fractions of a second) and at all the other times you will get a phasing/flanging that sounds worse than the unaltered or unprocessed sound.
Lets leave that for now…
But how or why is noise-cancelling working at all?
Here is the main reason why and how it can work: With a headphone you have to consider TWO independent and separate main sources of audio – and not ONE!
First: you have the audio you want to hear that is produced/played back by the headphone – and second: you have the unwanted sound from the outside – these TWO sounds are separate available and can be analyzed and/or processed separately. The second sound, usually the noise you want to cancel out, can be picked up by a microphone in real time and then be phase-inverted fast enough for the process of cancellation. In other words: It’s the exact same sound, the exact same waveform you want to cancel out – not a later recording with random differences that make it impossible to use for that application. A sound / noise from your surroundings gets reproduced and played phase-reversed and therefore tries to cancel out the noise – to put it simple.
This is the principle how it works and why it works – but there are a lot of details to consider and implement (for example: the noise-cancelling will only be of good quality at one single very small place/location because of the travelling waves – and this place/location should be within your ear – but ears are different, and the quite small microphone in the headphones (used for bringing the outside-sound to the “cancellation-processor”) cannot be a nearly perfect studio-mic – but still – this mic is the starting-point of the process…). That’s why there are different qualities of noise-cancelling, different prices and companies with different secrets – they wouldn’t tell you (or me) many details…
There are many other interesting application where you can use the cancellation of waveforms (it not only works with sound-waves, it also works with light-waves…) for example comparing recordings. You could for example compare or listen to some artefacts by comparing the wav-file lets say with the .mp3 or another data-compression-format by phase-reversing and then adding / mixing the files with the exact same level. Most of the sound will be cancelled out, and what you will hear are the missing parts of the data-compressed file.
Here a little animation: green + blue = red
(1) i am careful about the differences between perceived loudness (volume), objectively measured sound pressure (voltage) and theoretically calculated sound intensity (acoustic power), also not mentioning, that the adding-process could lead to distorted signals because the numbers/voltages get too big, but that’s another story about calculation-limits and bit-depth…
(2) x is the time-axis, y is the frequency (you can see the values in the middle column at the right), brightness means level (you can see the values of the levels in the right column at the right), the blue part in the middle of the graphic is just the waveform for orientation. Screenshot was made with iZotope RX-7 advanced, the pictures above
source fig. 1: https://de.wikipedia.org/wiki/Interferenz_(Physik)