Reference-Based Recursive Least-Squares Mitigation of Real Interference in Stereo Audio Recordings
Pith reviewed 2026-06-26 20:14 UTC · model grok-4.3
The pith
Real train interference in stereo audio can be substantially attenuated using a correlated reference recording and multi-reference recursive least-squares estimation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
With 30 taps per reference channel, 15 anti-causal taps, and forgetting factor 0.999, processing three 74-second real sequences reduces the maximum reference correlation from 0.386-0.832 to 0.011-0.016, achieving a correlation-ratio reduction of 30.6-34.1 dB and RMS decreases of 1.8-4.8 dB, demonstrating that real train interference including environmental acoustic effects can be substantially attenuated when a correlated reference recording is available.
What carries the argument
Multi-reference recursive least-squares (RLS) estimator that uses the second stereo recording to model and subtract the unknown propagation paths of the train noise.
If this is right
- The estimated interference component is subtracted from the noisy stereo audio.
- A finite-impulse-response low-pass postfilter is applied after subtraction.
- Reference correlation drops to 0.011-0.016 after processing.
- Output RMS level decreases by 1.8-4.8 dB depending on section and channel.
Where Pith is reading between the lines
- This technique could extend to canceling other correlated environmental noises in field recordings if a suitable reference channel exists.
- Real-time implementations might benefit from the forgetting factor of 0.999 for tracking slowly varying noise.
- Testing on additional noise types would reveal how much the correlation between reference and primary must exceed a threshold for effective cancellation.
Load-bearing premise
The second stereo recording must be a sufficiently correlated filtered observation of the identical physical noise source that reaches the primary microphones.
What would settle it
If processing leaves residual normalized correlation with the reference above 0.05 or fails to reduce it by at least 20 dB, the claim of substantial attenuation would not hold.
Figures
read the original abstract
Reference-based adaptive interference cancellation is evaluated for stereo audio recordings corrupted by real train noise and environmental background. The observed signal is modeled as a clean stereo program contaminated by an additive disturbance generated by an external acoustic source through unknown propagation paths. A second stereo recording, representing another filtered observation of the same physical noise source, is used as the reference input of a multi-reference recursive least-squares (RLS) estimator. The estimated train-interference component is subtracted from the noisy audio and followed by a finite-impulse-response low-pass postfilter. Three 74.01 s real audio sequences sampled at 11.025 kHz are processed under identical algorithmic parameters. Since clean ground truth is not available, performance is assessed with no-reference indicators: waveform behavior, Welch spectral estimates, RMS change, and residual normalized correlation with the reference. With 30 taps per reference channel, 15 anti-causal taps, and forgetting factor 0.999, the maximum reference correlation is reduced from 0.386--0.832 before processing to 0.011--0.016 after processing. The corresponding correlation-ratio reduction is approximately 30.6--34.1 dB, while the output RMS decreases by 1.8--4.8 dB depending on section and stereo channel. The results demonstrate that real train interference, including environmental acoustic effects, can be substantially attenuated when a correlated reference recording is available.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a multi-reference recursive least-squares (RLS) adaptive filter, using a second stereo recording as reference input, can substantially attenuate real train noise (including environmental acoustic effects) in primary stereo audio recordings. The observed signal is modeled as clean program plus additive disturbance through unknown paths; the RLS estimator (30 taps per reference channel + 15 anti-causal taps, forgetting factor 0.999) subtracts the estimated interference, followed by an FIR low-pass postfilter. On three 74.01 s real recordings sampled at 11.025 kHz, no-reference metrics show reference correlation reduced from 0.386–0.832 to 0.011–0.016 (≈30.6–34.1 dB) and RMS reduced by 1.8–4.8 dB.
Significance. If the central empirical claim holds, the work demonstrates practical viability of reference-based RLS cancellation for real acoustic interference where ground-truth clean signals are unavailable. Credit is due for processing actual field recordings rather than simulations and for consistent use of no-reference metrics (correlation, RMS, Welch spectra) that directly measure residual disturbance.
major comments (1)
- [Abstract and method description] Abstract and method description: the claim that the approach models 'unknown propagation paths' including 'environmental acoustic effects' is undercut by the chosen FIR support. With 30 taps per channel + 15 anti-causal taps at 11.025 kHz the total support is only ~4 ms; real reverberant paths from a distant train routinely exceed this length. The reported 30 dB correlation drop is therefore consistent with cancellation of only the short-time correlated component, weakening the assertion of comprehensive attenuation of the full disturbance.
Simulated Author's Rebuttal
We thank the referee for the constructive comment regarding filter support and its relation to the modeling of propagation paths. We respond to the major comment below.
read point-by-point responses
-
Referee: [Abstract and method description] Abstract and method description: the claim that the approach models 'unknown propagation paths' including 'environmental acoustic effects' is undercut by the chosen FIR support. With 30 taps per channel + 15 anti-causal taps at 11.025 kHz the total support is only ~4 ms; real reverberant paths from a distant train routinely exceed this length. The reported 30 dB correlation drop is therefore consistent with cancellation of only the short-time correlated component, weakening the assertion of comprehensive attenuation of the full disturbance.
Authors: The referee correctly notes that 45 taps at 11.025 kHz span only ~4 ms. The multi-reference RLS estimator adapts to the effective linear mapping between reference and primary channels over this finite support, which includes the direct path and early reflections responsible for the observed short-time correlation. The 30.6–34.1 dB reduction in normalized correlation demonstrates that these correlated components—incorporating environmental acoustic effects within the modeled length—are substantially attenuated. We do not claim to cancel infinite-length reverberation tails beyond the filter support. To avoid any overstatement, we will revise the abstract and method sections to clarify that the method targets the short-time correlated portion of the disturbance. revision: yes
Circularity Check
No circularity: direct empirical measurements on real recordings
full rationale
The paper applies a standard multi-reference RLS algorithm (with fixed parameters: 30 taps, 15 anti-causal taps, forgetting factor 0.999) to three real 74 s stereo recordings at 11.025 kHz and reports measured no-reference metrics (residual correlation, RMS change, spectral estimates) on the processed outputs. The central claim follows directly from these waveform-level observations rather than from any fitted parameter being renamed as a prediction or from a self-citation chain. No derivation step reduces to its own inputs by construction; the work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (3)
- number of taps per reference channel
- number of anti-causal taps
- forgetting factor
axioms (1)
- domain assumption The observed signal is a linear combination of the clean program and an additive disturbance that can be modeled as a filtered version of the reference recording.
Reference graph
Works this paper leans on
-
[1]
Workshop 2: Interference mitigation methods and audio signals,
U. Spagnolini, “Workshop 2: Interference mitigation methods and audio signals,” Politecnico di Milano, course handout, 2022
2022
-
[2]
Spagnolini,Statistical Signal Processing in Engineering
U. Spagnolini,Statistical Signal Processing in Engineering. Wiley, 2018
2018
-
[3]
Adaptive noise cancelling: Principles and applications,
B. Widrow, J. R. Glover, Jr., J. M. McCool, J. Kaunitz, C. S. Williams, R. H. Hearn, J. R. Zeidler, E. Dong, Jr., and R. C. Goodlin, “Adaptive noise cancelling: Principles and applications,”Proc. IEEE, vol. 63, no. 12, pp. 1692–1716, Dec. 1975
1975
-
[4]
Haykin,Adaptive Filter Theory, 5th ed
S. Haykin,Adaptive Filter Theory, 5th ed. Pearson, 2014
2014
-
[5]
A. H. Sayed,Fundamentals of Adaptive Filtering. Wiley, 2003
2003
-
[6]
P. S. R. Diniz,Adaptive Filtering: Algorithms and Practical Implemen- tation, 4th ed. Springer, 2013
2013
-
[7]
Fast recursive-least-squares transversal filters for adaptive filtering,
J. M. Cioffi and T. Kailath, “Fast recursive-least-squares transversal filters for adaptive filtering,”IEEE Trans. Acoust., Speech, Signal Process., vol. 32, no. 2, pp. 304–337, Apr. 1984
1984
-
[8]
S. M. Kuo and D. R. Morgan,Active Noise Control Systems: Algorithms and DSP Implementations. Wiley, 1996
1996
-
[9]
S. J. Elliott,Signal Processing for Active Control. Academic Press, 2001
2001
-
[10]
A. V . Oppenheim and R. W. Schafer,Discrete-Time Signal Processing, 3rd ed. Pearson, 2009
2009
-
[11]
J. G. Proakis and D. G. Manolakis,Digital Signal Processing: Principles, Algorithms, and Applications, 4th ed. Pearson, 2007
2007
-
[12]
The use of fast Fourier transform for the estimation of power spectra,
P. Welch, “The use of fast Fourier transform for the estimation of power spectra,”IEEE Trans. Audio Electroacoust., vol. 15, no. 2, pp. 70–73, Jun. 1967
1967
-
[13]
P. C. Loizou,Speech Enhancement: Theory and Practice, 2nd ed. CRC Press, 2013
2013
-
[14]
Suppression of acoustic noise in speech using spectral subtraction,
S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,”IEEE Trans. Acoust., Speech, Signal Process., vol. 27, no. 2, pp. 113–120, Apr. 1979
1979
-
[15]
Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,
Y . Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,”IEEE Trans. Acoust., Speech, Signal Process., vol. 32, no. 6, pp. 1109–1121, Dec. 1984
1984
-
[16]
Method for the subjective assessment of intermediate quality level of audio systems,
International Telecommunication Union, “Method for the subjective assessment of intermediate quality level of audio systems,” Recommen- dation ITU-R BS.1534-3, 2015
2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.