Reverberation time (RT) is a key parameter in qualifying the room acoustic status on the speech intelligibility and music perception. Typically, the RT is derived from an energy decay curve (EDC) via Schroeder integration of a room impulse response (RIR). Since the ambient and device noise are unavoidable in experimental RIR measurement in suit, the EDC includes not only the reverberation of the RIR but also the noise-induced decay. The noise makes the dynamic range of the EDC following the ISO 3382-1 insufficient for a reliable determination of the RT.
It shows that although techniques developed on noise compensation method (subtraction-truncation-correction method) reduce the noise effects on EDC to increase the dynamic range to a truncation time (TT) defined as the intersection of the main decay slope and noise level. However, the performance of these methods differs significantly and leads to a discrepancy in the determined TT from the optimal value and an error in RT determination. The RIR fluctuation and characteristics, such as the measured RIR has high initial peaks and early reflections, cause errors in the defining an estimation range (ER) of the RIR for calculating the decay slope and noise level.
In this paper, the noise compensation procedure is combined with a nonlinear regression model with a multi-slopes decay term and a noise term to reduce the RIR irregularities and noise uncertainties in the TT determination. The TT can be estimated easily, since the decay slope, noise level and the start level of the ER are the parameters of model itself. The multi-slopes decay term is introduced to decrease the estimation error of the start level of the ER affected by the RIR high initial peaks and early reflections. The differences in the decay levels at the TTs are estimated from the EDC between the integrated model and the integrated measured RIR. The values are applied to redefine the end level of the ER. The optimal TT is detected when the differences go zero or meets an acceptable threshold. The model parameters are generated until the iterative procedure converges to a minimum difference between EDC of the integrated model and the measured data under an LS optimization of the initial parameters estimates.
The model parameters are further applied to the noise compensation function to extend the dynamic range of EDC to the detected TT. During the procedure, the detected TT and decay slope are initially applied to calculate the correction term. Then, a truncated EDC is generated by truncating the RIR at the TT, subtracting the noise level, and applying the correction term before backward integration. The analysis of the RTs and the deviation of the decay slope at the TT proves that the procedure can improve the TT detection performance compared to the literature procedure.
Since the characteristic curvature of the EDC contains not only the noise energy but also the exponential decay signal, the truncated EDC fails to describe the important non-exponentially decaying features of the room acoustic. And, considering the nature of the human auditory system to render the noise perceptually inaudible, an implementation method that removes noise from the RIR instead of mathematically eliminating the noise effects on the EDC is desirable.
Thus, the application of the generalized spectral subtraction algorithm (GBSS) algorithm in this study states as an optimization problem, that is, subtracting the noise level from the RIR while maintaining the reverberant decay quality. Because the algorithm is based on a hypothesis that the noise is relatively stable or a slowly varying process, the measurements are conducted in the "real-world" with a relative stable background noise. The optimization process performed in the measurement of RIRs with artificial noise and natural ambient noise aims to find out the optimal set of factors to achieve the best noise reduction results with respect to the largest dynamic range improvement. The optimal factor is set variable based on the estimated SNRs of the RIRs filtered in the octave band. The acoustic parameters, the TT, the noise level estimated at TT, and the corresponding RT are used as control measures and evaluation criteria to ensure the reliability of the algorithm. The acoustic parameters estimated from the EDCs of the de-noised RIRs, with an acceptable degradation of the reverberation decay slope, give somewhat more stable results in some cases compared to the compensation method. Moreover, the GBSS algorithm with the optimal factors significantly improves the dynamic range and decreases the estimation error in RTs caused by the noise in the RIR.