C++ WebRtc声学回波消除3（AEC3）在回波消除后提供平坦输出_C++_Webrtc_Echo_Speex

C++ WebRtc声学回波消除3（AEC3）在回波消除后提供平坦输出

c++ webrtc

C++ WebRtc声学回波消除3（AEC3）在回波消除后提供平坦输出,c++,webrtc,echo,speex,C++,Webrtc,Echo,Speex,让我先说明一下，我对WebRtc完全是个新手，如果我提到的任何东西有点诙谐，请耐心听我说宽容的态度我正在编写一个应用程序，用于比较Speex和Web RTC AEC3之间的回声消除性能。 [WebRtc AEC3代码库（最新分支）：应用程序读取WAV文件并将样本输入AEC模块，WAV写入程序保存回声消除的输出我有两个输入： 1）扬声器输入或渲染信号或远场信号 2） MIC输入或捕获信号或近端信号和一个输出： 1） MicOutput-这是回声消除的结果现在，对于Speex模块，我看

让我先说明一下，我对WebRtc完全是个新手，如果我提到的任何东西有点诙谐，请耐心听我说宽容的态度

我正在编写一个应用程序，用于比较Speex和Web RTC AEC3之间的回声消除性能。 [WebRtc AEC3代码库（最新分支）：

应用程序读取WAV文件并将样本输入AEC模块，WAV写入程序保存回声消除的输出

我有两个输入： 1）扬声器输入或渲染信号或远场信号 2） MIC输入或捕获信号或近端信号

和一个输出： 1） MicOutput-这是回声消除的结果

现在，对于Speex模块，我看到了一种良好的行为方式。请看一下下面的文件，它在从中取消渲染信号方面做得很好捕获的信号

然而，当我用WebRtc Aec3传递相同的文件时，我得到一个平坦的信号。下面是Aec3的结果

它似乎也在抵消原来的麦克风信号

我使用以下参数（从Wav文件读取器中提取）：抽样率：8000 频道：1 位/样本：16 样本数目：270399 一次送入AEC的样品：（10*取样器）/1000=80

这是初始化：

m_streamConfig.set_sample_rate_hz(sampleRate);
m_streamConfig.set_num_channels(CHANNEL_COUNT);

// Create a temporary buffer to convert our RTOP input audio data into the webRTC required AudioBuffer. 
m_tempBuffer[0] = static_cast<float*> (malloc(sizeof(float) * m_samplesPerBlock));

// Create AEC3. 
m_echoCanceller3.reset(new EchoCanceller3(m_echoCanceller3Config, sampleRate, true));       //use high pass filter is true

// Create noise suppression.
m_noiseSuppression.reset(new NoiseSuppressionImpl(&m_criticalSection));
m_noiseSuppression->Initialize(CHANNEL_COUNT, sampleRate);

此外，关于参数（延迟、回声模型、混响、噪声地板等），我使用的是所有默认值

有人能告诉我我做错了什么吗？或者我如何通过调整适当的参数使它变得更好

更新：（2019年2月22日）找出了回声输出静音的原因。看起来Webrtc AEC3无法处理8k和16k采样率，尽管在源代码中有迹象表明它们支持4种不同的采样率：8k、16k、32k和48k。在输入32k和48k样本后，我得到了回波取消输出。但是，我没有看到任何回波取消。它只是在输入近端/麦克风/捕获输入时吐出准确的样本。因此，是的，可能我缺少关键参数设置。仍在寻求帮助

音频缓冲器

最重要的是所谓的“延迟”，你可以在音频处理中找到它的定义

设置ProcessReverseStream（）接收远端数据之间的|延迟|毫秒 frame和ProcessStream（）接收包含在客户端，这可以表示为延迟=（t_渲染-t_分析）+（t_进程-t_捕获）在哪里,

2.回声消除器3延迟

设置音频缓冲延迟（整数延迟）

auto renderAudioBuffer = CreateAudioBuffer(spkSamples);
auto capturedAudioBuffer = CreateAudioBuffer(micSamples);

// Analyze capture buffer
m_echoCanceller3->AnalyzeCapture(capturedAudioBuffer.get());

// Analyze render buffer
m_echoCanceller3->AnalyzeRender(renderAudioBuffer.get());

// Cancel echo
m_echoCanceller3->ProcessCapture(
capturedAudioBuffer.get(), false);          
// Assuming the analog level is not changed.  
//If we want to detect change, need to use gain controller and remember the previously rendered audio's analog level

// Copy the Captured audio out 
capturedAudioBuffer->CopyTo(m_streamConfig, m_tempBuffer);

arrayCopy_32f(m_tempBuffer[0], micOut, m_samplesPerBlock);

 - t_analyze is the time a frame is passed to ProcessReverseStream() and
   t_render is the time the first sample of the same frame is rendered by
   the audio hardware.
 - t_capture is the time the first sample of a frame is captured by the
   audio hardware and t_process is the time the same frame is passed to
   ProcessStream().