当前位置: 首页 > news >正文

webrtc之子带分割下——SplittingFilter源码分析

文章目录

  • 前言
  • 一、频带分割过程
    • 1.SplittingFilter的创建
    • 2.频带分割整体流程
      • 1)分割时机
      • 2)分割规则
      • 3)分割核心代码
    • 3.频带合并
  • 二、算法实现
    • 1.实现原理介绍
    • 2.All pass QMF系统源码
      • 1)提高精度
      • 2)经过串联全通滤波器
      • 3)分离后的低频高频
    • 3.滤波算法实现
      • 1)滤波器系数
      • 2)全通滤波器串联
  • 总结


前言

在之前的文章中,博主大致介绍了3A的作用、使用场景,并且以webrtc开源框架,说明了在webrtc中音频3A开启的方法和选项,以及对于3A模块实例如何塞入近端信号和远端信号的流程结合代码做了详细介绍。

webrtc中的SplittingFilter的主要作用是使用滤波器组进行子带分割,好为后面做回声消除做准备,本篇文章将会先介绍SplittingFilter代码的整体流程,包括创建,分割,和合并的业务代码。然后再进一步的介绍算法是如何实现以及实现细节。

|版本声明:山河君,未经博主允许,禁止转载

一、频带分割过程

1.SplittingFilter的创建

从最开始的地方来看一下SplittingFilter的创建时机,如果读者对于webrtc有一定了解,那么应该知道对于音频数据是存储在AudioBuffer中的,AudioBuffer创建时同时会创建SplittingFilter

AudioBuffer::AudioBuffer(size_t input_rate,size_t input_num_channels,size_t buffer_rate,size_t buffer_num_channels,size_t output_rate,size_t output_num_channels): input_num_frames_(static_cast<int>(input_rate) / 100),......num_bands_(NumBandsFromFramesPerChannel(buffer_num_frames_)), //判断需要划分的频带{.....if (num_bands_ > 1) { //判断是否创建SplittingFiltersplit_data_.reset(new ChannelBuffer<float>(buffer_num_frames_, buffer_num_channels_, num_bands_));splitting_filter_.reset(new SplittingFilter(buffer_num_channels_, num_bands_, buffer_num_frames_));}
}

根据上文提到的,对于语音信号,人声区间应该在0~8kHz,即:

  • 对于32k采样率应该划分为 0 ~ 8kHz, 8 ~ 16kHz
  • 对于48k采样率划分到 0 ~ 8kHz, 8 ~ 16kHz,16 ~ 24kHz。

所以对于16k采样率即不用再划分频带,而NumBandsFromFramesPerChannel(buffer_num_frames_)就是通过采样率判断需要划分的频带区间。

2.频带分割整体流程

1)分割时机

AudioBuffer会在合适的时机进行频带分割,例如近端信号会在以下条件(如开启高通滤波器,降噪等)满足时进行分割:

bool AudioProcessingImpl::SubmoduleStates::CaptureMultiBandProcessingActive(bool ec_processing_active) const {return high_pass_filter_enabled_ || mobile_echo_controller_enabled_ ||noise_suppressor_enabled_ || adaptive_gain_controller_enabled_ ||(echo_controller_enabled_ && ec_processing_active);
}if (submodule_states_.CaptureMultiBandSubModulesActive() &&SampleRateSupportsMultiBand(capture_nonlocked_.capture_processing_format.sample_rate_hz())) {capture_buffer->SplitIntoFrequencyBands();}

其中SplitIntoFrequencyBands接口就是开启进行分割的入口

void AudioBuffer::SplitIntoFrequencyBands() {splitting_filter_->Analysis(data_.get(), split_data_.get());
}

2)分割规则

SplittingFilter根据频带选择是分割成 0 ~ 8kHz, 8 ~ 16kHz还是分割成三份 0 ~ 8kHz, 8 ~ 16kHz,16 ~ 24kHz

void SplittingFilter::Analysis(const ChannelBuffer<float>* data,ChannelBuffer<float>* bands) {.....if (bands->num_bands() == 2) {TwoBandsAnalysis(data, bands); // 0 ~ 8kHz, 8 ~ 16kHz} else if (bands->num_bands() == 3) {ThreeBandsAnalysis(data, bands); // 0 ~ 8kHz, 8 ~ 16kHz,16 ~ 24kHz}
}

3)分割核心代码

假如分割成两份,那么核心的代码就是如下:

void SplittingFilter::TwoBandsAnalysis(const ChannelBuffer<float>* data,ChannelBuffer<float>* bands) {RTC_DCHECK_EQ(two_bands_states_.size(), data->num_channels());RTC_DCHECK_EQ(data->num_frames(), kTwoBandFilterSamplesPerFrame);for (size_t i = 0; i < two_bands_states_.size(); ++i) {std::array<std::array<int16_t, kSamplesPerBand>, 2> bands16;std::array<int16_t, kTwoBandFilterSamplesPerFrame> full_band16;FloatS16ToS16(data->channels(0)[i], full_band16.size(), full_band16.data());WebRtcSpl_AnalysisQMF(full_band16.data(), data->num_frames(),bands16[0].data(), bands16[1].data(),two_bands_states_[i].analysis_state1,two_bands_states_[i].analysis_state2);S16ToFloatS16(bands16[0].data(), bands16[0].size(), bands->channels(0)[i]);S16ToFloatS16(bands16[1].data(), bands16[1].size(), bands->channels(1)[i]);}
}
  • 核心代码过程:
    • FloatS16ToS16对于多通道的情况,先取一个通道的数据,将其从FloatS16转为int16
    • WebRtcSpl_AnalysisQMF进行频带分割,将两个频带音频数据分别放入bands16[0]bands16[1]two_bands_states_[i].analysis_state存储不同通道,两个频带的状态
    • S16ToFloatS16再将两个频带的数据恢复为float类型

3.频带合并

有频带分割,当然可以把分割的频带合并为原始音频,过程和分割步骤相似

void SplittingFilter::TwoBandsSynthesis(const ChannelBuffer<float>* bands,ChannelBuffer<float>* data) {RTC_DCHECK_LE(data->num_channels(), two_bands_states_.size());RTC_DCHECK_EQ(data->num_frames(), kTwoBandFilterSamplesPerFrame);for (size_t i = 0; i < data->num_channels(); ++i) {std::array<std::array<int16_t, kSamplesPerBand>, 2> bands16;std::array<int16_t, kTwoBandFilterSamplesPerFrame> full_band16;FloatS16ToS16(bands->channels(0)[i], bands16[0].size(), bands16[0].data());FloatS16ToS16(bands->channels(1)[i], bands16[1].size(), bands16[1].data());WebRtcSpl_SynthesisQMF(bands16[0].data(), bands16[1].data(),bands->num_frames_per_band(), full_band16.data(),two_bands_states_[i].synthesis_state1,two_bands_states_[i].synthesis_state2);S16ToFloatS16(full_band16.data(), full_band16.size(), data->channels(0)[i]);}
}

二、算法实现

1.实现原理介绍

webrtc的SplittingFilter源码中主要应用到以下的原理:

  • QMF正交镜像滤波器:语音信号处理二十九——QMF滤波器设计原理与实现
  • 多项分解和noble恒等式:语音信号处理三十——高效多相抽取器
  • 全通QMF滤波器:webrtc之子带分割上——All-pass QMF滤波器

webrtc中是采用三个一阶全通滤波器实现的all pass qmf,它的整体结构如下:

输入: x[n]|↓Even x[2n] → A1(z) -→ A2(z) -→ A3(z) ──┐│      ↓ half_band_low[n]Odd  x[2n+1] → A1(z) -→ A2(z) -→ A3(z) ─┘↓ half_band_high[n][用于语音处理]合成:↑x̂[2n]   = low + highx̂[2n+1] = low - high

2.All pass QMF系统源码


void WebRtcSpl_AnalysisQMF(const int16_t* in_data, size_t in_data_length,int16_t* low_band, int16_t* high_band,int32_t* filter_state1, int32_t* filter_state2)
{size_t i;int16_t k;int32_t tmp;int32_t half_in1[kMaxBandFrameLength];int32_t half_in2[kMaxBandFrameLength];int32_t filter1[kMaxBandFrameLength];int32_t filter2[kMaxBandFrameLength];const size_t band_length = in_data_length / 2;RTC_DCHECK_EQ(0, in_data_length % 2);RTC_DCHECK_LE(band_length, kMaxBandFrameLength);// Split even and odd samples. Also shift them to Q10.for (i = 0, k = 0; i < band_length; i++, k += 2){half_in2[i] = ((int32_t)in_data[k]) * (1 << 10);half_in1[i] = ((int32_t)in_data[k + 1]) * (1 << 10);}// All pass filter even and odd samples, independently.WebRtcSpl_AllPassQMF(half_in1, band_length, filter1,WebRtcSpl_kAllPassFilter1, filter_state1);WebRtcSpl_AllPassQMF(half_in2, band_length, filter2,WebRtcSpl_kAllPassFilter2, filter_state2);// Take the sum and difference of filtered version of odd and even// branches to get upper & lower band.for (i = 0; i < band_length; i++){tmp = (filter1[i] + filter2[i] + 1024) >> 11;low_band[i] = WebRtcSpl_SatW32ToW16(tmp);tmp = (filter1[i] - filter2[i] + 1024) >> 11;high_band[i] = WebRtcSpl_SatW32ToW16(tmp);}
}

1)提高精度

根据整体结构,首先是将信号分为奇偶信号,具体代码如下:

 for (i = 0, k = 0; i < band_length; i++, k += 2){half_in2[i] = ((int32_t)in_data[k]) * (1 << 10);half_in1[i] = ((int32_t)in_data[k + 1]) * (1 << 10);}

值得注意的是* (1 << 10)是为了提高计算精度,webrtc中常用Q10 格式的定点数(左移 10 位),进行高精度的乘法和累加计算,避免精度损失。

2)经过串联全通滤波器

WebRtcSpl_AllPassQMF(half_in1, band_length, filter1,WebRtcSpl_kAllPassFilter1, filter_state1);WebRtcSpl_AllPassQMF(half_in2, band_length, filter2,WebRtcSpl_kAllPassFilter2, filter_state2);

其中:

  • half_in1:偶信号
  • half_in2:奇信号
  • filter1,filter2:滤波后信号
  • WebRtcSpl_kAllPassFilter1:偶信号滤波器系数
  • WebRtcSpl_kAllPassFilter2:奇信号滤波器系数
  • filter_state1,filter_state2:保存滤波器状态,实现保持多个数据块(frame)之间连续

具体内部实现下文中介绍

3)分离后的低频高频

根据All pass QMF滤波器设计:
H0(z)=12[A(z)+A(−z)]H1(z)=12[A(z)−A(−z)]H_0(z)=\frac{1}{2}[A(z)+A(-z)] \\ H_1(z)=\frac{1}{2}[A(z)-A(-z)] H0(z)=21[A(z)+A(z)]H1(z)=21[A(z)A(z)]
滤波后得到低频高频信号:

for (i = 0; i < band_length; i++){tmp = (filter1[i] + filter2[i] + 1024) >> 11;low_band[i] = WebRtcSpl_SatW32ToW16(tmp);tmp = (filter1[i] - filter2[i] + 1024) >> 11;high_band[i] = WebRtcSpl_SatW32ToW16(tmp);}

这里右移11位而不是10位,就是All pass QMF滤波器设计中要除以2

3.滤波算法实现

1)滤波器系数

从上文中,对于低频子带和高频子带中三个全通滤波器系数分别是:


// QMF filter coefficients in Q16.
static const uint16_t WebRtcSpl_kAllPassFilter1[3] = {6418, 36982, 57261};
static const uint16_t WebRtcSpl_kAllPassFilter2[3] = {21333, 49062, 63010};
项目内容
WebRtcSpl_kAllPassFilter1第一组三级一阶全通滤波器的 Q16 系数,用于构造低子带
WebRtcSpl_kAllPassFilter2第二组三级一阶全通滤波器的 Q16 系数,用于构造高子带
格式Q16,表示 value65536\frac{value}{65536}65536value
来源来源于经典 QMF 结构设计,用于语音信号的二分频滤波处理(Subband splitting)

通过matlab画出来,其A(z)A(z)A(z)A(−z)A(-z)A(z)的相频响应是如下图中所示:
在这里插入图片描述

2)全通滤波器串联


void WebRtcSpl_AllPassQMF(int32_t* in_data, size_t data_length,int32_t* out_data, const uint16_t* filter_coefficients,int32_t* filter_state)
{// The procedure is to filter the input with three first order all pass filters// (cascade operations).////         a_3 + q^-1    a_2 + q^-1    a_1 + q^-1// y[n] =  -----------   -----------   -----------   x[n]//         1 + a_3q^-1   1 + a_2q^-1   1 + a_1q^-1//// The input vector |filter_coefficients| includes these three filter coefficients.// The filter state contains the in_data state, in_data[-1], followed by// the out_data state, out_data[-1]. This is repeated for each cascade.// The first cascade filter will filter the |in_data| and store the output in// |out_data|. The second will the take the |out_data| as input and make an// intermediate storage in |in_data|, to save memory. The third, and final, cascade// filter operation takes the |in_data| (which is the output from the previous cascade// filter) and store the output in |out_data|.// Note that the input vector values are changed during the process.size_t k;int32_t diff;// First all-pass cascade; filter from in_data to out_data.// Let y_i[n] indicate the output of cascade filter i (with filter coefficient a_i) at// vector position n. Then the final output will be y[n] = y_3[n]// First loop, use the states stored in memory.// "diff" should be safe from wrap around since max values are 2^25// diff = (x[0] - y_1[-1])diff = WebRtcSpl_SubSatW32(in_data[0], filter_state[1]);// y_1[0] =  x[-1] + a_1 * (x[0] - y_1[-1])out_data[0] = WEBRTC_SPL_SCALEDIFF32(filter_coefficients[0], diff, filter_state[0]);// For the remaining loops, use previous values.for (k = 1; k < data_length; k++){// diff = (x[n] - y_1[n-1])diff = WebRtcSpl_SubSatW32(in_data[k], out_data[k - 1]);// y_1[n] =  x[n-1] + a_1 * (x[n] - y_1[n-1])out_data[k] = WEBRTC_SPL_SCALEDIFF32(filter_coefficients[0], diff, in_data[k - 1]);}// Update states.filter_state[0] = in_data[data_length - 1]; // x[N-1], becomes x[-1] next timefilter_state[1] = out_data[data_length - 1]; // y_1[N-1], becomes y_1[-1] next time// Second all-pass cascade; filter from out_data to in_data.// diff = (y_1[0] - y_2[-1])diff = WebRtcSpl_SubSatW32(out_data[0], filter_state[3]);// y_2[0] =  y_1[-1] + a_2 * (y_1[0] - y_2[-1])in_data[0] = WEBRTC_SPL_SCALEDIFF32(filter_coefficients[1], diff, filter_state[2]);for (k = 1; k < data_length; k++){// diff = (y_1[n] - y_2[n-1])diff = WebRtcSpl_SubSatW32(out_data[k], in_data[k - 1]);// y_2[0] =  y_1[-1] + a_2 * (y_1[0] - y_2[-1])in_data[k] = WEBRTC_SPL_SCALEDIFF32(filter_coefficients[1], diff, out_data[k-1]);}filter_state[2] = out_data[data_length - 1]; // y_1[N-1], becomes y_1[-1] next timefilter_state[3] = in_data[data_length - 1]; // y_2[N-1], becomes y_2[-1] next time// Third all-pass cascade; filter from in_data to out_data.// diff = (y_2[0] - y[-1])diff = WebRtcSpl_SubSatW32(in_data[0], filter_state[5]);// y[0] =  y_2[-1] + a_3 * (y_2[0] - y[-1])out_data[0] = WEBRTC_SPL_SCALEDIFF32(filter_coefficients[2], diff, filter_state[4]);for (k = 1; k < data_length; k++){// diff = (y_2[n] - y[n-1])diff = WebRtcSpl_SubSatW32(in_data[k], out_data[k - 1]);// y[n] =  y_2[n-1] + a_3 * (y_2[n] - y[n-1])out_data[k] = WEBRTC_SPL_SCALEDIFF32(filter_coefficients[2], diff, in_data[k-1]);}filter_state[4] = in_data[data_length - 1]; // y_2[N-1], becomes y_2[-1] next timefilter_state[5] = out_data[data_length - 1]; // y[N-1], becomes y[-1] next time
}

其实算法实现已经在注释里了,这里对于实现进行介绍,总共有三级,都是一阶全通滤波器:

    //         a_3 + q^-1    a_2 + q^-1    a_1 + q^-1// y[n] =  -----------   -----------   -----------   x[n]//         1 + a_3q^-1   1 + a_2q^-1   1 + a_1q^-1//

根据一阶全通滤波器的传递函数:
H(q)=a+q−11+aq−1⟹(1+aq−1)y[n]=(a+q−1)x[n]⟹y[n]+ay[n−1]=ax[n]+x[n−1]⟹y[n]=x[n−1]+a⋅(x[n]−y[n−1])H(q)=\frac{a+q^{-1}}{1+aq^{-1}} \\ \Longrightarrow (1+aq^{-1})y[n]=(a+q^{-1})x[n] \\ \Longrightarrow y[n] +ay[n-1]=ax[n]+x[n-1] \\ \Longrightarrow y[n]=x[n-1]+a\cdot(x[n]-y[n-1])H(q)=1+aq1a+q1(1+aq1)y[n]=(a+q1)x[n]y[n]+ay[n1]=ax[n]+x[n1]y[n]=x[n1]+a(x[n]y[n1])
所以对应代码实现:

  • 每一个全通滤波器的初始状态使用的是上一次的frame对应全通滤波器的输入输出,例如:
    • filter_state[0]: x[−1]x[-1]x[1]
    • filter_state[1]: y[−1]y[-1]y[1]
  • 计算x[n]−y[n−1]x[n]-y[n-1]x[n]y[n1]:diff = WebRtcSpl_SubSatW32(in_data[0], filter_state[1]);
  • 计算y[n]y[n]y[n]:out_data[0] = WEBRTC_SPL_SCALEDIFF32(filter_coefficients[0], diff, filter_state[0]);

总结

本篇博客中介绍了SplittingFilter的作用和使用它的意义,通过对于webrtc源代码的解读对于频带分割的流程做了深入介绍,需要结合上一篇对于All pass QMF原理的文章一起看才能理解为何要如此设计。

如果对您有所帮助,请帮忙点个赞吧!

http://www.dtcms.com/a/279270.html

相关文章:

  • vscode插件之markdown预览mermaid、markmap、markdown
  • 直播推流技术底层逻辑详解与私有化实现方案-以rmtp rtc hls为例-优雅草卓伊凡
  • 当 `conda list` 里出现两个 pip:一步步拆解并卸载冲突包
  • 2025年轨道交通与导航国际会议(ICRTN 2025)
  • 【数据同化案例1】ETKF求解参数-状态联合估计的同化系统(完整MATLAB实现)
  • C#结构体:值类型的设计艺术与实战指南
  • 2025年新能源与可持续发展国际会议(ICNESD 2025)
  • 非正常申请有这么多好处,为什么还要大力打击?
  • TreeSize Free - windows下硬盘空间管理工具
  • 一分钟K线实时数据数据接口,逐笔明细数据接口,分时成交量数据接口,实时五档委托单数据接口,历史逐笔明细数据接口,历史分时成交量数据接口
  • RESTful API 设计规范
  • 为什么资深C++开发者大部分选vector?揭秘背后的硬核性能真相!
  • Nginx配置信息
  • 项目进度图不直观,如何优化展示方式
  • 一种用于医学图像分割的使用了多尺寸注意力Transformer的混合模型: HyTransMA
  • SecretFlow 隐语 (2) --- 隐语架构概览
  • SQL性能调优经验总结
  • Redis缓存解决方案
  • Laravel 中 chunk 分页漏掉数据?深度解析原因与解决方案
  • 深度剖析:动态接口代理核心原理与高级应用
  • 工业4.0时代的安全管理:2025年物联网与AI技术的融合与10+工具实践
  • NSSCTF Web 一点学习
  • 高安全前端架构:Rust-WASM 黑盒技术揭秘
  • 机器学习、深度学习、神经网络之间的关系
  • Binder 概述
  • Linux操作系统从入门到实战(七)详细讲解编辑器Vim
  • 第二章 uniapp实现兼容多端的树状族谱关系图,封装tree-item子组件
  • 自学鸿蒙测试day0
  • 专题:2025机器人产业深度洞察报告|附136份报告PDF与数据下载
  • UDP协议的端口161怎么检测连通性