当前位置：首页 > news >正文

MMR-Mamba：基于 Mamba 和空间频率信息融合的多模态 MRI 重建|文献速递-深度学习医疗AI最新文献

news 来源：原创 2025/5/30 8:30:50

Title

题目

MMR-Mamba: Multi-modal MRI reconstruction with Mamba and spatial-frequency information fusion

MMR-Mamba：基于 Mamba 和空间频率信息融合的多模态 MRI 重建

文献速递介绍

磁共振成像（MRI）因其无创、无辐射特性以及能够通过不同模态提供高分辨率形态学信息，成为重要的临床成像技术（Stoja等人，2021）。在临床实践中，具有互补信息的多模态MR图像被同步采集，以实现更准确的疾病诊断和治疗规划（Feng等人，2022）。例如，在脑部MR成像中，T1加权图像（T1WI）提供详细的解剖结构信息，而T2加权图像（T2WI）则有助于检测水肿、炎症和液性结构（Menze等人，2014）。类似地，在膝关节成像中，质子密度加权图像（PDWI）揭示结构信息，而脂肪抑制质子密度加权图像（FS-PDW）可抑制脂肪信号并突出软骨韧带（Chen等人，2015）。然而，由于k空间数据的顺序采集，MR成像本质上耗时较长，这会导致患者不适和操作成本增加（Plenge等人，2012）。因此，加速MR成像，尤其是通过欠采样k空间数据重建高质量MR图像，在临床实践中具有迫切需求（Guo等人，2023）。先前研究（Bilgic等人，2011；Song等人，2019；Lai等人，2017；Xiang等人，2018；Sun等人，2019；Lyu等人，2023；Li等人，2022）表明，利用易获取的模态（如T1WI或PDWI等参考模态）作为辅助指导，重建成像速度较慢的目标模态（如T2WI或FS-PDW）是一种有前景的方法，这被称为多模态MRI重建。该重建任务的主要挑战在于如何全面探索各模态内的长程依赖关系，并有效利用参考模态的互补信息。早期方法采用卷积神经网络（CNN）集成多模态信息（Xiang等人，2018；Lyu等人，2020；Xuan等人，2022），但这类方法通常表现出局部敏感性且缺乏长程依赖性，从而限制了其整合两模态相关特征以实现忠实MRI重建的能力。相比之下，基于Transformer的模型（Feng等人，2022；Li等人，2023；Huang等人，2023）以其大感受野和全局敏感性为特点，在捕捉广泛上下文信息方面通常优于CNN。然而，由于资源随序列长度呈二次增长，这些模型面临巨大的计算开销。因此，开发一种能够全面探索长程依赖关系并整合不同模态互补信息而不带来显著计算负担的算法至关重要。最近，具有选择性扫描机制的改进结构化状态空间序列模型Mamba（Gu和Dao，2023）因其能够以线性复杂度建模长程序列关系，成为Transformer的有力替代方案。在自然语言处理（Gupta等人，2022；Qin等人，2023）和医学图像分割（Xing等人，2024；Ma等人，2024）等涉及长期依赖建模的任务中，Mamba已表现出优于Transformer的性能。探索Mamba在多模态MRI重建中对长程依赖建模和互补信息融合的潜力具有重要意义。另一方面，频域中的每个分量代表空间域所有像素值的组合，这意味着频域特征捕捉整体模式和结构，提供整个图像的全局视图。此外，先前研究表明，傅里叶特征在恢复高频信号方面有效，而高频信号对缓解图像退化至关重要（Tancik等人，2020）。因此，通过在频域进行特征融合，可以实现不同模态间全面高效的全局特征整合。因此，Mamba模块和频域为多模态信息的全面高效融合提供了两种有前景的解决方案。基于上述分析，我们提出了一种用于多模态MRI重建的新框架MMR-Mamba。该框架基于Mamba架构，通过目标模态引导的跨Mamba（TCM）模块和选择性频率融合（SFF）模块，分别在空间域和频域联合探索互补信息融合。此外，我们引入自适应空间-频率融合（ASFF）模块以相互增强这两个域的特征。具体而言，Mamba模块用于提取各模态的特征。然后，我们设计TCM用于空间域信息融合，通过目标模态引导的跨Mamba将参考模态的相关特征选择性地补充到目标模态。在频域的SFF模块中，我们对相位谱进行逐元素求和，对幅度谱进行选择性集成，因为两模态的相位谱主要包含一致的结构信息，而不同模态的幅度谱则包含不兼容的风格信息。最后，我们采用ASFF模块增强来自两个域的融合特征，其中一个域中信息量较少的通道通过融入另一个域的对应通道特征来补充。ASFF模块实现了相关信息的整合和冗余特征的抑制。我们的主要贡献总结如下： • 提出MMR-Mamba，一种高效的多模态MRI重建框架。据我们所知，这是首次探索利用Mamba整合多模态MR图像的互补信息。 • 设计空间域互补特征融合的TCM模块和频域全局结构信息融合的SFF模块。 • 引入ASFF模块进行空间-频率信息融合，增强任务相关特征的同时抑制来自这两个域的无关特征。 • 在BraTS和fastMRI膝关节数据集上的大量实验表明，MMR-Mamba以更少的参数持续优于现有方法（见图1）。

Abatract

摘要

Multi-modal MRI offers valuable complementary information for diagnosis and treatment; however, its clinical utility is limited by prolonged scanning time. To accelerate the acquisition process, a practical approach is to reconstruct images of the target modality, which requires longer scanning time, from under-sampled kspace data using the fully-sampled reference modality with shorter scanning time as guidance. The primary challenge of this task lies in comprehensively and efficiently integrating complementary information from different modalities to achieve high-quality reconstruction. Existing methods struggle with this challenge: (1) convolution-based models fail to capture long-range dependencies; (2) transformer-based models, while excelling in global feature modeling, suffer from quadratic computational complexity. To address this dilemma, we propose MMR-Mamba, a novel framework that thoroughly and efficiently integrates multi-modal features for MRI reconstruction, leveraging Mamba’s capability to capture long-range dependencies with linear computational complexity while exploiting global properties of the Fourier domain. Specifically, we first design a Target modality-guided Cross Mamba (TCM) module in the spatial domain, which maximally restores the target modality information by selectively incorporating relevant information from the reference modality. Then, we introduce a Selective Frequency Fusion (SFF) module to efficiently integrate global information in the Fourier domain and recover high-frequency signals for the reconstruction of structural details. Furthermore, we devise an Adaptive Spatial-Frequency Fusion (ASFF) module, which mutually enhances the spatial and frequency domains by supplementing less informative channels from one domain with corresponding channels from the other. Extensive experiments on the BraTS and fastMRI knee datasets demonstrate the superiority of our MMR-Mamba over state-of-the-art reconstruction methods.

多模态磁共振成像（MRI）为诊断和治疗提供了有价值的互补信息，但其临床应用受限于较长的扫描时间。为加速采集过程，一种实用方法是利用扫描时间较短的全采样参考模态作为指导，从欠采样k空间数据中重建目标模态（需更长扫描时间）的图像。该任务的主要挑战在于全面且高效地整合不同模态的互补信息以实现高质量重建。现有方法面临以下难题：（1）基于卷积的模型难以捕捉长距离依赖关系；（2）基于Transformer的模型虽擅长全局特征建模，但存在二次计算复杂度问题。为解决这一困境，我们提出MMR-Mamba框架，该框架充分且高效地整合多模态特征用于MRI重建，既利用Mamba以线性计算复杂度捕捉长距离依赖的能力，又借助傅里叶域的全局特性。具体而言： 1. 目标模态引导的跨模态Mamba模块（TCM）：在空间域设计TCM模块，通过选择性融入参考模态的相关信息，最大限度恢复目标模态信息。该模块利用Mamba的序列建模优势，有效捕捉跨模态的长程空间依赖关系，避免卷积模型的局部性局限。 2. 选择性频率融合模块（SFF）：引入SFF模块在傅里叶域高效整合全局信息，恢复高频信号以重建结构细节。通过直接操作k空间数据，该模块利用低频区域的全局解剖结构先验和高频区域的局部细节特征，提升重建图像的分辨率和真实性。 3. 自适应空间-频率融合模块（ASFF）：设计ASFF模块实现空间域与频率域的交互增强，通过用一个域中信息丰富的通道补充另一个域中信息较弱的通道，实现跨域特征的互补优化，进一步提升重建精度。在BraTS脑肿瘤数据集和fastMRI膝关节数据集上的大量实验表明，MMR-Mamba显著优于现有先进重建方法。

Method

方法

3.1. Preliminaries State Space Models. SSMs are typically defined as linear, timeinvariant systems that map an input sequence ??(??) ∈ R?? to an output sequence ??(??) ∈ R?? through a hidden state ℎ(??) ∈ R?? . These systems can be mathematically expressed as the following ordinary differential equation (ODE):

3.1 预备知识：状态空间模型状态空间模型（SSMs）通常被定义为线性时不变系统，其通过隐藏状态 ( \mathbf{h}(t) \in \mathbb{R}^N ) 将输入序列 ( \mathbf{u}(t) \in \mathbb{R}^D ) 映射为输出序列 ( \mathbf{y}(t) \in \mathbb{R}^M )。这类系统可通过以下常微分方程（ODE）进行数学描述：

Conclusion

结论

This study explores the comprehensive and efficient integration of complementary information across modalities for multi-modal MRI reconstruction. We present the MMR-Mamba framework, which effectively integrates information through the TCM module in the spatial domain and the SFF module in the frequency domain, along with integrating the spatial-frequency features through the ASFF module. In particular, the TCM module employs cross Mamba blocks to selectively supplement complementary information from the reference modality to the target modality, while the SFF module integrates global information in the Fourier domain and restores high-frequency signals for reconstructing structural details. We conducted extensive experiments on the BraTS and fastMRI knee datasets, demonstrating the superiority of our proposed framework in reconstructing MRI under different acceleration factors.

本研究探索了多模态MRI重建中跨模态互补信息的全面高效整合方法，提出了MMR-Mamba框架。该框架通过空间域的TCM模块和频率域的SFF模块实现信息有效融合，并通过ASFF模块完成空间-频率特征的交互集成。具体而言，TCM模块利用跨模态Mamba块有选择地将参考模态的互补信息补充到目标模态，而SFF模块则在傅里叶域整合全局信息并恢复高频信号以重建结构细节。我们在BraTS和fastMRI膝关节数据集上进行了大量实验，结果表明所提框架在不同加速因子下的MRI重建中均表现出显著优势。

Results

结果

4.1. Dataset description In this study, we adopt two datasets with different anatomical structures and protocols for evaluation: the BraTS dataset (Menze et al., 2014) and the fastMRI knee dataset (Knoll et al., 2020). The BraTS Dataset 1 contains both T1WIs and T2WIs scans of the brain. [We randomly select 100 volumes from the BraTS dataset. From these, we extract 2D slices where the object area occupied more than 25% of the image, yielding a total of 3621 images for training and 1088 images for testing.] The 2D image size is 240 × 240. In our experiments, we use T1WI as the reference modality for the reconstruction of the T2WI modality. The fastMRI Dataset 2 is the largest public MRI dataset with raw k-space data. Following Xuan et al. (2020), we select 227 and 45 pairs of single-coil PDWI and FS-PDWI knee volumes for training and testing, respectively, resulting in a total of 8332 pairs of 2D images for training and 1665 images for testing. The 2D image size is 320 × 320. In our experiments, we use PDWI as the reference modality for the reconstruction of the FS-PDWI modality.

4.1 数据集描述本研究采用两种具有不同解剖结构和协议的数据集进行评估：BraTS数据集（Menze等，2014）和fastMRI膝关节数据集（Knoll等，2020）。 1. BraTS数据集：包含脑部的T1加权成像（T1WI）和T2加权成像（T2WI）扫描数据。我们从BraTS数据集中随机选取100例容积数据，从中提取目标区域面积占图像25%以上的2D切片，最终得到3621幅图像用于训练，1088幅图像用于测试。2D图像尺寸为240×240。在实验中，我们使用T1WI作为参考模态来重建T2WI模态。 2. fastMRI数据集：这是目前最大的包含原始k空间数据的公开MRI数据集。参照Xuan等（2020）的方法，我们选取227对单线圈质子密度加权成像（PDWI）和脂肪抑制质子密度加权成像（FS-PDWI）膝关节容积数据用于训练，45对用于测试，对应生成8332对2D图像用于训练，1665幅图像用于测试。2D图像尺寸为320×320。实验中，我们使用PDWI作为参考模态来重建FS-PDWI模态。

Figure

图

Fig. 1. Performance and parameter count comparison of our model with existing methods for (a) BraTS 4× and (b) fastMRI 4× acceleration. The size of each circle represents the number of model parameters, with larger circles indicating more parameters. In both reconstruction scenarios, our model achieves the best PSNR and SSIM values with significantly fewer parameters compared to other methods, demonstrating its efficiency and effectiveness.

图1 本文模型与现有方法在（a）BraTS 4倍加速和（b）fastMRI 4倍加速场景下的性能与参数数量对比。每个圆圈的大小表示模型参数数量，圆圈越大代表参数越多。在两种重建场景中，我们的模型均以显著更少的参数实现了最佳PSNR和SSIM值，证明了其高效性与有效性。

Fig. 2. Overview of the proposed MMR-Mamba framework (left). It contains Mamba blocks for feature extraction, TCM for spatial domain fusion, SFF for frequency domain fusion, and ASFF for spatial-frequency information integration. Structure of Mamba block and TCM (right).

图2 （左）所提出的MMR-Mamba框架概述。该框架包含用于特征提取的Mamba模块、空间域融合的TCM模块、频率域融合的SFF模块，以及空间-频率信息集成的ASFF模块。（右）Mamba模块和TCM模块的结构细节。

Fig. 3. Illustration of Selective Frequency Fusion (SFF) module

图3 选择性频率融合（SFF）模块示意图

Fig. 4. Illustration of the Adaptive Spatial-Frequency Fusion module.

图4 自适应空间-频率融合（ASFF）模块示意图

Fig. 5. Qualitative evaluation of reconstruction results from different methods on BraTS dataset under 4× and 8× acceleration. For each group, the first row shows the reconstructed images and the second row displays the error map between the results and the ground truth

图5 BraTS数据集上不同方法在4倍和8倍加速下的重建结果定性评估。每组中，第一行为重建图像，第二行为结果与真实值之间的误差图。

Fig. 6. Qualitative evaluation of reconstruction results from different methods on fastMRI knee dataset under 4× and 8× acceleration. For each group, the first row shows the reconstructed images and the second row displays the error map between the results and the ground truth.

图6 fastMRI膝关节数据集上不同方法在4倍和8倍加速下的重建结果定性评估。每组中，第一行为重建图像，第二行为结果与真实值之间的误差图。

Fig. 7. Visualization of the results from ablation study of our proposed modules on BraTS dataset under 8× acceleration.

图7 BraTS数据集上8倍加速下所提出模块消融实验结果的可视化。

Fig. 8. Performance comparison of different hyperparameter ?? settings in the ASFF module for (a) BraTS at 4× acceleration and (b) BraTS at 8× acceleration. The dashed lines represent the performance when the ASFF module is removed.

图8 ASFF模块中不同超参数λ设置的性能对比：（a）BraTS数据集4倍加速；（b）BraTS数据集8倍加速。虚线表示移除ASFF模块时的性能表现。

Fig. 9. Violin plots of the scaling factors ????????,?? and ????????,?? across different training epochs, accompanied by their respective threshold values ???????? and ???????? . The plots illustrate the evolution of channel selection dynamics, highlighting how changes predominantly occur in channels with smaller scaling factors

图9 不同训练 epoch 下缩放因子 ( \gamma{s,f} ) 和 ( \gamma{f,s} ) 及其对应阈值 ( \tau_s ) 和 ( \tau_f ) 的小提琴图。图中展示了通道选择动态的演化过程，突出显示变化主要发生在缩放因子较小的通道中。

Fig. 10. Comparison of Effective Receptive Fields (ERFs) for MINet, MCCA, Pan-Mamba, and MMR-Mamba. The colorbar, scaled between 0 and 1, represents the normalized intensity of the ERFs. A value close to 1 (dark green) indicates regions with the highest influence on the feature response, while values near 0 (yellow) represent areas with minimal or no influence. MMR-Mamba achieves the largest receptive field, extending to the full image size, demonstrating its ability to capture global context effectively compared to other methods.

图10 MINet、MCCA、Pan-Mamba与MMR-Mamba的有效感受野（ERF）对比。色条范围为0到1，表示归一化的ERF强度。接近1的值（深绿色）表示对特征响应影响最大的区域，接近0的值（黄色）表示影响最小或无影响的区域。MMR-Mamba实现了最大的感受野，覆盖全图尺寸，表明其相比其他方法更有效地捕捉全局上下文信息的能力。

Table

表

Table 1 Quantitative results on the BraTS and fastMRI datasets with different acceleration factors. We report mean ± std for the PSNR, SSIM, and NMSE metrics, along with network parameters. The best results are highlighted in red (see Bernstein et al., 2001; Xiang et al., 2018; Feng et al., 2021; Liang et al., 2021; Feng et al., 2022; Li et al., 2023; Huang et al., 2023; He et al., 2024; Guo et al., 2024b; Sun et al., 2025).

表1 不同加速因子下BraTS和fastMRI数据集的定量结果。我们报告了PSNR、SSIM和NMSE指标的平均值±标准差，以及网络参数数量。最佳结果以红色突出显示（参考文献：Bernstein等，2001；Xiang等，2018；Feng等，2021；Liang等，2021；Feng等，2022；Li等，2023；Huang等，2023；He等，2024；Guo等，2024b；Sun等，2025）。

Table 2 Ablation study of the proposed modules on BraTS dataset. We report mean ± std for the PSNR, SSIM, and NMSE metrics.

表2 BraTS数据集上所提出模块的消融实验结果。我们报告了PSNR、SSIM和NMSE指标的平均值±标准差。

Table 3 Ablation study on spatial domain fusion. We report mean ± std for the PSNR, SSIM, and NMSE metrics, along with model parameters, on the BraTS dataset under 4× and 8× acceleration

表3 空间域融合的消融实验。我们报告了BraTS数据集在4倍和8倍加速下的PSNR、SSIM和NMSE指标的平均值±标准差，以及模型参数数量。

Table 4 Ablation study on frequency domain fusion. We report mean ± std for the PSNR, SSIM, and NMSE metrics on the BraTS dataset under 4× and 8× acceleration

表4 频域融合的消融实验。我们报告了BraTS数据集在4倍和8倍加速下的PSNR、SSIM和NMSE指标的平均值±标准差。