脑潜在进展:基于潜扩散模型的三维脑磁共振成像个体时空疾病进展研究|文献速递-深度学习人工智能医疗图像
Title
题目
Brain Latent Progression: Individual-based spatiotemporal diseaseprogression on 3D Brain MRIs via latent diffusion
脑潜在进展:基于潜扩散模型的三维脑磁共振成像个体时空疾病进展研究
01
文献速递介绍
神经退行性疾病是现代医疗保健领域最紧迫的挑战之一,这类疾病会导致大脑功能和生活质量不可逆转地下降。迄今为止,由于尚无有效的治愈方法,患者和护理人员面临着长期的痛苦,而医疗系统也在不断攀升的成本和资源需求中艰难应对。解决这一危机需要范式转变,转向以早期干预、精准医疗和综合护理为重点的积极策略。这些疾病极其复杂,表现出与不同分子亚型相关的广泛神经病理学变异(Tijms 等,2024)。此外,它们在大脑各区域的表现不均衡,通过多种机制以不同速度进展,反映出其病理生理学的复杂本质(Young 等,2018)。解决这一问题需要开发先进工具,以加深我们对疾病机制的理解,最终推动定制化且更有效的治疗策略的制定。 早期的疾病进展建模方法侧重于捕捉标量生物标志物的动态变化(Young 等,2024;Oxtoby 和 Alexander,2017)。例如,在 Firth 等(2019)的研究中,通过对特定脑区的体积变化进行建模,研究了在后部皮质萎缩患者中观察到的特征性萎缩模式。尽管标量生物标志物提供了简化的表征,但它们极大地促进了我们对各种神经退行性疾病的理解,如阿尔茨海默病(AD)(Vogel 等,2021)和多发性硬化症(Eshaghi 等,2021)。然而,这些方法的一个显著局限性是它们无法捕捉可能更准确反映疾病潜在病理生理学的时空特征。例如,额颞叶痴呆患者在出现任何可检测到的体积减少之前,丘脑就已出现形态改变(Cury 等,2019)。 对时空模式的日益认可推动了传统疾病进展模型向时空方法的演变。时空模型(Young 等,2024)通常利用高维数据,如 3D 形状或完整的医学扫描,以更详细和全面的方式表征疾病动态,能够可视化并精确定位随时间推移发生的复杂结构变化。 具体而言,本文将重点关注应用于 3D T1 加权脑磁共振成像(T1w brain MRIs)的时空模型,旨在在个体水平上估计大脑在病理状态(如神经退行性变)和非病理状态(即衰老)下发生的结构变化。我们确定并聚焦于与该任务相关的四个主要挑战: 1. 个体化:疾病进展受各种个体因素影响,包括人口统计学和临床变量。为提高预测准确性,模型必须整合并利用受试者特异性元数据。 2. 纵向数据利用:纵向数据提供了有关个体疾病轨迹的宝贵见解,例如每位患者的进展速度。在可获得纵向数据时,模型应将其整合到推理过程中。 3. 时空一致性:多个时间点的疾病进展预测应呈现平滑、一致的演变,与潜在的生物学过程相符。 4. 内存需求:处理 3D 医学图像需要大量内存资源,这可能会限制模型在资源匮乏环境中的适用性(Blumberg 等,2018)。使此类模型能够在消费级硬件上运行将有助于更广泛的采用。 为解决这些挑战,我们引入了脑潜在进展模型(Brain Latent Progression, BrLP),这是一种新型的基于个体的时空模型,能够在个体水平上预测 3D 脑磁共振成像中的疾病进展。BrLP 在应对上述挑战方面有几项关键贡献。首先,我们提出将潜在扩散模型(LDM)(Rombach 等,2022)与 ControlNet(Zhang 等,2023)相结合,以在可用受试者数据的条件下生成个体化脑磁共振成像,从而解决挑战 1。其次,我们通过采用辅助模型来整合疾病进展的先验知识,该辅助模型可推断不同脑区的体积变化,在可获得纵向数据时能够利用这些数据,进而解决挑战 2。第三,我们引入了潜在平均稳定化(Latent Average Stabilization, LAS) 技术,以提高预测进展的时空一致性,解决挑战 3。第四,我们利用脑磁共振成像的潜在表征来减少处理 3D 扫描的内存需求,解决挑战 4。最后,我们展示了如何利用 LAS 在全局和体素水平上推导预测的不确定性度量,这可作为临床应用中的可靠性指标。 我们训练 BrLP 以了解具有不同认知状态的受试者大脑中发生的结构变化:认知正常(CN)、轻度认知障碍(MCI)和阿尔茨海默病(AD)。为此,我们使用了来自 2805 名受试者的 11,730 幅 T1 加权磁共振成像组成的大型数据集,这些数据来源于三项关于阿尔茨海默病的公开纵向研究。此外,我们采用了来自 962 名受试者的 2257 幅 T1 加权磁共振成像的外部纵向数据集,以评估我们的方法对样本外数据的泛化能力。据我们所知,我们是首个提出将疾病进展先验知识整合到图像生成过程中的 3D 条件生成脑磁共振成像模型的团队。 这项工作在多个方面扩展了我们在 MICCAI 2024 会议上发表的文章(Puglisi 等,2024):(1)通过分析 LAS 算法的超参数丰富了消融研究;(2)在外部数据集上测试 BrLP 以评估其泛化能力;(3)评估认知状态作为条件变量的影响;(4)在 BrLP 框架内引入一种机制来量化全局和体素水平上的预测不确定性,并提供统计分析以支持我们的发现;(5)展示了 BrLP 在临床试验患者选择中的潜在临床应用示例。
Aastract
摘要
The growing availability of longitudinal Magnetic Resonance Imaging (MRI) datasets has facilitated ArtificialIntelligence (AI)-driven modeling of disease progression, making it possible to predict future medical scansfor individual patients. However, despite significant advancements in AI, current methods continue toface challenges including achieving patient-specific individualization, ensuring spatiotemporal consistency,efficiently utilizing longitudinal data, and managing the substantial memory demands of 3D scans. To addressthese challenges, we propose Brain Latent Progression (BrLP), a novel spatiotemporal model designed topredict individual-level disease progression in 3D brain MRIs. The key contributions in BrLP are fourfold:(i) it operates in a small latent space, mitigating the computational challenges posed by high-dimensionalimaging data; (ii) it explicitly integrates subject metadata to enhance the individualization of predictions; (iii)it incorporates prior knowledge of disease dynamics through an auxiliary model, facilitating the integration oflongitudinal data; and (iv) it introduces the Latent Average Stabilization (LAS) algorithm, which (a) enforcesspatiotemporal consistency in the predicted progression at inference time and (b) allows us to derive a measureof the uncertainty for the prediction at the global and voxel level. We train and evaluate BrLP on 11,730T1-weighted (T1w) brain MRIs from 2,805 subjects and validate its generalizability on an external test setcomprising 2,257 MRIs from 962 subjects. Our experiments compare BrLP-generated MRI scans with realfollow-up MRIs, demonstrating state-of-the-art accuracy compared to existing methods.
纵向磁共振成像(MRI)数据集的日益丰富,为人工智能(AI)驱动的疾病进展建模提供了便利,使得预测个体患者未来的医学影像成为可能。然而,尽管AI领域已取得显著进展,当前方法仍面临诸多挑战,包括实现患者特异性个体化、确保时空一致性、高效利用纵向数据,以及处理三维影像庞大的内存需求。为解决这些挑战,我们提出了脑潜在进展模型(Brain Latent Progression, BrLP)——一种新型时空模型,旨在预测三维脑MRI中的个体水平疾病进展。BrLP的核心贡献体现在四个方面: (i)它在低维潜在空间中运行,减轻了高维影像数据带来的计算挑战; (ii)通过明确整合受试者元数据,提升预测的个体化程度; (iii)借助辅助模型融入疾病动态的先验知识,促进纵向数据的有效整合; (iv)提出潜在平均稳定化(Latent Average Stabilization, LAS)算法,该算法(a)在推理阶段确保预测进展的时空一致性,(b)支持在全局和体素水平推导预测的不确定性度量。 我们基于11,730幅来自2,805名受试者的T1加权(T1w)脑MRI对BrLP进行训练和评估,并在包含962名受试者的2,257幅MRI的外部测试集上验证其泛化能力。实验将BrLP生成的MRI影像与真实随访MRI对比,结果表明该模型的准确性优于现有方法,达到当前最优水平。
Method
方法
We now introduce the architecture of BrLP, comprising four keycomponents: an LDM, a ControlNet, an auxiliary model, and a LASblock, each described in successive paragraphs. These four components,summarized in Fig. 1, collectively address the challenges outlined in theintroduction. In particular, the LDM is designed to generate random3D brain MRIs that conform to specific covariates, while ControlNetaims to specialize these MRI scans to specific anatomical structures ofa subject. Additionally, the auxiliary model leverages prior knowledgeof disease progression to improve the precision in predicting the volumetric changes of specific brain regions. Finally, the LAS block is usedduring inference to improve spatiotemporal consistency, as well as toderive a measure of uncertainty for the predictions both at the globalvoxel level.
我们现将介绍 BrLP 的架构,该架构包含四个关键组件:潜在扩散模型(LDM)、控制网络(ControlNet)、辅助模型和潜在平均稳定化(LAS)模块,各组件将在后续段落中依次描述。这四个组件的汇总如图 1 所示,共同应对引言中提出的各项挑战。具体而言,潜在扩散模型(LDM)用于生成符合特定协变量的随机 3D 脑磁共振成像(MRIs),而控制网络(ControlNet)旨在将这些 MRI 扫描结果特化为特定受试者的解剖结构。此外,辅助模型借助疾病进展的先验知识,提高对特定脑区体积变化预测的精度。最后,潜在平均稳定化(LAS)模块在推理阶段用于提升时空一致性,并在全局和体素水平上推导预测的不确定性度量。
Conclusion
结论
This work introduces BrLP, a 3D spatiotemporal model that accurately captures neurodegenerative disease progression patterns by predicting individual 3D brain MRI evolution. While we focused on brainMRI applications, BrLP’s potential extends to other imaging modalities and progressive diseases. Moreover, the model can potentiallyincorporate additional covariates, such as genetic data, for enhancedindividualization. Our experiments demonstrate how BrLP can be usedfor patient selection in clinical trials to reduce the risk of type IIerrors. We believe that its application also extends to post-trial analysis,where, by generating digital twins of participants, BrLP could simulate untreated disease trajectories, enabling individualized treatmenteffect assessment. This approach could reduce the reliance on controlgroups and mitigate ethical concerns related to withholding potentialtherapeutic benefits.
本研究提出了脑潜在进展模型(BrLP),这是一种三维时空模型,通过预测个体三维脑磁共振成像(MRI)的演变,精准捕捉神经退行性疾病的进展模式。尽管我们的研究聚焦于脑MRI应用,但BrLP的潜力可扩展至其他成像模态和进行性疾病。此外,该模型还有望整合额外的协变量(如遗传数据),以进一步增强个体化预测能力。 我们的实验表明,BrLP可用于临床试验中的患者筛选,以降低Ⅱ类错误风险。我们认为,其应用范围还可扩展至试验后分析:通过生成受试者的数字孪生体,BrLP能够模拟未接受治疗的疾病轨迹,从而实现个体化治疗效果评估。这种方法可减少对对照组的依赖,并减轻因延迟潜在治疗获益而引发的伦理担忧。
Results
结果
In this section, we first describe the datasets and evaluation metricsused in our study. We then present an extensive evaluation of BrLPthrough five distinct experiments: an ablation study examining BrLP’scomponents and hyperparameters, a comparative analysis against established baseline methods, an investigation of the impact of cognitivestatus conditioning, an assessment of our proposed uncertainty metricsat the global and voxel level, and an exploration of BrLP’s potential toreduce Type II errors in clinical trials.
在本节中,我们首先描述本研究中使用的数据集和评估指标。随后,通过五项独立实验对BrLP进行全面评估:一项针对BrLP组件及超参数的消融研究,一项与现有基准方法的对比分析,一项关于认知状态条件变量影响的探究,一项对我们提出的全局和体素水平不确定性度量的评估,以及一项关于BrLP在减少临床试验中Ⅱ类错误方面潜力的探索。
Figure
图
Fig. 1. The overview of BrLP training and inference process. The training process outputs an autoencoder (A) that maps 3D brain MRIs into small latent representations; an LDM(B) able to generate latent representations according to subject-specific and progression-related covariates; a ControlNet (C), able to constrain the LDM’s generation process to asubject’s brain. During inference (E), progression-related variables at the target age are first predicted by an auxiliary model (D). These predictions, combined with subject-specificvariables and the baseline MRI, condition the generation of the latent representations corresponding to the predicted brain at the target age. Finally, the LAS algorithm (F) repeatsthis process 𝑚 times and averages the obtained latent representations before decoding the result into the 3D MRI space.
图1 BrLP训练与推理过程概述。训练过程输出:一个自编码器(A),用于将3D脑磁共振成像(MRIs)映射到低维潜在表征中;一个潜在扩散模型(LDM,B),能够根据受试者特异性协变量和进展相关协变量生成潜在表征;一个控制网络(ControlNet,C),能够将LDM的生成过程约束于特定受试者的脑部结构。在推理阶段(E),目标年龄下的进展相关变量首先由辅助模型(D)预测。这些预测结果与受试者特异性变量及基线MRI相结合,为目标年龄下预测脑结构对应的潜在表征生成提供条件约束。最后,潜在平均稳定化(LAS)算法(F)重复此过程𝑚次,并对得到的潜在表征取平均值,再将结果解码至3D MRI空间。
Fig. 2.Demographic and diagnostic statistics of the internal and external datasets. Distributions include (A) age at baseline, (B) average time interval between the initial andfollow-up visits, (C) sex distribution, and (D) diagnosis (CN, MCI, AD) at final visit.
图2 内部数据集与外部数据集的人口统计学及诊断统计结果。分布图包括:(A)基线年龄分布,(B)首次就诊与随访就诊之间的平均时间间隔分布,(C)性别分布,以及(D)末次就诊时的诊断结果分布(认知正常(CN)、轻度认知障碍(MCI)、阿尔茨海默病(AD))。
Fig. 3. Effect of varying the LAS parameter 𝑚 on different performance metrics and computation time. The plots show the trends for SSIM, MSE, MAE for different brain regions(hippocampus, amygdala, lateral ventricle, thalamus, and CSF) and computation time as 𝑚 increases from 1 to 64. Error bars indicate the 95% confidence intervals of the metric.Most metrics show improvements (higher SSIM, lower MSE and MAE) with increasing 𝑚.
图3 LAS参数𝑚的变化对不同性能指标和计算时间的影响。图表展示了随着𝑚从1增加到64,结构相似性指数(SSIM)、均方误差(MSE)、不同脑区(海马体、杏仁核、侧脑室、丘脑和脑脊液)的平均绝对误差(MAE)以及计算时间的变化趋势。误差线表示指标的95%置信区间。大多数指标随𝑚增大呈现改善趋势(SSIM升高,MSE和MAE降低)。
Fig. 4. A comparison between the real progression of a 70 y.o. subject with MCI (from the internal test set) over 15 years and the predictions obtained by BrLP and the baselinemethods. Each method shows a predicted MRI (left) and its deviation from the subject’s real brain MRI (right)
图4 一名70岁轻度认知障碍(MCI)受试者(来自内部测试集)15年间的真实病情进展与BrLP及基准方法预测结果的对比。每种方法均展示了预测的MRI(左侧)及其与该受试者真实脑MRI的偏差(右侧)。
Fig. 5. (A) Difference in uncertainty (𝑦-axis) as a function of prediction distance (𝑥-axis) in years (divided by 100). (B) MSE (𝑦-axis) as a function of uncertainty (𝑥-axis). (C)SSIM (𝑦-axis) as a function of uncertainty (𝑥-axis). In all plots, colored lines represent trends for individual subjects, and the black line shows the overall fixed effect from a linearmixed-effects model.
图5 (A)不确定性差异(𝑦轴)与预测时间距离(𝑥轴,单位:年,已除以100)的关系。(B)均方误差(MSE,𝑦轴)与不确定性(𝑥轴)的关系。(C)结构相似性指数(SSIM,𝑦轴)与不确定性(𝑥轴)的关系。所有图表中,彩色线条代表个体受试者的趋势,黑色线条表示线性混合效应模型的整体固定效应。
Fig. 6. Voxel-level uncertainty evaluated for predictions at different timesteps for a single subject. The first and second rows show the ground truth and predicted MRIs, respectively,at each timestep. The third row presents the uncertainty maps, with lighter colors indicating higher uncertainty. The fourth row displays the voxel-level squared error betweenthe ground truth and predicted MRIs.
图6 单名受试者不同时间步预测结果的体素水平不确定性评估。第一行和第二行分别显示每个时间步的真实MRI和预测MRI。第三行呈现不确定性图,颜色越浅表示不确定性越高。第四行显示真实MRI与预测MRI之间的体素水平平方误差。
Fig. 7. Comparison of patient selection methods for identifying fast progressors in clinical trials. The plot shows the efficacy (𝑦-axis) of three selection methods (Random, BrLP, andRegression) across various sample sizes (𝑥-axis) in both internal and external test sets. Efficacy is measured as the proportion of fast progressors (based on hippocampal atrophy)correctly identified by each method compared to the optimal selection.
图7临床试验中快速进展者识别的患者筛选方法对比。图表展示了在内部和外部测试集中,三种筛选方法(随机筛选法、BrLP法、回归法)在不同样本量(𝑥轴)下的效能(𝑦轴)。效能定义为每种方法相对最优筛选结果而言,正确识别的快速进展者(基于海马体萎缩判定)所占比例。
Fig. 8. Effect of the number of DDIM inference steps on BrLP performance. SSIM (leftaxis, blue) and MSE (right axis, red) are reported for different numbers of denoisingsteps. Shaded areas indicate 95% confidence intervals.
图8 DDIM推理步数对BrLP性能的影响。图表中报告了不同去噪步数下的结构相似性指数(SSIM,左轴,蓝色)和均方误差(MSE,右轴,红色)。阴影区域表示指标的95%置信区间。
Table
表
Table 1Results from the ablation study. MAE (±SD) in predicted volumes is expressed as a percentage of total brain volume.
表1消融研究结果。预测体积的平均绝对误差(MAE ± 标准差)以占全脑体积的百分比表示。
Table 2Results from the comparison with baseline methods on the internal test set. MAE (±SD) in predicted volumes is expressed as a percentage of total brain volume.
表2内部测试集上与基准方法的对比结果。预测体积的平均绝对误差(MAE ± 标准差)以占全脑体积的百分比表示。
Table 3Results from the comparison with baseline methods on the external test set. MAE (±SD) in predicted volumes is expressed as a percentage of total brain volume
表3 外部测试集上与基准方法的对比结果。预测体积的平均绝对误差(MAE ± 标准差)以占全脑体积的百分比表示。
Table 4Evaluating the impact of incorrect conditioning on cognitive status in BrLP predictions. MAE (±SD) in predicted volumes is expressed as a percentage of total brain volume.
表4 认知状态条件变量错误设定对BrLP预测结果的影响评估。预测体积的平均绝对误差(MAE ± 标准差)以占全脑体积的百分比表示。
Table 5Evaluation of BrLP performance differences between male and female subjects. MSE and regional MAE values (±SD) are reported, with the best result for each metric between thetwo groups highlighted in bold
表5 男性与女性受试者的BrLP性能差异评估。表中报告了均方误差(MSE)和各区域平均绝对误差(MAE)值(±标准差),两组间每项指标的最佳结果以粗体突出显示。