当前位置: 首页 > news >正文

[MIA 2025]CLIP in medical imaging: A survey

论文网址:CLIP in medical imaging: A survey - ScienceDirect

项目页面:github.com

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Background

2.3.1. Contrastive language-image pre-training

2.3.2. Variants of CLIP

2.3.3. Medical image–text dataset

2.4. CLIP in medical image–text pre-training

2.4.1. Challenges of CLIP pre-training

2.4.2. Multi-scale contrast

2.4.3. Data-efficient contrast

2.4.4. Explicit knowledge enhancement

2.4.5. Others

2.4.6. Summary

2.5. CLIP-driven applications

2.5.1. Classification

2.5.2. Dense prediction

2.5.3. Cross-modal tasks

2.5.4. Summary

2.6. Comparative analysis

2.7. Discussions and future directions

2.8. Conclusion

1. 心得

(1)我这可能只记录这篇文章比较不同的地方,基础CLIP和医学影像就不记录了,可以参考原文。主要是太长了没必要全搬运

(2)怎么全文画图风格还不一样,每个人画一张拼的?

(3)偏记录一点,介绍了不同的特别多模型

2. 论文逐段精读

2.1. Abstract

        ①就说CLIP在医学成像领域有意义然后要探索一下

2.2. Introduction

        ①Limitations: poor performance on out-of-distribution performance

        ②The trend of CLIP relevant papers (left) and medical image contained in thosed papers (right):

        ③How CLIP be used:

2.3. Background

2.3.1. Contrastive language-image pre-training

        ①How CLIP works(如果没看过可以去找CLIP原文,很清晰易懂的):

        ②Performance of CLIP in medical field:

2.3.2. Variants of CLIP

        ①介绍了一些变体,但因为没画图很难记住或者一眼知道有啥区别

2.3.3. Medical image–text dataset

        ①Open medical dataset:

2.4. CLIP in medical image–text pre-training

        ①Representative CILP based medical models:

2.4.1. Challenges of CLIP pre-training

        ①Challenges of CLIP in medical image field: 

Modality-influenced, local and global image/text analysis needed

Scarse data(不是说零样本泛化性都很好了吗为什么又说数据稀缺

Need professional kownledge

2.4.2. Multi-scale contrast

        ①GLoRIA matches text with subgraph:

        ②LoVT further assigns different weights on different sentence

2.4.3. Data-efficient contrast

        ①Blindly push all negative pairs away might reduce the relevance of similar disease:

        ②Add description or shuffle sentences

        ③Using medical image video

2.4.4. Explicit knowledge enhancement

        ①Combined with graph or kownledge graph(KG):

2.4.5. Others

        ~

2.4.6. Summary

        ~

2.5. CLIP-driven applications

2.5.1. Classification

        ①CLIP based models on image classification:

(1)Zero-shot classification

        ①Diagnosis example(我靠还能这样,,做二分类):

        ②How Xplainer works(我靠牛呗啊CLIP现在都酱紫玩的):

(2)Context optimization

        ①Example of context optimization:

这没什么解释,不能让人快速上手啊哈哈

2.5.2. Dense prediction

        ①Methods:

(1)Detection

        ①Lists relevant models

(2)2D medical image segmentation

        ①fine tune CLIP to 2D medical image dataset

(3)3D medical image segmentation

        ①Examples:

(4)Others

2.5.3. Cross-modal tasks

        ①Repesentitive models:

(1)Generation

        ①Automatically generate medical report or medical image

(2)Medical visual question answering

        ①Example(这构造奇奇怪怪的):

(3)Image–text retrieval

        ①Current models focus on global image feature

        ②X-TRA:

2.5.4. Summary

        ~

2.6. Comparative analysis

        ①How Multi-modality Large Language Model (MLLM) different from CLIP:

        ②Performance of CLIP on different image sets:

2.7. Discussions and future directions

        ①Inter-disease similarity:

        ②Challenges: inconsistency between pre-training and application, incomprehensive evaluation of refined pre-training, challenges of volumetric imaging, limited scope of refined CLIP pre-training, debiasing in CLIP Models, enhancing adversarial robustness of CLIP, exploring the potential of metadata, incorporation of high-order correlations, beyond image–text alignment

2.8. Conclusion

        ~

http://www.dtcms.com/a/264745.html

相关文章:

  • 多云密钥统一管理实战:CKMS对接阿里云/华为云密钥服务
  • .npmrc和.yarnrc配置文件介绍:分别用于 Node.js 中的 npm(Node Package Manager)和 Yarn 包管理工具
  • oracle集合三嵌套表(Nested Table)学习
  • 【第三章:神经网络原理详解与Pytorch入门】01.神经网络算法理论详解与实践-(1)神经网络预备知识(线性代数、微积分、概率等)
  • 微控制器中的EXTI0(External Interrupt 0)中断是什么?
  • uniapp socket 封装 (可拿去直接用)
  • 可编辑33页PPT | 某材料制造企业工业互联网平台解决方案
  • 云原生环境下部署大语言模型服务:以 DeepSeek 为例的实战教程
  • 6种iOS开发中常用的设计模式
  • Qt designer坑-布局内子控件的顺序错乱
  • 量化交易学习之自动化交易策略 [freqtrade 框架学习] ,常见问题避坑指南!!!!
  • <u>#12288;#8203;</u> HTML5全角空格,自动换行,半角用#32;#8203;
  • Spring AI Advisor RAG使用指南
  • Android Auto即将带来变革
  • AI大模型:从编码助手到流程重构者——软件开发新范式解析
  • 【前端】1 小时实现 React 简历项目
  • 多种方法实现golang中实现对http的响应内容生成图片
  • MySQL间隙锁详解:解决幻读的「隐形守护者」
  • React 学习(2)
  • 03-JS资料
  • 企业需要什么样的远程桌面管理软件?
  • 不引入变量 异或交换的缺点
  • 替代进口SCA7606【智芯微】国产高精度电流传感器 工业新能源电网专用
  • openai-agents记忆持久化(neo4j)
  • WPF学习笔记(21)ListBox、ListView与控件模板
  • 深入理解 LoRA:大语言模型微调的低秩魔法
  • PyTorch 不支持旧GPU的异常状态与解决方案:CUDNN_STATUS_NOT_SUPPORTED_ARCH_MISMATCH
  • Spring Boot 高并发框架实现方案:数字城市的奇妙之旅
  • 智能物流革命:Spring Boot+AI实现最优配送路径规划
  • Knife4j+Axios+Redis:前后端分离架构下的 API 管理与会话方案