当前位置: 首页 > news >正文

病理软件Cellprofiler使用教程

在这里插入图片描述


一、软件安装

下载地址:https://cellprofiler.org/releases/

直接照安装提示安装即可(我的电脑可能安装过相关环境,所以可以直接运行)

若不行可参考其他教程:https://blog.csdn.net/weixin_38594676/article/details/125034672


二、使用

2.1 了解软件大概界面

我第一次使用cellprofiler时根据下面这个视频初步了解了一下软件

https://mp.weixin.qq.com/s/gKlhPD_qBR9QImxykKcPSw


2.2 Cellprofiler pipeline示例

Cellprofiler是通过创建一系列pipeline对图像进行处理(可保存为CellProfiler Project (.cpproj)即流程文件),本次示例pipeline参考下述文献:

Machine learning-based pathomics signature of histology slides as a novel prognostic indicator in primary central nervous system lymphoma https://link.springer.com/article/10.1007/s11060-024-04665-8#MOESM4


示例流程可通过百度网盘提取:

通过网盘分享的文件:cellprofiler demo
链接: https://pan.baidu.com/s/17dhvhnh3VaanLRE2BmvbHQ?pwd=zmpj 提取码: zmpj


具体操作描述内容如下(文献):

First, the images were split into hematoxylin-stained and eosin-stained greyscale images by the “UnmixColors” module. The nuclei of tumor cells were identified with the “IdentifyPrimaryObjects” module. Then the “IdentifySecondaryObjects” module identified the cell body by using the nuclei as a “seed” region, growing outwards until stopped by the image threshold or by a neighbor. Thus it identified the cytoplasm by “subtracting” the nuclei objects from the cell objects using the “IdentifyTertiaryObjects” module. The quantitative features were extracted with modules including “Measure Image Quality,” “Measure Image Intensity,” “Measure Granularity,” “Measure Colocalization,” “Measure Object Intensity,” “Measure Object Neighbors,” “Measure Object Size Shape,” and “Measure Texture” (Fig. 1b).

具体解释如下:
在这里插入图片描述

具体操作实操:

Step1:双击流程文件页面如下,根据Step逐步进行操作(每一步的具体参数调整本篇不做介绍):
在这里插入图片描述

(1)输入:准备一个包含病理patch图片(png/jpg)的文件夹

(2)过程展示:
在这里插入图片描述

在这里插入图片描述

(3)输出文件:

在这里插入图片描述

  • MyExpt_Image:图像级别的全局特征,包括图像的,如平均亮度、标准差、纹理特征、图像中识别出的对象数量(如细胞总数、细胞核总数)。
  • MyExpt_Experiment: 包含关于整个实验或整个批次运行的汇总统计数据和元数据
  • MyExpt_IdentifyPrimaryObjects:包含了每个被识别出的细胞核的详细测量值。
  • MyExpt_IdentifySecondaryObjects:指每个完整的细胞的详细测量值。它们通常是通过从主要对象(细胞核)向外扩展到细胞膜边界来识别的。
  • MyExpt_Cytoplasm:包含了每个细胞的细胞质区域的详细测量值。细胞质通常是通过从次要对象(整个细胞)中减去主要对象(细胞核)来定义的。

Step2合并为单个文件夹用于后面分析

import numpy as np
import pandas as pd
import os# 文件保存目录
indir = "./00.raw_data//"
print(indir)# 图像级别特征
infile = os.path.join(indir, "MyExpt_Image.csv")
print(infile)
df_image = pd.read_csv(infile)
prefixes = ['Correlation','Granularity','ImageQuality','Intensity','Texture','Threshold','FileName_HEslide','ImageNumber'
]# '^' 表示字符串的开头
# '|' 表示 '或' (OR)
# 我们将所有前缀用 '|' 连接起来,并用括号括起来,确保它们是作为一个整体进行 '或' 操作
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_image = df_image.filter(regex=regex_pattern)
df_image = df_image.groupby([ 'FileName_HEslide',"ImageNumber"]).agg("mean").reset_index()
df_image.columns = ["image_" + i for i in df_image.columns]
df_image = df_image.rename({'image_ImageNumber':"ImageNumber",'image_FileName_HEslide':'FileName_HEslide'},axis=1)
df_image["case_id"] = df_image["FileName_HEslide"].str.split("_").str[0]
df_image.head(1)# 细胞核
infile = os.path.join(indir, "MyExpt_IdentifyPrimaryObjects.csv")
print(infile)df_Primary= pd.read_csv(infile)
prefixes = ['Texture','Neighbors','Location','Intensity','AreaShape','ObjectNumber','ImageNumber'
]
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_Primary = df_Primary.filter(regex=regex_pattern)
df_Primary = df_Primary.groupby(["ImageNumber"]).agg("mean").reset_index()
df_Primary.columns = ["nucl_" + i for i in df_Primary.columns]
df_Primary = df_Primary.rename({'nucl_ImageNumber':"ImageNumber"},axis=1)
df_Primary.head(1)# 细胞
infile = os.path.join(indir, "MyExpt_IdentifySecondaryObjects.csv")
print(infile)df_Sec= pd.read_csv(infile)
prefixes = ['Texture','Location','Intensity','AreaShape','ObjectNumber','ImageNumber'
]
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_Sec = df_Sec.filter(regex=regex_pattern)df_Sec = df_Sec.groupby(["ImageNumber"]).agg("mean").reset_index()
df_Sec.columns = ["cell_" + i for i in df_Sec.columns]
df_Sec = df_Sec.rename({'cell_ImageNumber':"ImageNumber"},axis=1)
df_Sec.head(1)# 细胞质
infile = os.path.join(indir, "MyExpt_Cytoplasm.csv")
print(infile)df_Cyto= pd.read_csv(infile)
prefixes = ['Texture', 'Location', 'Intensity', 'AreaShape','ObjectNumber','ImageNumber'
]
regex_pattern = '^(' + '|'.join(prefixes) + ')'
print(f"生成的正则表达式: {regex_pattern}")
df_Cyto = df_Cyto.filter(regex=regex_pattern)
df_Cyto = df_Cyto.groupby(["ImageNumber"]).agg("mean").reset_index()
df_Cyto.columns = ["cyto_" + i for i in df_Cyto.columns]
df_Cyto = df_Cyto.rename({'cyto_ImageNumber':"ImageNumber"},axis=1)
df_Cyto.head(1)# 特征合并
# ImageNumber 指 每个patch,每个ObjectNumber指每张patch上分割出来的核,细胞质等
df_m = pd.merge(df_image,df_Primary,how="inner",on = "ImageNumber")
df_m = pd.merge(df_m,df_Sec,how="inner",on = "ImageNumber")
df_m = pd.merge(df_m,df_Cyto,how="inner",on = "ImageNumber")
print(df_m.shape)
df_m.to_csv("Feature_byCellprofilers_BRAF.csv",index=False)

最后输出文件如下:

在这里插入图片描述

Step3特征提取后处理

特征提取后处理流程1:病理学家标注ROI区域-> ROI patches (512 × 512 pixels) were tiled using OpenSlide -> color-normalized using the Vahadane method -> 50 non-overlapping representative patches that contained more tumor cells from each patient were selected for feature extraction -> The final value of each feature was averaged over 50 patches for each slide.

特征提取后处理流程2:The extracted features wereCellProfiler platform aggregated by mean, median, SD, 25-quantiles, and 75-quantiles of the values for the ROl in each slide. Intotal, 525 pathomics nucleus features (pNUC) features were generated for each patient.


三、一些其他pipeline参考(来于文献)

文献1: Development and interpretation of a pathomics-driven ensemble model for predicting the response to immunotherapy in gastric cancer

  • Cellprofiler Pipeline: First, pathomics tumor nucleus features were extracted. After segmenting tumor nuclei using a HoVer-Net model for each ROI, we extracted three categories of pathomics nucleus features, including nuclear intensity, morphology, and texture features, using the “MeasureObjectIntensity”, “MeasureObject SizeShape”, and “Measure Texture” modules in the CellProfiler platform.

文献2: Clinical use of machine learning-based pathomics signature for diagnosis and survival prediction of bladder cancer https://onlinelibrary.wiley.com/doi/full/10.1111/cas.14927

  • Patch选择策略:Patch with 1000 × 1000 pixels

  • Cellprofiler Pipeline:We built an image processing pipeline (Document S1) for segmentation and feature extraction using multiple modules in CellProfiler. H&E-stained images were firstly unmixed with 1000 × 1000 pixels via the ‘UnmixColors’ module. Afterwards, unmixed images were automatically segmented via an ‘IdentifyPrimaryObjects’ module and an ‘IdentifySecondaryObjects’ module to identify the cell nuclei and cell cytoplasm. Quantitative image features of object shape, size, texture, and pixel intensity distribution were further extracted via multiple modules, including measure models of ‘Object Intensity Distribution’, ‘Object Intensity’, ‘Texture’, and ‘Object Size Shape’. After eliminating unnecessary image features, 345 available quantitative image features (Document S2) were finally selected for further analysis, which were also listed in Table S1.


文献3: Prognostic and predictive value of a pathomics signature in gastric cancer(IF:17)(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9653436

  • Patch选择策略: Ten nonoverlapping representative tiles of each case containing the greatest number of tumour cells with a field of view of 1000 × 1000 pixels (one pixel is equal to 0.504 μm) were selected by a pathologist and then confirmed by the other pathologist.

  • Cellprofiler Pipeline: The quantitative pathomics features of the selected tiles were extracted by using CellProfiler (version 4.0.7), an open-source image analysis software developed by the Broad Institute (Cambridge, MA). The H&E-stained images were split into haematoxylin-stained and eosin-stained greyscale images using the “UnmixColors” module. The H&E-stained images were also converted to greyscale images using the “ColorToGray” module based on the “Combine” method for further analysis. First, the features that indicated the image quality of the greyscale H&E, haematoxylin and eosin images were assessed by using the “MeasureImageQuality” and “MeasureImageIntensity” modules with three types of features, including blurred features, intensity features and threshold features. The threshold features were extracted by automatically calculating the threshold for each image to identify the tissue foreground from the unstained background with the Otsu algorithm. Subsequently, the colocalization and correlation between intensities in each haematoxylin-stained image and eosin-stained image were calculated on a pixel-by-pixel basis across an entire image by using the *“*MeasureColocalization” module. In addition, the granularity features of each image were assessed using the “MeasureGranularity” module, which outputted spectra of size measurements of the textures in the image, with a granular spectrum range of 16. Further description of the pipeline for feature extraction is described in the Supplementary Methods. A summary of the pathomics features is presented in Supplementary Table 15.

http://www.dtcms.com/a/351307.html

相关文章:

  • 【系统编程】线程控制原语
  • 半小时打造七夕传统文化网站:Qoder AI编程实战记录
  • Ansible配置文件
  • 2025第五届人工智能、自动化与高性能计算国际会议 (AIAHPC 2025)
  • YUM配置
  • 适配欧拉操作系统
  • 高频面试题:说一下线程池吧?(线程池原理,核心参数,创建方式,应用场景都要说到才能让面试官心服口服)
  • 什么是AQS?
  • Xposed框架实战指南:从原理到你的第一个模块
  • R语言使用随机森林对数据进行插补
  • 【Java基础】Java数据结构深度解析:Array、ArrayList与LinkedList的对比与实践
  • 【HarmonyOS NEXT】打包鸿蒙应用并发布到应用市场
  • 构建生产级 RAG 系统:从数据处理到智能体(Agent)的全流程深度解析
  • Linux 网络数据收发全栈工具书:从 nc、socat 到 iperf3 的 Buildroot 路径与跨平台实战
  • 开心实习之第三十二天
  • Python爬虫实战:Uiautomator2 详解与应用场景
  • Android SystemServer 系列专题【篇四:SystemServerInitThreadPool线程池管理】
  • android 事件分发源码分析
  • STL库——vector(类函数学习)
  • 【51单片机】萌新持续学习中《矩阵 密码锁 点阵屏》
  • 矩阵初等变换的几何含义
  • 血缘元数据采集开放标准:OpenLineage Integrations Apache Spark Configuration Usage
  • 重写BeanFactory初始化方法并行加载Bean
  • 信息网络安全视角下的在线问卷调查系统设计与实践(国内问卷调查)
  • 记一个Mudbus TCP 帮助类
  • Linux 内核 Workqueue 原理与实现及其在 KFD SVM功能的应用
  • LeetCode - 844. 比较含退格的字符串
  • LeetCode 438. 找到字符串中所有的字母异位词
  • 微算法科技(NASDAQ:MLGO)通过修改 Grover 算法在可重构硬件上实现动态多模式搜索
  • LeetCode - 946. 验证栈序列