当前位置：首页 > wzjs >正文

运河网站制作怎样开自己的网站

wzjs 2025/8/26 17:41:18

运河网站制作,怎样开自己的网站,沈阳便宜做网站的,深圳网络优化原文链接：如何使用bedtools、convert2bed、gff2bed提取基因序列前言我们的使用基因组注释文件gtf或gff文件从基因组fa文件中提取transcript的方式很多,相对用的比较多的是使用gffread软件。但是gffread软件，提取的序列一般都是transcript序列&#xf…

原文链接：如何使用bedtools、convert2bed、gff2bed提取基因序列

前言

我们的使用基因组注释文件gtf或gff文件从基因组fa文件中提取transcript的方式很多,相对用的比较多的是使用gffread软件。但是gffread软件，提取的序列一般都是transcript序列，若是，我们的想提取的gene序列，那么不能直接使用基因组gtf文件。

本次，我们介绍其中的一种方法。使用bedtools getfasta提取gene全长序列。

注意：这仅仅只是其中的一种方法而已。

软件安装

软件安装，使用mamba安装bedtools,convert2bed,gff2bed。

mamba install -y bedtools convert2bed gff2bed

软件测试

bedtools -h

$ bedtools -hbedtools is a powerful toolset for genome arithmetic.Version:   v2.31.1
About:     developed in the quinlanlab.org and by many contributors worldwide.
Docs:      http://bedtools.readthedocs.io/
Code:      https://github.com/arq5x/bedtools2
Mail:      https://groups.google.com/forum/#!forum/bedtools-discussUsage:     bedtools <subcommand> [options]The bedtools sub-commands include:[ Genome arithmetic ]intersect     Find overlapping intervals in various ways.window        Find overlapping intervals within a window around an interval.closest       Find the closest, potentially non-overlapping interval.coverage      Compute the coverage over defined intervals.map           Apply a function to a column for each overlapping interval.genomecov     Compute the coverage over an entire genome.merge         Combine overlapping/nearby intervals into a single interval.cluster       Cluster (but don't merge) overlapping/nearby intervals.complement    Extract intervals _not_ represented by an interval file.shift         Adjust the position of intervals.subtract      Remove intervals based on overlaps b/w two files.

制作bed文件

使用bedtools提取序列，需要制作bed文件。格式如下所示：

我们可以直接使用awk命令进行提取对应的信息。我们使用Cucumber的gff注释文件提取。

cat Cucumber.CLv4.gff3 | awk '{if($3 == "gene") print $0}' | awk '{print $1"\t"$4"\t"$5"\t"$9"\t"$7}' | head

提取对应的列信息即可。

使用convert2bed或gff2bed结合起来提取信息。

cat Cucumber.CLv4.gff3 |awk '{if($3~/^gene$/)print }' > 01.gene.gff && convert2bed --input=gff --output=bed < 01.gene.gff > 02.gene.bed

获得如下信息

我们在此基础上加上awk的命令即可，同上。

cat Cucumber.CLv4.gff3 |awk '{if($3~/^gene$/)print }' > 01.gff && convert2bed --input=gff --output=bed < 01.gff > 02.bed && awk '{print $1"\t"$2"\t"$3"\t"$10"\t"$6}' <02.bed>03.bed

使用gff2bed软件结合。

awk '{if($3~/^gene$/)print}' Cucumber.CLv4.gff3  > 01.genes.gff && gff2bed <01.genes.gff> 02.genes.bed

使用bedtools getfasta提取序列。

bedtools getfasta -fi Cucumber.geome.fa -bed 02.gene.bed -fo cucumber.gene.fa -name -s

-name参数是必须加的，若是不加，你的cucumber.gene.fa文件中无基因名。

$ bedtools getfasta -hTool:    bedtools getfasta (aka fastaFromBed)
Version: v2.31.1
Summary: Extract DNA sequences from a fasta file based on feature coordinates.Usage:   bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf>Options: -fi		Input FASTA file-fo		Output file (opt., default is STDOUT-bed		BED/GFF/VCF file of ranges to extract from -fi-name		Use the name field and coordinates for the FASTA header-name+		(deprecated) Use the name field and coordinates for the FASTA header-nameOnly	Use the name field for the FASTA header-split		Given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons)-tab		Write output in TAB delimited format.-bedOut		Report extract sequences in a tab-delimited BED format instead of in FASTA format.- Default is FASTA format.-s		Force strandedness. If the feature occupies the antisense,strand, the sequence will be reverse complemented.- By default, strand information is ignored.-fullHeader	Use full fasta header.- By default, only the word before the first space or tab is used.-rna	The FASTA is RNA not DNA. Reverse complementation handled accordingly.

输出的结果基因ID如下所示。

>CsaV4_1G000004::chr1:1088370-1092905
>CsaV4_1G000005::chr1:1095847-1098019
>CsaV4_1G000003::chr1:1084077-1087157

使用sed命令进行批量处理：

sed 's/::.*//' input.fa > output.fa

若我们的教程对你有所帮助，请点赞+收藏+转发，大家的支持是我们更新的动力！！

2024已离你我而去，2025加油！！

2024年推文汇总 (点击后访问)

2023年推文汇总 (点击后访问)

2022年推文汇总 (点击后访问)

往期部分文章

1. 最全WGCNA教程（替换数据即可出全部结果与图形）

WGCNA分析代码六

推荐大家购买最新的教程，若是已经购买以前WGNCA教程的同学，可以在对应教程留言，即可获得最新的教程。（注：此教程也仅基于自己理解，不仅局限于此，难免有不恰当地方，请结合自己需求，进行改动。）

2. 精美图形绘制教程

精美图形绘制教程
《R语言绘图专栏–50+图形绘制教程》

3. 转录组分析教程

转录组上游分析教程[零基础]
一个转录组上游分析流程 | Hisat2-Stringtie
Samll RNA上游分析

4. 转录组下游分析

批量做差异分析及图形绘制 | 基于DESeq2差异分析
GO和KEGG富集分析
单基因GSEA富集分析
全基因集GSEA富集分析

BioinfoR生信筆記 ，注于分享生物信息学相关知识和R语言绘图教程。

查看全文

http://www.dtcms.com/wzjs/496730.html

网站开发微信支付详细教程百度推广助手怎么用

建设网站需要多久到账网站提交入口百度

做网站英文网站点击量统计

山东青岛网站建设seo优化大数据培训包就业靠谱吗

焦作网站建设策划软文案例大全300字

网站开发常用开发语言餐饮营销方案100例

网站推广基本预算自建站平台

网页设计师资格证培训班免费的关键词优化软件

襄阳市做网站的公司肇庆seo排名

东莞做网站乐云seo百度推广营销方案

网站推荐男生正能量2021郑州seo优化外包热狗网

网站的图片怎么做显示和隐藏怎么建立网站?

建设部网站1667号百度指数查询app

家居企业网站建设效果如何优化关键词排名快速首页

做网站合肥网站规划与设计

网站主页建设格式天津百度推广开户

做网站注意福建百度代理公司

网站建设hairongsoft小程序开发公司排行榜

做网站烧钱吗关键词推广价格

打电话说帮忙做网站长春网站建设定制

个人网站设计及实现论文怎么做推广

wordpress自定义新页面链接抚州seo外包

营销团队名称防城港网站seo

企业网站的开发公司网站优化最为重要的内容是

钓鱼网站的域名怎么不稳定网络安全培训最强的机构

大庆建设中专网站东莞哪种网站推广好

大学生作业代做网站女生读网络营销与电商直播

手机网站应该怎么做百度识图网页版入口

厦门房产网惠州seo博客

乐陵人力资源网站天津网站推广

原文链接：如何使用bedtools、convert2bed、gff2bed提取基因序列

前言

软件安装

往期部分文章

相关文章：