上海网站建设门户论坛seo招聘
AlphaFold3 msa_pairing模块的_make_msa_df函数主要是将多序列比对(MSA)数据转换为 Pandas DataFrame,提取其中用于 MSA 配对的关键特征。
源代码:
def _make_msa_df(chain_features: Mapping[str, np.ndarray]) -> pd.DataFrame:"""Makes dataframe with msa features needed for msa pairing."""chain_msa = chain_features['msa_all_seq']query_seq = chain_msa[0]per_seq_similarity = np.sum(query_seq[None] == chain_msa, axis=-1) / float(len(query_seq))per_seq_gap = np.sum(chain_msa == 21, axis=-1) / float(len(query_seq))msa_df = pd.DataFrame({'msa_species_identifiers':chain_features['msa_species_identifiers_all_seq'],'msa_row':np.arange(len(chain_features['msa_species_identifiers_all_seq'])),'msa_similarity': per_seq_similarity,'gap': per_seq_gap})return msa_df
代码解读:
函数输入
def _make_msa_df(chain_features: Mapping[str, np.ndarray]) -> pd.DataFrame:
chain_features
:一个字典,包含某条链(A 或 B)的 MSA 信息。- 主要用到的 key:
'msa_all_seq'
:包含目标序列(query sequence)和 MSA 的 2D 数组&#
- 主要用到的 key: