这是我目前的ASR流程
1.在refseq搜集相关蛋白质序列,并cd-hit以0.95相似度去冗余剩下大概四五百条
2.mafft进行多序列比对,参数--auto,trimal剪辑对齐比对结果
3.iqtree建系统发育树
4.paml下面codeml ASR分析
关键卡在了第四步linux版本的codeml需要输入codeml.ctl配置文件,aaRatefile=氨基酸替代模型iqtree预测的是LG+F+R6,但是paml自带的.dat氨基酸替换模型没有这个组合,请问我该如何生成.dat文件。下面是我的配置文件比对序列时氨基酸序列,希望各位大佬检查一下有没有上面问题。
seqfile = supergene.phy
treefile = ff.fa.treefile
outfile = mlc
getSE = 0
noisy = 9
verbose = 1
seqtype = 2
runmode = 0
CodonFreq = 2
clock = 0
aaDist = 0
model = 2
aaRatefile =
icode = 0
Mgene = 0
fix_kappa = 0
kappa = 2
fix_omega = 0
omega = .4
fix_alpha = 1
alpha = 0
Malpha = 0
ncatG = 10
RateAncestor = 1
Small_Diff = .5e-6
method = 0