XP-CLR is a method that uses allele frequency differentiation at linked loci between two populations to
detect selective sweeps. Each chromosome was analyzed using the program XPCLR(v1.0) with parameters
-w1 0.0005 200 2000 1 -p1 0.9′."
运行参数别代表XPCLR -xpclr genofile1 genofile2 mapfile outputFile -w1 snpWin gridSize chrN -p corrLevel
运行得到的结果格式如下,分别代表chr# grid# #ofSNPs_in_window physical_pos genetic_pos XPCLR_score max_s
0 5148 6 10303058.000000 26.478879 0.000000 0.000200
0 5149 16 10305058.000000 26.483995 0.015511 0.001000
0 5150 11 10307058.000000 26.489143 0.000000 0.000400
0 5151 10 10309058.000000 26.494233 2.980613 0.003000
The average XP-CLR scores were calculated for each 20 kb sliding window with a step size of 2 kb. The
windows with the highest 1% of XP-CLR scores were considered as candidate
selective regions.
论文里面的描述average XP-CLR设置20kb滑动窗口和2kb步长,对每个窗口内的所有SNP计算XP-CLR平均值。这一步到底怎么做呢?窗口值20kb是第三列的snp个数,还是第四列的物理距离?还是直接行数呢?
2 回答
这个就是典型的sliding window的问题,在生信里很常见的。
比如你现在 0 ~ 1000Kbp(也就是1Mbp)的区间范围。按照20Kbp的窗口大小,2Kbp的步长进行滑动(sliding)。
那么你应该有下面若干区间
0 ~ 20Kbp
2 ~ 22Kbp
4 ~ 24Kbp
6 ~ 26Kbp
……
976 ~ 996Kbp
978 ~ 998Kbp
980 ~ 1000Kbp
然后你就应该计算这些区间里符合你要求的数值的平均值。
这家伙很懒,还没有设置简介