如何计算R中的重叠百分比
我试图用基因组坐标计算两个数据集之间的重叠百分比,满足某些标准 seg2如何计算R中的重叠百分比,r,bioinformatics,R,Bioinformatics,我试图用基因组坐标计算两个数据集之间的重叠百分比,满足某些标准 seg2 ID chrom loc.start loc.end num.mark seg.mean AB 1 3010000 173490000 8430 0.0039 AB 1 173510000 173590000 5 -17.738 AB 1 173610000 173830000 12 0.011 AB 1 173850000 17397000
ID chrom loc.start loc.end num.mark seg.mean
AB 1 3010000 173490000 8430 0.0039
AB 1 173510000 173590000 5 -17.738
AB 1 173610000 173830000 12 0.011
AB 1 173850000 173970000 6 -16.121
AB 2 3090000 181990000 8434 0.011
BB 12 3090000 68990000 2950 -0.2022
BB 12 69010000 87790000 889 0.0267
BB 12 88010000 98550000 507 -0.3337
BB 12 98570000 115090000 800 0.0586
BB 12 115110000 119350000 197 -0.2031
BB 12 119370000 119430000 4 -20.671
超过
chr start end CNA sample.ID
1 68580000 68640000 loss 1-68580000-68640000
3 15360000 16000000 loss 3-15360000-16000000
4 122660000 123500000 gain 4-122660000-123500000
7 48320000 48400000 loss 7-48320000-48400000
12 115860000 115980000 loss 12-115860000-115980000
12 113560000 114920000 gain 12-113560000-114920000
预期产出
ID chrom loc.start loc.end num.mark seg.mean lm(percentage of overlap)
AB 1 3010000 173490000 8430 0.0039 %
AB 1 173510000 173590000 5 -17.738
AB 1 173610000 173830000 12 0.011
AB 1 173850000 173970000 6 -16.121
AB 2 3090000 181990000 8434 0.011
BB 12 3090000 68990000 2950 -0.2022
BB 12 69010000 87790000 889 0.0267
BB 12 88010000 98550000 507 -0.3337
BB 12 98570000 115090000 800 0.0586
BB 12 115110000 119350000 197 -0.2031
BB 12 119370000 119430000 4 -20.671
我试过这个脚本,但不起作用
for (i in 1:now(seg2)) {
seg2$lm <- if((seg2$chrom[i] == over$chr[i]) |
(seg2$loc.start[i] <= over$start[i] & seg2$loc.end[i] >= over$end[i]) |
(over$seg.mean[i] >= 0.459 & seg2$CNA[i] == "gain") |
(over$seg.mean[i] <= -0.678 & seg2$CNA[i] == "loss"),
(over$end[i]-over$start[i])/(seg2$loc.end[i]-seg2$loc.start[i])*100)
}
for(i in 1:now(seg2)){
seg2$lm=0.459和seg2$CNA[i]=“增益”)|
(超过$seg.mean[i]我强烈建议您使用基因组特征
来有效地完成这项工作。如果您已经意识到创建自己的Granges
对象,那么您需要执行以下两个步骤来获得重叠长度
# to find overlaps
overlappin.index = findOverlaps(object1, object2)
# to get the overlap length
width(ranges(overlapping.index, ranges(object1),ranges(object2)))
其中,“object1”和“object2”是带有坐标的GRanges
对象,“overlappin.index”是重叠对象的索引。
一旦你有了长度,你可以很容易地得到百分比