Awk 如何基于column3在2个文件中查找匹配的行并创建具有秩值的额外文件

Awk 如何基于column3在2个文件中查找匹配的行并创建具有秩值的额外文件,awk,position,Awk,Position,我有2个文件,我需要根据第3列合并(pos)。然后找到匹配的位置,并使用awk创建如下所示的理想输出。我想有4列输出。第四列表示两个文件中的公共位置,其秩号为 File1.txt SNP-ID Chr Pos rs62637813 1 52058 rs150021059 1 52238 rs4477212 1 52356 kgp15717912 1 53424 rs140052487 1

我有2个文件,我需要根据第3列合并(pos)。然后找到匹配的位置,并使用awk创建如下所示的理想输出。我想有4列输出。第四列表示两个文件中的公共位置,其秩号为

File1.txt

SNP-ID      Chr Pos
rs62637813      1       52058
rs150021059     1       52238
rs4477212       1       52356
kgp15717912     1       53424
rs140052487     1       54353 
rs9701779       1       56537
kgp7727307      1       56962
kgp15297216     1       72391
rs3094315       1       75256
rs3131972       1       75272
kgp6703048      1       75406
kgp22792200     1       75665
kgp15557302     1       75769
File2.txt:

SNP-ID      Chr      Pos    Chip1
rs58108140      1       10583   1
rs189107123     1       10611   2
rs180734498     1       13302   3
rs144762171     1       13327   4
rs201747181     1       13957   5
rs151276478     1       13980   6
rs140337953     1       30923   7
rs199681827     1       46402   8
rs200430748     1       47190   9
rs187298206     1       51476   10
rs116400033     1       51479   11
rs190452223     1       51914   12
rs181754315     1       51935   13
rs185832753     1       51954   14
rs62637813      1       52058   15
rs190291950     1       52144   16
rs201374420     1       52185   17
rs150021059     1       52238   18
rs199502715     1       53234   19
rs140052487     1       54353   20
理想产出:

SNP-ID      Chr Pos Chip1   Chip2
rs58108140      1       10583   1   0
rs189107123     1       10611   2   0
rs180734498     1       13302   3   0
rs144762171     1       13327   4   0
rs201747181     1       13957   5   0
rs151276478     1       13980   6   0
rs140337953     1       30923   7   0
rs199681827     1       46402   8   0
rs200430748     1       47190   9   0
rs187298206     1       51476   10  0
rs116400033     1       51479   11  0
rs190452223     1       51914   12  0
rs181754315     1       51935   13  0
rs185832753     1       51954   14  0
rs62637813      1       52058   15  1
rs190291950     1       52144   16  0
rs201374420     1       52185   17  0
rs150021059     1       52238   18  2
rs199502715     1       53234   19  0
rs140052487     1       54353   20  3

我不太明白你说的“等级”是什么意思

awk '
    NR==FNR {pos[$3]=1; next} 
    FNR==1 {print $0, "Chip2"; next} 
    {print $0, ($3 in pos ? ++rank : 0)}
' File1.txt File2.txt | column -t
SNP-ID       Chr  Pos    Chip1  Chip2
rs58108140   1    10583  1      0
rs189107123  1    10611  2      0
rs180734498  1    13302  3      0
rs144762171  1    13327  4      0
rs201747181  1    13957  5      0
rs151276478  1    13980  6      0
rs140337953  1    30923  7      0
rs199681827  1    46402  8      0
rs200430748  1    47190  9      0
rs187298206  1    51476  10     0
rs116400033  1    51479  11     0
rs190452223  1    51914  12     0
rs181754315  1    51935  13     0
rs185832753  1    51954  14     0
rs62637813   1    52058  15     1
rs190291950  1    52144  16     0
rs201374420  1    52185  17     0
rs150021059  1    52238  18     2
rs199502715  1    53234  19     0
rs140052487  1    54353  20     3