Linux 基于具有不同行号的公共列合并两个txt文件_Linux_Awk_Merge

Linux 基于具有不同行号的公共列合并两个txt文件

linux awk merge

Linux 基于具有不同行号的公共列合并两个txt文件,linux,awk,merge,Linux,Awk,Merge,我希望合并两个以空格分隔的文件，而不首先根据“显型”列对它们进行排序。文件1多次包含相同的表型，而文件2每个表型只包含一次。我需要将文件1中的“表型”匹配到文件2中的“类别” 文件1： chr pos pval_EAS phenotype FDR 1 1902906 0.234 biomarkers-30600-both_sexes-irnt.tsv.gz 1 2 1475898 0.221 biomarkers-30600-both_sexes-irnt.tsv.gz 1 2 568899 0

我希望合并两个以空格分隔的文件，而不首先根据“显型”列对它们进行排序。文件1多次包含相同的表型，而文件2每个表型只包含一次。我需要将文件1中的“表型”匹配到文件2中的“类别”

文件1：

chr pos pval_EAS phenotype FDR
1 1902906 0.234 biomarkers-30600-both_sexes-irnt.tsv.gz 1
2 1475898 0.221 biomarkers-30600-both_sexes-irnt.tsv.gz 1
2 568899 0.433 continuous-4566-both_sexes-irnt.tsv.gz 1
2 2435478 0.113 continuous-4566-both_sexes-irnt.tsv.gz 1
4 1223446 0.112 phecode-554-both_sexes-irnt.tsv.gz 0.345
4 3456573 0.0003 phecode-554-both_sexes-irnt.tsv.gz 0.989

文件2：

phenotype Category
biomarkers-30600-both_sexes-irnt.tsv.bgz Metabolic
continuous-4566-both_sexes-irnt.tsv.gz Neoplasms
phecode-554-both_sexes-irnt.tsv.gz Immunological

我尝试了以下操作，但没有得到所需的输出：

awk -F' ' 'FNR==NR{a[$1]=$4; next} {print $0 a[$6]}' file2 file1 > file3

有了您展示的样品，请尝试以下内容

awk 'FNR==NR{arr[$1]=$2;next} ($4 in arr){print $0,arr[$4]}' file2 file1

awk 'FNR==NR{arr[$1]=$2;next} {print $0,(($4 in arr)?arr[$4]:"N/A")}' file2 file1

说明：添加上述内容的详细说明

awk '                ##Starting awk program from here.
FNR==NR{             ##Checking condition which will be TRUE when file2 is being read.
  arr[$1]=$2         ##Creating array arr with index of $1 and value is $2.
  next               ##next will skip all further statements from here.
}
($4 in arr){         ##Checking condition if 4th field is in arr then do following.
  print $0,arr[$4]   ##Printing current line along with value of arr with 4th field as index number.
}
' file2 file1        ##Mentioning Input_file names here.

奖励解决方案：如果要打印那些不匹配值的行，并且要使用

N/A

打印，请执行以下操作

awk 'FNR==NR{arr[$1]=$2;next} ($4 in arr){print $0,arr[$4]}' file2 file1

awk 'FNR==NR{arr[$1]=$2;next} {print $0,(($4 in arr)?arr[$4]:"N/A")}' file2 file1