bash-如果两列匹配,则追加列
如果在一个文件中已经找到前面的两个字段,我将尝试添加一列 我有一个以逗号分隔的文件,其中包含大量条目,我需要查找两列(第二列和第七列)上匹配的所有行。如果两个都在多行中找到,则添加第八列,称为“共享” 文件内容:bash-如果两列匹配,则追加列,bash,awk,Bash,Awk,如果在一个文件中已经找到前面的两个字段,我将尝试添加一列 我有一个以逗号分隔的文件,其中包含大量条目,我需要查找两列(第二列和第七列)上匹配的所有行。如果两个都在多行中找到,则添加第八列,称为“共享” 文件内容: WPC PROD LINUX O,1808,4194304000,10,3G,4G,66314 WPC PROD LINUX O,1809,3145728000,10,3G,4G,66314 WPC PROD LINUX O,1812,4194304000,10,3G,4G,66314
WPC PROD LINUX O,1808,4194304000,10,3G,4G,66314
WPC PROD LINUX O,1809,3145728000,10,3G,4G,66314
WPC PROD LINUX O,1812,4194304000,10,3G,4G,66314
WPC PROD LINUX,1808,4194304000,10,1D,2D,66314
WPC PROD LINUX,1809,3145728000,10,1D,2D,66314
WPC PROD LINUX,1812,4194304000,10,1D,2D,66314
WPCESXCS40BP01_0,1808,4194304000,10,1D,2D,66314
WPCESXCS40BP01_0,1809,3145728000,10,1D,2D,66314
WPCESXCS40BP01_0,1812,4194304000,10,1D,2D,66314
所需输出:
WPC PROD LINUX O,1808,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX O,1809,3145728000,10,3G,4G,66314,shared
WPC PROD LINUX O,1812,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX,1808,4194304000,10,1D,2D,66314,shared
WPC PROD LINUX,1809,3145728000,10,1D,2D,66314,shared
WPC PROD LINUX,1812,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1808,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1809,3145728000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1812,4194304000,10,1D,2D,66314,shared
我已经搜索并找到了这个链接,但它并不完全符合我的需要,它只匹配下面的一行
我可以这样做:
while IFS=',' read host device blk poolnum porta portb serial
ldev_count=`cat outputtest.txt | grep -iw $device | grep -iw $serial | wc -l`
if [[ $ldev_count > 1 ]] ; then
echo "$host, $device, $blk, $poolnum, $porta, $portb, $serial, SHARED" >> semifinal.txt
else
echo "$host, $device, $blk, $poolnum, $porta, $portb, $serial" >> semifinal.txt
fi
done < outputtest.txt
当IFS=',读取主机设备blk poolnum porta portb串行
ldev_count=`cat outputtest.txt | grep-iw$device | grep-iw$serial | wc-l`
如果[$ldev_count>1];然后
echo“$host、$device、$blk、$poolnum、$porta、$portb、$serial、SHARED”>>semifial.txt
其他的
echo“$host、$device、$blk、$poolnum、$porta、$portb、$serial”>>semifial.txt
fi
完成
但是速度非常慢。我希望能找到更好的解决办法
谢谢你的帮助
为格式化而编辑,请尝试以下内容,并让我知道这是否对您有帮助
awk -F, 'FNR==NR{a[$2,$7]++;next} a[$2,$7]>1{print $0",shared"}' Input_file Input_file
输出如下
WPC PROD LINUX O,1808,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX O,1809,3145728000,10,3G,4G,66314,shared
WPC PROD LINUX O,1812,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX,1808,4194304000,10,1D,2D,66314,shared
WPC PROD LINUX,1809,3145728000,10,1D,2D,66314,shared
WPC PROD LINUX,1812,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1808,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1809,3145728000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1812,4194304000,10,1D,2D,66314,shared
编辑:如果要打印字符串为“shared”的匹配行,而只打印非匹配行,则以下内容可能会对您有所帮助
awk -F, ' ##Creating field delimiter as comma.
FNR==NR{ ##FNR==NR is a condition which will be TRUE when first Input_file is being read.
a[$2,$7]++; ##creating an array named a whose index is $2,$7(second and 7th field) and incrementing its value with 1 each time same elements come.
next ##Using next keyword will skip all further statements.
}
a[$2,$7]>1{ ##This condition will be TRUE only when 2nd Input_file is being read, check if array a value in index of $2,$7 is greater than 1.
print $0",shared" ##Printing the current line with keyword shared at last of line.
next;
}
1
' Input_file Input_file ##Mentioning the Input_file twice here.
请您尝试以下内容,并让我知道这是否有助于您
awk -F, 'FNR==NR{a[$2,$7]++;next} a[$2,$7]>1{print $0",shared"}' Input_file Input_file
输出如下
WPC PROD LINUX O,1808,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX O,1809,3145728000,10,3G,4G,66314,shared
WPC PROD LINUX O,1812,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX,1808,4194304000,10,1D,2D,66314,shared
WPC PROD LINUX,1809,3145728000,10,1D,2D,66314,shared
WPC PROD LINUX,1812,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1808,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1809,3145728000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1812,4194304000,10,1D,2D,66314,shared
编辑:如果要打印字符串为“shared”的匹配行,而只打印非匹配行,则以下内容可能会对您有所帮助
awk -F, ' ##Creating field delimiter as comma.
FNR==NR{ ##FNR==NR is a condition which will be TRUE when first Input_file is being read.
a[$2,$7]++; ##creating an array named a whose index is $2,$7(second and 7th field) and incrementing its value with 1 each time same elements come.
next ##Using next keyword will skip all further statements.
}
a[$2,$7]>1{ ##This condition will be TRUE only when 2nd Input_file is being read, check if array a value in index of $2,$7 is greater than 1.
print $0",shared" ##Printing the current line with keyword shared at last of line.
next;
}
1
' Input_file Input_file ##Mentioning the Input_file twice here.
您可能需要:
awk -F\, 'NR==FNR{a[$2]++;b[$7]++;next}
a[$2]>1 && b[$7]>1{$(NF+1)="shared"}1' OFS=',' file file
结果:
WPC PROD LINUX O,1808,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX O,1809,3145728000,10,3G,4G,66314,shared
WPC PROD LINUX O,1812,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX,1808,4194304000,10,1D,2D,66314,shared
WPC PROD LINUX,1809,3145728000,10,1D,2D,66314,shared
WPC PROD LINUX,1812,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1808,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1809,3145728000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1812,4194304000,10,1D,2D,66314,shared
解释
我们将对该文件进行两次迭代:
First:NR==FNR{a[$2]+;b[$7]+;next}
我们获取每个列的重复数,并将其存储在a
和b
数组中
Second:a[$2]>1和&b[$7]>1{$(NF+1)=“共享”}1
要筛选与预期重复次数匹配的行,两列的此数字必须大于1,才能添加新的结束列:$(NF+1)=“shared”
注意:1
只是避免使用打印语句的快捷方式。您可能需要:
awk -F\, 'NR==FNR{a[$2]++;b[$7]++;next}
a[$2]>1 && b[$7]>1{$(NF+1)="shared"}1' OFS=',' file file
结果:
WPC PROD LINUX O,1808,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX O,1809,3145728000,10,3G,4G,66314,shared
WPC PROD LINUX O,1812,4194304000,10,3G,4G,66314,shared
WPC PROD LINUX,1808,4194304000,10,1D,2D,66314,shared
WPC PROD LINUX,1809,3145728000,10,1D,2D,66314,shared
WPC PROD LINUX,1812,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1808,4194304000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1809,3145728000,10,1D,2D,66314,shared
WPCESXCS40BP01_0,1812,4194304000,10,1D,2D,66314,shared
解释
我们将对该文件进行两次迭代:
First:NR==FNR{a[$2]+;b[$7]+;next}
我们获取每个列的重复数,并将其存储在a
和b
数组中
Second:a[$2]>1和&b[$7]>1{$(NF+1)=“共享”}1
要筛选与预期重复次数匹配的行,两列的此数字必须大于1,才能添加新的结束列:$(NF+1)=“shared”
注意:
1
只是一种避免使用打印语句的快捷方式。请在此处突出显示第2列和第7列,我的意思是可能会有一些混淆,因为我看不到这两列在您的问题中是相同的?请突出显示它们好吗?编辑了格式以提高可读性。好的,那么您是说,如果任何两行之间共享了2和7列(例如1808
和66314
),如果找到,您想在两行共享的末尾追加“shared”
?完全正确,谢谢!请在这里突出显示第2列和第7列,我的意思是可能会有一些混淆,因为我看不到这两列在您的问题中是相同的?请突出显示它们好吗?编辑了格式以提高可读性。好的,那么您是说,如果任何两行之间共享了2和7列(例如1808
和66314
),如果找到,您想在两行共享的末尾追加“shared”
?完全正确,谢谢!这正是我要求的。有没有办法也打印不匹配的行?@LukeFowler,请检查我的编辑解决方案,并告诉我这是否对您有帮助。这正是我要求的。有没有办法也打印不匹配的行?@LukeFowler,请检查我的编辑解决方案,并告诉我这是否对您有帮助。这太完美了!万分感谢!这太完美了!万分感谢!