如果所有其他列都相同,则添加第一列(AWK)
我有一个包含以下数据的文件:如果所有其他列都相同,则添加第一列(AWK),awk,Awk,我有一个包含以下数据的文件: 25 POSIX shell script, ASCII text executable 25 POSIX shell script, ASCII text executable 3 PostScript document text conforming DSC level 3.0, type EPS, Level 2 2 PostScript document text conforming DSC level 3.0, type EPS, Level
25 POSIX shell script, ASCII text executable
25 POSIX shell script, ASCII text executable
3 PostScript document text conforming DSC level 3.0, type EPS, Level 2
2 PostScript document text conforming DSC level 3.0, type EPS, Level 2
23 PostScript document text conforming DSC level 3.0, type EPS, Level 2
4 SVG Scalable Vector Graphics image
4 SVG Scalable Vector Graphics image
如果所有其他字段都相同,则希望对第一个字段求和,因此输出应为:
50 POSIX shell script, ASCII text executable
28 PostScript document text conforming DSC level 3.0, type EPS, Level 2
8 SVG Scalable Vector Graphics image
我尝试了这个awk命令:
awk '{ a[$2]+=$1 }END{ for(i in a) print a[i],i }' inputfile
其中打印:
25 POSIX
28 PostScript
8 SVG
但是我找不到一种方法来打印行的其余部分这是一种方法:
$ awk '{v=$1;$1="";s[$0]+=v}END{for(i in s)print s[i] i}' file
8 SVG Scalable Vector Graphics image
50 POSIX shell script, ASCII text executable
28 PostScript document text conforming DSC level 3.0, type EPS, Level 2
解释:
$ awk '{
v=$1 # store value in $1
$1="" # empty $1, record gets rebuilt
s[$0]+=v # sum indexing on $1less record
}
END { # in the end
for(i in s) # loop all
print s[i] i # ... and output
}' file
另一个带有“sort”的awk
$ sort -k2 sergio.txt | awk ' { t=$1; $1=""; c=$0;if(c==p) { s+=b} else { if(NR>1) print s+b,p; s=0} p=c;b=t} END { print s+b,p } ' sergio.txt
50 POSIX shell script, ASCII text executable
28 PostScript document text conforming DSC level 3.0, type EPS, Level 2
8 SVG Scalable Vector Graphics image
$
输入文件:
$ cat sergio.txt
25 POSIX shell script, ASCII text executable
25 POSIX shell script, ASCII text executable
3 PostScript document text conforming DSC level 3.0, type EPS, Level 2
2 PostScript document text conforming DSC level 3.0, type EPS, Level 2
23 PostScript document text conforming DSC level 3.0, type EPS, Level 2
4 SVG Scalable Vector Graphics image
4 SVG Scalable Vector Graphics image
$
$1=“”
将改变行中剩余的空格,我认为。如果这可能导致问题,您可以改为使用类似“sub(/[0-9]+[\t]+/,”)的内容。sub是数字和第一个字母空白字符或制表符之间的空格?这些是空白字符
$ cat sergio.txt
25 POSIX shell script, ASCII text executable
25 POSIX shell script, ASCII text executable
3 PostScript document text conforming DSC level 3.0, type EPS, Level 2
2 PostScript document text conforming DSC level 3.0, type EPS, Level 2
23 PostScript document text conforming DSC level 3.0, type EPS, Level 2
4 SVG Scalable Vector Graphics image
4 SVG Scalable Vector Graphics image
$