如果所有其他列都相同,则添加第一列(AWK)

如果所有其他列都相同,则添加第一列(AWK),awk,Awk,我有一个包含以下数据的文件: 25 POSIX shell script, ASCII text executable 25 POSIX shell script, ASCII text executable 3 PostScript document text conforming DSC level 3.0, type EPS, Level 2 2 PostScript document text conforming DSC level 3.0, type EPS, Level

我有一个包含以下数据的文件:

25  POSIX shell script, ASCII text executable
25  POSIX shell script, ASCII text executable
3   PostScript document text conforming DSC level 3.0, type EPS, Level 2
2   PostScript document text conforming DSC level 3.0, type EPS, Level 2
23  PostScript document text conforming DSC level 3.0, type EPS, Level 2
4   SVG Scalable Vector Graphics image
4   SVG Scalable Vector Graphics image
如果所有其他字段都相同,则希望对第一个字段求和,因此输出应为:

50  POSIX shell script, ASCII text executable
28  PostScript document text conforming DSC level 3.0, type EPS, Level 2
8   SVG Scalable Vector Graphics image
我尝试了这个awk命令:

awk '{ a[$2]+=$1 }END{ for(i in a) print a[i],i }' inputfile
其中打印:

25 POSIX
28 PostScript
8 SVG
但是我找不到一种方法来打印行的其余部分

这是一种方法:

$ awk '{v=$1;$1="";s[$0]+=v}END{for(i in s)print s[i] i}' file
8 SVG Scalable Vector Graphics image
50 POSIX shell script, ASCII text executable
28 PostScript document text conforming DSC level 3.0, type EPS, Level 2
解释:

$ awk '{
    v=$1              # store value in $1
    $1=""             # empty $1, record gets rebuilt
    s[$0]+=v          # sum indexing on $1less record
}
END {                 # in the end
    for(i in s)       # loop all 
        print s[i] i  # ... and output
}' file

另一个带有“sort”的awk

$  sort -k2 sergio.txt | awk  ' { t=$1; $1=""; c=$0;if(c==p) { s+=b} else { if(NR>1) print s+b,p; s=0} p=c;b=t} END { print s+b,p } ' sergio.txt
50  POSIX shell script, ASCII text executable
28  PostScript document text conforming DSC level 3.0, type EPS, Level 2
8  SVG Scalable Vector Graphics image
$
输入文件:

$ cat sergio.txt
25  POSIX shell script, ASCII text executable
25  POSIX shell script, ASCII text executable
3   PostScript document text conforming DSC level 3.0, type EPS, Level 2
2   PostScript document text conforming DSC level 3.0, type EPS, Level 2
23  PostScript document text conforming DSC level 3.0, type EPS, Level 2
4   SVG Scalable Vector Graphics image
4   SVG Scalable Vector Graphics image
$

$1=“”
将改变行中剩余的空格,我认为。如果这可能导致问题,您可以改为使用类似“sub(/[0-9]+[\t]+/,”)的内容。sub是数字和第一个字母空白字符或制表符之间的空格?这些是空白字符
$ cat sergio.txt
25  POSIX shell script, ASCII text executable
25  POSIX shell script, ASCII text executable
3   PostScript document text conforming DSC level 3.0, type EPS, Level 2
2   PostScript document text conforming DSC level 3.0, type EPS, Level 2
23  PostScript document text conforming DSC level 3.0, type EPS, Level 2
4   SVG Scalable Vector Graphics image
4   SVG Scalable Vector Graphics image
$