带过滤器的awk和grep?

带过滤器的awk和grep?,awk,grep,find,Awk,Grep,Find,这是我要筛选的日志文件 xxxyyy.com/plugins/status.gif?type=videoprogress;status=first;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545 xxxyyy.com/plugins/status.gif?type=videoprogress;status=mid;sid=6941c712-ca83-4aa1-a

这是我要筛选的日志文件

xxxyyy.com/plugins/status.gif?type=videoprogress;status=first;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoprogress;status=mid;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=US;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoprogress;status=third;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=US;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoprogress;status=complete;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoothers;status=pause;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoothers;status=mute;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1547;cid=IN;cpid=1547
xxxyyy.com/plugins/status.gif?type=videoothers;status=unmute;sid=6941c712-ca83-4aa1-a69a-931ca66df655;vid=606829;vrid=61478182;pid=1545;cid=IN;cpid=1545
xxxyyy.com/plugins/status.gif?type=videoothers;status=error;sid=6941c712-ca83-4aa1-a69a-931ca66df656;vid=606829;vrid=61478182;pid=1546;cid=IN;cpid=1546
我需要这样的输出

pid  cid cpid Count  
1545 IN  1545   4  
1545 US  1545   2  
1546 IN  1546   1    
1547 IN  1547   1  
请任何人帮帮我,快而脏:

kent$  awk -F';' '{a[$(NF-2) OFS $(NF-1) OFS $NF]++}
                   END{for(x in a)print x, a[x]}' file
pid=1547 cid=IN cpid=1547 1
pid=1545 cid=US cpid=1545 2
pid=1546 cid=IN cpid=1546 1
pid=1545 cid=IN cpid=1545 4

现在,您可以调整输出以适应所需的格式。

与kent one稍有不同:

awk -F';' '{ split($6,pid,"="); split($7,cid,"="); split($8,cpid,"="); n[pid[2] OFS cid[2] OFS cpid[2]]++; } END { print "pid","cid","cpid","count"; for (p in n) { print p,n[p] } }' input.txt
给出:

pid cid cpid count
1545 IN 1545 4
1545 US 1545 2
1546 IN 1546 1
1547 IN 1547 1
只有带注释的代码

{ 
  split($6,pid,"="); split($7,cid,"="); split($8,cpid,"="); # Get the numbers from each pair in an array
  n[pid[2] OFS cid[2] OFS cpid[2]]++; # count the tuples from the numbers (create an array with the tuples as key and increment it)
} 
END { 
  print "pid","cid","cpid","count"; # print the header
  for (p in n) { print p,n[p] } # print the key (tuples) and the count of it
}

另一种awk方式,与其他方式类似

awk -F';' '{for(i=0;i<3;i++){split($(NF-i),a,"=");x=a[2]" "x;NR==1&&y=a[1]" "y}
            b[x]++;x=z}END{print y "count";for(i in b)print i b[i]}' file

你需要在什么基础上增加计数?。。。基于pid/cid/cpid/all???顺便说一句,我没有投反对票不错的一个,我没有花时间分解头/拆分调用,尽管对于新手来说理解起来会很复杂。据我所知,在OFS上使用空格的原因是因为矫揉造作的人不能按原样使用它,需要调用sprintf吗?(不确定最后一点)@Tensibai谢谢:)不,如果你愿意,你可以使用OFS,只是个人喜好。只要分隔符没有出现在文本中,您就可以使用任何您想要的:)谢谢您的回答。我已经习惯了OFS,所以当空间太短时,在命令行中添加-vOFS=“\t”对于列中的输出非常方便:)
pid cid cpid count
1547 IN 1547 1
1545 IN 1545 4
1546 IN 1546 1
1545 US 1545 2