Grep 字数和it输出

Grep 字数和it输出,grep,tr,uniq,Grep,Tr,Uniq,我有以下几行: 123;123;#rss 123;123;#site #design #rss 123;123;#rss 123;123;#rss 123;123;#site #design 需要计算每个标记出现的次数,请执行以下操作: grep-Eo'[a-z]..'/1.txt | tr\\n | uniq-c i、 首先只从字符串中选择标签,然后将其分解并计数 输出: 1 #rss 1 #site 1 #design 3 #rss 1 #site 1

我有以下几行:

123;123;#rss
123;123;#site #design #rss
123;123;#rss
123;123;#rss
123;123;#site #design
需要计算每个标记出现的次数,请执行以下操作:

grep-Eo'[a-z]..'/1.txt | tr\\n | uniq-c

i、 首先只从字符串中选择标签,然后将其分解并计数

输出:

   1 #rss
   1 #site
   1 #design
   3 #rss
   1 #site
   1 #design
而不是预期的:

   2 #site
   4 #rss
   2 #design
问题似乎出在不可打印字符上,这使得计数不正确。还是别的什么?有人能建议一个正确的解决方案吗?

印刷品

  2 #design
  4 #rss
  2 #site

如预期。

使用awk作为替代方案:

awk -F [" "\;] '{ for(i=3;i<=NF;i++) {  map[$i]++ } } END { for (i in map) { print map[i]" "i} }' file

将字段分隔符设置为空格或空格;然后从第三个字段循环到最后一个字段NF,添加到数组映射中,字段作为索引,递增计数器作为值。在文件处理结束时,循环映射数组并打印索引/值。

这可以在单个gnu awk中完成:

awk-v RS='[a-zA-Z]+''RT{++freq[RT]}END{for freq print freq[i],i}'文件中的i 2站点 2设计 4 rss 或者一个grep+awk解决方案:

grep-iEo'[a-z]+'文件| awk'{++freq[$1]}END{for i in freq print freq[i],i}' 2站点 2设计 4 rss
仅凭所展示的样品,请尝试以下内容。用GNU awk编写和测试

输出如下

2 #site
2 #design
4 #rss
说明:增加对以上内容的详细说明

awk '                                     ##Starting awk program from here.
{
  while($0){                              ##Running while till line value.
    match($0,/#[^ ]*/)                    ##using match function to match regex #[^ ]* in current line.
    count[substr($0,RSTART,RLENGTH)]++    ##Creating count array which has index as matched sub string and keep increasing its value with 1 here.
    $0=substr($0,RSTART+RLENGTH)          ##Putting rest of line after match into currnet line here.
  }
}
END{                                      ##Starting END block of this program from here.
  for(key in count){                      ##using for loop to go throgh count here.
    print count[key],key                  ##printing value of count which has index as key and key here.
  }
}
' Input_file                              ##Mentioning Input_file name here.

uniq要求输入已按排序;一个快速解决办法是……|排序| uniq-c;.*表示匹配零个或多个字符,包括空格和非打印字符。。。尝试“[a-z]+”将字母限制为小写字母请查看-F[\;]应该是-F'[;]”。您的数组保留一个计数,而不提供映射,因此cnt[]或类似名称对它来说比map[]更有用。还有-print-map[i]i=print-map[i],我让OFS有了生存的理由:——。@KarlsD:这行得通吗?
2 #site
2 #design
4 #rss
awk '                                     ##Starting awk program from here.
{
  while($0){                              ##Running while till line value.
    match($0,/#[^ ]*/)                    ##using match function to match regex #[^ ]* in current line.
    count[substr($0,RSTART,RLENGTH)]++    ##Creating count array which has index as matched sub string and keep increasing its value with 1 here.
    $0=substr($0,RSTART+RLENGTH)          ##Putting rest of line after match into currnet line here.
  }
}
END{                                      ##Starting END block of this program from here.
  for(key in count){                      ##using for loop to go throgh count here.
    print count[key],key                  ##printing value of count which has index as key and key here.
  }
}
' Input_file                              ##Mentioning Input_file name here.
$ cut -d';' -f3 file | tr ' ' '\n' | sort | uniq -c
      2 #design
      4 #rss
      2 #site