Grep 字数和it输出_Grep_Tr_Uniq

Grep 字数和it输出

grep

Grep 字数和it输出,grep,tr,uniq,Grep,Tr,Uniq,我有以下几行： 123;123;#rss 123;123;#site #design #rss 123;123;#rss 123;123;#rss 123;123;#site #design 需要计算每个标记出现的次数，请执行以下操作： grep-Eo'[a-z]..'/1.txt | tr\\n | uniq-c i、首先只从字符串中选择标签，然后将其分解并计数输出： 1 #rss 1 #site 1 #design 3 #rss 1 #site 1

我有以下几行：

123;123;#rss
123;123;#site #design #rss
123;123;#rss
123;123;#rss
123;123;#site #design

需要计算每个标记出现的次数，请执行以下操作：

grep-Eo'[a-z]..'/1.txt | tr\\n | uniq-c

i、首先只从字符串中选择标签，然后将其分解并计数

输出：

   1 #rss
   1 #site
   1 #design
   3 #rss
   1 #site
   1 #design

而不是预期的：

   2 #site
   4 #rss
   2 #design

问题似乎出在不可打印字符上，这使得计数不正确。还是别的什么？有人能建议一个正确的解决方案吗？

印刷品

  2 #design
  4 #rss
  2 #site

如预期。

使用awk作为替代方案：

awk -F [" "\;] '{ for(i=3;i<=NF;i++) {  map[$i]++ } } END { for (i in map) { print map[i]" "i} }' file

将字段分隔符设置为空格或空格；然后从第三个字段循环到最后一个字段NF，添加到数组映射中，字段作为索引，递增计数器作为值。在文件处理结束时，循环映射数组并打印索引/值。

这可以在单个gnu awk中完成：

awk-v RS='[a-zA-Z]+''RT{++freq[RT]}END{for freq print freq[i]，i}'文件中的i 2站点 2设计 4 rss 或者一个grep+awk解决方案：

grep-iEo'[a-z]+'文件| awk'{++freq[$1]}END{for i in freq print freq[i]，i}' 2站点 2设计 4 rss

仅凭所展示的样品，请尝试以下内容。用GNU awk编写和测试

输出如下

2 #site
2 #design
4 #rss

说明：增加对以上内容的详细说明

awk '                                     ##Starting awk program from here.
{
  while($0){                              ##Running while till line value.
    match($0,/#[^ ]*/)                    ##using match function to match regex #[^ ]* in current line.
    count[substr($0,RSTART,RLENGTH)]++    ##Creating count array which has index as matched sub string and keep increasing its value with 1 here.
    $0=substr($0,RSTART+RLENGTH)          ##Putting rest of line after match into currnet line here.
  }
}
END{                                      ##Starting END block of this program from here.
  for(key in count){                      ##using for loop to go throgh count here.
    print count[key],key                  ##printing value of count which has index as key and key here.
  }
}
' Input_file                              ##Mentioning Input_file name here.

uniq要求输入已按排序；一个快速解决办法是……|排序| uniq-c；.*表示匹配零个或多个字符，包括空格和非打印字符。。。尝试“[a-z]+”将字母限制为小写字母请查看-F[\；]应该是-F'[；]”。您的数组保留一个计数，而不提供映射，因此cnt[]或类似名称对它来说比map[]更有用。还有-print-map[i]i=print-map[i]，我让OFS有了生存的理由：——。@KarlsD:这行得通吗？

2 #site
2 #design
4 #rss

awk '                                     ##Starting awk program from here.
{
  while($0){                              ##Running while till line value.
    match($0,/#[^ ]*/)                    ##using match function to match regex #[^ ]* in current line.
    count[substr($0,RSTART,RLENGTH)]++    ##Creating count array which has index as matched sub string and keep increasing its value with 1 here.
    $0=substr($0,RSTART+RLENGTH)          ##Putting rest of line after match into currnet line here.
  }
}
END{                                      ##Starting END block of this program from here.
  for(key in count){                      ##using for loop to go throgh count here.
    print count[key],key                  ##printing value of count which has index as key and key here.
  }
}
' Input_file                              ##Mentioning Input_file name here.

$ cut -d';' -f3 file | tr ' ' '\n' | sort | uniq -c
      2 #design
      4 #rss
      2 #site