Grep 字数和it输出
我有以下几行:Grep 字数和it输出,grep,tr,uniq,Grep,Tr,Uniq,我有以下几行: 123;123;#rss 123;123;#site #design #rss 123;123;#rss 123;123;#rss 123;123;#site #design 需要计算每个标记出现的次数,请执行以下操作: grep-Eo'[a-z]..'/1.txt | tr\\n | uniq-c i、 首先只从字符串中选择标签,然后将其分解并计数 输出: 1 #rss 1 #site 1 #design 3 #rss 1 #site 1
123;123;#rss
123;123;#site #design #rss
123;123;#rss
123;123;#rss
123;123;#site #design
需要计算每个标记出现的次数,请执行以下操作:
grep-Eo'[a-z]..'/1.txt | tr\\n | uniq-c
i、 首先只从字符串中选择标签,然后将其分解并计数
输出:
1 #rss
1 #site
1 #design
3 #rss
1 #site
1 #design
而不是预期的:
2 #site
4 #rss
2 #design
问题似乎出在不可打印字符上,这使得计数不正确。还是别的什么?有人能建议一个正确的解决方案吗? 2 #design
4 #rss
2 #site
如预期。使用awk作为替代方案:
awk -F [" "\;] '{ for(i=3;i<=NF;i++) { map[$i]++ } } END { for (i in map) { print map[i]" "i} }' file
将字段分隔符设置为空格或空格;然后从第三个字段循环到最后一个字段NF,添加到数组映射中,字段作为索引,递增计数器作为值。在文件处理结束时,循环映射数组并打印索引/值。这可以在单个gnu awk中完成: awk-v RS='[a-zA-Z]+''RT{++freq[RT]}END{for freq print freq[i],i}'文件中的i 2站点 2设计 4 rss 或者一个grep+awk解决方案: grep-iEo'[a-z]+'文件| awk'{++freq[$1]}END{for i in freq print freq[i],i}' 2站点 2设计 4 rss
仅凭所展示的样品,请尝试以下内容。用GNU awk编写和测试 输出如下
2 #site
2 #design
4 #rss
说明:增加对以上内容的详细说明
awk ' ##Starting awk program from here.
{
while($0){ ##Running while till line value.
match($0,/#[^ ]*/) ##using match function to match regex #[^ ]* in current line.
count[substr($0,RSTART,RLENGTH)]++ ##Creating count array which has index as matched sub string and keep increasing its value with 1 here.
$0=substr($0,RSTART+RLENGTH) ##Putting rest of line after match into currnet line here.
}
}
END{ ##Starting END block of this program from here.
for(key in count){ ##using for loop to go throgh count here.
print count[key],key ##printing value of count which has index as key and key here.
}
}
' Input_file ##Mentioning Input_file name here.
uniq要求输入已按排序;一个快速解决办法是……|排序| uniq-c;.*表示匹配零个或多个字符,包括空格和非打印字符。。。尝试“[a-z]+”将字母限制为小写字母请查看-F[\;]应该是-F'[;]”。您的数组保留一个计数,而不提供映射,因此cnt[]或类似名称对它来说比map[]更有用。还有-print-map[i]i=print-map[i],我让OFS有了生存的理由:——。@KarlsD:这行得通吗?
2 #site
2 #design
4 #rss
awk ' ##Starting awk program from here.
{
while($0){ ##Running while till line value.
match($0,/#[^ ]*/) ##using match function to match regex #[^ ]* in current line.
count[substr($0,RSTART,RLENGTH)]++ ##Creating count array which has index as matched sub string and keep increasing its value with 1 here.
$0=substr($0,RSTART+RLENGTH) ##Putting rest of line after match into currnet line here.
}
}
END{ ##Starting END block of this program from here.
for(key in count){ ##using for loop to go throgh count here.
print count[key],key ##printing value of count which has index as key and key here.
}
}
' Input_file ##Mentioning Input_file name here.
$ cut -d';' -f3 file | tr ' ' '\n' | sort | uniq -c
2 #design
4 #rss
2 #site