bash：从一开始就捕获具有相同特定字符数的行_Bash_Awk_Grep_Character_Capture

bash：从一开始就捕获具有相同特定字符数的行

bash awk grep

bash：从一开始就捕获具有相同特定字符数的行,bash,awk,grep,character,capture,Bash,Awk,Grep,Character,Capture,我想捕获在第一个第n个字符上具有相同开头的行，并且无论第一个第n个字符后面是什么，都只输出其中一行。如果该行少于n个字符，则按原样将其发送到输出我尝试grep捕获第一个特定数量的字符，但它删除了其余的字符 cat myfile.txt | grep-o-p'^{0,41}' 或 cat myfile.txt | grep-o-P.{0,0}http.{0,41}' 这里我有一个文件，我想捕获前41个字符中相同的行，只显示其中一行： https://example.com/first/seco

我想捕获在第一个第n个字符上具有相同开头的行，并且无论第一个第n个字符后面是什么，都只输出其中一行。如果该行少于n个字符，则按原样将其发送到输出

我尝试grep捕获第一个特定数量的字符，但它删除了其余的字符

cat myfile.txt | grep-o-p'^{0,41}'

或

cat myfile.txt | grep-o-P.{0,0}http.{0,41}'

这里我有一个文件，我想捕获前41个字符中相同的行，只显示其中一行：

https://example.com/first/second/blahblah/?alsda=asldfaalafowiorie
https://example.com/first/second/blahblah/?oriwo=asldkjalkdjf2kasd
https://example.com/first/second/blahblah/some/more/dir
https://example.com/another/one
https://example.com/third/fourth/something/?cldl=aosijfoiret
https://example.com/third/fourth/something/?cldl=5145652
https://example.com/third/fourth/something/?hfdg=156569&wuew=8428
https://example.com/first/second/blahblah/

所需输出

https://example.com/first/second/blahblah/?alsda=asldfaalafowiorie
https://example.com/another/one
https://example.com/third/fourth/something/?cldl=aosijfoiret

谢谢。

就是通常的

排序和uniq

对

sort file | uniq -w40

您可能希望对

sort-s-k1.1,1.40文件执行一些操作，以便对其进行稳定排序

仅输出其中一行，无论第一个第n个字符后面是什么。如果该行少于n个字符，则按原样将其发送到输出
除此之外，还有万能的awk
awk -v N=41 '
   # Put lines longer then 41 in associative array, if not there already
   length($0) >= N { i = substr($0,1,N); if (!(i in a)) a[i] = $0 }
   # output lines shorter then 41
   length($0) < N {print}
   # output the array
   END{ for (i in a) print a[i] } ' file

awk-vn=41'
#在关联数组中放置比41长的行（如果还没有）
长度（$0）>=N{i=substr（$0,1，N）；如果（！（a中的i））a[i]=0}
#输出线短于41
长度（$0）
如果该行少于n个字符，则按原样将其发送到输出
，即使它被重新创建？如果有办法使其独特，那将是很好的。它不是那样工作的。我没有看到您的更新，让我试试它。它可以按照我的要求工作，但@rowboat的解决方案更简单、更短。谢谢，很好用。谢谢你推荐什么资源来学习awk？任何特定的网站或书籍？sort-uk1，1.41文件。对不起，idk关于资源的问题。在这里提问很好。unix.stackexchange有很多关于awk的内容（例如+链接问题）
awk '!seen[substr($0,1,41)]++' file