Bash awk-如果域已匹配，则跳过子域的行_Bash_Awk

Bash awk-如果域已匹配，则跳过子域的行

bash awk

Bash awk-如果域已匹配，则跳过子域的行,bash,awk,Bash,Awk,假设-已经有一个有序的域列表，如： tld.aa. tld.aa.do.notshowup.0 tld.aa.do.notshowup.0.1 tld.aa.do.notshowup.0.1.1 tld.aa.do.notshowup.too tld.bb.showup tld.aaaaa.showup tld.xxxxx. tld.xxxxx.donotshowup tld.yougettheidea.dontyou tld.yougettheidea.dontyou.thankyou 后来

假设-已经有一个有序的域列表，如：

tld.aa.
tld.aa.do.notshowup.0
tld.aa.do.notshowup.0.1
tld.aa.do.notshowup.0.1.1
tld.aa.do.notshowup.too
tld.bb.showup
tld.aaaaa.showup
tld.xxxxx.
tld.xxxxx.donotshowup
tld.yougettheidea.dontyou
tld.yougettheidea.dontyou.thankyou

后来被列为黑名单

根据具体要求-所有带有尾随“.”的行均应注明不应显示该特定域的所有较深子域在黑名单上。。。因此，示例的期望输出上述内容将/应该是：

tld.aa.
tld.bb.showup
tld.aaaaa.showup
tld.xxxxx.
tld.yougettheidea.dontyou
tld.yougettheidea.dontyou.thankyou

我目前在一个循环中运行它（纯bash+大量使用bash内置程序来加快速度）。。。但是作为清单现在处理大约562k个条目需要相当长的时间

AWK（或者sed）这样做难道不容易吗？任何帮助都是不容易的非常感谢（我已经在awk中尝试了一些东西，但不知何故无法让它显示我想要的…）

谢谢

如果

行始终位于要忽略的行之前，则此awk应执行以下操作：

$ awk '{for (i in a) if (index($0,i) == 1) next}/\.$/{a[$0]=1}1' file
tld.aa.
tld.bb.showup
tld.aaaaa.showup
tld.xxxxx.
tld.yougettheidea.dontyou
tld.yougettheidea.dontyou.thankyou

```
/\.$/{a[$0]=1}
```
将带有尾随点的行添加到数组中
```
{for（a中的i）if（index（$0，i）==1）next}
```
在其中一个索引项中搜索当前行，如果找到（
```
next
```
），则跳过进一步处理

如果文件按字母顺序排序，并且没有子域以点结尾，那么您甚至不需要@Corentin Limier建议的数组：

awk 'a{if (index($0,a) == 1) next}/\.$/{a=$0}1' file

是否已排序？@corentilimier“[…]有一个已排序的域列表，如：“如果文件已排序，则不需要将所有值保留在数组中。像这样的东西应该是有效的：awk'{if（index（$0，a）==1&&a！=''）{next}}/\.$/{a=$0}1'文件很棒-谢谢-这正是我想要的。。。而且它的速度非常快（大约2-3秒就跑完562k行）。