Awk 根据标准分割文件
我有一个包含以下数据的文件Awk 根据标准分割文件,awk,nawk,Awk,Nawk,我有一个包含以下数据的文件 .domain bag .set bag1 bag1 abc1 .set bag2 bag2 abc2 .domain cat .set bag1:cat bag1:cat abc1:cat .set bag2:cat bag2:cat abc2:cat 我想根据设置的值将这个文件分成两部分(bag1.txt和bag2.txt) bag1.txt应该如下所示: .domain bag .set bag1 bag1 abc1 .domain cat .set bag1
.domain bag
.set bag1
bag1
abc1
.set bag2
bag2
abc2
.domain cat
.set bag1:cat
bag1:cat
abc1:cat
.set bag2:cat
bag2:cat
abc2:cat
我想根据设置的值将这个文件分成两部分(bag1.txt和bag2.txt)
bag1.txt应该如下所示:
.domain bag
.set bag1
bag1
abc1
.domain cat
.set bag1:cat
bag1:cat
abc1:cat
.domain bag
.set bag2
bag2
abc2
.domain cat
.set bag2:cat
bag2:cat
abc2:cat
bag2.txt应该如下所示:
.domain bag
.set bag1
bag1
abc1
.domain cat
.set bag1:cat
bag1:cat
abc1:cat
.domain bag
.set bag2
bag2
abc2
.domain cat
.set bag2:cat
bag2:cat
abc2:cat
.domain行对于这两个文件都是通用的
我尝试了下面的命令,但它不起作用
nawk '{if($0~/.set/){split($2,a,":");filename=a[1]".text"}if(filename=".text"){print|"tee *.text"}else{print >filename}}' file.txt
单向:
awk '
BEGIN {
## Split fields with spaces and colon.
FS = "[ :]+";
## Extension of output files.
ext = ".txt";
}
## Write lines that begin with ".domain" to all known output files (saved
## in "processed_bags"). Also save them in the "domain" array to copy them
## later to all files not processed yet.
$1 == ".domain" {
for ( b in processed_bags ) {
print $0 >> sprintf( "%s%s", b, ext );
}
domain[ i++ ] = $0;
next;
}
## Select output file to write. If not found previously, copy all
## domains saved until now.
$1 == ".set" {
bag = $2;
if ( ! (bag in processed_bags) ) {
for ( j = 0; j < i; j++ ) {
print domain[j] >> sprintf( "%s%s", bag, ext );
}
processed_bags[ bag ] = 1;
}
}
## A normal line of data (neither ".domain" nor ".set"). Copy
## to the file saved in "bag" variable.
bag {
print $0 >> sprintf( "%s%s", bag, ext );
}
' file.txt
输出:
==> bag1.txt <==
.domain bag
.set bag1
bag1
abc1
.domain cat
.set bag1:cat
bag1:cat
abc1:cat
==> bag2.txt <==
.domain bag
.set bag2
bag2
abc2
.domain cat
.set bag2:cat
bag2:cat
abc2:cat
==>bag1.txt bag2.txt这没问题。但是我们能概括一下公共线的部分吗?如果有很多包的话?像bag1…bag1000。我怎么做?我的实际文件中有许多bag1到bag1000的包。我们可以用print>*.txt(从bag1.txt到bag 1000.txt的目录中已经有许多空文件)来代替print>>bag1吗@peter:我已经编辑了答案来概括它。它有完整的注释,您可以查看它是否适合您的需要,因为我不明白您使用print>>*.txt是什么意思