awk能否基于单独的规范文件替换字段?
我有这样一个输入文件:awk能否基于单独的规范文件替换字段?,awk,getline,Awk,Getline,我有这样一个输入文件: SomeSection.Foo OtherSection.Foo OtherSection.Goo for (key in arr_obj) { ## Assign 'string\034string' to 'key' variable split( key, key_parts, SUBSEP ) ## Split 'key' with the content of SUBSEP variable.
SomeSection.Foo
OtherSection.Foo
OtherSection.Goo
for (key in arr_obj) { ## Assign 'string\034string' to 'key' variable
split( key, key_parts, SUBSEP ) ## Split 'key' with the content of SUBSEP variable.
...
}
…还有另一个文件描述哪些对象属于每个部分:
[SomeSection]
Blah
Foo
[OtherSection]
Foo
Goo
所需的输出将是:
SomeSection.2 // that's because Foo appears 2nd in SomeSection
OtherSection.1 // that's because Foo appears 1st in OtherSection
OtherSection.2 // that's because Goo appears 2nd in OtherSection
节和对象的编号和名称是可变的
在awk你怎么会做这样的事
提前感谢,,
阿德里安。一种可能性:
script.awk的内容及注释:
## When 'FNR == NR', the first input file is in process.
## If line begins with '[', get the section string and reset the position
## of its objects.
FNR == NR && $0 ~ /^\[/ {
object = substr( $0, 2, length($0) - 2 )
pos = 0
next
}
## This section process the objects of each section. It saves them in
## an array. Variable 'pos' increments with each object processed.
FNR == NR {
arr_obj[object, $0] = ++pos
next
}
## This section process second file. It splits line in '.' to find second
## part in the array and prints all.
FNR < NR {
ret = split( $0, obj, /\./ )
if ( ret != 2 ) {
next
}
printf "%s.%d\n", obj[1], arr_obj[ obj[1] SUBSEP obj[2] ]
}
结果:
SomeSection.2
OtherSection.1
OtherSection.2
编辑评论中的问题:
## When 'FNR == NR', the first input file is in process.
## If line begins with '[', get the section string and reset the position
## of its objects.
FNR == NR && $0 ~ /^\[/ {
object = substr( $0, 2, length($0) - 2 )
pos = 0
next
}
## This section process the objects of each section. It saves them in
## an array. Variable 'pos' increments with each object processed.
FNR == NR {
arr_obj[object, $0] = ++pos
next
}
## This section process second file. It splits line in '.' to find second
## part in the array and prints all.
FNR < NR {
ret = split( $0, obj, /\./ )
if ( ret != 2 ) {
next
}
printf "%s.%d\n", obj[1], arr_obj[ obj[1] SUBSEP obj[2] ]
}
我不是专家,但我会尝试解释我是如何理解它的:
Subsp是一个字符,当您希望使用不同的值作为键时,它可以分隔数组中的索引。默认情况下为\034,尽管您可以像RS或FS一样修改它
在指令arr_obj[object,$0]=++pos中,逗号将所有值与subsp的值连接起来,因此在这种情况下会导致:
arr_obj[SomeSection\034Blah] = 1
在脚本末尾,我使用变量arr_obj[obj[1]subsp obj[2]访问索引,但其含义与前一节中的arr_obj[object,$0]相同
您还可以访问此索引的每个部分,并使用subsp变量将其拆分,如下所示:
SomeSection.Foo
OtherSection.Foo
OtherSection.Goo
for (key in arr_obj) { ## Assign 'string\034string' to 'key' variable
split( key, key_parts, SUBSEP ) ## Split 'key' with the content of SUBSEP variable.
...
}
结果:
key_parts[1] -> SomeSection
key_parts[2] -> Blah
此awk生产线应完成以下工作:
awk 'BEGIN{FS="[\\.\\]\\[]"}
NR==FNR{ if(NF>1){ i=1; idx=$2; }else{ s[idx"."$1]=i; i++; } next; }
{ if($0 in s) print $1"."s[$0] } ' f2 input
请参阅下面的测试:
kent$ head input f2
==> input <==
SomeSection.Foo
OtherSection.Foo
OtherSection.Goo
==> f2 <==
[SomeSection]
Blah
Foo
[OtherSection]
Foo
Goo
kent$ awk 'BEGIN{FS="[\\.\\]\\[]"}
NR==FNR{ if(NF>1){ i=1; idx=$2; }else{ s[idx"."$1]=i; i++; } next; }
{ if($0 in s) print $1"."s[$0] } ' f2 input
SomeSection.2
OtherSection.1
OtherSection.2
你能解释一下你脚本的最后一部分吗?我在处理数组时总是最困难的。@JaypalSingh:更新了答案来解释它。