Awk 我有一个可以工作的脚本代码,但是如何使这个脚本代码更“优雅”?
一些背景。我有两个文件A和B,其中包含我需要提取的数据 对于文件A,我只需要最后两行,如下所示:Awk 我有一个可以工作的脚本代码,但是如何使这个脚本代码更“优雅”?,awk,Awk,一些背景。我有两个文件A和B,其中包含我需要提取的数据 对于文件A,我只需要最后两行,如下所示: RMM: 17 -0.221674395053E+01 0.59892E-04 0.00000E+00 31 0.259E-03 1 F= -.22167440E+01 E0= -.22167440E+01 d E =-.398708E-10 mag= 2.0000 Total CPU time used (se
RMM: 17 -0.221674395053E+01 0.59892E-04 0.00000E+00 31 0.259E-03
1 F= -.22167440E+01 E0= -.22167440E+01 d E =-.398708E-10 mag= 2.0000
Total CPU time used (sec): 0.364
User time (sec): 0.355
System time (sec): 0.009
Elapsed time (sec): 1.423
Maximum memory used (kb): 9896.
Average memory used (kb): 0.
Minor page faults: 2761
Major page faults: 4
Voluntary context switches: 24
mainfolder1[tab/space]subfolder1[tab/space][all the extracted info separated by tab]
mainfolder2[tab/space]subfolder2[tab/space][all the extracted info separated by tab]
mainfolder3[tab/space]subfolder3[tab/space][all the extracted info separated by tab]
...
mainfoldern[tab/space]subfoldern[tab/space][all the extracted info separated by tab]
我需要提取以下数字:
-1st Line, 2nd field (17)
-1st Line 4th field (0.59892E-04)
-2nd Line, 1st field (1)
-2nd Line, 3rd field (-.22167440E+01)
-2nd Line, 5th field (-.22167440E+01)
-2nd Line, 8th field (-.398708E-10)
-2nd Line, 10th field (2.0000)
-1st line, 6th field (0.364)
-2nd line, 4th field (0.355)
-3rd line, 4th field (0.009)
-4th line, 4th field (1.423)
-6th line, 5th field (9896.)
-7th line, 5th field (0.)
对于文件B,我只需要最后11行,如下所示:
RMM: 17 -0.221674395053E+01 0.59892E-04 0.00000E+00 31 0.259E-03
1 F= -.22167440E+01 E0= -.22167440E+01 d E =-.398708E-10 mag= 2.0000
Total CPU time used (sec): 0.364
User time (sec): 0.355
System time (sec): 0.009
Elapsed time (sec): 1.423
Maximum memory used (kb): 9896.
Average memory used (kb): 0.
Minor page faults: 2761
Major page faults: 4
Voluntary context switches: 24
mainfolder1[tab/space]subfolder1[tab/space][all the extracted info separated by tab]
mainfolder2[tab/space]subfolder2[tab/space][all the extracted info separated by tab]
mainfolder3[tab/space]subfolder3[tab/space][all the extracted info separated by tab]
...
mainfoldern[tab/space]subfoldern[tab/space][all the extracted info separated by tab]
我需要提取以下数字:
-1st Line, 2nd field (17)
-1st Line 4th field (0.59892E-04)
-2nd Line, 1st field (1)
-2nd Line, 3rd field (-.22167440E+01)
-2nd Line, 5th field (-.22167440E+01)
-2nd Line, 8th field (-.398708E-10)
-2nd Line, 10th field (2.0000)
-1st line, 6th field (0.364)
-2nd line, 4th field (0.355)
-3rd line, 4th field (0.009)
-4th line, 4th field (1.423)
-6th line, 5th field (9896.)
-7th line, 5th field (0.)
我的输出应该是这样的:
RMM: 17 -0.221674395053E+01 0.59892E-04 0.00000E+00 31 0.259E-03
1 F= -.22167440E+01 E0= -.22167440E+01 d E =-.398708E-10 mag= 2.0000
Total CPU time used (sec): 0.364
User time (sec): 0.355
System time (sec): 0.009
Elapsed time (sec): 1.423
Maximum memory used (kb): 9896.
Average memory used (kb): 0.
Minor page faults: 2761
Major page faults: 4
Voluntary context switches: 24
mainfolder1[tab/space]subfolder1[tab/space][all the extracted info separated by tab]
mainfolder2[tab/space]subfolder2[tab/space][all the extracted info separated by tab]
mainfolder3[tab/space]subfolder3[tab/space][all the extracted info separated by tab]
...
mainfoldern[tab/space]subfoldern[tab/space][all the extracted info separated by tab]
下面是我的脚本代码:
for m in ./*/; do
main=$(basename "$m")
for s in "$m"*/; do
sub=$(basename "$s")
vdata=$(tail -n2 ./$main/$sub/A | awk -F'[ =]+' NR==1'{a=$2;b=$4;next}{print s,a,$2,$4,$6,$9, $11}')
ctime=$(tail -n11 ./$main/$sub/B |head -n1|awk '{print $6}')
utime=$(tail -n10 ./$main/$sub/B |head -n1|awk '{print $4}')
stime=$(tail -n9 ./$main/$sub/B |head -n1|awk '{print $4}')
etime=$(tail -n8 ./$main/$sub/B |head -n1|awk '{print $4}')
maxmem=$(tail -n6 ./$main/$sub/B |head -n1|awk '{print $5}')
avemem=$(tail -n5 ./$main/$sub/B |head -n1|awk '{print $5}')
c=$(echo $sub| cut -c 2-)
echo "$m $c $vdata $ctime $utime $stime $etime $maxmem $avemem"
done
done > output
现在,第四行,vdata部分,实际上是前一个论坛问题中的循环行。我不完全明白。我希望我的文件B代码和文件a的awk代码一样优雅。我该怎么做?谢谢 对于文件B,请尝试以下操作:
tail -n11 B | awk -F':' '{ print $2 }'
array=($(tail -n11 B | awk -F':' '{ print $2 }'))
for value in "${array[@]}"
do
echo $value
done
for m in ./*/; do
main=$(basename "$m")
for s in "$m"*/; do
sub=$(basename "$s")
fileA="${main}/${sub}/A"
fileB="${main}/${sub}/B"
awk -v sizeA=$(wc -l < "$fileA") -v sizeB=$(wc -l < "$fileB") '
NR==FNR {
if ( FNR == (sizeA-1) ) { split($0,p) }
if ( FNR == sizeA ) { split($0,a) }
next
}
{ b[sizeB + 1 - FNR] = $NF }
END {
split(FILENAME,f,"/")
print f[1], f[2], p[2], p[4], a[1], a[3], a[5], a[8], a[10], b[11], b[10], b[9], b[8], b[6], b[5]
}
' "$fileA" "$fileB"
done
done > output
如果需要保留这些值,然后进行回显,可以执行以下操作:
tail -n11 B | awk -F':' '{ print $2 }'
array=($(tail -n11 B | awk -F':' '{ print $2 }'))
for value in "${array[@]}"
do
echo $value
done
for m in ./*/; do
main=$(basename "$m")
for s in "$m"*/; do
sub=$(basename "$s")
fileA="${main}/${sub}/A"
fileB="${main}/${sub}/B"
awk -v sizeA=$(wc -l < "$fileA") -v sizeB=$(wc -l < "$fileB") '
NR==FNR {
if ( FNR == (sizeA-1) ) { split($0,p) }
if ( FNR == sizeA ) { split($0,a) }
next
}
{ b[sizeB + 1 - FNR] = $NF }
END {
split(FILENAME,f,"/")
print f[1], f[2], p[2], p[4], a[1], a[3], a[5], a[8], a[10], b[11], b[10], b[9], b[8], b[6], b[5]
}
' "$fileA" "$fileB"
done
done > output
您应该研究find和xargs,因为每次您在shell中编写循环只是为了操纵文本,您的方法是错误的,但为了保持简单并保留原始结构,听起来您可以使用以下方法:
tail -n11 B | awk -F':' '{ print $2 }'
array=($(tail -n11 B | awk -F':' '{ print $2 }'))
for value in "${array[@]}"
do
echo $value
done
for m in ./*/; do
main=$(basename "$m")
for s in "$m"*/; do
sub=$(basename "$s")
fileA="${main}/${sub}/A"
fileB="${main}/${sub}/B"
awk -v sizeA=$(wc -l < "$fileA") -v sizeB=$(wc -l < "$fileB") '
NR==FNR {
if ( FNR == (sizeA-1) ) { split($0,p) }
if ( FNR == sizeA ) { split($0,a) }
next
}
{ b[sizeB + 1 - FNR] = $NF }
END {
split(FILENAME,f,"/")
print f[1], f[2], p[2], p[4], a[1], a[3], a[5], a[8], a[10], b[11], b[10], b[9], b[8], b[6], b[5]
}
' "$fileA" "$fileB"
done
done > output
请注意,上面的命令只会打开每个B文件1次,而不是6次
awk 'NR==1{print $6} NR==2{print $4} NR==3{print $4} ...'
您可以通过以下方式简化:
NR==2 || NR==3 || NR==4
但这似乎很难维持。或者可以使用数组:
awk 'BEGIN{a[1]=6;a[2]=4...} NR in a{ print $a[NR]}'
但我想你真的只是想:
awk '{print $NF}' ORS=\\t
您不需要第1行的第6个字段。您需要最后一个字段
不要试图将输出收集到变量中以进行回显,而是添加ORS=\\t以获得制表符分隔的输出,并将其打印到脚本的标准输出。-vdata行上的F'[=]+'将=添加到awk的FS字段拆分值中,以便字段$8不包含前导=。NR==1{…}块正在存储第一行的值,以便在第二行的打印输出中使用。谢谢!我将调查find和xargs:谢谢你的回答很简洁,我想我会用你的