使用awk、sed和grep处理文本文件
我的输入文件:使用awk、sed和grep处理文本文件,sed,awk,grep,Sed,Awk,Grep,我的输入文件: 20110512075615 Constanta 1.0041 1013.41 9999.0 0 0.0 0 20110512075630 Constanta 1.0021 1013.45 9999.0 0 0.0 0 20110512075645 Constanta 1.0031 1013.47 9999.0 0 0.0 0 20110512075700 Constanta 1.0018 1013.47 9999.0 0 0.0 0 20110512075730 Constan
20110512075615 Constanta 1.0041 1013.41 9999.0 0 0.0 0
20110512075630 Constanta 1.0021 1013.45 9999.0 0 0.0 0
20110512075645 Constanta 1.0031 1013.47 9999.0 0 0.0 0
20110512075700 Constanta 1.0018 1013.47 9999.0 0 0.0 0
20110512075730 Constanta 1.0038 1013.48 9999.0 0 0.0 0
20110512075745 Constanta 1.0023 1013.48 9999.0 0 0.0 0
20110512075800 Constanta 9999.0000 1013.46 13.2 0 0.0 0
20110512075815 Constanta 1.0038 1013.45 13.2 0 0.0 0
20110512075830 Constanta 1.0040 1013.50 13.2 0 0.0 0
20110512075845 Constanta 1.0034 1013.50 13.2 0 0.0 0
20110512075900 Constanta 1.0050 1013.45 13.2 0 0.0 0
20110512075915 Constanta 1.0060 1013.48 13.2 0 0.0 0
20110512075930 Constanta 1.0056 1013.45 13.2 0 0.0 0
20110512080000 Constanta 1.0066 1013.50 13.2 0 0.0 0
20110512080015 Constanta 1.0067 1013.49 13.2 0 0.0 0
20110512080100 Constanta 1.0065 1013.48 13.2 0 0.0 0
20110512080115 Constanta 9999.0000 1013.51 13.2 0 0.0 0
20110512080130 Constanta 1.0065 1013.51 13.2 0 0.0 0
20110512080145 Constanta 1.0079 1013.49 13.2 0 0.0 0
20110512080200 Constanta 1.0072 1013.51 13.2 0 0.0 0
20110512080215 Constanta 1.0084 1013.51 13.2 0 0.0 0
我的输出文件:
YY/MM/DD HH -Level- Atm.Prs -Tw-
201105120757 1.0018 1013.47 9999.0 0 0.0 0
201105120759 1.0050 1013.45 13.2 0 0.0 0
201105120800 9999.0000 1.0066 1013.50 13.2 0 0.0 0
201105120801 1.0065 1013.48 13.2 0 0.0 0
201105120802 9999.0000 1.0072 1013.51 13.2 0 0.0 0
我的代码:
#! /bin/bash
FILE="Constanta20110513.txt"
# 1) remove column two(='Constanta')
awk '{$2="";print}' $FILE | column -t > tmpfile
# 2) remove lines with '9999.0000'
cat tmpfile | sed -e '/9999.[0-9]/d' >> final.tmp
# 3) remove first three lines
awk 'NR>3' final.tmp >> myfile.tmp
# 4) count lines between '....00' si '....00':
#if >= 3, keep only the line with '...00' and delete the other lines
#if < 3, do the same, and put '9999' on column two
output=$(grep -n '00\s*$' myfile.tmp | sed 's/\s*$/ /')
array=($output $(cat myfile.tmp | wc -l))
for (( i=0; i<${#array[@]}-1; i++ )); do
index1=$(echo "${array[$i]}" | grep -o '^[0-9]*')
index2=$(echo "${array[$i+1]}" | grep -o '^[0-9]*')
if [ $(( index2 - index1 )) -ge 3 ]; then
echo $(echo "${array[$i]}" | grep -o '[0-9]*$') >> temp.tmp
else
echo $(echo "${array[$i]}" | grep -o '[0-9]*$') 9999.0000 >> temp.tmp
fi
done
# 5) delete last two characters from first column(=00)
awk '{sub(/..$/,"",$1)} 1' temp.tmp >> output.tmp
# 6) insert header
echo 'YY/MM/DD HH -Level- Atm.Prs -Tw-' | cat - output.tmp >> output2.tmp
#save
mv output2.tmp $FILE
#/bin/bash
FILE=“Constanta20110513.txt”
#1)移除第二列(‘康斯坦塔’)
awk'{$2=”“;print}'$FILE | column-t>tmpfile
#2)拆下带有“9999.0000”的管路
cat tmpfile | sed-e'/9999.[0-9]/d'>>final.tmp
#3)拆下前三条线路
awk'NR>3'final.tmp>>myfile.tmp
#4)计算“..00”si“..00”之间的行数:
#如果>=3,则仅保留带有“…00”的行,并删除其他行
#如果<3,则执行相同操作,并在第二列中输入'9999'
输出=$(grep-n'00\s*$'myfile.tmp | sed's/\s*$/'))
数组=($output$(cat myfile.tmp | wc-l))
对于((i=0;i>temp.tmp
其他的
echo$(echo“${array[$i]}”| grep-o'[0-9]*$')9999.0000>>temp.tmp
fi
完成
#5)从第一列(=00)中删除最后两个字符
awk'{sub(/…$/,“”,$1)}1'temp.tmp>>output.tmp
#6)插入标题
echo'YY/MM/DD HH-Level-Atm.Prs-Tw-'| cat-output.tmp>>output2.tmp
#拯救
mv output2.tmp$文件
我的问题在第4步:不工作,临时文件temp.tmp不是create。
我想问题就在这里:grep-n'00\s*$'myfile.tmp | sed's/\s*$/'
提前非常感谢。这里是一次完成1到3:
awk '{$2="";sub(/ /," ")} !/9999.[0-9]/ && t++>2' $FILE
不确定第4步中你想算什么,你能说得更清楚些吗。我基于Jotne的工作,在第1-3步添加了一个函数来处理第4步。应将以下内容放入可执行文件(我称之为awko
)中,并像awko Constanta20110513.txt
一样运行:
#!/usr/bin/awk -f
BEGIN { print "YY/MM/DD HH -Level- Atm.Prs -Tw-" }
# absorb jotne's work for #1-3 more or less
{$2="";sub(/ /," ")}
/9999.0000/ || NR<=3 { next }
/^[0-9]{12}00/ { output_line() } # deal with the "00" lines
END { output_line() } # output the final "00" stored in last
function output_line() {
if( last_nr != 0 ) {
if( NR-last_nr < 3 ) {
temp = $0 # save off the current line
$0 = last # reset it to the last "00" line
$2 = "9999.0000" # make $2 what you want
print $0
$0 = temp # restore $0 from temp
}
if( NR-last_nr >= 3 ) { print last }
}
$1 = substr( $1, 1, 12 ) # drop the "00" from $1
last = $0; last_nr = NR; # store some variables
}
你能给我们看一下你想要的输出吗?在第4步,我想计算最后两个字符为'00'的行之间的行数(在第一列);如果结果>=3:仅保留结尾处有“00”的行,并删除另一行;如果结果为@N0741337,则工作完美!非常非常感谢!
YY/MM/DD HH -Level- Atm.Prs -Tw-
201105120757 1.0018 1013.47 9999.0 0 0.0 0
201105120759 1.0050 1013.45 13.2 0 0.0 0
201105120800 9999.0000 1013.50 13.2 0 0.0 0
201105120801 1.0065 1013.48 13.2 0 0.0 0
201105120802 9999.0000 1013.51 13.2 0 0.0 0