Regex 我想读取一个文件并用AWK存储一些变量
我有一个包含以下内容的文件。这是设备查询的结果,因此在数据库中可能找不到某些输入。以下示例是成功和不成功查询的结果。我的意思是,第二个示例没有我想要捕获到变量中的所有信息,因此我想忽略这个结果,并使用null/empty值设置变量Regex 我想读取一个文件并用AWK存储一些变量,regex,shell,awk,readfile,text-processing,Regex,Shell,Awk,Readfile,Text Processing,我有一个包含以下内容的文件。这是设备查询的结果,因此在数据库中可能找不到某些输入。以下示例是成功和不成功查询的结果。我的意思是,第二个示例没有我想要捕获到变量中的所有信息,因此我想忽略这个结果,并使用null/empty值设置变量 <INTLPO:ISV=PORTAB NTL="6130290095" VEM=NAO; VECTURA - SS BSA002 2020-09-12 09-32 INTLP
<INTLPO:ISV=PORTAB NTL="6130290095" VEM=NAO;
VECTURA - SS BSA002 2020-09-12 09-32
INTLPO:ISV=PORTAB NTL="6130290095" VEM=NAO;
INTERROGACAO DE NUMERO TELEFONICO PARA PORTABILIDADE NUMERICA
TIPO DE ENCAMINHAMENTO POR ASSINANTE
NTL = 6130290095 OPC = S_INF RNP = 551 CSP = 25
EIP = S_INF
CDO = 00961
CNL = 61000 NUF = S_INF TPB = PREST
CPT = NAO CRE = 125 NUE = S_INF
DAT = 2014-04-16 HOR = 10:30:20.798609
TBR = 25
RST MAN RST MAN RST MAN
2% 934 3% 934 4% 934
5% 934 6% 934 7% 934
8% 934 9% 934 9090% 934
0??% 934 90??% 934 0?0% 934
TOTAL DE NUMEROS ASSOCIADOS AO SERVICO: 1
结果
2020-09-12,BSA002,6160150536,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6160150536,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
请注意,记录6130290095(变量NTL)与“number”记录(上面示例的最后一行)错误关联
我怎样才能克服呢?我尝试了一些AWK条件语句,但也没有成功。
作为一个输出,我只需要一行一行的记录,正如输出示例中的一些行所示。
非常感谢。如果您只想更改未设置的
numero
的值,请添加类似numero |
的测试读了你的评论后,我改变了我的解决方案。正如我现在所理解的,您不希望一条记录包含所有块的组合结果,但希望每个处理的块都有一条结果行。每个新块都以
开始。处理任何有标记=值
对的问题的最佳方法是首先填充该映射的数组(tag2val[]
如下),然后您可以按自己喜欢的顺序通过标记(名称)访问自己喜欢的值
$ cat tst.awk
BEGIN {
OFS = ","
numTags = split("\
EQUIPMENT \
DATE \
NUMERO \
NTL \
OPC \
RNP \
CSP \
EIP \
CDO \
CNL \
NUF \
TPB \
CPT \
CRE \
NUE \
DAT \
HOR \
TBR \
MAN \
",tags)
for (tagNr=1; tagNr<=numTags; tagNr++) {
tag = tags[tagNr]
printf "%s%s", tag, (tagNr<numTags ? OFS : ORS)
}
}
/^</ && (NR > 1) {
prt()
delete tag2val
}
$1 == "VECTURA" {
tag2val["EQUIPMENT"] = $4
tag2val["DATE"] = $5
}
/^INTLPO/ {
gsub(/^[^"]+"|"$/,"",$2)
tag2val["NUMERO"] = $2
}
/^([[:space:]]*[^[:space:]]+ = [^[:space:]]+)+$/ {
for (i=1; i<NF; i+=3) {
tag2val[$i] = $(i+2)
}
}
nextLineTag != "" {
tag2val[nextLineTag] = $2
nextLineTag = ""
}
/^[[:space:]]*RST[[:space:]]+MAN/ {
nextLineTag = "MAN"
}
END { prt() }
function prt( tagNr, tag, val) {
for (tagNr=1; tagNr<=numTags; tagNr++) {
tag = tags[tagNr]
val = tag2val[tag]
printf "%s%s", val, (tagNr<numTags ? OFS : ORS)
}
}
如果只想打印一次结果,请使用END{print…}
。请包含示例输入文件的预期输出。它是一个文件吗?它是什么样子的?(或者bash数组?)我不想只打印一次。我想每个记录只打印一次。我有一个50个查询的列表,其中一些有我需要的所有信息,而有些没有。输出是一个包含50个“数字”记录的列表,对于那些拥有我希望以coma分隔的所有信息的人(如前面提供的示例中的最后一个),以及那些没有的人,将以“6130300000,BSA2,,,,,,,,,,,,,,,,,,,,,,,,,,…的形式打印,如脚本所示(input.txt>>output.txt)。。再次感谢。你想从RST的台词中提取什么?你好。谢谢你的回答。我试过了,但没有成功。我有一个文件,上面写着对结果的恐惧,比如问题中的例子。其中一部分没有结果信息(因为设备数据库中不存在这些记录)。但是我想写一个包含所有记录的输出。每个输入记录一行,包含所有可用信息。当信息块没有所有的信息时,我想用它们所拥有的最低信息打印它们(例如2020-09-12,BSA2613030020,,,,,,,,,,,,,,,)…非常感谢,你知道了。您的解决方案优雅而简单,它精确地工作了,不用说谢谢。
2020-09-12,BSA002,6160150536,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6160150536,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,,,,,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,,,,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,,,,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,,,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
2020-09-12,BSA002,6160150178,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,MAN
awk 'function newrecord() {
recordnumber++;
data=equipment=numero=ntl=opc=rnp=csp=eip=cdo="";
cnl=nuf=tpb=cpt=cre=nue=dat=hor=tbr=man="";
}
function printrecord() {
print data, equipment, numero, ntl, opc, rnp, csp, eip,
cdo, cnl, nuf, tpb, cpt, cre, nue, dat, hor, tbr, man;
}
BEGIN { OFS="," }
/^<INTLPO/ { if (recordnumber) printrecord(); newrecord(); }
/^VECTURA/ { equipment = $4; data = $5 }
/^INTLPO/ { numero = $2}
/^\s*NTL/ { ntl = $3 ; opc = $6; rnp = $9; csp = $12}
/^\s*EIP/ { eip = $3}
/^\s*CDO/ { cdo = $3}
/^\s*CNL/ { cnl = $3; nuf = $6; tpb = $9}
/^\s*CPT/ { cpt = $3; cre = $6; nue = $9}
/^\s*DAT/ { dat = $3; hor = $6}
/^\s*TBR/ { tbr = $3}
/^\s*RST/ { man = $2; next}
END { printrecord(); }
' input.tx
$ cat tst.awk
BEGIN {
OFS = ","
numTags = split("\
EQUIPMENT \
DATE \
NUMERO \
NTL \
OPC \
RNP \
CSP \
EIP \
CDO \
CNL \
NUF \
TPB \
CPT \
CRE \
NUE \
DAT \
HOR \
TBR \
MAN \
",tags)
for (tagNr=1; tagNr<=numTags; tagNr++) {
tag = tags[tagNr]
printf "%s%s", tag, (tagNr<numTags ? OFS : ORS)
}
}
/^</ && (NR > 1) {
prt()
delete tag2val
}
$1 == "VECTURA" {
tag2val["EQUIPMENT"] = $4
tag2val["DATE"] = $5
}
/^INTLPO/ {
gsub(/^[^"]+"|"$/,"",$2)
tag2val["NUMERO"] = $2
}
/^([[:space:]]*[^[:space:]]+ = [^[:space:]]+)+$/ {
for (i=1; i<NF; i+=3) {
tag2val[$i] = $(i+2)
}
}
nextLineTag != "" {
tag2val[nextLineTag] = $2
nextLineTag = ""
}
/^[[:space:]]*RST[[:space:]]+MAN/ {
nextLineTag = "MAN"
}
END { prt() }
function prt( tagNr, tag, val) {
for (tagNr=1; tagNr<=numTags; tagNr++) {
tag = tags[tagNr]
val = tag2val[tag]
printf "%s%s", val, (tagNr<numTags ? OFS : ORS)
}
}
$ awk -f tst.awk file
EQUIPMENT,DATE,NUMERO,NTL,OPC,RNP,CSP,EIP,CDO,CNL,NUF,TPB,CPT,CRE,NUE,DAT,HOR,TBR,MAN
BSA002,2020-09-12,6130290095,6130290095,S_INF,551,25,S_INF,00961,61000,S_INF,PREST,NAO,125,S_INF,2014-04-16,10:30:20.798609,25,934
BSA002,2020-09-12,6160150178,,,,,,,,,,,,,,,,