Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/video/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
String AWK/SED提取巨行之间的字符串_String_Awk_Sed_Line_Delimiter - Fatal编程技术网

String AWK/SED提取巨行之间的字符串

String AWK/SED提取巨行之间的字符串,string,awk,sed,line,delimiter,String,Awk,Sed,Line,Delimiter,我有一个巨大的行,是来自ws的响应,我需要获取和之间的所有字符串。文件如下所示: Content-Type: application/xop+xml; charset=UTF-8; type="application/soap+xml"; Content-Transfer-Encoding: binary Content-ID: <root.message@cxf.apache.org> <soap:Envelope xmlns:soap="http://www.w3.org

我有一个巨大的行,是来自ws的响应,我需要获取
之间的所有字符串。文件如下所示:

Content-Type: application/xop+xml; charset=UTF-8; type="application/soap+xml";
Content-Transfer-Encoding: binary
Content-ID: <root.message@cxf.apache.org>

<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"><soap:Body><ns1:consultarComunicacionesResponse xmlns:ns1="http://ve.tecno.afip.gov.ar/domain/service/ws"><ns2:RespuestaPaginada xmlns:ns2="http://ve.tecno.afip.gov.ar/domain/service/ws" xmlns:ns3="http://core.tecno.afip.gov.ar/model/ws/types" xmlns:ns4="http://ve.tecno.afip.gov.ar/domain/service/ws/types"><pagina>1</pagina><totalPaginas>1</totalPaginas><itemsPorPagina>100</itemsPorPagina><totalItems>2</totalItems><ns4:items><ns4:ComunicacionSimplificada><idComunicacion>sdfgsfdgsfdgsd</idComunicacion><cuitDestinatario>sdfgsdfgsdfgsfdg</cuitDestinatario><fechaPublicacion>sdfgsdfg</fechaPublicacion><fechaVencimiento>sdfgsdfgsdfg</fechaVencimiento><sistemaPublicador>sdfgsdfgsfg</sistemaPublicador><sistemaPublicadorDesc>sdfgsdfggf</sistemaPublicadorDesc><estado>2</estado><estadoDesc>sdfgsdfgsgf</estadoDesc><asunto>EXAMPLEEEEEEEEEEEEEEEE1</asunto><prioridad>3</prioridad><tieneAdjunto>sdfgfdg</tieneAdjunto></ns4:ComunicacionSimplificada><ns4:ComunicacionSimplificada><idComunicacion>sdfgsdfgdfg</idComunicacion><cuitDestinatario>sdfgdfsg</cuitDestinatario><fechaPublicacion>sdfgsdfg</fechaPublicacion><fechaVencimiento>sdfgdsfg</fechaVencimiento><sistemaPublicador>sdfgsdfg</sistemaPublicador><sistemaPublicadorDesc>sdfgsdfgdsfggsdf</sistemaPublicadorDesc><estado>1</estado><estadoDesc>dsfgsdfgsgd</estadoDesc><asunto>EXAMPLEEEEEEEEEEEEEEEE2</asunto><prioridad>asdfdsf</prioridad><tieneAdjunto>asdfasdf</tieneAdjunto></ns4:ComunicacionSimplificada></ns4:items></ns2:RespuestaPaginada></ns1:consultarComunicacionesResponse></soap:Body></soap:Envelope>    
EXAMPLEEEEEEEEEEEEEEEE1    
EXAMPLEEEEEEEEEEEEEEEE2
可能会有很多重复,在0到数百之间


谢谢

awk
救援

$ awk -v RS='[<>]' '/\/asunto/{f=0;next} f; /asunto/{f=1}' file

EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs='[]''/\/asunto/{f=0;next}f/asunto/{f=1}'文件
示例EEEEEE1
示例EEEEEE2
更新:根据注释,如果标签可能存在于其他位置,您可以在打开/关闭标签的左侧和右侧定位

$ awk -v RS='[<>]' '/^\/asunto$/{f=0;next} f; /^asunto$/{f=1}' file
EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs='[]''/^\/asunto$/{f=0;next}f/^asunto$/{f=1}文件
示例EEEEEE1
示例EEEEEE2
或者等效地,检查字符串是否完全匹配

$ awk -v RS='[<>]' '$0=="/asunto"{f=0;next} f; $0=="asunto"{f=1}' file
EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs='[]'$0='/asunto“{f=0;next}f$0==“asunto”{f=1}”文件
示例EEEEEE1
示例EEEEEE2

还要注意的是,并非所有的
awk
变体都支持多字符RS.

awk
来拯救

$ awk -v RS='[<>]' '/\/asunto/{f=0;next} f; /asunto/{f=1}' file

EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs='[]''/\/asunto/{f=0;next}f/asunto/{f=1}'文件
示例EEEEEE1
示例EEEEEE2
更新:根据注释,如果标签可能存在于其他位置,您可以在打开/关闭标签的左侧和右侧定位

$ awk -v RS='[<>]' '/^\/asunto$/{f=0;next} f; /^asunto$/{f=1}' file
EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs='[]''/^\/asunto$/{f=0;next}f/^asunto$/{f=1}文件
示例EEEEEE1
示例EEEEEE2
或者等效地,检查字符串是否完全匹配

$ awk -v RS='[<>]' '$0=="/asunto"{f=0;next} f; $0=="asunto"{f=1}' file
EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs='[]'$0='/asunto“{f=0;next}f$0==“asunto”{f=1}”文件
示例EEEEEE1
示例EEEEEE2

还请注意,并非所有的
awk
变体都支持多字符RS。

您也可以使用GNU

grep-oP'(?/dev/null)
实际0.277秒
用户0.254s
系统0m0.022s

$time grep-oP'(?您也可以使用GNU

grep-oP'(?/dev/null)
实际0.277秒
用户0.254s
系统0m0.022s

$time grep-oP'(?与GNU awk一起用于多字符:

$ awk -v RS='</?asunto>' '!(NR%2)' file
EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs=''!(NR%2)文件
示例EEEEEE1
示例EEEEEE2

带GNU awk的多字符RS:

$ awk -v RS='</?asunto>' '!(NR%2)' file
EXAMPLEEEEEEEEEEEEEEEE1
EXAMPLEEEEEEEEEEEEEEEE2
$awk-vrs=''!(NR%2)文件
示例EEEEEE1
示例EEEEEE2
使用XML解析器(和awk删除标题)

使用XML解析器(和awk删除标头)

这可能适用于您(GNU-sed):

sed-nr'/([^这可能适合您(GNU-sed):


sed-nr'/([^正如其他地方所指出的,支持XML的工具原则上更安全,但是如果“asunto”标记没有嵌套,那么下面的GNU grep咒语可能有用,即使
之间的字符串为空或包含其他标记,它也会起作用:

grep -oP '(?<=<asunto>).*?(?=</asunto>)'

grep-oP'(?正如其他地方指出的,支持XML的工具原则上更安全,但是如果没有“asunto”标记嵌套,那么下面的GNU grep咒语可能会有用,即使
之间的字符串为空或包含其他标记,它也会起作用:

grep -oP '(?<=<asunto>).*?(?=</asunto>)'

grep-oP'(?您应该提到,由于多字符,它是特定于gawk的,如果
asunto
出现在其他上下文中而不是作为标记出现,它将失败。@karakfa-为了解决Ed的第二点,您能否将一行代码稍微更改为:
awk-vrs='[]'''/\/asunto*$/{f=0;next}f;/^asunto/{f=1}“file
?您应该提到,由于多字符,它是特定于gawk的,如果
asunto
出现在其他上下文中而不是作为标记出现,它将失败。@karakfa-为了解决Ed的第二点,您能否将一行代码稍微更改为:
awk-v RS='[''''/\/asunto*$/{f=0;next}f;/^asunto/{f=1}”file