Bash 从日志文件中提取单词
我试图从日志文件中提取作业id,但在bash中提取作业id时遇到问题。我试过使用sed 以下是我的日志文件的外观:Bash 从日志文件中提取单词,bash,awk,sed,Bash,Awk,Sed,我试图从日志文件中提取作业id,但在bash中提取作业id时遇到问题。我试过使用sed 以下是我的日志文件的外观: > 2018-06-16 02:39:39,331 INFO org.apache.flink.client.cli.CliFrontend > - Running 'list' command. > 2018-06-16 02:39:39,641 INFO org.apache.flink.runtime.rest.RestClient
> 2018-06-16 02:39:39,331 INFO org.apache.flink.client.cli.CliFrontend
> - Running 'list' command.
> 2018-06-16 02:39:39,641 INFO org.apache.flink.runtime.rest.RestClient
> - Rest client endpoint started.
> 2018-06-16 02:39:39,741 INFO org.apache.flink.client.cli.CliFrontend
> - Waiting for response...
> Waiting for response...
> 2018-06-16 02:39:39,953 INFO org.apache.flink.client.cli.CliFrontend
> - Successfully retrieved list of jobs
> ------------------ Running/Restarting Jobs -------------------
> 15.06.2018 18:49:44 : 1280dfd7b1de4c74cacf9515f371844b : jETTY HTTP Server -> servlet with content decompress -> pull from
> collections -> CSV to Avro encode -> Kafka publish (RUNNING)
> 16.06.2018 02:37:07 : aa7a691fa6c3f1ad619b6c0c4425ba1e : jETTY HTTP Server -> servlet with content decompress -> pull from
> collections -> CSV to Avro encode -> Kafka publish (RUNNING)
> --------------------------------------------------------------
> 2018-06-16 02:39:39,956 INFO org.apache.flink.runtime.rest.RestClient
> - Shutting down rest endpoint.
> 2018-06-16 02:39:39,957 INFO org.apache.flink.runtime.rest.RestClient
> - Rest endpoint shutdown complete.
我使用以下代码提取包含jobId的行:
extractRestResponse=`cat logFile.txt`
echo "extractRestResponse: "$extractRestResponse
w1="------------------ Running/Restarting Jobs -------------------"
w2="--------------------------------------------------------------"
extractRunningJobs="sed -e 's/.*'"$w1"'\(.*\)'"$w2"'.*/\1/' <<< $extractRestResponse"
runningJobs=`eval $extractRunningJobs`
echo "running jobs :"$runningJobs
awk
救援
awk '/^-+$/{f=0} f; /^-+ Running\/Restarting Jobs -+$/{f=1}' logfile
awk
救援
awk '/^-+$/{f=0} f; /^-+ Running\/Restarting Jobs -+$/{f=1}' logfile
对于sed:
sed -n '/^-* Running\/Restarting Jobs -*/,/^--*/{//!p;}' logFile.txt
解释:
- 默认情况下,应用命令后,输入行将回显到标准输出。
标志禁止此行为-n
:匹配从/^-*运行\/重新启动作业-*/,/^-*/
到^-*运行\/重新启动作业-*
的行(包括)^-*
/!p代码>:打印与地址匹配的行以外的行
sed -n '/^-* Running\/Restarting Jobs -*/,/^--*/{//!p;}' logFile.txt
解释:
- 默认情况下,应用命令后,输入行将回显到标准输出。
标志禁止此行为-n
:匹配从/^-*运行\/重新启动作业-*/,/^-*/
到^-*运行\/重新启动作业-*
的行(包括)^-*
/!p代码>:打印与地址匹配的行以外的行
sed -e 's/.*'"$w1"'\(.*\)'"$w2"'.*/\1/' <<< $extractRestResponse
输出是介于$w1
和$w2
之间的文本:
> 15.06.2018 18:49:44 : 1280dfd7b1de4c74cacf9515f371844b : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) > 16.06.2018 02:37:07 : aa7a691fa6c3f1ad619b6c0c4425ba1e : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) >
您可以改进原始替换:
sed -e 's/.*'"$w1"'\(.*\)'"$w2"'.*/\1/' <<< $extractRestResponse
输出是介于$w1
和$w2
之间的文本:
> 15.06.2018 18:49:44 : 1280dfd7b1de4c74cacf9515f371844b : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) > 16.06.2018 02:37:07 : aa7a691fa6c3f1ad619b6c0c4425ba1e : jETTY HTTP Server -> servlet with content decompress -> pull from > collections -> CSV to Avro encode -> Kafka publish (RUNNING) >
“我注意到所有换行符都丢失了”,您需要dbl引用您的
“$extractRestResponse”
变量(最有可能)。祝你好运。“我注意到所有的换行符都丢失了”,你需要dbl引用你的“$extractRestResponse”
变量(很可能)。祝你好运