Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/logging/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
bash在大型日志文件中查找字符串_Bash_Logging_Awk - Fatal编程技术网

bash在大型日志文件中查找字符串

bash在大型日志文件中查找字符串,bash,logging,awk,Bash,Logging,Awk,我有一个巨大的日志文件,其中包含超过100万个字符串。 它包含19个colulms: time | date | host | user | domain | category | source | port | URL | etc 例如: time date host user domain category source port URL etc 2:10:21 18.11.2014 192.168.56.101 %username1% %

我有一个巨大的日志文件,其中包含超过100万个字符串。 它包含19个colulms:

time | date | host | user | domain | category   | source | port | URL | etc

例如:

time    date    host    user    domain  category    source  port    URL etc
2:10:21 18.11.2014  192.168.56.101  %username1% %domainname%    "many words"    stackoverflow.com   "80"    http://stackoverflow.com/   
2:10:22 18.11.2014  192.168.56.101  %username2% %domainname%    "done"  stackoverflow.com   "80"    http://stackoverflow.com/   
2:10:23 18.11.2014  192.168.56.101  %username3% %domainname%    "denied site"   stackoverflow.com   "80"    http://stackoverflow.com/   
2:10:24 18.11.2014  192.168.56.101  %username4% %domainname%    "suspicious"    stackoverflow.com   "80"    http://stackoverflow.com/   
2:10:25 18.11.2014  192.168.56.101  %username5% %domainname%    "uncategorized" stackoverflow.com   "80"    http://stackoverflow.com/   
2:10:26 18.11.2014  192.168.56.101  %username6% %domainname%    "denied site"   stackoverflow.com   "80"    http://stackoverflow.com/   
2:10:27 18.11.2014  192.168.56.101  %username7% %domainname%    "many words"    stackoverflow.com   "80"    http://stackoverflow.com/
当我尝试在列中查找字符串时,有时它看起来很糟糕:

user@stand-01:~/folder$cat file |awk '{FS=" ";print$6}'
category
"many
"done"
"denied
"suspicious"
"uncategorized"
"denied
"many
因此,当我尝试第7列时,它有来自另一列的数据:

user@stand-01:~/folder$cat file |awk '{FS=" ";print$7}'
source
words"
stackoverflow.com
site"
stackoverflow.com
stackoverflow.com
site"
words"
如何使用空格delimeter,而不在引号中分隔文本


谢谢,像这样的东西可能有用

$ awk '$6 ~ /^"[^"]+"$/{print $6;next} $6 ~ /^"/{print $6, $7}' input
"many words"
"done"
"denied site"
"suspicious"
"uncategorized"
"denied site"
"many words"

这里有一个
awk

awk -F\" 'NR>1{print $2}' file
many words
done
denied site
suspicious
uncategorized
denied site
many words


与其为此查找复杂的正则表达式,不如更改此文件的编写方式,使其以逗号分隔(csv)、制表符分隔,等等。也就是说,字段中不存在某些内容。否则,将来可能会给您带来更多问题。您的意思是这个
awk-v FS=“\”“{print$2}”文件
?您的文件选项卡是分隔的,而不是空格分隔的。请使用
head-1 logFile | cat-vte
命令进行检查。尽管这确实有效,但请注意OP说““我怎样才能使用空格delimeter而不使用引号分隔文本?”@fedorqui Uff,我用过。需要一杯咖啡,谢谢。
awk -F\" 'NR>1{print FS$2FS}' file
"many words"
"done"
"denied site"
"suspicious"
"uncategorized"
"denied site"
"many words"