Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/logging/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Regex 在配置单元中访问\u日志进程_Regex_Logging_Hadoop_Hive - Fatal编程技术网

Regex 在配置单元中访问\u日志进程

Regex 在配置单元中访问\u日志进程,regex,logging,hadoop,hive,Regex,Logging,Hadoop,Hive,我有access\u日志大约500MB,我给样本作为 10.223.157.186 - - [15/Jul/2009:14:58:59 -0700] "GET / HTTP/1.1" 403 15779 10.223.157.186 - - [15/Jul/2009:14:58:59 -0700] "GET /favicon.ico HTTP/1.1" 404 5397 10.216.113.172 - - [29/Apr/2010:07:19:48 -0700] "GET / HTTP/1.1

我有
access\u日志
大约
500MB
,我给样本作为

10.223.157.186 - - [15/Jul/2009:14:58:59 -0700] "GET / HTTP/1.1" 403 15779
10.223.157.186 - - [15/Jul/2009:14:58:59 -0700] "GET /favicon.ico HTTP/1.1" 404 5397
10.216.113.172 - - [29/Apr/2010:07:19:48 -0700] "GET / HTTP/1.1" 200 68831
如何从时间戳中提取月份

预期输出:

year   month    day    event occurrence

2009   jul      15     GET /favicon.ico HTTP/1.1

2009   apr      29     GET / HTTP/1.1
我试过这个

add jar /usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u2.jar;

create table log(ip string, gt string, gt1 string, timestamp string, id1 string, s1 string, s2 string) row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'                          
with serdeproperties('input.regex'= '^(\\S+) (\\S+) (\\S+) \\[([[\\w/]+:(\\d{2}:\\d{2}):\\d{2}\\s[+\\-]\\d{4}:/]+\\s[+\\-]\\d{4})\\] "(.+?)" (\\S+) (\\S+)')location '/path';
如果我理解正确,字符串函数在这种情况下将无法工作。我不熟悉
regex
&
hive


帮帮我..提前谢谢

我不熟悉hadoop/hive,但就正则表达式而言,如果我使用的是ruby:

log_file = %Q[
  10.223.157.186 - - [15/Jul/2009:14:58:59 -0700] "GET / HTTP/1.1" 403 15779
  10.223.157.186 - - [15/Jul/2009:14:58:59 -0700] "GET /favicon.ico HTTP/1.1" 404 5397
  10.216.113.172 - - [29/Apr/2010:07:19:48 -0700] "GET / HTTP/1.1" 200 68831
]

converted_lines = log_file.split("\n").map do |line|
  regex = /^.*? - - \[(\d+)\/(\w+)\/(\d{4}).*?\] (.*)/
  matches = regex.match(line)
  output = [
    [:year, matches[3]],
    [:month, matches[2]],
    [:day, matches[1]],
    [:event_occurrence, matches[4]],
  ]
end
希望有帮助