Datetime 如何通过遍历从文件中查找缺少的日期时间序列
实际上,我想从文件中找到datetime的序列中断,并在缺少条目的地方添加空行 示例文件 2017-09-07 01:00:00 10 0Datetime 如何通过遍历从文件中查找缺少的日期时间序列,datetime,awk,sh,Datetime,Awk,Sh,实际上,我想从文件中找到datetime的序列中断,并在缺少条目的地方添加空行 示例文件 2017-09-07 01:00:00 10 0 2017-09-07 01:15:00 10 0 2017-09-07 01:30:00 10 0 2017-09-07 01:45:00 10 0 2017-09-07 03:00:00 10 0 2017-09-07 03:15:00 10 0 2017-09-07 03:30:00 10 0 2017-09-07 03:45:00 10 0 2017-
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0
2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0
2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0 产出应该是, 2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0 第2小时四行空白 2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0 第4小时四行空白 2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0在GNU awk中:
awk '
function foo(str) { # converts $1 $2 to epoch time
gsub(/[-:]/," ",str)
return mktime(str)
}
NR==1 { # set initial time
p=foo($1 " " $2)
next
}
{
q=foo($1 " " $2) # current time
while(q!=p+900) { # current should be previous + 900 s
print "" # if not, print empty record
p=p+900 # and increase p by 15 mins
}
print
p=q # current is new previous
}' file
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0
2017-09-07 03:00:00 10 0
...
使用
gawk
awk '
function get_dt(v)
{
gsub(/[-:]/," ",v);
return strftime("%F %T",900 + mktime(v))
}
{
current_dt=$1" "$2
}
next_dt != "" && current_dt != next_dt{
while(current_dt!=next_dt)
{
# print next_dt, "this is new"
# here is your blank line
print ""
next_dt=get_dt(next_dt)
}
}
{
next_dt = get_dt($1" "$2)
}1
' file
一行
输入
输出
如果需要时间戳,则
最短awk解决方案:
awk -F'[[:space:]:]' '!a[$1,$2]++ && h && $2-h>1{ print "\n\n\n" }{ h=1 }1' file
输出:
2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0
2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0
2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0
是的,那么你的具体问题是什么?此外,您实际使用的语言是什么?ksh->awk@prisoner也许您可以在问题中标记它们,这将有助于您快速获得答案。如果缺少
2017-09-07 05:15:00
,我认为OP希望在日期时间有间隔的地方打印新行,看起来需要15分钟interval@AkshayHegde,你说我认为OP想要-你不能阅读OP的想法,那是你的主观想法。因此,如果OP会说“实际上,我指的是另一种逻辑,这不会给我期望的结果”,我会调整或删除我的答案。当然,亲爱的,对不起,这是我从他/她的帖子中看到的。我如何应用reg ex模式选择/处理只匹配日期模式的行,如/^[0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]/@AkshayHegde Roman Perekhrest如果文件中有与正则表达式不匹配的条目,如0 0.00 2017-09-25 01:30:00 13 0.00 2017-09-09-25 02:30:00 13 0.00 2017-09-25 03:30:00 13 0 0.00@Akshay Hegdeh我如何将reg ex模式应用于只匹配日期模式的拾取/处理行,如/^[0-9][0-9][0-9][0-9]:[0-9]:[0-9][0-9]:[0-9]/@Akshay HedgeNF&&/^[0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]/{next}
将这一行放在开头,这样任何其他模式都将被跳过,如果有字段和模式不匹配,请跳过您上面提到的这一行模式,因为^
$ awk 'function get_dt(v){gsub(/[-:]/," ",v); return strftime("%F %T",900 + mktime(v))}{current_dt=$1" "$2}next_dt != "" && current_dt != next_dt{while(current_dt!=next_dt){ print next_dt" this is new"; next_dt=get_dt(next_dt)}}{next_dt = get_dt($1" "$2)}1' infile
2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0
2017-09-07 02:00:00 this is new
2017-09-07 02:15:00 this is new
2017-09-07 02:30:00 this is new
2017-09-07 02:45:00 this is new
2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0
2017-09-07 04:00:00 this is new
2017-09-07 04:15:00 this is new
2017-09-07 04:30:00 this is new
2017-09-07 04:45:00 this is new
2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0
awk -F'[[:space:]:]' '!a[$1,$2]++ && h && $2-h>1{ print "\n\n\n" }{ h=1 }1' file
2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0
2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0
2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0