Datetime 如何通过遍历从文件中查找缺少的日期时间序列

Datetime 如何通过遍历从文件中查找缺少的日期时间序列,datetime,awk,sh,Datetime,Awk,Sh,实际上,我想从文件中找到datetime的序列中断,并在缺少条目的地方添加空行 示例文件 2017-09-07 01:00:00 10 0 2017-09-07 01:15:00 10 0 2017-09-07 01:30:00 10 0 2017-09-07 01:45:00 10 0 2017-09-07 03:00:00 10 0 2017-09-07 03:15:00 10 0 2017-09-07 03:30:00 10 0 2017-09-07 03:45:00 10 0 2017-

实际上,我想从文件中找到datetime的序列中断,并在缺少条目的地方添加空行

示例文件

2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0
2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0
2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0

产出应该是,

2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0

第2小时四行空白

2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0

第4小时四行空白

2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0在GNU awk中:

awk '
function foo(str) {       # converts $1 $2 to epoch time
    gsub(/[-:]/," ",str)
    return mktime(str)
} 
NR==1 {                   # set initial time
    p=foo($1 " " $2)
    next
} 
{
    q=foo($1 " " $2)      # current time
    while(q!=p+900) {     # current should be previous + 900 s 
        print ""          # if not, print empty record
        p=p+900           # and increase p by 15 mins
    }
    print
    p=q                   # current is new previous
}' file
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0




2017-09-07 03:00:00 10 0
...

使用
gawk

awk '
    function get_dt(v)
    {
          gsub(/[-:]/," ",v); 
          return strftime("%F %T",900 + mktime(v))
    }
    {
        current_dt=$1" "$2
    }
    next_dt != "" && current_dt != next_dt{
        while(current_dt!=next_dt)
        { 
            # print next_dt, "this is new"
            # here is your blank line
            print ""

            next_dt=get_dt(next_dt)
        }
    }
    {
        next_dt = get_dt($1" "$2)
    }1
  ' file
一行

输入

输出

如果需要时间戳,则

最短awk解决方案:

awk -F'[[:space:]:]' '!a[$1,$2]++ && h && $2-h>1{ print "\n\n\n" }{ h=1 }1' file
输出:

2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0




2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0




2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0

是的,那么你的具体问题是什么?此外,您实际使用的语言是什么?ksh->awk@prisoner也许您可以在问题中标记它们,这将有助于您快速获得答案。如果缺少
2017-09-07 05:15:00
,我认为OP希望在日期时间有间隔的地方打印新行,看起来需要15分钟interval@AkshayHegde,你说我认为OP想要-你不能阅读OP的想法,那是你的主观想法。因此,如果OP会说“实际上,我指的是另一种逻辑,这不会给我期望的结果”,我会调整或删除我的答案。当然,亲爱的,对不起,这是我从他/她的帖子中看到的。我如何应用reg ex模式选择/处理只匹配日期模式的行,如/^[0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]/@AkshayHegde Roman Perekhrest如果文件中有与正则表达式不匹配的条目,如0 0.00 2017-09-25 01:30:00 13 0.00 2017-09-09-25 02:30:00 13 0.00 2017-09-25 03:30:00 13 0 0.00@Akshay Hegdeh我如何将reg ex模式应用于只匹配日期模式的拾取/处理行,如/^[0-9][0-9][0-9][0-9]:[0-9]:[0-9][0-9]:[0-9]/@Akshay Hedge
NF&&/^[0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]/{next}
将这一行放在开头,这样任何其他模式都将被跳过,如果有字段和模式不匹配,请跳过您上面提到的这一行模式,因为
^
$ awk 'function get_dt(v){gsub(/[-:]/," ",v); return strftime("%F %T",900 + mktime(v))}{current_dt=$1" "$2}next_dt != "" && current_dt != next_dt{while(current_dt!=next_dt){ print next_dt" this is new"; next_dt=get_dt(next_dt)}}{next_dt = get_dt($1" "$2)}1' infile
2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0
2017-09-07 02:00:00 this is new
2017-09-07 02:15:00 this is new
2017-09-07 02:30:00 this is new
2017-09-07 02:45:00 this is new
2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0
2017-09-07 04:00:00 this is new
2017-09-07 04:15:00 this is new
2017-09-07 04:30:00 this is new
2017-09-07 04:45:00 this is new
2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0
awk -F'[[:space:]:]' '!a[$1,$2]++ && h && $2-h>1{ print "\n\n\n" }{ h=1 }1' file
2017-09-07 01:00:00 10 0
2017-09-07 01:15:00 10 0
2017-09-07 01:30:00 10 0
2017-09-07 01:45:00 10 0




2017-09-07 03:00:00 10 0
2017-09-07 03:15:00 10 0
2017-09-07 03:30:00 10 0
2017-09-07 03:45:00 10 0




2017-09-07 05:00:00 10 0
2017-09-07 05:15:00 10 0
2017-09-07 05:30:00 10 0
2017-09-07 05:45:00 10 0