Perl Date::Parse-如何正确解析1901年到1969年之间的日期 背景

Perl Date::Parse-如何正确解析1901年到1969年之间的日期 背景,perl,date,datetime,time,Perl,Date,Datetime,Time,我正在使用Perl解析用户输入的日期和日期时间,这些用户对格式设置不太小心。Perl模块Date::Parse看起来很棒,因为它可以处理我需要处理的大多数情况 除了我今天发现的介于1901-01-01 00:00:00和1968-12-31 23:59:59之间的日期时间。对于这些日期时间,Date::Parse str2time在将日期时间解析为历元时间时会额外增加100年 代码 下面是我用来解析日期时间的代码: #!/usr/bin/perl #-----------------------

我正在使用Perl解析用户输入的日期和日期时间,这些用户对格式设置不太小心。Perl模块
Date::Parse
看起来很棒,因为它可以处理我需要处理的大多数情况

除了我今天发现的介于1901-01-01 00:00:00和1968-12-31 23:59:59之间的日期时间。对于这些日期时间,Date::Parse str2time在将日期时间解析为历元时间时会额外增加100年

代码 下面是我用来解析日期时间的代码:

#!/usr/bin/perl
#---------------------------------------------------------------------
# format_date.pl
#
# format variable date inputs
#---------------------------------------------------------------------

use strict;
use warnings;

use Date::Parse;
use DateTime;

my $DEFAULT_TIME_ZONE = "GMT";

my @dates = (
    "1899-06-24 09:44:00",
    "1900-12-31 23:59:59",
    "1901-01-01 00:00:00",
    "1960-12-31 23:59:59",
    "1966-06-24 09:44:00",
    "1968-12-31 23:59:59",
    "1969-01-01 00:00:00",
    "1969-12-31 23:59:59",
    "1970-01-01 00:00:01",
    "2000-01-01 00:00:00",
    "2017-06-24 23:59:59",
    "2018-06-24 09:44:00",
    "2238-06-24 09:44:00"

);

foreach my $string (@dates) {

    # format datetime field from any valid datetime input
    # default time zone is used if timezone is not included in string
    my $epoch = str2time( $string, $DEFAULT_TIME_ZONE );

    # error if date is not correctly parsed
    if ( !$epoch ) {
        die("ERROR ====> invalid datetime ($string), "
        . "datetime format should be YYYY-MM-DD HH:MM:SS");
    }

    my $date = DateTime->from_epoch( epoch => $epoch );

    printf( "formatting datetime: value = %20s, epoch = %20u, "
            . "date = %20s\n", $string, $epoch, $date );

}

exit 0;
旁注:我需要改进错误处理,因为有效日期
1970-01-01 00:00:00
将抛出错误

输出 1901年至1969年期间的额外100年可以在输出中看到:

formatting datetime: value =  1899-06-24 09:44:00, epoch = 18446744071484095456, date =  1899-06-24T09:44:00
formatting datetime: value =  1900-12-31 23:59:59, epoch = 18446744071532098815, date =  1900-12-31T23:59:59
formatting datetime: value =  1901-01-01 00:00:00, epoch =            978307200, date =  2001-01-01T00:00:00
formatting datetime: value =  1960-12-31 23:59:59, epoch =           2871763199, date =  2060-12-31T23:59:59
formatting datetime: value =  1966-06-24 09:44:00, epoch =           3044598240, date =  2066-06-24T09:44:00
formatting datetime: value =  1968-12-31 23:59:59, epoch =           3124223999, date =  2068-12-31T23:59:59
formatting datetime: value =  1969-01-01 00:00:00, epoch = 18446744073678015616, date =  1969-01-01T00:00:00
formatting datetime: value =  1969-12-31 23:59:59, epoch = 18446744073709551615, date =  1969-12-31T23:59:59
formatting datetime: value =  1970-01-01 00:00:01, epoch =                    1, date =  1970-01-01T00:00:01
formatting datetime: value =  2000-01-01 00:00:00, epoch =            946684800, date =  2000-01-01T00:00:00
formatting datetime: value =  2017-06-24 23:59:59, epoch =           1498348799, date =  2017-06-24T23:59:59
formatting datetime: value =  2018-06-24 09:44:00, epoch =           1529833440, date =  2018-06-24T09:44:00
formatting datetime: value =  2238-06-24 09:44:00, epoch =           8472332640, date =  2238-06-24T09:44:00
附加说明 文件显示,它至少可以处理1901-01-01的日期。文档建议它应该能够处理更老的日期

问题: 我该如何处理这件怪事?使用Perl解析变量输入格式有更好的方法吗

编辑:多种日期格式的示例 输入可以有多种格式。以下是一系列示例:

my @dates = (
    "2018-02-20 00:00:00",
    "20180220",
    "02/20/2018",
    "02/20/18",    # interpreted as 1918-02-20
    "2018-02-20"
);

tangent回答了根本问题


问题在于Date::Parse-请参阅。关于——

解决方案1 我的解决方案是使用Date::Parse strtime而不是str2time

日期::Parse strtime将日期解析为数组($ss、$mm、$hh、$day、$month、$year、$zone)。允许使用以下方法将年份转换回4位数的年份:

if ( $year < 1000 ) { $year += 1900; }

请参阅上的示例脚本和输出。

仅使用模块添加另一个可能的解决方案


希望这有帮助,BR。

纪元时间是自1970-01-01T00:00:00Z以来的秒数。您尝试转换为纪元时间的日期早于此

为什么要使用两个不同的日期时间库?如果需要DateTime对象,请使用DateTime模块

use DateTime::Format::DateParse qw( );

for my $dt_str (@dates) {
    my $dt = DateTime::Format::DateParse->parse_datetime($dt_str, $DEFAULT_TIME_ZONE)
       or die(...);

    ...
}
产生:

1899-06-24 09:44:00 => 3799-06-24T09:44:00  <- doh!
1900-12-31 23:59:59 => 3800-12-31T23:59:59  <- doh!
1901-01-01 00:00:00 => 1901-01-01T00:00:00
1960-12-31 23:59:59 => 1960-12-31T23:59:59
1966-06-24 09:44:00 => 1966-06-24T09:44:00
1968-12-31 23:59:59 => 1968-12-31T23:59:59
1969-01-01 00:00:00 => 1969-01-01T00:00:00
1969-12-31 23:59:59 => 1969-12-31T23:59:59
1970-01-01 00:00:01 => 1970-01-01T00:00:01
2000-01-01 00:00:00 => 2000-01-01T00:00:00
2017-06-24 23:59:59 => 2017-06-24T23:59:59
2018-06-24 09:44:00 => 2018-06-24T09:44:00
2238-06-24 09:44:00 => 2238-06-24T09:44:00
2018-02-20 00:00:00 => 2018-02-20T00:00:00
20180220            => 2018-02-20T00:00:00
02/20/2018          => 2018-02-20T00:00:00
02/20/18            => 1918-02-20T00:00:00
2018-02-20          => 2018-02-20T00:00:00
1899-06-24 09:44:00 => 1899-06-24T09:44:00
1900-12-31 23:59:59 => 1900-12-31T23:59:59
1901-01-01 00:00:00 => 1901-01-01T00:00:00
1960-12-31 23:59:59 => 1960-12-31T23:59:59
1966-06-24 09:44:00 => 1966-06-24T09:44:00
1968-12-31 23:59:59 => 1968-12-31T23:59:59
1969-01-01 00:00:00 => 1969-01-01T00:00:00
1969-12-31 23:59:59 => 1969-12-31T23:59:59
1970-01-01 00:00:01 => 1970-01-01T00:00:01
2000-01-01 00:00:00 => 2000-01-01T00:00:00
2017-06-24 23:59:59 => 2017-06-24T23:59:59
2018-06-24 09:44:00 => 2018-06-24T09:44:00
2238-06-24 09:44:00 => 2238-06-24T09:44:00
2018-02-20 00:00:00 => 2018-02-20T00:00:00
20180220            => 2018-02-20T00:00:00
02/20/2018          => 2018-02-20T00:00:00
02/20/18            => 2018-02-20T00:00:00
2018-02-20          => 2018-02-20T00:00:00
产生:

1899-06-24 09:44:00 => 3799-06-24T09:44:00  <- doh!
1900-12-31 23:59:59 => 3800-12-31T23:59:59  <- doh!
1901-01-01 00:00:00 => 1901-01-01T00:00:00
1960-12-31 23:59:59 => 1960-12-31T23:59:59
1966-06-24 09:44:00 => 1966-06-24T09:44:00
1968-12-31 23:59:59 => 1968-12-31T23:59:59
1969-01-01 00:00:00 => 1969-01-01T00:00:00
1969-12-31 23:59:59 => 1969-12-31T23:59:59
1970-01-01 00:00:01 => 1970-01-01T00:00:01
2000-01-01 00:00:00 => 2000-01-01T00:00:00
2017-06-24 23:59:59 => 2017-06-24T23:59:59
2018-06-24 09:44:00 => 2018-06-24T09:44:00
2238-06-24 09:44:00 => 2238-06-24T09:44:00
2018-02-20 00:00:00 => 2018-02-20T00:00:00
20180220            => 2018-02-20T00:00:00
02/20/2018          => 2018-02-20T00:00:00
02/20/18            => 1918-02-20T00:00:00
2018-02-20          => 2018-02-20T00:00:00
1899-06-24 09:44:00 => 1899-06-24T09:44:00
1900-12-31 23:59:59 => 1900-12-31T23:59:59
1901-01-01 00:00:00 => 1901-01-01T00:00:00
1960-12-31 23:59:59 => 1960-12-31T23:59:59
1966-06-24 09:44:00 => 1966-06-24T09:44:00
1968-12-31 23:59:59 => 1968-12-31T23:59:59
1969-01-01 00:00:00 => 1969-01-01T00:00:00
1969-12-31 23:59:59 => 1969-12-31T23:59:59
1970-01-01 00:00:01 => 1970-01-01T00:00:01
2000-01-01 00:00:00 => 2000-01-01T00:00:00
2017-06-24 23:59:59 => 2017-06-24T23:59:59
2018-06-24 09:44:00 => 2018-06-24T09:44:00
2238-06-24 09:44:00 => 2238-06-24T09:44:00
2018-02-20 00:00:00 => 2018-02-20T00:00:00
20180220            => 2018-02-20T00:00:00
02/20/2018          => 2018-02-20T00:00:00
02/20/18            => 2018-02-20T00:00:00
2018-02-20          => 2018-02-20T00:00:00

.变量输入格式所有输入均为YYYY-MM-DD hh:MM:ss。我遗漏了什么?所以至少展示这些变量格式的一个好例子。问题在于Date::Parse-请参阅。我已经添加了一个变量格式的示例。我最初没有包括它们,因为我试图梳理出额外100年的问题。OP需要能够处理Date::Parse支持的多种格式的日期。问题中的示例代码不是很好,因为所有日期都是ISO 8601。下面是我可能收到的多种格式的数组:
my@dates=(“2018-02-20 00:00:00”、“20180220”、“02/20/2018”、“02/20/18”、“2018-02-20”)使用Date::Manip大大简化了我的代码(参见我的答案)。谢谢
1899-06-24 09:44:00 => 1899-06-24T09:44:00
1900-12-31 23:59:59 => 1900-12-31T23:59:59
1901-01-01 00:00:00 => 1901-01-01T00:00:00
1960-12-31 23:59:59 => 1960-12-31T23:59:59
1966-06-24 09:44:00 => 1966-06-24T09:44:00
1968-12-31 23:59:59 => 1968-12-31T23:59:59
1969-01-01 00:00:00 => 1969-01-01T00:00:00
1969-12-31 23:59:59 => 1969-12-31T23:59:59
1970-01-01 00:00:01 => 1970-01-01T00:00:01
2000-01-01 00:00:00 => 2000-01-01T00:00:00
2017-06-24 23:59:59 => 2017-06-24T23:59:59
2018-06-24 09:44:00 => 2018-06-24T09:44:00
2238-06-24 09:44:00 => 2238-06-24T09:44:00
2018-02-20 00:00:00 => 2018-02-20T00:00:00
20180220            => 2018-02-20T00:00:00
02/20/2018          => 2018-02-20T00:00:00
02/20/18            => 2018-02-20T00:00:00
2018-02-20          => 2018-02-20T00:00:00