Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/cmake/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何将以24为小时值的日期/时间字符串转换为Pandas中的日期时间?_Python_Pandas_Datetime_Time - Fatal编程技术网

Python 如何将以24为小时值的日期/时间字符串转换为Pandas中的日期时间?

Python 如何将以24为小时值的日期/时间字符串转换为Pandas中的日期时间?,python,pandas,datetime,time,Python,Pandas,Datetime,Time,我正在从一个普通邮件应用程序(Mac OS X)将电子邮件作为文本文件导入。不幸的是,电子邮件上的许多日期都有类似于“24:01:01”的时间,这是无效的时间(应该是“00:01:01”) 有没有一个简单的方法来转换这些 正常日期/时间字符串工作正常: >>> pd.to_datetime("March 23, 2011 at 23:42:46 PDT") Timestamp('2011-03-23 23:42:46-0700', tz='pytz.FixedOffset(-

我正在从一个普通邮件应用程序(Mac OS X)将电子邮件作为文本文件导入。不幸的是,电子邮件上的许多日期都有类似于
“24:01:01”
的时间,这是无效的时间(应该是
“00:01:01”

有没有一个简单的方法来转换这些

正常日期/时间字符串工作正常:

>>> pd.to_datetime("March 23, 2011 at 23:42:46  PDT")
Timestamp('2011-03-23 23:42:46-0700', tz='pytz.FixedOffset(-420)')
异常日期字符串:

>>> pd.to_datetime("March 23, 2011 at 24:42:46  PDT")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   1860         try:
-> 1861             values, tz_parsed = conversion.datetime_to_datetime64(data)
   1862             # If tzaware, these values represent unix timestamps, so we

pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()

TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-38-4cb009b21802> in <module>
----> 1 pd.to_datetime("March 23, 2011 at 24:42:46  PDT")

~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin, cache)
    609             result = convert_listlike(arg, box, format)
    610     else:
--> 611         result = convert_listlike(np.array([arg]), box, format)[0]
    612 
    613     return result

~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    300             arg, dayfirst=dayfirst, yearfirst=yearfirst,
    301             utc=utc, errors=errors, require_iso8601=require_iso8601,
--> 302             allow_object=True)
    303 
    304     if tz_parsed is not None:

~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   1864             return values.view('i8'), tz_parsed
   1865         except (ValueError, TypeError):
-> 1866             raise e
   1867 
   1868     if tz_parsed is not None:

~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   1855             dayfirst=dayfirst,
   1856             yearfirst=yearfirst,
-> 1857             require_iso8601=require_iso8601
   1858         )
   1859     except ValueError as e:

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime_object()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime_object()

pandas/_libs/tslibs/parsing.pyx in pandas._libs.tslibs.parsing.parse_datetime_string()

~/anaconda/envs/pyqt/lib/python3.6/site-packages/dateutil/parser/_parser.py in parse(timestr, parserinfo, **kwargs)
   1354         return parser(parserinfo).parse(timestr, **kwargs)
   1355     else:
-> 1356         return DEFAULTPARSER.parse(timestr, **kwargs)
   1357 
   1358 

~/anaconda/envs/pyqt/lib/python3.6/site-packages/dateutil/parser/_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
    651             raise ValueError("String does not contain a date:", timestr)
    652 
--> 653         ret = self._build_naive(res, default)
    654 
    655         if not ignoretz:

~/anaconda/envs/pyqt/lib/python3.6/site-packages/dateutil/parser/_parser.py in _build_naive(self, res, default)
   1225                 repl['day'] = monthrange(cyear, cmonth)[1]
   1226 
-> 1227         naive = default.replace(**repl)
   1228 
   1229         if res.weekday is not None and not res.day:

ValueError: hour must be in 0..23
>>pd.to_datetime(“2011年3月23日24:42:46太平洋标准时间”)
---------------------------------------------------------------------------
TypeError回溯(最近一次调用上次)
对象中的~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py到\u datetime64ns(数据、日首、年首、utc、错误、要求\u iso8601、允许\u对象)
1860尝试:
->1861个值,tz_parsed=转换.datetime_到_datetime64(数据)
1862#如果tzaware,这些值表示unix时间戳,因此
pandas/_libs/tslibs/conversion.pyx在pandas中。_libs.tslibs.conversion.datetime_to_datetime64()
TypeError:无法识别的值类型:
在处理上述异常期间,发生了另一个异常:
ValueError回溯(最近一次调用上次)
在里面
---->1 pd.截止日期时间(“2011年3月23日24:42:46 PDT”)
~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg,errors,dayfirst,yearfirst,utc,box,format,exact,unit,expert_datetime,format,origin,cache)
609结果=转换列表(参数、框、格式)
610其他:
-->611 result=convert_listlike(np.array([arg]),框,格式)[0]
612
613返回结果
~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in\u convert\u listlike\u datetimes(arg,box,format,name,tz,unit,errors,推断日期时间格式,dayfirst,yearfirst,exact)
300 arg,dayfirst=dayfirst,yearfirst=yearfirst,
301 utc=utc,errors=errors,require_iso8601=require_iso8601,
-->302允许(对象=真)
303
304如果分析的tz_不是无:
对象中的~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py到\u datetime64ns(数据、日首、年首、utc、错误、要求\u iso8601、允许\u对象)
1864返回值。视图('i8'),tz_已解析
1865除外(ValueError,TypeError):
->1866年
1867
1868如果tz_解析为非无:
对象中的~/anaconda/envs/pyqt/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py到\u datetime64ns(数据、日首、年首、utc、错误、要求\u iso8601、允许\u对象)
1855 dayfirst=dayfirst,
1856年第一年=第一年,
->1857 require_iso8601=require_iso8601
1858         )
1859除值误差为e外:
pandas/_libs/tslib.pyx在pandas中。_libs.tslib.array_to_datetime()
pandas/_libs/tslib.pyx在pandas中。_libs.tslib.array_to_datetime()
pandas/_libs/tslib.pyx在pandas中。_libs.tslib.array_to_datetime_object()
pandas/_libs/tslib.pyx在pandas中。_libs.tslib.array_to_datetime_object()
pandas/_libs/tslibs/parsing.pyx在pandas中。_libs.tslibs.parsing.parse_datetime_string()
解析中的~/anaconda/envs/pyqt/lib/python3.6/site-packages/dateutil/parser//\u parser.py(timestr、parserinfo、**kwargs)
1354返回解析器(parserinfo).parse(timestr,**kwargs)
1355其他:
->1356返回DEFAULTPARSER.parse(timestr,**kwargs)
1357
1358
解析中的~/anaconda/envs/pyqt/lib/python3.6/site-packages/dateutil/parser//\u parser.py(self、timestr、default、ignoretz、tzinfos、**kwargs)
651 raise VALUETERROR(“字符串不包含日期:”,timestr)
652
-->653 ret=self.\u build\u naive(res,默认值)
654
655如果不是ignoretz:
~/anaconda/envs/pyqt/lib/python3.6/site-packages/dateutil/parser//u parser.py in\u build\u naive(self,res,默认值)
1225回复['day']=蒙特兰奇(cyear,cm月)[1]
1226
->1227 naive=默认值。替换(**repl)
1228
1229如果res.weekday不是无,也不是res.day:
ValueError:小时必须在0..23中

首先将好的日期时间转换为带有
错误的
强制'
-get
NaT
的坏值。因此,过滤它,
替换
24
并添加一天。用它替换缺少的值的最后一步:

d = ["March 23, 2011 at 24:42:46  PDT",
     "March 23, 2011 at 23:42:46  PDT"]

s = pd.Series(d)

s1 = pd.to_datetime(s, errors='coerce')
m = s1.isna()

s2 = (pd.to_datetime(s[m].replace('at 24:', 'at 00:', regex=True),  errors='coerce') +
         pd.Timedelta(1, unit='d'))

s = s1.fillna(s2)
print (s)
0   2011-03-24 00:42:46
1   2011-03-23 23:42:46
dtype: datetime64[ns]
另一个想法-将天数和时间提取到单独的列中,并添加
timedelta
s:

s1 = pd.to_datetime(s, errors='coerce')
m = s1.isna()

df2 = s[m].str.split(' at ', expand=True)
df2.columns = ['date','time']
df2['date'] = pd.to_datetime(df2['date'], errors='coerce')
df2['time'] = pd.to_timedelta(df2['time'].str.extract('(\d+:\d+:\d+)', expand=False))
df2['date1'] = df2['date'] + df2['time']
print (df2)
        date            time               date1
0 2011-03-23 1 days 00:42:46 2011-03-24 00:42:46

s = s1.fillna(df2['date1'])
print (s)
0   2011-03-24 00:42:46
1   2011-03-23 23:42:46
dtype: datetime64[ns]

你期望从中得到什么价值?第二天0:42:46?@Paul是的,我期待第二天00:42:46。不知道这些时间线是从哪里来的。可能是Apple Mail Mac,也可能是来自用于创建原始电子邮件的邮件服务。以下是它们在电子邮件文本文件中的显示方式:
日期:2011年2月24日24:48:03太平洋标准时间