Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/346.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/clojure/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 基于另一列将日期添加到日期_Python_Pandas - Fatal编程技术网

Python 基于另一列将日期添加到日期

Python 基于另一列将日期添加到日期,python,pandas,Python,Pandas,我有一个问题,我正在合并两个数据集,它们对午夜有不同的定义。因此,我希望在数据集中为每一次午夜添加一天,以便它们都遵循相同的日期设置 我通过以下方式安排了我的日期和时间: df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%d/%m/%Y') df['Hour'] = pd.to_datetime(df['Hour']).dt.strftime('%H:%M') 然后我试图修正00:00:00的任何事件,使其在一天之后: df.loc['

我有一个问题,我正在合并两个数据集,它们对午夜有不同的定义。因此,我希望在数据集中为每一次午夜添加一天,以便它们都遵循相同的日期设置

我通过以下方式安排了我的日期和时间:

df['Date'] = pd.to_datetime(df['Date']).dt.strftime('%d/%m/%Y')
df['Hour'] = pd.to_datetime(df['Hour']).dt.strftime('%H:%M')
然后我试图修正00:00:00的任何事件,使其在一天之后:

df.loc['Hour' == '00:00:00', 'Date'] = pd.DatetimeIndex(df.Date) + timedelta(days=1)
但是,我不断提出一个关键错误:

raise KeyError("cannot use a single bool to index into setitem")
KeyError: 'cannot use a single bool to index into setitem
任何帮助都将不胜感激

目标: 输入:

输出:

    Date        |  Hour
    ---------------------
    19/06/2016  |  23:30
    19/06/2016  |  23:45
    20/06/2016  |  00:00
    20/06/2016  |  00:15
    20/06/2016  |  00:30
您可以使用检查午夜和添加一天:

f['Date'] = pd.to_datetime(f['Date'])
m = f['Hour'] == '00:00'
f['Date'] = f['Date'].mask(m, f['Date'] + pd.Timedelta(1, unit='d')).dt.strftime('%d/%m/%Y')
带有
loc
的解决方案:

m = f['Hour'] == '00:00'
dates = pd.to_datetime(f['Date'])
f.loc[m, 'Date'] = (dates + pd.Timedelta(1, unit='d')).dt.strftime('%d/%m/%Y')
#alternative
#f.loc[m, 'Date'] = (dates[m] + pd.Timedelta(1, unit='d')).dt.strftime('%d/%m/%Y')
解决方案包括:


代码内:

for fname in glob.glob(path):
    fname = fname.replace(r'\2016', '/2016')
    f = pd.DataFrame(pd.read_csv(fname))
    f = f.replace({'Hour': {'24:00:00': '00:00'}})
    f['Date'] = pd.to_datetime(f['Date']).dt.strftime('%d/%m/%Y')
    f['Hour'] = pd.to_datetime(f['Hour']).dt.strftime('%H:%M')

    m = f['Hour'] == '00:00'
    dates = (pd.to_datetime(f['Date']) + pd.Timedelta(1, unit='d')).dt.strftime('%d/%m/%Y')
    f['Date'] = np.where(m, dates, f['Date'])

    print(fname)
    if a == 0:
        f_2016['Date'] = f['Date']
        f_2016['Hour'] = f['Hour']
        a = 1
    f_2016 = pd.merge(f_2016, f, on=['Date', 'Hour'])
    print(pd.DataFrame.head(f_2016, n=100))

我的建议是除非你必须分开日期和时间

通过将
datettime
列测试为标准化版本,可以测试您的时间是否为午夜:

import pandas as pd

f = pd.DataFrame({'Date': ['2018/01/01 15:00', '2018/01/02 00:00']})

f['Date'] = pd.to_datetime(f['Date'])
f.loc[f['Date'] == f['Date'].dt.normalize()] = f['Date'].apply(pd.DateOffset(1))

#                  Date
# 0 2018-01-01 15:00:00
# 1 2018-01-03 00:00:00
如果您确实必须将时间分开,则可以调整此解决方案:

f = pd.DataFrame({'Date': ['2018/01/01', '2018/01/02'],
                  'Hour': ['15:00', '00:00']})

f['Date'] = pd.to_datetime(f['Date'])
mask = pd.to_datetime(f['Date'].astype(str)+' '+f['Hour']) == f['Date']

f.loc[mask, 'Date'] = f.loc[mask, 'Date'].apply(pd.DateOffset(1))

你能添加数据样本吗?掩码解决方案抛出一个错误,表示它需要是str而不是TimeDelta loc解决方案给出的日期不正确:
31/01/2016 23:15 31/01/2016 23:30 31/01/2016 23:45 02/01/2016 00:00 01/02/2016 00:15 01/02/2016 00:30
@MaskedMonkey-我更改了第二个解决方案。你的熊猫版本是什么?熊猫版本是0.20。3@MaskedMonkey-我在
pandas 0.22.0
中测试它,运行良好。
for fname in glob.glob(path):
    fname = fname.replace(r'\2016', '/2016')
    f = pd.DataFrame(pd.read_csv(fname))
    f = f.replace({'Hour': {'24:00:00': '00:00'}})
    f['Date'] = pd.to_datetime(f['Date']).dt.strftime('%d/%m/%Y')
    f['Hour'] = pd.to_datetime(f['Hour']).dt.strftime('%H:%M')

    m = f['Hour'] == '00:00'
    dates = (pd.to_datetime(f['Date']) + pd.Timedelta(1, unit='d')).dt.strftime('%d/%m/%Y')
    f['Date'] = np.where(m, dates, f['Date'])

    print(fname)
    if a == 0:
        f_2016['Date'] = f['Date']
        f_2016['Hour'] = f['Hour']
        a = 1
    f_2016 = pd.merge(f_2016, f, on=['Date', 'Hour'])
    print(pd.DataFrame.head(f_2016, n=100))
import pandas as pd

f = pd.DataFrame({'Date': ['2018/01/01 15:00', '2018/01/02 00:00']})

f['Date'] = pd.to_datetime(f['Date'])
f.loc[f['Date'] == f['Date'].dt.normalize()] = f['Date'].apply(pd.DateOffset(1))

#                  Date
# 0 2018-01-01 15:00:00
# 1 2018-01-03 00:00:00
f = pd.DataFrame({'Date': ['2018/01/01', '2018/01/02'],
                  'Hour': ['15:00', '00:00']})

f['Date'] = pd.to_datetime(f['Date'])
mask = pd.to_datetime(f['Date'].astype(str)+' '+f['Hour']) == f['Date']

f.loc[mask, 'Date'] = f.loc[mask, 'Date'].apply(pd.DateOffset(1))