Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python正在计算数据帧中日期范围的小时数_Python_Python 3.x_Pandas_Date Range - Fatal编程技术网

python正在计算数据帧中日期范围的小时数

python正在计算数据帧中日期范围的小时数,python,python-3.x,pandas,date-range,Python,Python 3.x,Pandas,Date Range,我想计算日期范围的所有小时数。周一至周五的标准开放时间为每天16小时,周六和周日为24小时 我已经编写了代码,它适用于两个特定日期: date1 = date(2017,4, 13) date2 = date(2017,4, 17) def daterange(d1, d2): return (d1 + datetime.timedelta(days=i) for i in range((d2 - d1).days + 1)) total = 0 for n in daterang

我想计算日期范围的所有小时数。周一至周五的标准开放时间为每天16小时,周六和周日为24小时

我已经编写了代码,它适用于两个特定日期:

date1 = date(2017,4, 13)
date2 = date(2017,4, 17)

def daterange(d1, d2):
     return (d1 + datetime.timedelta(days=i) for i in range((d2 - d1).days + 1))

total = 0
for n in daterange(date1, date2):
    if n.weekday() < 5:
        total += 16
    else: 
        total += 24
print (total)
上面这些列的类型是datetime64[ns]

错误为TypeError:无法将序列转换为类“int”

有什么方法可以计算时间序列列的值吗?它可以位于新列中,也可以仅位于结果中


提前谢谢你

您需要使用apply函数来执行此操作。错误只是告诉您没有正确调用函数

在这种情况下,apply方法将函数应用于数据帧的每一行(逐行)

将dataframe函数调用更改为:

df['new_column'] = df.apply( lambda x : daterange(x['start'],x['end']))

如果您需要进一步帮助,请告诉我。

IIUC您可以使用以下简单映射:

样本系列:

In [110]: s = pd.date_range('2017-01-01', periods=10).to_series()

In [111]: s
Out[111]:
2017-01-01   2017-01-01
2017-01-02   2017-01-02
2017-01-03   2017-01-03
2017-01-04   2017-01-04
2017-01-05   2017-01-05
2017-01-06   2017-01-06
2017-01-07   2017-01-07
2017-01-08   2017-01-08
2017-01-09   2017-01-09
2017-01-10   2017-01-10
Freq: D, dtype: datetime64[ns]
映射

# DateLikeSeries.dt.weekday returns the day of the week with Monday=0, Sunday=6
In [94]: mapping = {i:16 if i<5 else 24 for i in range(7)}

In [95]: mapping
Out[95]: {0: 16, 1: 16, 2: 16, 3: 16, 4: 16, 5: 24, 6: 24}

In [112]: s.dt.weekday.map(mapping)
Out[112]:
2017-01-01    24
2017-01-02    16
2017-01-03    16
2017-01-04    16
2017-01-05    16
2017-01-06    16
2017-01-07    24
2017-01-08    24
2017-01-09    16
2017-01-10    16
Freq: D, dtype: int64


In [113]: s.dt.weekday.map(mapping).sum()
Out[113]: 184

您可以将自定义函数用于:

另一个解决方案,但我认为更复杂:

#reshape df
df1 = df.stack().reset_index()
df1.columns = ['i','c','date']
#groupby by index and resample to days, forward fill NaNs
df1 = df1.set_index('date').groupby('i').resample('D').ffill()
         .reset_index(level=0, drop=True).reset_index()
#get hours
df1['tot'] = np.where(df1['date'].dt.weekday < 5, 16, 24)
#sum by index
s = df1.groupby('i')['tot'].sum()
#join to original
df = df.join(s)
print (df.head(10))
       Start        End  tot
0 2017-02-03 2017-03-15  752
1 2017-02-05 2017-03-16  728
2 2017-02-06 2017-03-17  720
3 2017-02-10 2017-03-18  680
#重塑df
df1=df.stack().reset_index()
df1.columns=['i','c','date']
#按索引分组并重新采样到天,转发填充NaN
df1=df1.set_index('date').groupby('i').resample('D').ffill()
.reset_index(级别=0,下降=True)。reset_index()
#上班时间
df1['tot']=np.where(df1['date'].dt.weekday<5,16,24)
#按指数求和
s=df1.groupby('i')['tot'].sum()
#加入原创
df=df.join(s)
打印(测向头(10))
起止点
0 2017-02-03 2017-03-15  752
1 2017-02-05 2017-03-16  728
2 2017-02-06 2017-03-17  720
3 2017-02-10 2017-03-18  680
计时

df = pd.concat([df]*100).reset_index(drop=True) 
print (df)

def f(df):
    df1 = df.stack().reset_index()
    df1.columns = ['i','c','date']
    df1 = df1.set_index('date').groupby('i').resample('D').ffill().reset_index(level=0, drop=True).reset_index()
    df1['tot'] = np.where(df1['date'].dt.weekday < 5, 16, 24)
    s = df1.groupby('i')['tot'].sum()
    return df.join(s)

print (f(df))
mapping = {i:16 if i<5 else 24 for i in range(7)}

In [190]: %timeit (f(df))
1 loop, best of 3: 482 ms per loop

#MaxU solution
In [191]: %timeit df['oncall_hours'] =  df.apply(lambda x: pd.date_range(x['Start'], x['End']).to_series().dt.weekday.map(mapping).sum(), axis=1)
1 loop, best of 3: 531 ms per loop

In [192]: %timeit df['new'] = df.apply(lambda x : np.where(pd.date_range(x['Start'], x['End']).weekday < 5, 16, 24).sum(), axis=1)
10 loops, best of 3: 166 ms per loop
df=pd.concat([df]*100).重置索引(drop=True)
打印(df)
def f(df):
df1=df.stack().reset_index()
df1.columns=['i','c','date']
df1=df1.设置索引(“日期”).groupby(“i”).重新采样(“D”).ffill().重置索引(级别=0,下降=True).重置索引()
df1['tot']=np.where(df1['date'].dt.weekday<5,16,24)
s=df1.groupby('i')['tot'].sum()
返回df.join(s)
打印(f(df))

mapping={i:16如果我的或其他答案有用,请不要忘记。谢谢。
df['new'] = df.apply(lambda x : np.where(pd.date_range(x['Start'], x['End']).weekday < 5, 16, 24).sum(), axis=1)
print (df)
       Start        End  new
0 2017-02-03 2017-03-15  752
1 2017-02-05 2017-03-16  728
2 2017-02-06 2017-03-17  720
3 2017-02-10 2017-03-18  680
def f(x):
    b = pd.date_range(x['Start'], x['End']).weekday
    return np.where(b < 5, 16, 24).sum()

df['new'] = df.apply(f, axis=1)
print (df)
       Start        End  new
0 2017-02-03 2017-03-15  752
1 2017-02-05 2017-03-16  728
2 2017-02-06 2017-03-17  720
3 2017-02-10 2017-03-18  680
#reshape df
df1 = df.stack().reset_index()
df1.columns = ['i','c','date']
#groupby by index and resample to days, forward fill NaNs
df1 = df1.set_index('date').groupby('i').resample('D').ffill()
         .reset_index(level=0, drop=True).reset_index()
#get hours
df1['tot'] = np.where(df1['date'].dt.weekday < 5, 16, 24)
#sum by index
s = df1.groupby('i')['tot'].sum()
#join to original
df = df.join(s)
print (df.head(10))
       Start        End  tot
0 2017-02-03 2017-03-15  752
1 2017-02-05 2017-03-16  728
2 2017-02-06 2017-03-17  720
3 2017-02-10 2017-03-18  680
df = pd.concat([df]*100).reset_index(drop=True) 
print (df)

def f(df):
    df1 = df.stack().reset_index()
    df1.columns = ['i','c','date']
    df1 = df1.set_index('date').groupby('i').resample('D').ffill().reset_index(level=0, drop=True).reset_index()
    df1['tot'] = np.where(df1['date'].dt.weekday < 5, 16, 24)
    s = df1.groupby('i')['tot'].sum()
    return df.join(s)

print (f(df))
mapping = {i:16 if i<5 else 24 for i in range(7)}

In [190]: %timeit (f(df))
1 loop, best of 3: 482 ms per loop

#MaxU solution
In [191]: %timeit df['oncall_hours'] =  df.apply(lambda x: pd.date_range(x['Start'], x['End']).to_series().dt.weekday.map(mapping).sum(), axis=1)
1 loop, best of 3: 531 ms per loop

In [192]: %timeit df['new'] = df.apply(lambda x : np.where(pd.date_range(x['Start'], x['End']).weekday < 5, 16, 24).sum(), axis=1)
10 loops, best of 3: 166 ms per loop