Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/309.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在python中使用pandas计算满足某些条件的日期范围内的天数_Python_Python 3.x_Pandas_Python 2.7_Date - Fatal编程技术网

如何在python中使用pandas计算满足某些条件的日期范围内的天数

如何在python中使用pandas计算满足某些条件的日期范围内的天数,python,python-3.x,pandas,python-2.7,date,Python,Python 3.x,Pandas,Python 2.7,Date,我目前有一个pandas数据框,其中每一行都有一个日期范围,我想计算该范围内符合某些条件的天数: Item | Date Start | Date End ---------------------- A | 02/01/2019 | 03/02/2019 B | 04/02/2019 | 08/02/2019 例如:2019年1月至2019年期间的天数或2019年期间的天数: Item | Date Start | Date End | Days in Jan-2019 |

我目前有一个pandas数据框,其中每一行都有一个日期范围,我想计算该范围内符合某些条件的天数:

Item | Date Start | Date End
----------------------
A    | 02/01/2019 | 03/02/2019
B    | 04/02/2019 | 08/02/2019
例如:2019年1月至2019年期间的天数或2019年期间的天数:

Item | Date Start | Date End    | Days in Jan-2019 | Days in 2019 | Days in Q1 - 2019
------------------------------------------------------------------------------------
A    | 02/01/2019 | 03/02/2019  | 30               | 33           | 33
B    | 04/04/2019 | 08/04/2019  | 0                | 5            | 0
理想情况下,我不想为范围内的每个日期创建一行来进行此计算,并且可以保持行结构不变,但无法确定如何执行此操作,或者最有效的方法是什么


谢谢

这里有必要为每一对创建范围,如果可能的条件很少,则使用带条件的
sum

df['Date Start'] = pd.to_datetime(df['Date Start'], dayfirst=True)
df['Date End'] = pd.to_datetime(df['Date End'], dayfirst=True)

s = df.apply(lambda x: pd.date_range(x['Date Start'], x['Date End']), axis=1)
df['Days in Jan-2019'] = s.apply(lambda x: ((x.year == 2019) & (x.month == 1)).sum())
df['Days in 2019'] = s.apply(lambda x: (x.year == 2019).sum())
df['Days in Q1 2019'] = s.apply(lambda x: ((x.year == 2019) & (x.quarter == 1)).sum())

print (df)
  Item Date Start   Date End  Days in Jan-2019  Days in 2019  Days in Q1 2019
0    A 2019-01-02 2019-02-03                30            33               33
1    B 2019-02-04 2019-02-08                 0             5                5
另一个想法是按级别值展平日期范围和thn聚合
sum

df['Date Start'] = pd.to_datetime(df['Date Start'], dayfirst=True)
df['Date End'] = pd.to_datetime(df['Date End'], dayfirst=True)

df['r'] = df.apply(lambda x: pd.date_range(x['Date Start'], x['Date End']), axis=1)
df1 = df.explode('r')
df1['Days in Jan-2019'] = (df1['r'].dt.year == 2019) & (df1['r'].dt.month == 1)
df1['Days in 2019'] =     df1['r'].dt.year == 2019
df1['Days in Q1 2019'] = (df1['r'].dt.year == 2019) & (df1['r'].dt.quarter == 1)

df = df.drop('r', axis=1).join(df1.sum(level=0))
print (df)
  Item Date Start   Date End  Days in Jan-2019  Days in 2019  Days in Q1 2019
0    A 2019-01-02 2019-02-03                30            33               33
1    B 2019-02-04 2019-02-08                 0             5                5

这里有必要为每一对创建范围,如果可能的条件很少,则使用带条件的
sum

df['Date Start'] = pd.to_datetime(df['Date Start'], dayfirst=True)
df['Date End'] = pd.to_datetime(df['Date End'], dayfirst=True)

s = df.apply(lambda x: pd.date_range(x['Date Start'], x['Date End']), axis=1)
df['Days in Jan-2019'] = s.apply(lambda x: ((x.year == 2019) & (x.month == 1)).sum())
df['Days in 2019'] = s.apply(lambda x: (x.year == 2019).sum())
df['Days in Q1 2019'] = s.apply(lambda x: ((x.year == 2019) & (x.quarter == 1)).sum())

print (df)
  Item Date Start   Date End  Days in Jan-2019  Days in 2019  Days in Q1 2019
0    A 2019-01-02 2019-02-03                30            33               33
1    B 2019-02-04 2019-02-08                 0             5                5
另一个想法是按级别值展平日期范围和thn聚合
sum

df['Date Start'] = pd.to_datetime(df['Date Start'], dayfirst=True)
df['Date End'] = pd.to_datetime(df['Date End'], dayfirst=True)

df['r'] = df.apply(lambda x: pd.date_range(x['Date Start'], x['Date End']), axis=1)
df1 = df.explode('r')
df1['Days in Jan-2019'] = (df1['r'].dt.year == 2019) & (df1['r'].dt.month == 1)
df1['Days in 2019'] =     df1['r'].dt.year == 2019
df1['Days in Q1 2019'] = (df1['r'].dt.year == 2019) & (df1['r'].dt.quarter == 1)

df = df.drop('r', axis=1).join(df1.sum(level=0))
print (df)
  Item Date Start   Date End  Days in Jan-2019  Days in 2019  Days in Q1 2019
0    A 2019-01-02 2019-02-03                30            33               33
1    B 2019-02-04 2019-02-08                 0             5                5