Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/318.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
基于python中另一个数据帧中的日期,对一个数据帧中的值的每周总计求和_Python_Pandas_Numpy_Datetime_Dataframe - Fatal编程技术网

基于python中另一个数据帧中的日期,对一个数据帧中的值的每周总计求和

基于python中另一个数据帧中的日期,对一个数据帧中的值的每周总计求和,python,pandas,numpy,datetime,dataframe,Python,Pandas,Numpy,Datetime,Dataframe,我想对另一个数据帧定义的特定日期的数据帧的一列中的值求和 我的第一个日期数据框如下所示: import numpy as np import pandas as pd start_date = ["2-22-16 00:00:00", "2-29-16 00:00:00", "3-7-16 00:00:00", "3-14-16 00:00:00", "3-21-16 00:00:00", "3-28-16 00:00:00", "4-4-16 00:00:00", "4-11-16 00:0

我想对另一个数据帧定义的特定日期的数据帧的一列中的值求和

我的第一个日期数据框如下所示:

import numpy as np
import pandas as pd

start_date = ["2-22-16 00:00:00", "2-29-16 00:00:00", "3-7-16 00:00:00", "3-14-16 00:00:00", "3-21-16 00:00:00", "3-28-16 00:00:00", "4-4-16 00:00:00", "4-11-16 00:00:00", "4-18-16 00:00:00", "4-25-16 00:00:00", "5-2-16 00:00:00", "5-9-16 00:00:00", "5-16-16 00:00:00", "5-23-16 00:00:00", "5-30-16 00:00:00", "6-6-16 00:00:00", "6-13-16 00:00:00", "6-20-16 00:00:00", "6-27-16 00:00:00", "7-4-16 00:00:00", "7-11-16 00:00:00", "7-18-16 00:00:00", "7-25-16 00:00:00", "8-08-16 00:00:00", "8-22-16 00:00:00", "8-29-16 00:00:00", "9-5-16 00:00:00", "9-12-16 00:00:00", "9-19-16 00:00:00", "9-26-16 00:00:00", "10-3-16 00:00:00", "10-10-16 00:00:00", "10-17-16 00:00:00", "10-24-16 00:00:00", "10-31-16 00:00:00", "11-7-16 00:00:00", "11-14-16 00:00:00", "11-21-16 00:00:00", "1-23-17 00:00:00", "1-30-17 00:00:00", "2-06-17 00:00:00", "3-13-17 00:00:00", "3-27-17 00:00:00", "6-19-17 00:00:00", "6-26-17 00:00:00"]
start_date = [pd.to_datetime(d) for d in start_date]
end_date = pd.DatetimeIndex(start_date) + pd.DateOffset(7)
ndf = pd.DataFrame({'start':pd.to_datetime(start_date),'end':end_date}); ndf.head()
dates = ["4-17-16 04:00:00", "4-16-16 19:30:00", "4-16-16 19:00:00", "2-24-16 09:00:00", "3-21-16 02:00:00", "3-18-16 10:00:00", "3-24-16 05:00:00", "3-11-16 00:00:00"]
df = pd.DataFrame(
    {'timestamp': dates,
     'value': np.random.randint(1,25,size=(8,))})
我想要的是在
ndf
中定义的周内的另一个数据帧的值。我的另一个数据帧如下所示:

import numpy as np
import pandas as pd

start_date = ["2-22-16 00:00:00", "2-29-16 00:00:00", "3-7-16 00:00:00", "3-14-16 00:00:00", "3-21-16 00:00:00", "3-28-16 00:00:00", "4-4-16 00:00:00", "4-11-16 00:00:00", "4-18-16 00:00:00", "4-25-16 00:00:00", "5-2-16 00:00:00", "5-9-16 00:00:00", "5-16-16 00:00:00", "5-23-16 00:00:00", "5-30-16 00:00:00", "6-6-16 00:00:00", "6-13-16 00:00:00", "6-20-16 00:00:00", "6-27-16 00:00:00", "7-4-16 00:00:00", "7-11-16 00:00:00", "7-18-16 00:00:00", "7-25-16 00:00:00", "8-08-16 00:00:00", "8-22-16 00:00:00", "8-29-16 00:00:00", "9-5-16 00:00:00", "9-12-16 00:00:00", "9-19-16 00:00:00", "9-26-16 00:00:00", "10-3-16 00:00:00", "10-10-16 00:00:00", "10-17-16 00:00:00", "10-24-16 00:00:00", "10-31-16 00:00:00", "11-7-16 00:00:00", "11-14-16 00:00:00", "11-21-16 00:00:00", "1-23-17 00:00:00", "1-30-17 00:00:00", "2-06-17 00:00:00", "3-13-17 00:00:00", "3-27-17 00:00:00", "6-19-17 00:00:00", "6-26-17 00:00:00"]
start_date = [pd.to_datetime(d) for d in start_date]
end_date = pd.DatetimeIndex(start_date) + pd.DateOffset(7)
ndf = pd.DataFrame({'start':pd.to_datetime(start_date),'end':end_date}); ndf.head()
dates = ["4-17-16 04:00:00", "4-16-16 19:30:00", "4-16-16 19:00:00", "2-24-16 09:00:00", "3-21-16 02:00:00", "3-18-16 10:00:00", "3-24-16 05:00:00", "3-11-16 00:00:00"]
df = pd.DataFrame(
    {'timestamp': dates,
     'value': np.random.randint(1,25,size=(8,))})
现在我想创建一个新的数据框,对
ndf
中日期之间的
df
中的所有
值进行求和。所以我创建了这个函数:

def get_dates(x):
    # Select the df values between start and ending datetime. 
    n = df[(df['timestamp']>ndf['start'])&(df['timestamp']<ndf['end'])]
    # Return sum of values
    return n.values[0],n['value'].sum()
def get_日期(x):
#选择开始和结束日期时间之间的df值。
n=df[(df['timestamp']>ndf['start'])和(df['timestamp']ndf['start'])和(df['timestamp']当您希望按均匀间隔的时间间隔对数据进行分组时使用

df.set_index('timestamp').resample('w-mon', label='left').sum().reset_index()
返回:

   timestamp  value
0 2016-02-22   22.0
1 2016-02-29    NaN
2 2016-03-07   13.0
3 2016-03-14   20.0
4 2016-03-21    9.0
5 2016-03-28    NaN
6 2016-04-04    NaN
7 2016-04-11   34.0

您基本上是想按7天/每周的间隔进行分组,还是需要按不相等的日期范围(不同开始和结束日期长度的时间段)进行分组?@Jarad该分组始终为7天,但并非在2016年至2017年的所有周内都是固定的。请注意,
日期的第一个范围为2016年4月7日至2016年11月21日,然后从2017年1月23日开始跳至2017年3月27日,然后从2017年6月19日再次跳至2017年6月26日。但间隔始终为7天。