Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 基于非年度条件的值的加减_Python_Python 3.x_Pandas_Python 2.7_Dataframe - Fatal编程技术网

Python 基于非年度条件的值的加减

Python 基于非年度条件的值的加减,python,python-3.x,pandas,python-2.7,dataframe,Python,Python 3.x,Pandas,Python 2.7,Dataframe,我有两个数据帧,都有日期。数据帧对每个类型和每个状态都有重复的日期,因为它是一个累积的求和帧,看起来像: Date State Type Value 2010-01-01 AK NUC 10 2010-02-01 AK NUC 10 2010-03-01 AK NUC 10 . . 2010-01-01 CO NUC 2 2010-0

我有两个数据帧,都有日期。数据帧对每个类型和每个状态都有重复的日期,因为它是一个累积的求和帧,看起来像:

Date          State     Type      Value
2010-01-01    AK        NUC       10
2010-02-01    AK        NUC       10
2010-03-01    AK        NUC       10
.
.
2010-01-01    CO        NUC       2
2010-02-01    CO        NUC       2
.
.
2010-01-01    AK        WND       20
2010-02-01    AK        WND       21
.
.
2018-08-01   .......
Operating Date   Retirement Date   Type    State       Value
2010-02-01       2010-04-01        NUC     AK          1
2011-02-01       2014-02-01        NUC     AK          2
2011-03-01       2016-03-01        NUC     AK          10
.
.

.
2018-08-01   .......
我需要做的是取第二个数据框,根据的“运行日期”向每个的“类型”和的“状态”添加,然后根据的“退休日期”减去,所有与原始的“日期”相关的内容。第二个数据帧看起来像:

Date          State     Type      Value
2010-01-01    AK        NUC       10
2010-02-01    AK        NUC       10
2010-03-01    AK        NUC       10
.
.
2010-01-01    CO        NUC       2
2010-02-01    CO        NUC       2
.
.
2010-01-01    AK        WND       20
2010-02-01    AK        WND       21
.
.
2018-08-01   .......
Operating Date   Retirement Date   Type    State       Value
2010-02-01       2010-04-01        NUC     AK          1
2011-02-01       2014-02-01        NUC     AK          2
2011-03-01       2016-03-01        NUC     AK          10
.
.

.
2018-08-01   .......
例如,在AK上,输出将进行加法和减法,如下所示:

if AK(Date) == AK(Operating Date):
      AK(Value, Date) = AK(Value, Date) + AK(Value, Operating Date)

elif AK(Date) == AK(Retirement Date):
      AK(Value, Date) = AK(Value, Date) - AK(Value, Retirement Date)
else:
      continue
实际输出数据帧(仅针对AK'NUC')为:

Date          State     Type      Value
2010-01-01    AK        NUC       10
2010-02-01    AK        NUC       11
2010-03-01    AK        NUC       11
2010-04-01    AK        NUC       10
.
.
2011-01-01    AK        NUC       10
2011-02-01    AK        NUC       12
2011-03-01    AK        NUC       22
2011-04-01    AK        NUC       22
.
.
2016-01-01    AK        NUC       22
2010-02-01    AK        NUC       22
2010-03-01    AK        NUC       12
2010-04-01    AK        NUC       12
.
.

如何进行这种类型的操作?

下面代码中使用的主数据帧

df

Date        State   Type    Value
2010-01-01  AK      NUC     10
2010-02-01  AK      NUC     10
2010-03-01  AK      NUC     10
2010-01-01  CO      NUC     2
2010-02-01  CO      NUC     2
2010-01-01  AK      WND     20
2010-02-01  AK      WND     21
要添加到main的更改,请注意,我将空格替换为_

delta

Operating_Date  Retirement_Date Type    State   Value
2010-02-01      2010-04-01      NUC     AK      1
2011-02-01      2014-02-01      NUC     AK      2
2011-03-01      2016-03-01      NUC     AK      10
攻击计划是使用一个日期列,为了做到这一点,我们需要将退休日期和工作日期合并为一列,当我们使用退休日期时,我们给值一个负数,并保留工作日期的正值

#We first make a copy of the delta, we will call these cancellations and use the 
#Retirement_Date and the value in negative
cx = delta.copy()
cx['Date']=cx['Retirement_Date']
cx.drop(['Operating_Date','Retirement_Date'],axis=1,inplace=True)
cx['Value'] *=-1

#In the original delta we assign operating date as the date value
delta['Date'] = delta['Operating_Date']
delta.drop(['Operating_Date','Retirement_Date'],axis=1,inplace=True)

#We then append the cancellations to the main delta frame and rename the values 
#column to delta
delta = delta.append(cx)
delta.rename(columns={'Value':'Delta'},inplace=True)
现在,我们有了一个数据框架,其中有一个日期列,包含每个日期要跟踪的所有积极和消极更改

delta

Type    State   Delta   Date
NUC     AK      1       2010-02-01
NUC     AK      2       2011-02-01
NUC     AK      10      2011-03-01
NUC     AK      -1      2010-04-01
NUC     AK      -2      2014-02-01
NUC     AK      -10     2016-03-01
现在我们需要做的就是将更改的累积值添加到主数据帧中

#we start by merging the data frames, as the column names are the same and we want to merge on all of them we just specify that it's an outer join
df = df.merge(delta,how='outer')
#if there are any new dates in the delta that aren't in the main dataframe we want to bring forth our cumulative sum
#but first we need to make sure we sort by date so the cumulative sum works
df.sort_values(['Type','State','Date'],inplace=True)

df['Value'] = df.groupby(['State','Type'])['Value'].ffill()

#for the dates where we have no changes we fill with zeros
df['Delta'].fillna(0,inplace=True)

#we can now add the cumilative sum of the delta to the values column

df['Value'] +=df.groupby(['State','Type'])['Delta'].cumsum().astype(int)

#and lastly we can remove the delta column again and we're done
del df['Delta']
最终的数据帧,希望是您所追求的

df

Date        State   Type    Value
2010-01-01  AK      NUC     10
2010-02-01  AK      NUC     11
2010-03-01  AK      NUC     11
2010-04-01  AK      NUC     10
2011-02-01  AK      NUC     12
2011-03-01  AK      NUC     22
2014-02-01  AK      NUC     20
2016-03-01  AK      NUC     10
2010-01-01  CO      NUC     2
2010-02-01  CO      NUC     2
2010-01-01  AK      WND     20
2010-02-01  AK      WND     21

谢谢你这么详细的回复。不仅解决了这个问题,而且发表了大量评论。A++:)