Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/327.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在timeseries上合并两个数据帧_Python_Pandas_Dataframe_Merge - Fatal编程技术网

Python 如何在timeseries上合并两个数据帧

Python 如何在timeseries上合并两个数据帧,python,pandas,dataframe,merge,Python,Pandas,Dataframe,Merge,我希望创建的是一个数据帧,它看起来像: amount months category 0 6460 2018-01-31 budgeted 1 7905 2018-01-31 actual 2 11509 2018-02-28 budgeted 3 21502 2018-02-28 actual ... ... amount_x months category_x amoun

我希望创建的是一个数据帧,它看起来像:

    amount  months      category    
0   6460    2018-01-31  budgeted    
1   7905    2018-01-31  actual  
2   11509   2018-02-28  budgeted    
3   21502   2018-02-28  actual 
...
...
    amount_x    months      category_x  amount_y    category_y
0   6460        2018-01-31  budgeted    7905        actual
1   11509       2018-02-28  budgeted    21502       actual
...
...
我拥有的示例代码和我正在使用的基本数据是:

import pandas as pd
import string
import altair as alt

from random import randint

# 
# This is the general form of my 'real' dataframe. It is not subject to change.
#
months                  = [ 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ]
monthyAmounts           = [ "actual", "budgeted", "difference" ]

summary = []

summary.append( [ randint( -1000, 15000 ) for x in range( 0, len( months ) * len( monthyAmounts ) ) ] )
summary.append( [ randint( -1000, 15000 ) for x in range( 0, len( months ) * len( monthyAmounts ) ) ]  )
summary.append( [ randint( -1000, 15000 ) for x in range( 0, len( months ) * len( monthyAmounts ) ) ]  )

index   = pd.Index( [ 'Income', 'Expenses', 'Difference' ], name = 'type' )
columns = pd.MultiIndex.from_product( [months, monthyAmounts], names=['month', 'category'] )

summaryDF = pd.DataFrame( summary, index = index, columns = columns )

#
# From this point, I am trying to transform the summaryDF into something 
# I can use in a different context...
#

budgetMonths = pd.date_range( "January, 2018", periods = 12, freq = 'BM' )

idx = pd.IndexSlice
budgeted = summaryDF.loc[ 'Difference', idx[:, 'budgeted' ] ].cumsum()
actual   = summaryDF.loc[ 'Difference', idx[:, 'actual' ] ].cumsum()

budgeted.index = budgetMonths
actual.index = budgetMonths

budgetedDF = pd.DataFrame( { 'amount': budgeted, 'months': budgetMonths, 'category': 'budgeted' })
actualDF   = pd.DataFrame( { 'amount': actual, 'months': budgetMonths, 'category': 'actual' })

print( budgetedDF )
print( actualDF )

df3 = pd.merge( budgetedDF, actualDF, on = 'months' )
df3 
df3看起来像:

    amount  months      category    
0   6460    2018-01-31  budgeted    
1   7905    2018-01-31  actual  
2   11509   2018-02-28  budgeted    
3   21502   2018-02-28  actual 
...
...
    amount_x    months      category_x  amount_y    category_y
0   6460        2018-01-31  budgeted    7905        actual
1   11509       2018-02-28  budgeted    21502       actual
...
...
我想我快要得到我想要的了…只需要最后的合并步骤。

使用
pd.concat“合并”这些数据帧

df3 = (pd.concat([budgetedDF, actualDF])
         .sort_index()
         .reset_index(drop=True)
)

但是,您可能更喜欢这种表示方式:

df3 = (pd.concat([budgetedDF, actualDF])
         .drop('months', 1)
         .set_index('category', append=True)
         .unstack()
)

df3
           amount         
category   actual budgeted
2018-01-31   3612     2183
2018-02-28   3357     8902
2018-03-30   2828     9956
2018-04-30   2990    14475
2018-05-31   4446    25385
2018-06-29  19119    29119
2018-07-31  27296    40869
2018-08-31  38443    43400
2018-09-28  47978    52686
2018-10-31  49612    63384
2018-11-30  49272    74107
2018-12-31  56907    83653

听起来你想要的是

pd.concat([budgetedDF, actualDF]).sort_values('months').reset_index(drop=True)

如何在普通月份值上进行合并?@coldspeed:当我在文章顶部阅读所需行为时,他们希望为
月份的每个值保留单独的值。不,他们在脚本底部通过
pd.merge(budgetedDF,actualDF,on='months')
对月份进行合并。合并后,每个数据帧的结果都有一行。这正是我阅读时他们试图避免的行为。不,请参阅。他们希望在普通月份合并,然后合并结果作为单独的行存在。他们甚至使用了一个明确的指示符列。“你可能更喜欢这个表述”后面的concat行不是我最终需要的表述,正如我在文章顶部所指出的。原因是我需要一个表示法,该表示法与altair希望从数据帧生成图形的方式非常友好。@Eric当然,这是您的选择。让你知道这些是你的选择。我不知道牵牛星,所以我也不知道,谢谢。非常感谢。我仍然不确定你的答案与另一个答案有什么不同。@Eric他们实际上是一样的,但我在他们的答案中涵盖了比fuglede更多的基础。