Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/clojure/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫:合并日期时间索引上的数据帧_Python_Pandas - Fatal编程技术网

Python 熊猫:合并日期时间索引上的数据帧

Python 熊猫:合并日期时间索引上的数据帧,python,pandas,Python,Pandas,我有以下两个数据帧,我已将日期设置为日期时间索引df.set_索引(pd.to_DateTime(df['date']),inplace=True),并希望在日期合并或加入: df.head(5) catcode_amt type feccandid_amt amount date 1915-12-31 A5000 24K H6TX08100 1000 1916-12-31 T6100 24K H8CA52

我有以下两个数据帧,我已将日期设置为日期时间索引
df.set_索引(pd.to_DateTime(df['date']),inplace=True)
,并希望在日期合并或加入:

df.head(5)
        catcode_amt type    feccandid_amt   amount
date                
1915-12-31  A5000   24K     H6TX08100   1000
1916-12-31  T6100   24K     H8CA52052   500
1954-12-31  H3100   24K     S8AK00090   1000
1985-12-31  J7120   24E     H8OH18088   36
1997-12-31  z9600   24K     S6ND00058   2000


d.head(5)
         catcode_disp disposition   feccandid_disp  bills
date                
2007-12-31  A0000   support     S4HI00011               1
2007-12-31  A1000   oppose      S4IA00020', 'P20000741  1
2007-12-31  A1000   support     S8MT00010               1
2007-12-31  A1500   support     S6WI00061               2
2007-12-31  A1600   support     S4IA00020', 'P20000741  3
我尝试了以下两种方法,但都返回了MemoryError:

df.join(d, how='right')
我在没有将日期设置为索引的数据帧上使用下面的代码

merge=pd.merge(df,d, how='inner', on='date')

看起来日期是索引,在这种情况下,您希望在索引上合并,而不是在列上合并。如果有两个数据帧,
df_1
df_2


df_1.merge(df_2,left_index=True,right_index=True,how='inner')
如果需要在函数中按索引合并,可以添加参数
left_index=True
right_index=True

样本(更改了
d
中索引的第一个值以进行匹配):

或者您可以使用:

编辑:是对的:

我将重复项添加到数据帧
df
(索引中的最后2个值):


我遇到了类似的问题。您很可能有很多
NaT
s.
我删除了我所有的
NaT
s,然后执行了连接并能够连接它

df = df[df['date'].notnull() == True].set_index('date')
d = d[d['date'].notnull() == True].set_index('date')
df.join(d, how='right')

@耶兹雷尔:我刚试过你推荐的代码:我仍然收到一个记忆错误。您还有其他想法吗?您的
内存的大小是多少?数据帧的形状是什么<代码>打印df.shape
打印d.shape
?我的
df.shape
(389194,4)
和我的
d.shape是(2910,4)
hmm,可能是帮助函数
concat
,请参见我答案的编辑。谢谢你的建议。我刚试过,但仍有记忆错误。您还有其他想法吗?请尝试使用两个数据帧,这两个数据帧是您的数据的一小部分,例如每个数据帧的最后100行。
print df
           catcode_amt type feccandid_amt  amount
date                                             
1915-12-31       A5000  24K     H6TX08100    1000
1916-12-31       T6100  24K     H8CA52052     500
1954-12-31       H3100  24K     S8AK00090    1000
1985-12-31       J7120  24E     H8OH18088      36
1997-12-31       z9600  24K     S6ND00058    2000

print d
           catcode_disp disposition            feccandid_disp  bills
date                                                                
1997-12-31        A0000     support                 S4HI00011    1.0
2007-12-31        A1000      oppose  S4IA00020', 'P20000741 1    NaN
2007-12-31        A1000     support                 S8MT00010    1.0
2007-12-31        A1500     support                 S6WI00061    2.0
2007-12-31        A1600     support  S4IA00020', 'P20000741 3    NaN

merge=pd.merge(df,d, how='inner', left_index=True, right_index=True)
print merge
           catcode_amt type feccandid_amt  amount catcode_disp disposition  \
date                                                                         
1997-12-31       z9600  24K     S6ND00058    2000        A0000     support   

           feccandid_disp  bills  
date                              
1997-12-31      S4HI00011    1.0  
print pd.concat([df,d], join='inner', axis=1)

date                                                                         
1997-12-31       z9600  24K     S6ND00058    2000        A0000     support   

           feccandid_disp  bills  
date                              
1997-12-31      S4HI00011    1.0  
print df
           catcode_amt type feccandid_amt  amount
date                                             
1915-12-31       A5000  24K     H6TX08100    1000
1916-12-31       T6100  24K     H8CA52052     500
1954-12-31       H3100  24K     S8AK00090    1000
2007-12-31       J7120  24E     H8OH18088      36
2007-12-31       z9600  24K     S6ND00058    2000

print d
           catcode_disp disposition            feccandid_disp  bills
date                                                                
1997-12-31        A0000     support                 S4HI00011    1.0
2007-12-31        A1000      oppose  S4IA00020', 'P20000741 1    NaN
2007-12-31        A1000     support                 S8MT00010    1.0
2007-12-31        A1500     support                 S6WI00061    2.0
2007-12-31        A1600     support  S4IA00020', 'P20000741 3    NaN

merge=pd.merge(df,d, how='inner', left_index=True, right_index=True)
print merge
           catcode_amt type feccandid_amt  amount catcode_disp disposition  \
date                                                                         
2007-12-31       J7120  24E     H8OH18088      36        A1000      oppose   
2007-12-31       J7120  24E     H8OH18088      36        A1000     support   
2007-12-31       J7120  24E     H8OH18088      36        A1500     support   
2007-12-31       J7120  24E     H8OH18088      36        A1600     support   
2007-12-31       z9600  24K     S6ND00058    2000        A1000      oppose   
2007-12-31       z9600  24K     S6ND00058    2000        A1000     support   
2007-12-31       z9600  24K     S6ND00058    2000        A1500     support   
2007-12-31       z9600  24K     S6ND00058    2000        A1600     support   

                      feccandid_disp  bills  
date                                         
2007-12-31  S4IA00020', 'P20000741 1    NaN  
2007-12-31                 S8MT00010    1.0  
2007-12-31                 S6WI00061    2.0  
2007-12-31  S4IA00020', 'P20000741 3    NaN  
2007-12-31  S4IA00020', 'P20000741 1    NaN  
2007-12-31                 S8MT00010    1.0  
2007-12-31                 S6WI00061    2.0  
2007-12-31  S4IA00020', 'P20000741 3    NaN  
df = df[df['date'].notnull() == True].set_index('date')
d = d[d['date'].notnull() == True].set_index('date')
df.join(d, how='right')