Python 熊猫:在一列上合并两个不同大小的数据帧
我的第一个数据帧(df1)如下所示:Python 熊猫:在一列上合并两个不同大小的数据帧,python,pandas,dataframe,Python,Pandas,Dataframe,我的第一个数据帧(df1)如下所示: pvalue trend time 0 0.000065 0.000076 2019-03-18 04:00:04 1 0.000087 0.000098 2019-03-18 04:00:06 2 0.000000 0.000000 2019-03-18 04:00:22 3 0.000000 0.000087
pvalue trend time
0 0.000065 0.000076 2019-03-18 04:00:04
1 0.000087 0.000098 2019-03-18 04:00:06
2 0.000000 0.000000 2019-03-18 04:00:22
3 0.000000 0.000087 2019-03-18 04:02:29
4 0.000000 0.000000 2019-03-18 04:03:04
5 0.000000 0.000023 2019-03-18 04:03:05
6 0.000000 0.000000 2019-03-18 04:03:18
7 0.000000 0.000067 2019-03-18 04:18:55
8 0.000000 0.000000 2019-03-18 04:18:56
9 0.000000 0.000000 2019-03-18 04:20:41
我的第二个数据(df2)如下所示:
time price
0 2019-03-18 04:00:00 0.00190633
1 2019-03-18 04:00:01 0.00190633
2 2019-03-18 04:00:02 0.00190633
3 2019-03-18 04:00:03 0.00190633
4 2019-03-18 04:00:04 0.00190633
5 2019-03-18 04:00:05 0.00190633
6 2019-03-18 04:00:06 0.00190800
7 2019-03-18 04:00:07 0.00190800
8 2019-03-18 04:00:08 0.00190800
9 2019-03-18 04:00:09 0.00190800
df2['time']
每行有一秒的变化。但是在df1上,每一个df1['time']
之间都有几秒钟的间隔。。。我想要的是:
time price pvalue trend
0 2019-03-18 04:00:00 0.00190633 0.000000 0.000000
1 2019-03-18 04:00:01 0.00190633 0.000000 0.000000
2 2019-03-18 04:00:02 0.00190633 0.000000 0.000000
3 2019-03-18 04:00:03 0.00190633 0.000000 0.000000
4 2019-03-18 04:00:04 0.00190633 0.000065 0.000076
5 2019-03-18 04:00:05 0.00190633 0.000000 0.000000
6 2019-03-18 04:00:06 0.00190800 0.000087 0.000098
所以基本上有所有的秒数,当df1中有pvalue和trend的数据时,将它们放在新的数据帧中。我尝试的是以下内容:df\u all=df\u pvalue\u trade.merge(df\u check,on='time',left\u index=True)
但我只有df1的行,而不是像我的示例中那样每秒钟一行。。。有什么想法吗?谢谢
我对上述代码的测试结果如下:
pvalue trend time mkt_result price
6 0.000000 0.000000 2019-03-18 04:00:06 reject Ha := upward OR downward trend 0.00190800
21 0.000000 0.000000 2019-03-18 04:00:21 reject Ha := upward OR downward trend 0.00190800
22 0.000000 0.000000 2019-03-18 04:00:22 reject Ha := upward OR downward trend 0.00190800
149 0.000000 0.000000 2019-03-18 04:02:29 reject Ha := upward OR downward trend 0.00190594
184 0.000000 0.000000 2019-03-18 04:03:04 reject Ha := upward OR downward trend 0.00190594
185 0.000000 0.000000 2019-03-18 04:03:05 reject Ha := upward OR downward trend 0.00190594
198 0.000000 0.000000 2019-03-18 04:03:18 reject Ha := upward OR downward trend 0.00190594
这不是我想要的 用于:
另外,如果需要,只替换df1.列
中与df2.列
不同的NaN
s列:
d = dict.fromkeys(df1.columns.difference(df2.columns), 0)
print (d)
{'pvalue': 0, 'trend': 0}
df = pd.merge(df2, df1, on='time', how='left').fillna(d)
print (df)
time price pvalue trend
0 2019-03-18 04:00:00 0.001906 0.000000 0.000000
1 2019-03-18 04:00:01 0.001906 0.000000 0.000000
2 2019-03-18 04:00:02 0.001906 0.000000 0.000000
3 2019-03-18 04:00:03 0.001906 0.000000 0.000000
4 2019-03-18 04:00:04 0.001906 0.000065 0.000076
5 2019-03-18 04:00:05 0.001906 0.000000 0.000000
6 2019-03-18 04:00:06 0.001908 0.000087 0.000098
7 2019-03-18 04:00:07 0.001908 0.000000 0.000000
8 2019-03-18 04:00:08 0.001908 0.000000 0.000000
9 2019-03-18 04:00:09 0.001908 0.000000 0.000000
df_pvalue_trade.merge(df_check,on='time',left_index=True,how='left')@Wen Ben,不,它给出的结果与我的代码行相同。。。请你把那份市面复印件擦掉好吗?感谢将how=left-to'right'?@Viktor.w-对于我来说,我可以像您期望的那样工作-
df=pd.merge(df2,df1,on='time',how='left')。fillna(0)
@jezrael,第一个是右边的,您的代码工作正常,谢谢!
d = dict.fromkeys(df1.columns.difference(df2.columns), 0)
print (d)
{'pvalue': 0, 'trend': 0}
df = pd.merge(df2, df1, on='time', how='left').fillna(d)
print (df)
time price pvalue trend
0 2019-03-18 04:00:00 0.001906 0.000000 0.000000
1 2019-03-18 04:00:01 0.001906 0.000000 0.000000
2 2019-03-18 04:00:02 0.001906 0.000000 0.000000
3 2019-03-18 04:00:03 0.001906 0.000000 0.000000
4 2019-03-18 04:00:04 0.001906 0.000065 0.000076
5 2019-03-18 04:00:05 0.001906 0.000000 0.000000
6 2019-03-18 04:00:06 0.001908 0.000087 0.000098
7 2019-03-18 04:00:07 0.001908 0.000000 0.000000
8 2019-03-18 04:00:08 0.001908 0.000000 0.000000
9 2019-03-18 04:00:09 0.001908 0.000000 0.000000