Python 合并熊猫中的2个数据帧
我有两个数据帧:GPS坐标Python 合并熊猫中的2个数据帧,python,python-2.7,pandas,merge,gps,Python,Python 2.7,Pandas,Merge,Gps,我有两个数据帧:GPS坐标 Time X Y Z 2013-06-01 00:00:00 13512.466575 -12220.845913 19279.970720 2013-06-01 00:00:00 -13529.778408 -14013.560399 -18060.112972 2013-06-01 00:00:00 25108.907276 8764.536182
Time X Y Z
2013-06-01 00:00:00 13512.466575 -12220.845913 19279.970720
2013-06-01 00:00:00 -13529.778408 -14013.560399 -18060.112972
2013-06-01 00:00:00 25108.907276 8764.536182 1594.215305
2013-06-01 00:00:00 -8436.586675 -22468.562354 -11354.726511
2013-06-01 00:05:00 13559.288748 -11476.738832 19702.063737
2013-06-01 00:05:00 -13500.120049 -14702.564328 -17548.488127
2013-06-01 00:05:00 25128.357948 8883.802142 664.732379
2013-06-01 00:05:00 -8346.854582 -22878.993160 -10544.640975
Time X Y Z
2013-06-01 00:00:00 0.248752905273E+05 -0.557450976562E+04 -0.726176757812E+03
2013-06-01 00:15:00 0.148314306641E+05 0.510153710938E+04 0.201156157227E+05
2013-06-01 00:15:00 0.242346674805E+05 -0.562089208984E+04 0.561714257812E+04
2013-06-01 00:15:00 0.195601284180E+05 -0.122148081055E+05 -0.108823476562E+05
2013-06-01 00:15:00 0.336192968750E+04 -0.122589394531E+05 -0.220986958008E+05
和格洛纳斯坐标
Time X Y Z
2013-06-01 00:00:00 13512.466575 -12220.845913 19279.970720
2013-06-01 00:00:00 -13529.778408 -14013.560399 -18060.112972
2013-06-01 00:00:00 25108.907276 8764.536182 1594.215305
2013-06-01 00:00:00 -8436.586675 -22468.562354 -11354.726511
2013-06-01 00:05:00 13559.288748 -11476.738832 19702.063737
2013-06-01 00:05:00 -13500.120049 -14702.564328 -17548.488127
2013-06-01 00:05:00 25128.357948 8883.802142 664.732379
2013-06-01 00:05:00 -8346.854582 -22878.993160 -10544.640975
Time X Y Z
2013-06-01 00:00:00 0.248752905273E+05 -0.557450976562E+04 -0.726176757812E+03
2013-06-01 00:15:00 0.148314306641E+05 0.510153710938E+04 0.201156157227E+05
2013-06-01 00:15:00 0.242346674805E+05 -0.562089208984E+04 0.561714257812E+04
2013-06-01 00:15:00 0.195601284180E+05 -0.122148081055E+05 -0.108823476562E+05
2013-06-01 00:15:00 0.336192968750E+04 -0.122589394531E+05 -0.220986958008E+05
我需要根据列时间合并它们-为了只在同一时间获得卫星的坐标(我需要特定时间的所有GPS坐标和所有Glonass坐标),上面示例的结果应该如下所示:
Time X_gps Y_gps Z_gps X_glonass Y_glonass Z_glonass
0 2013-06-01 00:00:00 13512.466575 -12220.845913 19279.970720 0.248752905273E+05 -0.557450976562E+04 -0.726176757812E+03
1 2013-06-01 00:00:00 -13529.778408 -14013.560399 -18060.112972
2 2013-06-01 00:00:00 25108.907276 8764.536182 1594.215305
3 2013-06-01 00:00:00 -8436.586675 -22468.562354 -11354.726511
我最后做的是coord=pd.merge(d_-gps,d_-glonass,on='Time',how='internal',后缀=('u-gps','u-glonass'))
,但它复制glonass坐标以填补数据帧中的空白。我应该改变什么来获得我想要的结果?
我是熊猫队的新手,所以我真的需要你的帮助。在合并后(我冒昧地先重命名了列),你可以在列上迭代,测试重复的,并将它们设置为NaN
,你不能设置为空,因为列的数据类型是浮点,设置为空字符串将引发无效的文字错误:
In [272]:
df1 = df1.rename(columns={'X':'X_glonass', 'Y':'Y_glonass', 'Z':'Z_glonass'})
df = df.rename(columns={'X':'X_gps', 'Y':'Y_gps', 'Z':'Z_gps'})
merged = df.merge(df1, on='Time')
In [278]:
for col in merged.columns[1:]:
merged.loc[merged[col].duplicated(),col] = np.NaN
merged
Out[278]:
Time X_gps Y_gps Z_gps X_glonass \
0 2013-06-01 13512.466575 -12220.845913 19279.970720 24875.290527
1 2013-06-01 -13529.778408 -14013.560399 -18060.112972 NaN
2 2013-06-01 25108.907276 8764.536182 1594.215305 NaN
3 2013-06-01 -8436.586675 -22468.562354 -11354.726511 NaN
Y_glonass Z_glonass
0 -5574.509766 -726.176758
1 NaN NaN
2 NaN NaN
3 NaN NaN
您是否尝试过使用how=left
进行合并?顺便说一句,glonass
坐标不应该出现在结果coord
df中的所有行上吗?是的,它不起作用。它返回所有gps坐标和重复的glonass坐标重复的值有什么问题,语义是否正确?在该示例中,我只有一个glonass坐标在时间00:00:00,所以我只需要在df中输入一个。您的gps df中有4个“2013-06-01 00:00:00”条目,而glonass df中有1个条目,如果你合并了这些,为什么你不希望看到你的glosnass值被复制到这些时间条目中的每一个呢?这是问题@Patricia的答案,但是让NaN
无处不在对我来说没有任何意义,不复制显然应该复制的数据是没有意义的