Python 两个数据帧的无迭代求交
所以我有两个df 第一:Python 两个数据帧的无迭代求交,python,pandas,Python,Pandas,所以我有两个df 第一: Latitude Longitude Area 0 -25.66026 28.0914 HappyPlace 1 -25.67923 28.10525 SadPlace 2 -30.68456 19.21694 AveragePlace 3 -30.12345 22.34256 CoolPlace 4 -15.12546 17.12365 BadPlace 第二: Lat
Latitude Longitude Area
0 -25.66026 28.0914 HappyPlace
1 -25.67923 28.10525 SadPlace
2 -30.68456 19.21694 AveragePlace
3 -30.12345 22.34256 CoolPlace
4 -15.12546 17.12365 BadPlace
第二:
Latitude Longitude Population
0 -25.66026 28.0914 5000
1 -25.14568 28.10525 1750
2 -30.68456 19.21694 6000
3 -30.65375 22.34256 8000
4 -15.90458 17.12365 5600
我想得到纬度/经度相同的地方,这样我就知道人口了最重要的是,我只需要我真正的项目的交叉点
结果df:
Latitude Longitude Area
0 -25.66026 28.0914 HappyPlace
2 -30.68456 19.21694 AveragePlace
我试过:
pd.merge(df1, df2, on=['LATITUDE'], how='inner')
不工作的结果很奇怪
set(df1['LATITUDE']).intersection(set(df2['LATITUDE'))
df1[(df1['LATITUDE'] == df2['LATITUDE'])]
df1.where(df1.LATITUDE == df2.LATITUDE)
所有返回值错误:只能比较标签相同的系列对象
(实际Df非常大,两列都是浮动的)pd.merge()
失败,出现KeyError
,因为LATITUDE
是错误的键
以下MCVE按预期工作
import pandas as pd
import numpy as np
print(pd.__version__)
df1_string = """-25.66026 28.0914 HappyPlace
-25.67923 28.10525 SadPlace
-30.68456 19.21694 AveragePlace
-30.12345 22.34256 CoolPlace
-15.12546 17.12365 BadPlace"""
df2_string = """-25.66026 28.0914 5000
-25.14568 28.10525 1750
-30.68456 19.21694 6000
-30.65375 22.34256 8000
-15.90458 17.12365 5600"""
df1 = pd.DataFrame([x.split() for x in df1_string.split('\n')], columns=['Latitude', 'Longitude', 'Population'])
df2 = pd.DataFrame([x.split() for x in df2_string.split('\n')], columns=['Latitude', 'Longitude', 'Population'])
result = pd.merge(df1, df2, on=['Latitude'], how='inner')
print(set(df1['Latitude']).intersection(set(df2['Latitude'])))
print(df1[(df1['Latitude'] == df2['Latitude'])])
print(df1.where(df1.Latitude == df2.Latitude))
print(result)
产生
0.24.2
{'-25.66026', '-30.68456'}
Latitude Longitude Population
0 -25.66026 28.0914 HappyPlace
2 -30.68456 19.21694 AveragePlace
Latitude Longitude Population
0 -25.66026 28.0914 HappyPlace
1 NaN NaN NaN
2 -30.68456 19.21694 AveragePlace
3 NaN NaN NaN
4 NaN NaN NaN
Latitude Longitude_x Population_x Longitude_y Population_y
0 -25.66026 28.0914 HappyPlace 28.0914 5000
1 -30.68456 19.21694 AveragePlace 19.21694 6000
您需要:
df1[df1['LATITUDE'].isin(df2['LATITUDE'])]
谢谢您的帮助,但是有人能解释一下pd.merge()为什么不起作用吗?