合并python数据帧和嵌套列表_Python_Pandas_Dataframe

合并python数据帧和嵌套列表

python pandas dataframe

合并python数据帧和嵌套列表,python,pandas,dataframe,Python,Pandas,Dataframe,无法使用where条件组合/合并/交叉连接数据帧和嵌套列表（如果嵌套列表中最近的zip等于实际zip，请不要在最近的zip字段中显示），以获得所需的输出到目前为止我掌握的代码 x=0 print(test_df) print(type(test_df)) for x in range(5): nearest_result=search.by_coordinates(test_df.iloc[x,1],test_df.iloc[x,2], radius=30,returns=3)

无法使用where条件组合/合并/交叉连接数据帧和嵌套列表（如果嵌套列表中最近的zip等于实际zip，请不要在最近的zip字段中显示），以获得所需的输出

到目前为止我掌握的代码

x=0
print(test_df)
print(type(test_df))
for x in range(5):      
 nearest_result=search.by_coordinates(test_df.iloc[x,1],test_df.iloc[x,2], radius=30,returns=3)
n_zip=[res.zipcode for res in nearest_result]
print(n_zip)
print(type(n_zip))

数据帧和嵌套列表：

期望的输出：

也许可以提出一种更简单的方法，但作为第一步，首先要删除“最近的”：

>>> print(test_df)  # /!\ dropped 'NEAREST_ZIP
ID  BEGIN_LAT  BEGIN_LON  ZIP_CODE
0   0    30.9958   -87.2388     36441
1   1    42.5589   -92.5000     50613
2   2    42.6800   -91.9000     50662
3   3    37.0800   -97.8800     67018
4   4    37.8200   -96.8200     67042
>>> # used nzip:
>>> nzip = [[36441, 32535, 36426],
             [50613, 50624, 50613],  # i guess there was a typo in your code here
             [50662, 50641, 50671],
             [67018, 67003, 67049],
             [67042, 67144, 67074]]

>>> # build a `closest` dataframe:
>>> closest = pd.DataFrame(data={k: (v1, v2) for k, v1, v2 in nzip}).T.stack().reset_index().drop(columns=['level_1'])
>>> closest.columns = ['ZIP_CODE', 'NEAREST_ZIP']
>>> # merging
>>> test_df.merge(closest)
   ID  BEGIN_LAT  BEGIN_LON  ZIP_CODE  NEAREST_ZIP
0   0    30.9958   -87.2388     36441        32535
1   0    30.9958   -87.2388     36441        36426
2   1    42.5589   -92.5000     50613        50624
3   1    42.5589   -92.5000     50613        50613
4   2    42.6800   -91.9000     50662        50641
5   2    42.6800   -91.9000     50662        50671
6   3    37.0800   -97.8800     67018        67003
7   3    37.0800   -97.8800     67018        67049
8   4    37.8200   -96.8200     67042        67144
9   4    37.8200   -96.8200     67042        67074

我的nzip数据帧最多可以包含30000个元素。任何其他方式，而不是将其写为“最接近的=pd.DataFrame（数据={k：（v1，v2）用于nzip}.T.stack（）.reset_index（）.drop（列=['level_1']）”