Python SettingWithCopyWarning问题-如何在for循环中创建df的副本?
我正在尝试运行以下代码:Python SettingWithCopyWarning问题-如何在for循环中创建df的副本?,python,pandas,loops,geocoding,Python,Pandas,Loops,Geocoding,我正在尝试运行以下代码: for x in range(len(df10)): try: time.sleep(1) #to add delay in case of large DFs geocode_result = gmaps.geocode(df10['Address'][x]) df10['lat'][x] = geocode_result[0]['geometry']['location'] ['lat'] df
for x in range(len(df10)):
try:
time.sleep(1) #to add delay in case of large DFs
geocode_result = gmaps.geocode(df10['Address'][x])
df10['lat'][x] = geocode_result[0]['geometry']['location'] ['lat']
df10['long'][x] = geocode_result[0]['geometry']['location']['lng']
except IndexError:
print("Address was wrong...")
except Exception as e:
print("Unexpected error occurred.", e )
我希望for循环遍历地址列表,该列表现在存储在名为df10['Address']
的pandas数据帧的列中,然后应用Google地理编码服务提取每行的经度和纬度,并将其保存为原始数据帧的列
当我尝试执行此操作时,会出现以下错误:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
我理解这是因为我试图覆盖原始数据帧,但我真的很难找到一个替代代码
希望有人能帮忙 假设df10本身不是另一个数据帧的一部分:
df10.loc[df10.index[x],'lat']=geocode_结果[0]['geometry']['location']['lat']
通过应用返回多个值的函数来创建新列
伪造数据:
import pandas as pd
df = pd.DataFrame({'a':[1,2,3,4,5,6],'b':list('abcdef')})
>>> df
a b
0 1 a
1 2 b
2 3 c
3 4 d
4 5 e
5 6 f
构造一个函数,该函数以一行作为参数,并对其中一列进行操作
def f(row):
lat = row['a'] * 2
lon = row['a'] % 2
return lat,lon
将函数应用于DataFrame并将结果分配给新列
>>> df[['lat','lon']] = df.apply(f,axis=1,result_type='expand')
>>> df
a b lat lon
0 1 a 2 1
1 2 b 4 0
2 3 c 6 1
3 4 d 8 0
4 5 e 10 1
5 6 f 12 0
>>>
expand
参数将函数中类似列表的结果转换为列
您没有提供任何示例数据,我也没有安装gmap
,但我想您的代码应该是这样的:
def g(row):
g_result = gmaps.geocode(row['Address'])
lat = g_result[0]['geometry']['location'] ['lat']
lon = g_result[0]['geometry']['location']['lng']
return lat,lon
df[['lat','lon']] = df.apply(g,axis=1,result_type='expand')
使用方法如下:
def g(row):
g_result = gmaps.geocode(row['Address'])
lat = g_result[0]['geometry']['location'] ['lat']
lon = g_result[0]['geometry']['location']['lng']
return lat,lon
df[['lat','lon']] = df.apply(g,axis=1,result_type='expand')
尝试使用
loc/iloc
accessor。有关详细信息,请查看:df10.iloc[x,'lat']=…
。我在这里使用的是iloc
,因为您循环了df10
的长度,但这将是列标签的问题。您最好在df10
的索引上循环,或者使用.apply()
执行某些操作,或者尝试完全避免循环。