Python TypeError:字符串索引必须是整数
我正试图创建一个.csv文件,其中包含来自谷歌API地理编码服务的地理编码数据的解析信息。我想将地址信息解析为单独的列。我的脚本运行良好,直到我得到类型错误的位置数据。有人能帮我修改我的脚本,以便我能将这些数据包含到我的表中吗 ##这是我的剧本Python TypeError:字符串索引必须是整数,python,google-maps,for-loop,Python,Google Maps,For Loop,我正试图创建一个.csv文件,其中包含来自谷歌API地理编码服务的地理编码数据的解析信息。我想将地址信息解析为单独的列。我的脚本运行良好,直到我得到类型错误的位置数据。有人能帮我修改我的脚本,以便我能将这些数据包含到我的表中吗 ##这是我的剧本 import pandas as pd import requests import geocoder import time import json df = pd.read_csv('/Users/albertgonzalobautista/
import pandas as pd
import requests
import geocoder
import time
import json
df = pd.read_csv('/Users/albertgonzalobautista/Desktop/workingbook.csv') # define CSV to be read to be geocdoed
# create new columns for the output CSV
df['geocode_data'] = ''
df['address']=''
df['street_number']=''
df['street_name']=''
df['postalcode']=''
df['city']=''
df['st_pr_mn']=''
df['country']=''
df['location_lat']=''
df['location_lon']=''
# Create function that handles the geocoding requests
average = 0
def reverseGeocode(latlng): #defines reverse geocoding function
#Set parameters
start = time.time()
result = {} #create empty list
url = 'https://maps.googleapis.com/maps/api/geocode/json?latlng={0}&key={1}' #Access URL for Google Geocoder API
apikey = 'XXX' # Set you API Key taken from Google API website and your Google Developers Account
request = url.format(latlng, apikey)
#delays responses so that it does not over
data = json.loads(requests.get(request).text)
if len(data['results']) > 0:
result = data['results'][0]
#global average #if not work delete first char(uncomment)
average = time.time() - start
return result
for i, row in df.iterrows():
if average < 0.3 : time.sleep(0.3 - average) #0.3 is period time (min= 0.2 max = free)
df['geocode_data'][i] = reverseGeocode(df['lat'][i].astype(str) + ',' + df['lon'][i].astype(str))
for i, row in df.iterrows():
if 'address_components' in row['geocode_data']:
for component in row['geocode_data']['address_components']:
df['address'][i] = row['geocode_data']['formatted_address']
for component in row['geocode_data']['address_components']:
if 'street_number' in component['types']:
df['street_number'][i] = component['long_name']
for component in row['geocode_data']['address_components']:
if 'route' in component ['types']:
df['street_name'][i] = component['long_name']
break
for component in row['geocode_data']['address_components']:
if 'route' in component ['types']:
df['street_name'][i] = component['long_name']
for component in row['geocode_data']['address_components']:
if 'postal_code' in component ['types']:
df['postalcode'][i] = component['short_name']
break
for component in row['geocode_data']['address_components']:
if 'locality' in component ['types']:
df['city'][i]= component['short_name']
break
for component in row['geocode_data']['address_components']:
if 'administrative_area_level_1' in component ['types']:
df['st_pr_mn'][i] = component ['long_name']
break
for component in row['geocode_data']['address_components']:
if 'country' in component ['types']:
df['country'][i] = component ['long_name']
break
for component in row['geocode_data']['geometry']:
if component['location']:
df['location_lng'][i] = int(component['location']['lng'])
df['location_lat'][i] = int(component['location']['lat'])
df.to_csv('test10.csv', encoding='utf-8', index=False)
在您的代码中发现可能的输入错误:
for component in row['geocode_data']['geometry']:
if 'location' in component ['lat']:
df['location_lng'][i] = component['lng']
我相信您希望在那里寻找组件['lng']
如果这还不能解决问题,你能打印出df['location\u lng'],这样我们就可以准确地看到它是什么类型的吗?你的“location”路径似乎不正确。试试这个:
for component in row['geocode_data']['geometry']:
if component['location']: # Note: NOT: if 'location' in component ['...']
df['location_lng'][i] = component['location']['lng']
df['location_lat'][i] = component['location']['lat']
更新:
看起来实际的问题是正在使用的输出“数组”。
您正在分配以下输出:
df['location_lat']=''
然后您将分配:
df['location_lat'][i]=...
实际上是:
''[i] = ...
这将导致记录的类型错误。[我得到了“KeyError:‘location_lat’”,但是嘿…]
也许可以尝试以下方式:
results = []
...
for ...
df = {}
...
df['location_lng'] = component['location']['lng']
...
results.append(df)
更新:
没有使用熊猫,我不确定df csv格式是如何反映的;但是,我推测,如果删除“#为输出CSV创建新列”部分,您的数据帧结构将不会损坏。
ie:删除:
df['geocode_data'] = ''
df['address']=''
等等
不过,我可能找错人了
另一更新:
我的简单测试用例是:
js = ... (json string as above)
data = json.loads(js)
component = data['geometry']
if component['location']:
val = component['location']['lat']
print val
它可以很好地提取纬度。
所以问题不应该是“如果”部分
更新(再次…)
确定-再试一次:而不是:
df['geocode_data'][i] = reverseGeocode(...)
直接分配给字典变量以进行数据提取。即:
data = reverseGeocode(...)
然后按上述方法提取数据,并根据需要分配给数据帧。因此,我仍然得到相同的错误。给我同样的错误,什么错误?您可以发布完整的stacktrace吗?第99行,在组件['lat']中的if'location':builtins.TypeError:行['geocode_data']['geometry']中的组件的字符串索引必须是整数:'如果组件['lat']中的'location':'那么位置路径已被更正!谢谢,但是我仍然得到了“TypeError:字符串索引必须是整数”的相同错误。在“if”检查中,您仍然得到了“in”部分吗?这需要删除。我直接复制了它,并检查了是否遗漏了任何内容,但仍然得到了错误内置项。TypeError:如果组件['location'],则行上的字符串索引必须是整数:我有点困惑如何将新更改嵌入到我的脚本中。你能告诉我怎么做吗。谢谢你的帮助!我试着对它进行一点操作,然后把它添加进去。我仍然在if组件['locations'的行中得到相同的错误
data = reverseGeocode(...)