Python Try/除了…AttributeError:只能使用带字符串值的.str访问器_Python_Pandas_Selenium_Web Scraping

Python Try/除了…AttributeError:只能使用带字符串值的.str访问器

python pandas selenium web-scraping

Python Try/除了…AttributeError:只能使用带字符串值的.str访问器,python,pandas,selenium,web-scraping,Python,Pandas,Selenium,Web Scraping,我正在运行一个函数，从一个网站获取一些数据，并将其写入一个数据库。我用的是硒和壁虎 ...code... first_names = driver.find_elements_by_class_name('first-name') first_names = [name.text for name in first_names] last_names = driver.find_elements_by_class_name('last-name') last_names = [name.te

我正在运行一个函数，从一个网站获取一些数据，并将其写入一个数据库。我用的是硒和壁虎

...code...

first_names = driver.find_elements_by_class_name('first-name')
first_names = [name.text for name in first_names]

last_names = driver.find_elements_by_class_name('last-name')
last_names = [name.text for name in last_names]

commit_status = driver.find_elements_by_class_name('school-name')
commit_status = [commit.text for commit in commit_status]

#error is happening below

athlete['commit_school'] = athlete['commit'].str.replace('\d+', '').str.replace('/', 
'').str.replace('VERBAL', '').str.replace('SIGNED', '')

athlete['first'] = athlete['first'].str.title()
athlete['last'] = athlete['last'].str.title()

...code...

然后我循环使用这个函数，查看不同州网页上的类似数据。有时它会正常返回页面上的数据并继续到下一个状态，而其他时候，我会得到： AttributeError:只能对字符串值使用.str访问器！ …密码会被破解。让我困惑的是，我得到错误的次数似乎是任意的。有时我会通过1/4的循环，有时通过3/4的循环

我第一次尝试修复是一次尝试/例外，但我不确定我是否做对了，或者这是否是最好的方法：

athlete['state'] = state_all[:length:]
athlete['hs'] = hs_all[:length:]
athlete['commit'] = commit_status[:length:]

try:
    athlete['commit_school'] = athlete['commit'].str.replace('\d+', '').str.replace('/', 
    '').str.replace('VERBAL', '').str.replace('SIGNED', '')
    athlete['first'] = athlete['first'].str.title()
    athlete['last'] = athlete['last'].str.title()
except AttributeError:
    pass

athlete['list'] = 'Rivals_' + year + '_' + list_state
athlete['home'] = profile_page[:length:]

错误发生在try/except语句中，但我认为如果发现错误，它会跳过所有错误。

下面我添加

.astype（'str'）

到中间的代码是否解决了每个列的问题？您可能有数据类型为字符串和对象混合的列

athlete['commit_school'] = athlete['commit'].astype('str').str.replace('\d+', '').str.replace('/', '').str.replace('VERBAL', '').str.replace('SIGNED', '')

athlete['first'] = athlete['first'].astype('str').str.title()
athlete['last'] = athlete['last'].astype('str').str.title()

它成功地循环了代码，但输出中仍然缺少很多值。如果其中一些列是混合类型，而当无法获取输出时却将其保留为空，那么它为什么不提取其余的列呢？我得到更多的结果，然后与我以前的尝试我不确定，因为这是没有完整的代码。您的脚本中是否有一些.loc或drop#u副本（或任何过滤行的逻辑）？它有效地工作，您回答了我的问题，谢谢。出于某种原因，当使用title（）方法转换“first”和“last”时，一些行不会出现在电子表格中。我将它们取出并使其工作，只是我的第一行/最后一行都在大写字母中。