Python 尝试使用Selenium循环浏览配置文件列表
我正在尝试循环浏览所有配置文件,并将人员姓名、工作配置文件和位置存储在列表中。以下是我所在的LinkedIn屏幕截图: 以下是我必须循环使用的li html标记:Python 尝试使用Selenium循环浏览配置文件列表,python,selenium,automation,Python,Selenium,Automation,我正在尝试循环浏览所有配置文件,并将人员姓名、工作配置文件和位置存储在列表中。以下是我所在的LinkedIn屏幕截图: 以下是我必须循环使用的li html标记: <li class="reusable-search__result-container "> <div class="entity-result "> <div class
<li class="reusable-search__result-container ">
<div class="entity-result ">
<div class="entity-result__item">
<div class="entity-result__image">
<div class="display-flex align-items-center">
<a class="app-aware-link" aria-hidden="true" href="https://www.linkedin.com/search/results/people/headless?geoUrn=%5B103644278%5D&origin=FACETED_SEARCH&keywords=python%20developer">
<div id="ember522" class="ivm-image-view-model ember-view"> <div class="
ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex
">
<div class="EntityPhoto-circle-3-ghost-person ivm-view-attr__ghost-entity ">
<!----> </div>
</div>
</div>
</a>
</div>
</div>
<div class="entity-result__content entity-result__divider pt3 pb3 t-12 t-black--light">
<div class="mb1">
<div class="linked-area flex-1 cursor-pointer">
<div class="t-roman t-sans">
<span class="entity-result__title">
<div class="display-flex">
<span class="entity-result__title-line flex-shrink-1 entity-result__title-text--black ">
<span class="entity-result__title-text t-16">
<a class="app-aware-link" href="https://www.linkedin.com/search/results/people/headless?geoUrn=%5B103644278%5D&origin=FACETED_SEARCH&keywords=python%20developer">
<!---->LinkedIn Member<!---->
</a>
<!----> </span>
</span>
<!----></div>
</span>
</div>
<div>
<div class="entity-result__primary-subtitle t-14 t-black">
<!---->Software Developer<!---->
</div>
<div class="entity-result__secondary-subtitle t-14">
<!---->United States<!---->
</div>
</div>
</div>
</div>
<div class="linked-area flex-1 cursor-pointer">
<p class="entity-result__summary entity-result__summary--2-lines t-12 t-black--light ">
<!---->Current: Full Stack Software<span class="white-space-pre"> </span><strong><!---->Developer<!----></strong><span class="white-space-pre"> </span>at GE Healthcare<!---->
</p>
</div>
<!----> </div>
<div class="entity-result__actions entity-result__divider entity-result__actions--empty">
<!----> <!---->
</div>
</div>
</div>
</li>
但我无法获得工作地点和工作简介。有人能告诉我这方面的代码吗
我尝试了类似的方法,但出现了一个错误:
profile_names = []
job_profiles = []
linkedin_members = browser.find_elements_by_xpath('//div[@class="linked-area flex-1 cursor-pointer"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
job_profile = linkedin_member.find_element_by_xpath('.//div[@class="entity-result__primary-subtitle"]').text
profile_names.append(name)
job_profiles.append(job_profiles)
您只需标识这些元素(我认为您可以使用带有css选择器的类进行标识),然后循环遍历这些元素并将文本附加到适当的数组中
profile_names = []
linkedin_members = browser.find_elements_by_xpath('//span[@class="entity-result__title"]')
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.//a[@class="app-aware-link"]').get_attribute('text').strip()
profile_names.append(name)
user_positions = []
positions = browser.find_elements_by_css_selector('div.entity-result__primary-subtitle')
for position in positions:
user_positions.append(position.text.strip())
user_locations = []
locations = browser.find_elements_by_css_selector('div.entity-result__secondary-subtitle')
for location in locations:
user_locations.append(location.text.strip())
另一种方法是:
members_serach_results_xpath = '//div[@class="entity-result__item"]'
member_name_xpath = '//span[contains(@class,"entity-result__title-text")]//span[@dir]'
member_location_xpath = '//div[contains(@class,"entity-result__secondary-subtitle")]'
member_job_title_xpath = '//div[@class="entity-result__item"]//div[contains(@class,"entity-result__primary-subtitle")]'
profile_names = []
profile_addresses = []
profile_job_titles = []
linkedin_members = browser.find_elements_by_xpath(members_serach_results_xpath)
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.' + member_name_xpath).get_attribute('text').strip()
profile_names.append(name)
address = linkedin_member.find_element_by_xpath('.' + member_location_xpath).get_attribute('text').strip()
profile_addresses.append(address)
job_title = linkedin_member.find_element_by_xpath('.' + member_job_title_xpath).get_attribute('text').strip()
profile_job_titles.append(job_title)
在这里,我将定位器作为代码外的参数。最好的做法之一是不要将定位器硬编码到使用它的方法中。您可以指定运行代码时引发的错误吗?它引发了这个错误:selenium.common.exceptions.NoSuchElementException:Message:没有这样的元素:无法定位元素:{“method”:“xpath”,“selector”://a[@class=“app aware link”]}但现在已经分类了。谢谢,它成功了!你能告诉我为什么我试过的代码不起作用吗?你得到了什么错误?我得到了这个错误:它抛出了这个错误:selenium.common.exceptions.NoSuchElementException:Message:没有这样的元素:找不到元素:{“method”:“xpath”,“selector”://a[@class=“app aware link”]},我只想使用1 for循环来获取所有这三件事的数据。您的解决方案运行完全正常,我将继续讨论,但我只是想了解我的错误所在。您的代码的问题在于
//div[@class=“linked area flex-1 cursor pointer”]
。此查找到的div没有任何带有标记a
,因此。//a[@class=“app-aware-link”]
的子体找不到任何内容。我一定会在稍后查看此代码。谢谢发帖:)我喜欢这种方式!只需指出第2行和第3行中的XPath在单引号中使用单引号,像entity-result\uu title-text
这样的类名应该用双引号括起来quotes@C.Peck谢谢你的更正!这都是因为Java:)在那里,我们总是对外部字符串使用“
,对内部字符串使用”
members_serach_results_xpath = '//div[@class="entity-result__item"]'
member_name_xpath = '//span[contains(@class,"entity-result__title-text")]//span[@dir]'
member_location_xpath = '//div[contains(@class,"entity-result__secondary-subtitle")]'
member_job_title_xpath = '//div[@class="entity-result__item"]//div[contains(@class,"entity-result__primary-subtitle")]'
profile_names = []
profile_addresses = []
profile_job_titles = []
linkedin_members = browser.find_elements_by_xpath(members_serach_results_xpath)
for linkedin_member in linkedin_members:
name = linkedin_member.find_element_by_xpath('.' + member_name_xpath).get_attribute('text').strip()
profile_names.append(name)
address = linkedin_member.find_element_by_xpath('.' + member_location_xpath).get_attribute('text').strip()
profile_addresses.append(address)
job_title = linkedin_member.find_element_by_xpath('.' + member_job_title_xpath).get_attribute('text').strip()
profile_job_titles.append(job_title)