Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/286.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何找出与某一特定期刊/论文的每位教授相关的从属关系_Python_Selenium_Selenium Webdriver_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 如何找出与某一特定期刊/论文的每位教授相关的从属关系

Python 如何找出与某一特定期刊/论文的每位教授相关的从属关系,python,selenium,selenium-webdriver,web-scraping,beautifulsoup,Python,Selenium,Selenium Webdriver,Web Scraping,Beautifulsoup,我想浏览的网站是。单击“显示更多”按钮后,即可获得从属关系。我可以点击它,但我不能刮的从属关系是加载后点击显示更多按钮这里是代码。for循环未打印包含从属关系的dl标记 导入时间 从selenium导入webdriver 从selenium.common.Exception导入NoTouchElementException 从selenium导入webdriver 从bs4导入BeautifulSoup driver=webdriver.Firefox() 司机,上车https://www.s

我想浏览的网站是。单击“显示更多”按钮后,即可获得从属关系。我可以点击它,但我不能刮的从属关系是加载后点击显示更多按钮这里是代码。for循环未打印包含从属关系的dl标记

导入时间
从selenium导入webdriver
从selenium.common.Exception导入NoTouchElementException
从selenium导入webdriver
从bs4导入BeautifulSoup
driver=webdriver.Firefox()
司机,上车https://www.sciencedirect.com/science/article/pii/S1571065308000656')
soup=BeautifulSoup(driver.page\u源代码'html.parser')
时间。睡眠(7)
尝试:
element=驱动程序。通过_css_选择器查找_element_('.show hide details.u-font-sans'))
元素。单击()
时间。睡眠(15)
对于soup.find(id='author-group')中的数据:
打印(数据)
打印('--')
除无任何例外:

pass
我认为在单击“显示更多”按钮后,需要将汤的实例化向下移动到

如果我运行以下代码:

driver = webdriver.Firefox()

driver.get('https://www.sciencedirect.com/science/article/pii/S1571065308000656')
time.sleep(3)

try:
    element = driver.find_element_by_css_selector('.show-hide-details.u-font-sans')
    element.click()
    time.sleep(9)
    soup = BeautifulSoup(driver.page_source,'html.parser')

    for data in soup.find(id='author-group'):
        print(data)
        print('---')

except NoSuchElementException:
    pass
我的输出是:

<span class="sr-only">Author links open overlay panel</span>
---
<a class="author size-m workspace-trigger" href="#!" name="baep-author-id6"><span class="content"><span class="text given-name">Ignaz</span><span class="text surname">Rutter</span><span class="author-ref" id="bfn001"><sup>1</sup></span><svg class="icon icon-envelope" focusable="false" height="24" viewbox="0 0 102 128" width="19.125"><path d="m55.8 57.2c-1.78 1.31-5.14 1.31-6.9 0l-31.32-23.2h69.54l-31.32 23.19zm-55.8-24.78l42.94 32.62c2.64 1.95 6.02 2.93 9.4 2.93s6.78-0.98 9.42-2.93l40.24-30.7v-10.34h-102zm92 56.48l-18.06-22.74-8.04 5.95 17.38 21.89h-64.54l18.38-23.12-8.04-5.96-19.08 24.02v-37.58l-1e1 -8.46v61.1h102v-59.18l-1e1 8.46v35.62"></path></svg></span></a>
---
<dl class="affiliation"><dd>Fakultät für Informatik, Universität Karlsruhe, Germany</dd></dl>
---
作者链接打开覆盖面板
---
---
Fakultät für Informatik,卡尔斯鲁厄大学,德国
---

数据是从脚本标记加载的,这意味着您可以只使用请求、提取脚本内容并使用json库进行解析

import requests, json
from bs4 import BeautifulSoup as bs

headers = {'User-Agent':'Mozilla/5.0'}
url = 'https://www.sciencedirect.com/science/article/pii/S1571065308000656'
r = requests.get(url, headers = headers)
soup = bs(r.content, 'lxml')
data = json.loads(soup.select_one('[type="application/json"]').text)

for author in data['authors']['content']:
    print(' '.join([author['$$'][0]['$$'][0]['_'],author['$$'][0]['$$'][1]['_']]))
    print(author['$$'][1]['$$'][0]['_'])