Python Selenium使用For循环访问元素

Python Selenium使用For循环访问元素,python,selenium,for-loop,Python,Selenium,For Loop,基本上,我希望从其他页面从第一页到最后一页的所有信息,我尝试: 这个网站有点奇怪。我想在“发行后”下获得所有发行人和其他信息 driver.get('https://www.chinabondconnect.com/en/Primary/Primary-Information/Onshore.html') wait = WebDriverWait(driver, 30) driver.find_element_by_link_text('Others').click() for i in ran

基本上,我希望从其他页面从第一页到最后一页的所有信息,我尝试:

这个网站有点奇怪。我想在“发行后”下获得所有发行人和其他信息

driver.get('https://www.chinabondconnect.com/en/Primary/Primary-Information/Onshore.html')
wait = WebDriverWait(driver, 30)
driver.find_element_by_link_text('Others').click()
for i in range(1,20):
        pg = "tb2tr pg" + str(i)
        allitems = driver.find_element_by_xpath('//*[@id="td7"]/tbody/tr[@class=pg])')
        for i in range(len(allitems)):
            issuer = driver.find_element_by_xpath('(//tr[@class=pg]//td[1]//div[2]//div)').text
            print(issuer)
它表示不是有效的xpath

有人能帮忙吗


谢谢

尝试删除xpath的括号,这样最终的xpath将如下所示:

issuer = driver.find_element_by_xpath('//tr[@class=pg]//td[1]//div[2]//div').text

尝试删除xpath的括号,这样最终的xpath将如下所示:

issuer = driver.find_element_by_xpath('//tr[@class=pg]//td[1]//div[2]//div').text

如果我错了,请纠正我。我知道您希望对整个网页进行爬网,这意味着当您单击时,该页面将加载一个新页面。Selenium web驱动程序无法识别新页面,它只关注第一个页面。你必须给它这样做的指示。解决这个问题的方法是:

from selenium.webdriver.support import expected_conditions as EC

# Start the driver
with webdriver.Firefox() as driver:
    # Open URL
    driver.get("https://seleniumhq.github.io")

    # Setup wait for later
    wait = WebDriverWait(driver, 10)

    # Store the ID of the original window
    original_window = driver.current_window_handle

    # Check we don't have other windows open already
    assert len(driver.window_handles) == 1

    # Click the link which opens in a new window
    driver.find_element(By.LINK_TEXT, "new window").click()

    # Wait for the new window or tab
    wait.until(EC.number_of_windows_to_be(2))

    # Loop through until we find a new window handle
    for window_handle in driver.window_handles:
        if window_handle != original_window:
            driver.switch_to.window(window_handle)
            break

    # Wait for the new tab to finish loading content

如果我错了,请纠正我。我知道您希望对整个网页进行爬网,这意味着当您单击时,该页面将加载一个新页面。Selenium web驱动程序无法识别新页面,它只关注第一个页面。你必须给它这样做的指示。解决这个问题的方法是:

from selenium.webdriver.support import expected_conditions as EC

# Start the driver
with webdriver.Firefox() as driver:
    # Open URL
    driver.get("https://seleniumhq.github.io")

    # Setup wait for later
    wait = WebDriverWait(driver, 10)

    # Store the ID of the original window
    original_window = driver.current_window_handle

    # Check we don't have other windows open already
    assert len(driver.window_handles) == 1

    # Click the link which opens in a new window
    driver.find_element(By.LINK_TEXT, "new window").click()

    # Wait for the new window or tab
    wait.until(EC.number_of_windows_to_be(2))

    # Loop through until we find a new window handle
    for window_handle in driver.window_handles:
        if window_handle != original_window:
            driver.switch_to.window(window_handle)
            break

    # Wait for the new tab to finish loading content
尝试对所有项目使用此xpath。用“tb2tr pg”+str(i)值获取td7中的所有tr值

你可以用

for item in allitems:
    issuer = item.find_element_by_xpath('./td[1]/div[2]/div').get_attribute('textContent'
    print(issuer)
尝试对所有项目使用此xpath。用“tb2tr pg”+str(i)值获取td7中的所有tr值

你可以用

for item in allitems:
    issuer = item.find_element_by_xpath('./td[1]/div[2]/div').get_attribute('textContent'
    print(issuer)
使用
find_elements()
获取所有记录,并使用
get_属性(“textContent”)
获取隐藏节点值

for item in driver.find_elements_by_xpath("//table[@id='tb7']//tr[starts-with(@class,'tb2tr pg')]//td[1]/div[2]/div"):
    print(item.get_attribute("textContent"))
输出:

Central Huijin Investment Ltd.
Dongguan Rural Commercial Bank Co., Ltd.
Gemdale (Group) Co., Ltd.
Everbright Securities
China securities co ltd
Bank of China 
Jinan Rail Transit Group Co., Ltd.
Ping An Bank Co., Ltd.
Shaanxi Financial Holding Group Co., Ltd.
Bank of Suzhou Co., Ltd.
Chongqing Expressway Group Co., Ltd.
Shanghai World Expo Land Holdings Co., Ltd.
Beijing Capital Tourism Group Co., Ltd.
CMB Financial Leasing Co., Ltd.
Shaanxi Coal Industry Chemical Group Co., Ltd.
China Securities Co., Ltd.
Guangdong Electric Power Development Co., Ltd.
China Construction Bank 
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
China Securities Co., Ltd.
China Securities Co., Ltd.
China Bohai Bank
Shangrao Investment Holding Group SCP
China Securities Co., Ltd
Everbright Securities
Guangzhou Kaide Renewable Publicly Issued Corporate Bond
SCP/Guangzhou Development Zone Business Development Group
Qingdao City Investment Financial Holding Group Renewable Publicly Issued Corporate Bond
China Railway Construction Investment Group MTN
Qingdao Guoxin Development (Group) Co., Ltd.
China Securities Co., Ltd.
China Orient Asset Management Co., Ltd
    Datang International Power Generation Co.,Ltd.
Bank of China
Bank of China 
Datang International Power Generation Co.,Ltd. 
Hangzhou City Construction Investment Group Limited
YIBIN STATE OWNED ASSETS MANAGEMENT CO.,LTD.
China Railway Construction Investment Corporation
ABC Financial Leasing
Guangzhou Metro
Aluminum Corporation of China Limited
Fubon Bank
China Securities Co., Ltd.
Ganzhou Development Investment Holding Group
Shanghai rural Commercial Bank
Everbright Securities
ICBC Financial Leasing Co., Ltd
Shanghai Pudong Development Bank
China State Railway Group Co., Ltd.
China State Railway Group Co., Ltd.
CMB Financial Leasing
CMB Financial Leasing Co., Ltd.
Bank of China
Bank of China 
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
Industrial and Commercial Bank of China Limited
Bank of Communications Co.,Ltd.
Zhejiang State-owned Capital Operation Co., Ltd.
China Merchant Bank
China Merchants Bank
Bank of Communications Financial Leasing Co., Ltd.
CCB Financial Leasing Co., Ltd
Central Huijin Investment Ltd.
Central Huijin Investment Ltd.
China Securities Co., Ltd
Everbright Securities
Beijing Infrastructure Investment Co., LTD
Huishang Bank Corporation
Bank of Communication
China Nonferrous Metal Mining (Group) Co., Ltd
Everbright Securities
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
China Securities Co., Ltd
China Everbright Bank Co., Ltd
Bank of China...so on
使用
find_elements()
获取所有记录,并使用
get_属性(“textContent”)
获取隐藏节点值

for item in driver.find_elements_by_xpath("//table[@id='tb7']//tr[starts-with(@class,'tb2tr pg')]//td[1]/div[2]/div"):
    print(item.get_attribute("textContent"))
输出:

Central Huijin Investment Ltd.
Dongguan Rural Commercial Bank Co., Ltd.
Gemdale (Group) Co., Ltd.
Everbright Securities
China securities co ltd
Bank of China 
Jinan Rail Transit Group Co., Ltd.
Ping An Bank Co., Ltd.
Shaanxi Financial Holding Group Co., Ltd.
Bank of Suzhou Co., Ltd.
Chongqing Expressway Group Co., Ltd.
Shanghai World Expo Land Holdings Co., Ltd.
Beijing Capital Tourism Group Co., Ltd.
CMB Financial Leasing Co., Ltd.
Shaanxi Coal Industry Chemical Group Co., Ltd.
China Securities Co., Ltd.
Guangdong Electric Power Development Co., Ltd.
China Construction Bank 
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
China Securities Co., Ltd.
China Securities Co., Ltd.
China Bohai Bank
Shangrao Investment Holding Group SCP
China Securities Co., Ltd
Everbright Securities
Guangzhou Kaide Renewable Publicly Issued Corporate Bond
SCP/Guangzhou Development Zone Business Development Group
Qingdao City Investment Financial Holding Group Renewable Publicly Issued Corporate Bond
China Railway Construction Investment Group MTN
Qingdao Guoxin Development (Group) Co., Ltd.
China Securities Co., Ltd.
China Orient Asset Management Co., Ltd
    Datang International Power Generation Co.,Ltd.
Bank of China
Bank of China 
Datang International Power Generation Co.,Ltd. 
Hangzhou City Construction Investment Group Limited
YIBIN STATE OWNED ASSETS MANAGEMENT CO.,LTD.
China Railway Construction Investment Corporation
ABC Financial Leasing
Guangzhou Metro
Aluminum Corporation of China Limited
Fubon Bank
China Securities Co., Ltd.
Ganzhou Development Investment Holding Group
Shanghai rural Commercial Bank
Everbright Securities
ICBC Financial Leasing Co., Ltd
Shanghai Pudong Development Bank
China State Railway Group Co., Ltd.
China State Railway Group Co., Ltd.
CMB Financial Leasing
CMB Financial Leasing Co., Ltd.
Bank of China
Bank of China 
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
Industrial and Commercial Bank of China Limited
Bank of Communications Co.,Ltd.
Zhejiang State-owned Capital Operation Co., Ltd.
China Merchant Bank
China Merchants Bank
Bank of Communications Financial Leasing Co., Ltd.
CCB Financial Leasing Co., Ltd
Central Huijin Investment Ltd.
Central Huijin Investment Ltd.
China Securities Co., Ltd
Everbright Securities
Beijing Infrastructure Investment Co., LTD
Huishang Bank Corporation
Bank of Communication
China Nonferrous Metal Mining (Group) Co., Ltd
Everbright Securities
Industrial and Commercial Bank of China
Industrial and Commercial Bank of China Limited
China Securities Co., Ltd
China Everbright Bank Co., Ltd
Bank of China...so on


您好,谢谢,但它已经在行中抛出错误:allitemsHi,谢谢,但它已经在行中抛出错误:allitemsHi谢谢,但该网站已将所有链接放在一个网站中,我不需要单击下一页我看到了,您是否尝试按元素打印?元素=驱动程序。查找元素(by.TAG_NAME,“a”)您好,谢谢,但该网站已将所有链接放在一个网站中,我不需要单击下一页我知道,您是否尝试按元素打印?元素=驱动程序。查找元素(by.TAG_NAME,“a”)嗨,seleniumhi更好,seleniumhi更好,很高兴您的帮助!但是当我使用范围(1,20)中的I的
:pg=“tb2tr pg”+str(I)allitems=driver.find_element_by_xpath(//table[@id='td7']/tbody//tr[started with(@class,{}')]”。format(pg))
,它说找不到..不确定我为什么认为
上的tb2tr pg1上有一个额外的“on”
,我改为范围内的I(2,20),但还是找不到,这是tb7,不是td7。现在可以用了,谢谢!但它返回空白,这是否意味着网站不能被屏蔽?此外,我可以问一下为什么使用
tr[以(@class,{}')
开头而不是
tr[@class='{}']
嘿,我试过一个例子,网站没有什么问题,但不知怎的,我无法得到文本返回,但是对于范围(2,20)内的I:pg=“tb2tr pg”+str(I)驱动程序,所有的
都是空的allitems=driver.find_elements_by_xpath(“//table[@id='tb7']/tbody//tr[started with(@class,{}')”)”)”.format(pg))for allitems:issuer=item.find_elements_by_xpath('./td[2]/div[2]/span')。文本打印(issuer)
Hi,很高兴您的帮助!但当我使用
表示范围(1,20)内的I时:pg=“tb2tr pg”+str(i) allitems=driver.find_element_by_xpath(“//table[@id='td7']/tbody//tr[以(@class,{}')]”开头)。format(pg))
,它表示找不到。。不确定为什么我认为
tb2tr pg1上有一个额外的“on”
,我改为范围(2,20)中的for i,但仍然无法找到它是tb7而不是td7。它现在可以工作了。谢谢!但是它返回为空,这是否意味着该网站不能被屏蔽?另外,我可以问一下为什么使用
tr[以(@class,{}')开头
而不是
tr[@class='{}']
Hey,我试过一个例子,网站没有问题,但不知何故,我无法得到文本返回,但在范围(2,20):pg=“tb2tr pg”+str(I)driver.implicitly_wait(10)allitems=driver.find_elements_by_xpath(//table[@id='tb7']]/tbody//tr[以(@class,{}')]”开头)。format(pg))对于allitems中的项:issuer=item。通过xpath('./td[2]/div[2]/span')查找元素。文本打印(issuer)您好,它工作正常,谢谢您的帮助!请问为什么使用textcontent而不是。text?@Joyce:。当元素在页面上可见时,text会工作。您使用的应用程序需要滚动页面以使元素可见。这就是它给出空值的原因。textcontent检索所有隐藏值,如果它存在于DOM中。希望我已经回答了您的问题。请将此标记为已接受并投赞成票。谢谢。您好,它工作正常,谢谢您的帮助!我可以问一下为什么使用textcontent而不是。text?@Joyce:。当元素在页面上可见时,text会起作用。您使用的应用程序需要滚动页面以使元素可见。这就是它给出空v的原因value.textcontent检索DOM中存在的所有隐藏值。希望我已经回答了您的查询。请将此标记为已接受并投赞成票。谢谢。