Python 美化组不返回HTML表第一列中的值_Python_Html_Selenium_Beautifulsoup

Python 美化组不返回HTML表第一列中的值

python html selenium

Python 美化组不返回HTML表第一列中的值,python,html,selenium,beautifulsoup,Python,Html,Selenium,Beautifulsoup,我有一个脚本，打开http查找html表，并将该表的所有值存储到csv 对于第二个网站，我希望从相同的脚本复制信息是不工作的。它无法为第1列生成值，但仍然为第2列和第3列生成正确的值我相信这是因为在site one上，第一列看起来像： 4216278 其中，在第二个站点上，第一列如下所示： 4268023 我相信‘4268023’之前的主要空白可能是问题所在在下面的代码中，列[0]返回为空，而列[1]和列[2]返回正确的值 html = driver.page_source soup = B

我有一个脚本，打开http查找html表，并将该表的所有值存储到csv

对于第二个网站，我希望从相同的脚本复制信息是不工作的。它无法为第1列生成值，但仍然为第2列和第3列生成正确的值

我相信这是因为在site one上，第一列看起来像： 4216278

其中，在第二个站点上，第一列如下所示： 4268023

我相信‘4268023’之前的主要空白可能是问题所在

在下面的代码中，列[0]返回为空，而列[1]和列[2]返回正确的值

html = driver.page_source
soup = BeautifulSoup(html)

table = soup.find("table", border=1)
header = (['Claim_Number', 'Township_Area', 'Date Cancelled_Forfeited'])

records = []
for row in table.findAll('tr')[1:]:
    col = row.findAll('td')

    Claim_Number = col[0].string
    Township_Area = col[1].string
    Date_Cancelled_Forfeited = col[2].string
    Claim_Cancel = (Claim_Number, Township_Area, Date_Cancelled_Forfeited)
    records.append(Claim_Cancel)

工作脚本脚本运行到完成，但不会生成完整的csv，第1列留空

为了帮助测试，您能否提供两个URL，一个有效，一个无效请参阅我编辑的文章，了解我的URL，我设法解决了我遇到的问题。我没有搜索所有的“td”，而是添加了一行代码来搜索所有的“a”，然后：link=row.findAll（'a'）和Claim_Number=link[0].string.strip（），然后写下我要查找的值。

import selenium
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from bs4 import BeautifulSoup


# Create a new instance of the Firefox driver
driver = webdriver.Firefox()

# go to the MCI Main Menu KENORA Mining Division page
driver.get("http://www.mci.mndm.gov.on.ca/Claims/Cf_Claims/clm_css.cfm?Div=10")

# find first text box
inputElement = driver.find_element_by_name("Claim_View__Claim_Number")

# type in the text box, nothing (could be the claim number for a specific search
inputElement.send_keys("")

# submit the form (click on the search button)
inputElement.submit()

#Read the HTML table with Client info, Store the info, and print to a file.
html = driver.page_source
soup = BeautifulSoup(html)

table = soup.find("table", border=1)
header = (['Claim_Number', 'Township_Area', 'Recorded_Holder', 'Due_Date'])

records = []
for row in table.findAll('tr')[1:]:
    col = row.findAll('td')

    Claim_Number = col[0].string
    Township_Area = col[1].string
    Recorded_Holder = col[2].string
    Due_Date = col[3].string
    Claim_Att = (Claim_Number, Township_Area, Recorded_Holder, Due_Date)
    records.append(Claim_Att)
with open('MCItest2.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerow(header)
    writer.writerows(row for row in records if row)

import csv
import selenium
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from bs4 import BeautifulSoup


# Create a new instance of the Firefox driver
driver = webdriver.Firefox()

# go to the MCI Main Menu KENORA Mining Division page
driver.get("http://www.mci.mndm.gov.on.ca/Claims/Cf_Claims/clm_cas.cfm?Div=10")

# find first text box
inputElement = driver.find_element_by_name("Claim_Cancellation__Canc_Period")
inputbutton = driver.find_element_by_xpath("//input[@name='Claim_Cancellation__Canc_Period' and @value='90']").click();

# type in the text box, nothing (could be the claim number for a specific search
inputElement.send_keys("")

# submit the form (click on the search button)
inputElement.submit()

#Read the HTML table with Client info, Store the info, and print to a file.
html = driver.page_source
soup = BeautifulSoup(html)

table = soup.find("table", border=1)
header = (['Claim_Number', 'Township_Area', 'Date Cancelled_Forfeited'])

records = []
for row in table.findAll('tr')[1:]:
    col = row.findAll('td')

    Claim_Number = col[0]' '.string
    Township_Area = col[1].string
    Date_Cancelled_Forfeited = col[2].string
    Claim_Cancel = (Claim_Number, Township_Area, Date_Cancelled_Forfeited)
    records.append(Claim_Cancel)

with open('MCICancel2.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerow(header)
    writer.writerows(row for row in records if row)