Python 使用mechanize绕过404
我正在创建一个Python脚本来读取URL文件,但我知道并非所有这些脚本都能正常工作。我试图找出如何绕过这个问题,让它读取文件的下一行,而不是提出我在下面发布的错误。我知道我需要某种if语句,但我不太明白Python 使用mechanize绕过404,python,csv,beautifulsoup,mechanize,Python,Csv,Beautifulsoup,Mechanize,我正在创建一个Python脚本来读取URL文件,但我知道并非所有这些脚本都能正常工作。我试图找出如何绕过这个问题,让它读取文件的下一行,而不是提出我在下面发布的错误。我知道我需要某种if语句,但我不太明白 from mechanize import Browser from BeautifulSoup import BeautifulSoup import csv me = open('C:\Python27\myfile.csv') reader = csv.reader(me) mech
from mechanize import Browser
from BeautifulSoup import BeautifulSoup
import csv
me = open('C:\Python27\myfile.csv')
reader = csv.reader(me)
mech = Browser()
for url in me:
response = mech.open(url)
html = page.read()
soup = BeautifulSoup(html)
table = soup.find("table", border=3)
for row in table.findAll('tr')[2:]:
col = row.findAll('td')
BusinessName = col[0].string
Phone = col[1].string
Address = col[2].string
City = col[3].string
State = col[4].string
Zip = col[5].string
Restaurantinfo = (BusinessName, Phone, Address, City, State)
print "|".join(Restaurantinfo)
当我运行该代码块时,它会引发以下错误:
httperror\u seek\u包装器:HTTP错误404:未找到
基本上,我要问的是如何让Python忽略这一点并尝试下一个URL。如果文件中只有URL,那么每行编写一个URL并使用以下代码可能会更简单:
from mechanize import Browser
from BeautifulSoup import BeautifulSoup
me = open('C:\Python27\myfile.csv')
mech = Browser()
for url in me.readlines():
...
如果要保留代码,必须使用:
for url in reader:
...