Python 在检查网站是否正常工作时,如何绕过或捕获socket.timeout错误?
我一直在开发一个程序来检查网站是否正常工作。我从excel工作表中获取URL,然后将结果粘贴到同一excel工作表中的True和false,但对于某些URL,我收到socket.timeout错误,代码在此之后不起作用。代码如下:Python 在检查网站是否正常工作时,如何绕过或捕获socket.timeout错误?,python,python-3.x,Python,Python 3.x,我一直在开发一个程序来检查网站是否正常工作。我从excel工作表中获取URL,然后将结果粘贴到同一excel工作表中的True和false,但对于某些URL,我收到socket.timeout错误,代码在此之后不起作用。代码如下: import http.client as httpc from urllib.parse import urlparse import pandas as pd import xlwings as xw import smtplib from xlsxwrit
import http.client as httpc
from urllib.parse import urlparse
import pandas as pd
import xlwings as xw
import smtplib
from xlsxwriter import Workbook
import socket
x=[]
df = pd.read_excel (r'xyz.xlsx')
df1=pd.DataFrame(df,columns=['URL'])
print(df1)
url_list=df["URL"].tolist()
print(url_list)
for i in url_list:
def checkUrl(i):
if 'http' not in i:
i= 'https://'+i
p = urlparse(i)
conn = httpc.HTTPConnection(p.netloc,timeout=4)
conn.request('HEAD', p.path)
try:
resp = conn.getresponse()
return resp.status<400
except requests.exceptions.RequestException:
return False
print(checkUrl(i))
x.append(checkUrl(i))
workbook = Workbook('abc.xlsx')
Report_Sheet = workbook.add_worksheet()
Report_Sheet.write(0, 1, 'Value')
Report_Sheet.write_column(1, 1, x)
workbook.close()
将http.client作为httpc导入
从urllib.parse导入urlparse
作为pd进口熊猫
将xlwings作为xw导入
导入smtplib
从xlsxwriter导入工作簿
导入套接字
x=[]
df=pd.read_excel(r'xyz.xlsx')
df1=pd.DataFrame(df,columns=['URL'])
打印(df1)
url_list=df[“url”].tolist()
打印(url\u列表)
对于url_列表中的i:
def checkUrl(一):
如果“http”不在i中:
i='https://'+i
p=urlparse(i)
conn=httpc.HTTPConnection(p.netloc,超时=4)
连接请求(“头”,p.path)
尝试:
resp=conn.getresponse()
返回相应状态第一个猜测是
resp = conn.getresponse()
应该在try子句中。如果不起作用,请添加程序的输出。此代码中有许多问题
即使url需要HTTPS,您也可以无条件地使用HTTP
您可以在try:
Exception子句需要一个requests.exceptions.RequestException
不能由代码引发
由于您使用的不是请求库,而是低级的http.client
,因此您应该只看到套接字库中的错误,这些都是OSError的子类
您的代码可能会变成(注意:未测试):
def checkUrl(i):
如果“http”不在i中:
i='https://'+i
p=urlparse(i)
如果(p.scheme=='http'):
conn=httpc.HTTPConnection(p.netloc,超时=4)
其他:
conn=httpc.HTTPSConnection(p.netloc,超时=4)
尝试:
连接请求(“头”,p.path)
resp=conn.getresponse()
return resp.status根据我的经验,当IP地址解析为有效主机名,但服务器不再配置为使用该主机名时,会发生此错误。这会导致服务器忽略您尝试连接到它的尝试
要处理此问题,应该在超时错误时返回False
import socket
try:
resp = conn.getresponse()
return resp.status<400
except requests.exceptions.RequestException:
return False
except socket.timeout as err:
return False
它检查8个网站,第9个网站返回:sock.connect(sa)socket.timeout:timed out非常感谢。它起作用了。有没有办法将这些类型的站点确定为非工作站点?有没有办法将这些类型的站点确定为非工作站点?奈拉布公司
import socket
try:
resp = conn.getresponse()
return resp.status<400
except requests.exceptions.RequestException:
return False
except socket.timeout as err:
return False
import socket
import ssl
import http.client
try:
resp = conn.getresponse()
return resp.status < 400
except http.client.HTTPException as err:
# A connection was established, but the request failed
return False
except socket.timeout as err:
# The website no longer exists on the server
return False
except socket.gaierror as err:
# Could not resolve the hostname to an IP address
return False
except ssl.CertificateError as err:
# The SSL certificate was never configured, or it cannot be trusted
return False
except ssl.SSLError as err:
# Other SSL errors not covered by ssl.CertificateError
return False