Python 3.x 如何在使用for循环请求时忽略HTTP错误?

Python 3.x 如何在使用for循环请求时忽略HTTP错误?,python-3.x,error-handling,python-requests,python-requests-html,Python 3.x,Error Handling,Python Requests,Python Requests Html,这是我的代码,用于检查多个URL中的特定关键字,并在是否找到该关键字时写入输出文件 import requests import pandas as pd from bs4 import BeautifulSoup df = pd.read_csv('/path/to/input.csv') urls = df.T.values.tolist()[2] myList= [] for url in urls: url_1 = url keyword ='myKeyword'

这是我的代码,用于检查多个URL中的特定关键字,并在是否找到该关键字时写入输出文件

import requests
import pandas as pd
from bs4 import BeautifulSoup

df = pd.read_csv('/path/to/input.csv')
urls = df.T.values.tolist()[2]
myList= []

for url in urls:
    url_1 = url
    keyword ='myKeyword'
    res = requests.get(url_1)
    finalresult= print(keyword in res.text)

    if finalresult == False:
        myList.append("NOT OK")
    else:
        myList.append("OK")

df["myList"] = pd.DataFrame(myList, columns=['myList'])

df.to_csv('/path/to/output.csv', index=False)
但是,一旦我的多个URL中的任何一个关闭并且出现HTTP错误,脚本将停止并显示以下错误:

    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='argos-yoga.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x122582d90>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
raiseconnectionError(e,request=request)
requests.exceptions.ConnectionError:HTTPSConnectionPool(host='argos-yoga.com',port=443):url:/(由NewConnectionError引起(':未能建立新连接:[Errno 8]提供了节点名或服务名,或未知])超过了最大重试次数

如何忽略这些错误并让脚本继续扫描?有人能帮我吗?thx

您只需使用“除此之外尝试”方式即可

例如:

import requests
import pandas as pd
from bs4 import BeautifulSoup

df = pd.read_csv('/path/to/input.csv')
urls = df.T.values.tolist()[2]
myList= []

for url in urls:
    url_1 = url
    keyword ='myKeyword'
    try:
        res = requests.get(url_1)
        finalresult = keyword in res.text
        print(finalresult)
        if finalresult == False:
            myList.append("NOT OK")
        else:
            myList.append("OK")
    except Exception as e:
        print(f"There was an error, error = {e}")
        pass
    df["myList"] = pd.DataFrame(myList, columns=['myList'])
    df.to_csv('/path/to/output.csv', index=False)

尝试将
Try..except
仅放在
requests.get()和
res.text
周围

例如:

import requests
import pandas as pd
from bs4 import BeautifulSoup

df = pd.read_csv('/path/to/input.csv')
urls = df.T.values.tolist()[2]
myList= []

for url in urls:
    url_1 = url
    keyword ='myKeyword'
    try:                                    # <-- put try..except here
        res = requests.get(url_1)
        finalresult = keyword in res.text   # <-- remove print()
    except:
        finalresult = False

    if finalresult == False:
        myList.append("NOT OK")
    else:
        myList.append("OK")

df["myList"] = pd.DataFrame(myList, columns=['myList'])

df.to_csv('/path/to/output.csv', index=False)

谢谢你,艾哈迈德!我已经尝试了上面的代码,但是如果finalresult==False,它不会添加'notOK'。我得到所有URL的“OK”。你知道我该如何解决这个问题吗?我不确定,所以当最终结果等于False时,它会说OK?好的,是的,我发现你需要添加finalresult=关键字is res.text因为你刚刚将它分配给一个print语句,我编辑了代码。试试看,告诉我它是否有效。这是可行的!谢谢你,安德烈。我只是想知道,如何为HTTP错误添加一个标志。例如,如果这是我在输入文件“argos yoga.com”中的URL,我希望将其标记为“Down”,而不是“OK”,因为页面不工作。我可以在代码中的某个地方添加这样的内容吗:
例外情况除外,例如e:print(f“有一个错误,error={e}”)myList(“Down”)pass
我询问的原因是,当我从终端运行此脚本时,最好知道哪些URL也会抛出HTTP错误,并将它们保存在我的errorLog.txt中。非常感谢。
for url in urls:
    url_1 = url
    keyword ='myKeyword'
    try:                                    # <-- put try..except here
        res = requests.get(url_1)

        if keyword in res.text:
            myList.append("OK")
        else:
            myList.append("NOT OK")
    except:
        myList.append("Down")