Python 3.x 如何在使用for循环请求时忽略HTTP错误？_Python 3.x_Error Handling_Python Requests_Python Requests Html

Python 3.x 如何在使用for循环请求时忽略HTTP错误？

python-3.x error-handling

Python 3.x 如何在使用for循环请求时忽略HTTP错误？,python-3.x,error-handling,python-requests,python-requests-html,Python 3.x,Error Handling,Python Requests,Python Requests Html,这是我的代码，用于检查多个URL中的特定关键字，并在是否找到该关键字时写入输出文件 import requests import pandas as pd from bs4 import BeautifulSoup df = pd.read_csv('/path/to/input.csv') urls = df.T.values.tolist()[2] myList= [] for url in urls: url_1 = url keyword ='myKeyword'

这是我的代码，用于检查多个URL中的特定关键字，并在是否找到该关键字时写入输出文件

import requests
import pandas as pd
from bs4 import BeautifulSoup

df = pd.read_csv('/path/to/input.csv')
urls = df.T.values.tolist()[2]
myList= []

for url in urls:
    url_1 = url
    keyword ='myKeyword'
    res = requests.get(url_1)
    finalresult= print(keyword in res.text)

    if finalresult == False:
        myList.append("NOT OK")
    else:
        myList.append("OK")

df["myList"] = pd.DataFrame(myList, columns=['myList'])

df.to_csv('/path/to/output.csv', index=False)

但是，一旦我的多个URL中的任何一个关闭并且出现HTTP错误，脚本将停止并显示以下错误：

    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='argos-yoga.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x122582d90>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

raiseconnectionError（e，request=request）
requests.exceptions.ConnectionError:HTTPSConnectionPool（host='argos-yoga.com'，port=443）：url:/（由NewConnectionError引起（'：未能建立新连接：[Errno 8]提供了节点名或服务名，或未知]）超过了最大重试次数

如何忽略这些错误并让脚本继续扫描？有人能帮我吗？thx

您只需使用“除此之外尝试”方式即可

例如：

import requests
import pandas as pd
from bs4 import BeautifulSoup

df = pd.read_csv('/path/to/input.csv')
urls = df.T.values.tolist()[2]
myList= []

for url in urls:
    url_1 = url
    keyword ='myKeyword'
    try:
        res = requests.get(url_1)
        finalresult = keyword in res.text
        print(finalresult)
        if finalresult == False:
            myList.append("NOT OK")
        else:
            myList.append("OK")
    except Exception as e:
        print(f"There was an error, error = {e}")
        pass
    df["myList"] = pd.DataFrame(myList, columns=['myList'])
    df.to_csv('/path/to/output.csv', index=False)

尝试将

Try..except

仅放在

requests.get（）和res.text
周围
例如：
import requests
import pandas as pd
from bs4 import BeautifulSoup

df = pd.read_csv('/path/to/input.csv')
urls = df.T.values.tolist()[2]
myList= []

for url in urls:
    url_1 = url
    keyword ='myKeyword'
    try:                                    # <-- put try..except here
        res = requests.get(url_1)
        finalresult = keyword in res.text   # <-- remove print()
    except:
        finalresult = False

    if finalresult == False:
        myList.append("NOT OK")
    else:
        myList.append("OK")

df["myList"] = pd.DataFrame(myList, columns=['myList'])

df.to_csv('/path/to/output.csv', index=False)

谢谢你，艾哈迈德！我已经尝试了上面的代码，但是如果finalresult==False，它不会添加'notOK'。我得到所有URL的“OK”。你知道我该如何解决这个问题吗？我不确定，所以当最终结果等于False时，它会说OK？好的，是的，我发现你需要添加finalresult=关键字is res.text因为你刚刚将它分配给一个print语句，我编辑了代码。试试看，告诉我它是否有效。这是可行的！谢谢你，安德烈。我只是想知道，如何为HTTP错误添加一个标志。例如，如果这是我在输入文件“argos yoga.com”中的URL，我希望将其标记为“Down”，而不是“OK”，因为页面不工作。我可以在代码中的某个地方添加这样的内容吗：例外情况除外，例如e:print（f“有一个错误，error={e}”）myList（“Down”）pass我询问的原因是，当我从终端运行此脚本时，最好知道哪些URL也会抛出HTTP错误，并将它们保存在我的errorLog.txt中。非常感谢。
for url in urls:
    url_1 = url
    keyword ='myKeyword'
    try:                                    # <-- put try..except here
        res = requests.get(url_1)

        if keyword in res.text:
            myList.append("OK")
        else:
            myList.append("NOT OK")
    except:
        myList.append("Down")