Python 如何在URL列表中循环打印<；P>；与美联_Python_Python 3.x_Beautifulsoup

Python 如何在URL列表中循环打印<；P>；与美联

python python-3.x

Python 如何在URL列表中循环打印<；P>；与美联,python,python-3.x,beautifulsoup,Python,Python 3.x,Beautifulsoup,我刚刚发现了关于美丽之路（4）。我有很多链接，我想一次打印多个网站的标签，但我不知道怎么做，因为我是初学者。我在stackoverflow上也找不到适合我的东西。像这样的事情是行不通的： from bs4 import BeautifulSoup import requests import warnings warnings.filterwarnings("ignore", category=UserWarning, module='bs4') url = ["http://fc.lc/a

我刚刚发现了关于美丽之路（4）。我有很多链接，我想一次打印多个网站的

标签，但我不知道怎么做，因为我是初学者。我在stackoverflow上也找不到适合我的东西。
像这样的事情是行不通的：

from bs4 import BeautifulSoup
import requests
import warnings

warnings.filterwarnings("ignore", category=UserWarning, module='bs4')
url = ["http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml?s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://google.com=text&format=text", "http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml?s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://example.com&format=text&format=text"]

# add header
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'}
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, "lxml")
print( soup.find('p').text )

我在这方面遇到的错误（我没想到它会起作用（给我一个可能重复的错误答案对我没有帮助，请先阅读标题中的问题）：

回溯（最近一次呼叫最后一次）：
文件“C:\Users\Gebruiker\Desktop\apitosortened.py”，第10行，在
r=requests.get（url，headers=headers）
文件“C:\Users\Gebruiker\AppData\Local\Programs\Python37-32\lib\site packages\requests\api.py”，第75行，在get中
返回请求（'get'，url，params=params，**kwargs）
请求中第60行的文件“C:\Users\Gebruiker\AppData\Local\Programs\Python37-32\lib\site packages\requests\api.py”
return session.request（method=method，url=url，**kwargs）
请求中第533行的文件“C:\Users\Gebruiker\AppData\Local\Programs\Python37-32\lib\site packages\requests\sessions.py”
resp=自我发送（准备，**发送）
文件“C:\Users\Gebruiker\AppData\Local\Programs\Python37-32\lib\site packages\requests\sessions.py”，第640行，在send中
adapter=self.get\u适配器（url=request.url）
文件“C:\Users\Gebruiker\AppData\Local\Programs\Python37-32\lib\site packages\requests\sessions.py”，第731行，在get\U适配器中
raise InvalidSchema（“未找到“%s”的连接适配器%url）
requests.exceptions.InvalidSchema:未找到“”的连接适配器['http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml？s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://google.com=text&format=text', 'http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml？s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://example.com&format=text&format=text']'

我真的没想到它会这么简单难，任何帮助都会很感激！

如果您有列表，那么使用

进行循环
for item in url:
    r = requests.get(item, headers=headers)
    soup = BeautifulSoup(r.content, "lxml")
    print(soup.find('p').text)


顺便说一句：你的url不会返回任何HTML，但会返回一些带有链接的文本-因此代码找不到

请参阅此返回文本
for item in url:
    r = requests.get(item, headers=headers)
    print(r.text)    

结果
https://fc.lc/C4FNiXbY

使用for loop，然后检查是否存在p
标记。如果存在，则打印文本
from bs4 import BeautifulSoup
import requests
import warnings

warnings.filterwarnings("ignore", category=UserWarning, module='bs4')
urls = ["http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml?s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://google.com=text&format=text", "http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml?s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://example.com&format=text&format=text"]

# add header
for url in urls:
 headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'}
 r = requests.get(url, headers=headers)
 soup = BeautifulSoup(r.content, "lxml")
 if soup.find('p'):
    print( soup.find('p').text)

@Trenton_M你读过我写的东西吗？我已经解决了这个问题，我的问题是无法打印多个URL的p，这与
标记无关，也没有提到它！使用for-loop处理链接列表（或任何其他列表）可能的重复
from bs4 import BeautifulSoup
import requests
import warnings

warnings.filterwarnings("ignore", category=UserWarning, module='bs4')
urls = ["http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml?s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://google.com=text&format=text", "http://fc.lc/api?api=9053290fd05b5e5eb091b550078fa1e30935c92c&url=https://wow-ht.ml?s=https://cutlinks.pro/api?api=e6a8809e51daedcf30d9d6270fd0bfeba73c1dcb&url=https://example.com&format=text&format=text"]

# add header
for url in urls:
 headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'}
 r = requests.get(url, headers=headers)
 soup = BeautifulSoup(r.content, "lxml")
 if soup.find('p'):
    print( soup.find('p').text)