在.txt Python BeautifulSoup中从HTML提取文本_Python_Html_Beautifulsoup

在.txt Python BeautifulSoup中从HTML提取文本

python html

在.txt Python BeautifulSoup中从HTML提取文本,python,html,beautifulsoup,Python,Html,Beautifulsoup,我刚开始为我的工作编程，我被困在一些事情上。我以前在网上看过，但没有一个答案是有效的。我正在使用BeautifulSoup，但我愿意使用其他东西。非常感谢你我正在尝试提取名称到目前为止我有从bs4导入美化组以打开（“C:\\Users\\marle\\Desktop\\texte.txt，'rb'）作为fp: 汤=美汤（fp）汤=美化组（“数据”） name\u list=soup.find\u all”（class=\“single\u liste\u exposant\u na

我刚开始为我的工作编程，我被困在一些事情上。我以前在网上看过，但没有一个答案是有效的。我正在使用BeautifulSoup，但我愿意使用其他东西。非常感谢你

我正在尝试提取

名称

到目前为止我有

从bs4导入美化组
以打开（“C:\\Users\\marle\\Desktop\\texte.txt，'rb'）作为fp:
汤=美汤（fp）
汤=美化组（“数据”）
name\u list=soup.find\u all”（class=\“single\u liste\u exposant\u name\”>您可以找到div，然后获取文本：
from bs4 import BeautifulSoup
soup = BeautifulSoup(my_html)
names = [s.text.strip() for s in soup.findAll("div", attrs={"class":"single_liste_exposant_name"})]
print(names)

您只需将您的HTML字符串放入my_HTML
请澄清问题的具体内容，请参阅，。感谢您的快速回答。我只是尝试了，与我尝试的许多事情一样，我没有得到任何结果。仅[]当您在打开文件时读取值时，您必须将其作为文本（“r”而不是“rb”）和美化组（fp.read（））非常感谢！我已经在这上面呆了6个小时了哈哈。出于某种原因，我不得不离开rb，它不能与r一起工作。我不知道为什么。最后一件事是：从bs4导入BeautifulSoup并打开（“C:\\Users\\marle\\Desktop\\texte.txt”，“rb”）作为fp:soup=BeautifulSoup（fp.read（））name=[s.text.strip（））对于soup.findAll（“div”，attrs={“class”：“single_liste_exposant_name”}）”打印（名称）
from bs4 import BeautifulSoup
soup = BeautifulSoup(my_html)
names = [s.text.strip() for s in soup.findAll("div", attrs={"class":"single_liste_exposant_name"})]
print(names)