python刮html字体标记_Python_Screen Scraping

python刮html字体标记

python

python刮html字体标记,python,screen-scraping,Python,Screen Scraping,我对编程和python特别陌生。我无法从html中提取字体标记文本。这是我的代码。我需要提取和计数之间的所有文本。我不知道我没有考虑什么，因为运行程序时得到的响应是空的 from bs4 import BeautifulSoup html = """<P STYLE="margin-bottom: 0in">"amy in marketing press one amanda in groups press two to repeat this menu pres

我对编程和python特别陌生。我无法从html中提取字体标记文本。这是我的代码。我需要提取和计数之间的所有文本。我不知道我没有考虑什么，因为运行程序时得到的响应是空的

from bs4 import BeautifulSoup

html = """<P STYLE="margin-bottom: 0in">&quot;amy in marketing press one amanda in groups press two to repeat this menu press star&quot;</P>
<P STYLE="margin-bottom: 0in"><BR>
</P>
<P STYLE="margin-bottom: 0in">Labels:<FONT COLOR="#ff0000">Machine-Message,In-House-Alternative,Company-Alternative;</FONT></P>
<P STYLE="margin-bottom: 0in"><FONT COLOR="#00b050">Machine-Message,</FONT><FONT COLOR="#00b050">Greetings-Other;</FONT></P>
<P STYLE="margin-bottom: 0in"><FONT COLOR="#0070c0">Machine-Message,</FONT>
<FONT COLOR="#0070c0">Personal-Information;</FONT></P>
<P STYLE="margin-bottom: 0in"><BR>
</P>"""

soup = BeautifulSoup(html)
print(soup.find('FONT', COLOR="#ff0000"))

从bs4导入美化组
html=“”“市场营销中的艾米按一组中的阿曼达按二重复此菜单按星星”



标签：机器信息、内部备选方案、公司备选方案；
机器信息，问候其他人；
机器消息，
个人信息；


“”“
soup=BeautifulSoup（html）
打印（soup.find（'FONT'，COLOR=“#ff0000”））

您缺少一个引号“并在soup.find中使用小写标记名，或用于获取所有事件find\u all

from bs4 import BeautifulSoup

html = """<P STYLE="margin-bottom: 0in">&quot;amy in marketing press one amanda in groups press two to repeat this menu press star&quot;</P>
<P STYLE="margin-bottom: 0in"><BR>
</P>
<P STYLE="margin-bottom: 0in">Labels:<FONT COLOR="#ff0000">Machine-Message,In-House-Alternative,Company-Alternative;</FONT></P>
<P STYLE="margin-bottom: 0in"><FONT COLOR="#00b050">Machine-Message,</FONT><FONT COLOR="#00b050">Greetings-Other;</FONT></P>
<P STYLE="margin-bottom: 0in"><FONT COLOR="#0070c0">Machine-Message,</FONT>
<FONT COLOR="#0070c0">Personal-Information;</FONT></P>
<P STYLE="margin-bottom: 0in"><BR>
</P>"""
soup = BeautifulSoup(html)
print(soup.find("font", color="#ff0000").text)

从bs4导入美化组
html=“”“市场营销中的艾米按一组中的阿曼达按二重复此菜单按星星”



标签：机器信息、内部备选方案、公司备选方案
机器信息、问候语和其他
机器消息，
个人资料


“”“
soup=BeautifulSoup（html）
打印（soup.find（“font”，color=“#ff0000”）.text）

我在写这篇文章的过程中错过了“但结果是一样的，空的[]拉凯什，现在好多了，我得到了一些东西。现在我只需要得到文本而不需要提及标签：机器消息、内部替代方案、公司替代方案@TiriTaka我已经添加了很多变化，非常感谢这一进展。但这仍然不是我任务的全部目标。我将扩展html。我需要从中找出“机器消息”在这个文档中重复了多少次，实际上我需要计算每个标签在这个文档中重复了多少次。将此添加到旧的html

@37713616300006，

非常感谢您致电快速帮助，grill按一获取私人聚会信息按二获取出发订单按三获取预订“

标签：机器消息，公司类型；

机器消息，公司类型；

请不要在注释中放入大量代码。它属于问题本身，在注释中阅读很痛苦。