Python 基于空格分割文本_Python_Html_Regex_Parsing_Split

Python 基于空格分割文本

python html regex parsing

Python 基于空格分割文本,python,html,regex,parsing,split,Python,Html,Regex,Parsing,Split,我的目标是解析上述文本，以便将对话与描述分开。我的文件中有多个这样的实例。输出应为两个单独的字符串x和y，其中：没有时间。今天不行…就在这里！y=出租车发出刺耳的声音…他的名片如何使用正则表达式匹配实现这一点？还是有更好的办法解决这个问题？我正在使用python。用于解析网页中的内容。根据所需的标记提取内容更容易。使用正则表达式解析HTML不是一个好主意演示：输出：您似乎误解了字符串小帮助？需要一点帮助吗？。和x，y要提取的是同一块中由换行符分隔的字符串你可以试试这个 from bs

我的目标是解析上述文本，以便将对话与描述分开。我的文件中有多个这样的实例。输出应为两个单独的字符串x和y，其中：没有时间。今天不行…就在这里！y=出租车发出刺耳的声音…他的名片

如何使用正则表达式匹配实现这一点？还是有更好的办法解决这个问题？我正在使用python。

用于解析网页中的内容。根据所需的标记提取内容更容易。使用正则表达式解析HTML不是一个好主意

演示：

输出：

您似乎误解了字符串小帮助？需要一点帮助吗？。和x，y要提取的是同一块中由换行符分隔的字符串

你可以试试这个

from bs4 import BeautifulSoup
s = """<b>                          DEADPOOL (CONT'D) </b>                Little help?

    The cabbie grabs Deadpool's hand and pulls him through to the
    front. Deadpool's head rests upside down on the bench seat
    as he maneuvers his legs through. The cabbie turns the
    helping hand into a HANDSHAKE, then turns down the Juice.

<b>                            CABBIE </b>"""

soup = BeautifulSoup(s, "html.parser")
print(soup.text)

为输入第二个附加问题而编辑

Little help?
The cabbie grabs Deadpool's hand and pulls him through to the
front. Deadpool's head rests upside down on the bench seat
as he maneuvers his legs through. The cabbie turns the
helping hand into a HANDSHAKE, then turns down the Juice.

试试beautifulsoup？嘿，谢谢你的回复。我已经更新了我的问题，使之更加清晰。嘿，谢谢你，但我的问题与你所理解的有点不同。我已经更新了问题。再次感谢！我编辑了你补充的第二个问题的答案。请再试一次，谢谢：-

ss="""<b>                          DEADPOOL (CONT'D) </b>                Little help?

The cabbie grabs Deadpool's hand and pulls him through to the
front. Deadpool's head rests upside down on the bench seat
as he maneuvers his legs through. The cabbie turns the
helping hand into a HANDSHAKE, then turns down the Juice.

<b>                            CABBIE </b>"""
import re
regx=re.compile(r'(?s)(?<=\>)[^<>]*(?=\<)')
lst=[m.strip() for m in regx.findall(ss)]
xy=[m.strip() for m in re.split(r'\n{2}',lst[1])]
for i in xy: print(i+"\n")     # x=xy[0], y=xy[1]

Little help?
The cabbie grabs Deadpool's hand and pulls him through to the
front. Deadpool's head rests upside down on the bench seat
as he maneuvers his legs through. The cabbie turns the
helping hand into a HANDSHAKE, then turns down the Juice.

ss="""copy&paste_Your_Input_string_Here"""
xy=[m.strip() for m in re.split(r'\n{2}',ss)]
for i in xy: print(i +"\n")     # x=xy[0], y=xy[1]