Python 正则表达式使用look behinds解析Buffy脚本

Python 正则表达式使用look behinds解析Buffy脚本,python,regex,regex-lookarounds,Python,Regex,Regex Lookarounds,我很难解析此页面: 我正在尝试获取角色名称和关联的对话。 文本如下所示: <p>BUFFY: Wait! <p>She stands there panting, watching the truck turn a corner. <p>BUFFY: (whining) Don't you want your garbage? <p>She sighs, pouts, turns and walks back toward the house. &

我很难解析此页面:

我正在尝试获取角色名称和关联的对话。 文本如下所示:

<p>BUFFY: Wait!
<p>She stands there panting, watching the truck turn a corner.
<p>BUFFY: (whining) Don't you want your garbage?
<p>She sighs, pouts, turns and walks back toward the house.
<p>Cut to the kitchen. Buffy enters through the back door, holding a pile of
mail. She begins looking through it. We see Dawn standing by the island.
<p>DAWN: Hey Buffy. Oh, don't forget, today's trash day.<br>BUFFY: (sourly)
Thanks.
<p>Dawn piles her books into her school bag. Buffy opens a letter.
<p>Close shot of the letter.
<p>
<p>Dawn smiles, and she and Willow exit. Buffy picks up the still-wrapped
sandwich and stares at it.
<p>BUFFY: (to herself) Somebody should.
<p>She sighs, puts the sandwich back in the bag.
<p>Cut to the Bronze. Pan across various people drinking and dancing,
bartender serving. Reveal Xander and Anya sitting at the bar eating chips from
several bags. A notebook sits in front of them bearing the wedding seating
chart.
<p>ANYA: See ... this seating chart makes no sense. We have to do it again.
(Xander nodding) We can't do it again. You do it.<br>XANDER: The seating
chart's fine. Let's get back to the table arrangements. I'm starting to have
dreams of gardenia bouquets. (winces) I am so glad my manly coworkers didn't
just hear me say that. (eating chips)
巴菲:等等! 她气喘吁吁地站在那里,看着卡车转弯。 巴菲:(抱怨)你不想要你的垃圾吗? 她叹了口气,撅嘴,转身朝房子走去。 切到厨房去。巴菲从后门进来,手里拿着一堆纸 邮寄。她开始仔细看。我们看到黎明站在岛上。 道恩:嘿,巴菲。哦,别忘了,今天是垃圾日。
巴菲:(酸溜溜地) 谢谢 道恩把书塞进书包里。布菲打开一封信。 这封信的近景。 黎明微笑着,她和柳树离开了。布菲捡起仍然包着的衣服 三明治,盯着它看。 布菲:(对自己说)应该有人来。 她叹了口气,把三明治放回袋子里。 切割成青铜色。在喝酒跳舞的人群中穿梭, 服务生。展示了Xander和Anya坐在酒吧里吃薯条的情景 几包。他们面前放着一本笔记本,上面写着婚礼的座位 图表。 安雅:看。。。这个座位表毫无意义。我们必须再做一次。 (桑德点点头)我们不能再这样做了。你来做。
桑德:座位 查特很好。让我们回到餐桌安排上来。我开始有点紧张了 栀子花之梦。(畏缩)我很高兴我有男子气概的同事没有这么做 听我说。(吃薯条) 理想情况下,我会从

匹配到下一个

。我试着用look aheads和look backinds来做这个:

reg = "((?<=<p>)|(?<=<br>))(?P<character>.+):(?P<dialogue>.+)((?=<p>)|(?=<br>))"
script = re.findall(reg, html_text)

reg=“((?围绕点符号:

re.findall('((?<=<p>)|(?<=<br>))([A-Z]+):([^<]+)', text)

re.findall(”(?围绕点符号工作:

re.findall('((?<=<p>)|(?<=<br>))([A-Z]+):([^<]+)', text)

re.findall(”(?哪种python?对我来说,第一个版本与某些东西匹配哪种python?对我来说,第一个版本与某些东西匹配谢谢你的帮助!谢谢你的帮助!