如何使用Python或正则表达式获得引号中的对话行?

如何使用Python或正则表达式获得引号中的对话行?,python,regex,Python,Regex,我在这个网站上尝试了一些答案,但没有成功。下面是我正在处理的文本类型的一个示例: 伊丽莎白说:“不过,如果你今天拿到了,我妈妈的 目标会得到回应。” 她终于向父亲勒索承认 马匹已订婚。因此,简不得不骑马, 她母亲带着许多愉快的预言陪她走到门口 真是糟糕的一天。她的希望得到了满足;简离开不久 在下大雨之前。她的姐妹们为她感到不安,但她的母亲 他很高兴。雨一连下了一个晚上 间歇;简当然不能回来了 “这真是我的一个幸运的主意!”班纳特太太不止一次地说 有一次,好像下雨是她自己的功劳。直到 然而,第二天

我在这个网站上尝试了一些答案,但没有成功。下面是我正在处理的文本类型的一个示例:

伊丽莎白说:“不过,如果你今天拿到了,我妈妈的 目标会得到回应。”

她终于向父亲勒索承认 马匹已订婚。因此,简不得不骑马, 她母亲带着许多愉快的预言陪她走到门口 真是糟糕的一天。她的希望得到了满足;简离开不久 在下大雨之前。她的姐妹们为她感到不安,但她的母亲 他很高兴。雨一连下了一个晚上 间歇;简当然不能回来了

“这真是我的一个幸运的主意!”班纳特太太不止一次地说 有一次,好像下雨是她自己的功劳。直到 然而,第二天早上,她并没有意识到自己的幸福 发明早餐刚吃完,一个仆人就来了 尼日斐花园为伊丽莎白带来了以下便条:

“我最亲爱的丽萃--

“我今天早上感到很不舒服,我想应该是这样的 归咎于我昨天浑身湿透了。我的好朋友不会 等我好了再说。他们还坚持要我去看 琼斯先生——因此,如果你听到他的消息,不要惊慌 来过我这里,除了喉咙痛和头痛,还有 我没什么问题。--你的,等等。”

“好吧,亲爱的,”伊丽莎白看完便条后,班纳特先生说 大声说,“如果你的女儿得了一场危险的疾病——如果 如果她死了,知道这一切都在她心里,那将是一种安慰 在你的命令下追捕彬格莱先生。”

“噢!我不怕她死。人们不会死于小事故 轻微的感冒。她会得到很好的照顾。只要她留下 在那里,一切都很好。如果我有时间,我会去看她 马车。”

我想从这个例子中提取

"But if you have got them to-day, my mother's purpose will be answered"
"This was a lucky idea of mine, indeed!" 
"MY DEAREST LIZZY,-- I find myself very unwell this morning, which, I suppose, is to be imputed to my getting wet through yesterday. My kind friends will not hear of my returning till I am better. They insist also on my seeing Mr. Jones--therefore do not be alarmed if you should hear of his having been to me--and, excepting a sore throat and headache, there is not much the matter with me.--Yours, etc." 
"Well, my dear,"
…等等。我想进入regex的规则是

1. get all strings within a " " (there can be multiple on the same line)
2. if the line ends with a \n before finding a second ", continue grabbing the next line so long as it also begins with a "
可能会帮助您实现这一目标。它会将您的文本分为三组:

(\")(.*)(\")

如果希望传递
\n
,只需使用或
|
将其添加到第二个组,然后:


这可能不是你想要的,但你可以试试这个:


对于示例数据,您可以使用:

  • “[^\n”]*”
    匹配从开始到结束的双引号,不匹配换行符
  • |
  • “[^\n”]*\n+“[^”]*”
    仅当第一个换行符以双引号开头时,才从开始引号到结束引号进行匹配

这不能满足第二个条件,大副,
“我最亲爱的丽萃,——我找到我了…..
这个line@Emma谢谢但这也包括这样的文字:“伊丽莎白大声读了那封信后,班纳特先生说,@neptune你想
我最亲爱的丽萃,--我发现我的…
在输出中吗?”?如果没有,请按以下方式更新相关信息:well@Emma理想情况下是的,所以您想用单个空格替换多余的空格或新行输入输出?@CodeManiac理想情况下,这组文本位于同一行,如图所示,我建议的是
“(?:(^\s*”)|([^“])+?”
这将为您提供所有匹配项,而不是从匹配项中删除多个空格到单个空格,以便在同一行中获得输出。这肯定与示例匹配!非常感谢。然而,当我在全文中尝试时,却发现了这样一个问题:达西先生给凯瑟琳夫人的信风格不同;班纳特先生给柯林斯先生的回信与这两封信仍然不同。“亲爱的先生,”我必须再次麻烦您向您表示祝贺。伊丽莎白不久将成为达西先生的妻子。尽可能地安慰凯瑟琳夫人。但是,如果我是你,我会支持侄子。他还有更多的东西要付出。“你诚挚的,等等。”给我看看你的文本,它给出了你的错误。我根据你的样本制作了正则表达式。请只提供给你错误的章节。我认为它非常复杂,因为文本不一致。但是如果问题只发生一次,你可以破例。将正则表达式更改为:
\”([^\“]+?)(\“\\-\-\-\n\\-\-\-您的任何时候)
 (\")(.*|\n)(\")
text = '''
"But if you have got them to-day," said Elizabeth, "my mother's purpose will be answered."

She did at last extort from her father an acknowledgment that the horses were engaged. Jane was therefore obliged to go on horseback, and her mother attended her to the door with many cheerful prognostics of a bad day. Her hopes were answered; Jane had not been gone long before it rained hard. Her sisters were uneasy for her, but her mother was delighted. The rain continued the whole evening without intermission; Jane certainly could not come back.

"This was a lucky idea of mine, indeed!" said Mrs. Bennet more than once, as if the credit of making it rain were all her own. Till the next morning, however, she was not aware of all the felicity of her contrivance. Breakfast was scarcely over when a servant from Netherfield brought the following note for Elizabeth:

"MY DEAREST LIZZY,--

"I find myself very unwell this morning, which, I suppose, is to be imputed to my getting wet through yesterday. My kind friends will not hear of my returning till I am better. They insist also on my seeing Mr. Jones--therefore do not be alarmed if you should hear of his having been to me--and, excepting a sore throat and headache, there is not much the matter with me.--Yours, etc."

"Well, my dear," said Mr. Bennet, when Elizabeth had read the note aloud, "if your daughter should have a dangerous fit of illness--if she should die, it would be a comfort to know that it was all in pursuit of Mr. Bingley, and under your orders."

"Oh! I am not afraid of her dying. People do not die of little trifling colds. She will be taken good care of. As long as she stays there, it is all very well. I would go and see her if I could have the carriage."
'''

talk = re.findall(r'\"([^\"]+?)(\"|\-\-\n)',text)
for t in talk:
    print(t[0])
"[^\n"]*"|"[^\n"]*\n+"[^"]*"