Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/365.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在没有空格的单词之间未检测到标点符号_Python_Regex - Fatal编程技术网

Python 在没有空格的单词之间未检测到标点符号

Python 在没有空格的单词之间未检测到标点符号,python,regex,Python,Regex,当检测到标点符号时,我如何分割句子。?!并且出现在两个没有空格的单词之间 例如: 预期: ['This is an example.', 'Not working as expected.', 'Because there isn't a space after dot.']` +表示一个或多个某物,*表示零个或多个 如果你需要保持平衡。您可能不想拆分,但可以这样做: splitText = re.findall(".*?[.?!]", "This is an example. Not w

当检测到标点符号时,我如何分割句子。?!并且出现在两个没有空格的单词之间

例如:

预期:

['This is an example.', 
'Not working as expected.', 
'Because there isn't a space after dot.']`
+表示一个或多个某物,*表示零个或多个

如果你需要保持平衡。您可能不想拆分,但可以这样做:

splitText = re.findall(".*?[.?!]", "This is an example. Not working as expected.Because there isn't a space after dot.")

['This is an example.',
 ' Not working as expected.',
 "Because there isn't a space after dot."]
您可以通过使用正则表达式来修剪它们,例如“\s**?[.?!]”或只使用.trim

注:*?是懒惰或非贪婪量词,与。*相反,后者是贪婪量词

输出:

['This is an example!', 
 ' Working as expected?', 
 'Because.']
['This is an example', 
'!', 
' Working as expected', 
'?', 
'Because', 
'.', 
'']
另一个解决方案:

import re
from pprint import pprint

split_text = re.split("([?.!])", "This is an example! Working as "
    "expected?Because.")

pprint(split_text)
输出:

['This is an example!', 
 ' Working as expected?', 
 'Because.']
['This is an example', 
'!', 
' Working as expected', 
'?', 
'Because', 
'.', 
'']

@Zppingto是的,我刚刚注意到,编辑了答案。如果你看一下的文档,有一条注释提到拆分对空匹配不起作用,这就是原因吗?@SebastianProske哦,这很糟糕:有其他方法的解决方案吗?@Zppingto使用findall代替拆分,在我的回答中,如果句子中有常见的缩写词,如Mr.,等,这个解决方案将不起作用。它起作用,但我丢失了标点符号,这对我来说不是一个解决方案@塞巴斯蒂安·普罗斯克已经指出了这一点。。。这不能用split来完成
import re
from pprint import pprint

split_text = re.split("([?.!])", "This is an example! Working as "
    "expected?Because.")

pprint(split_text)
['This is an example', 
'!', 
' Working as expected', 
'?', 
'Because', 
'.', 
'']