Regex 用Python3将段落分成句子
我正在编写一个电报机器人,帮助我学习德语 我不想翻译整个段落,而是想一步一步地翻译每个句子,紧接着翻译,这样我就可以面对单词并学习,而不是一直上下滚动 我是一个雷格士新手 我想知道是否有这样一个 我的文本可以分成以下几个句子:Regex 用Python3将段落分成句子,regex,string,python-3.x,split,Regex,String,Python 3.x,Split,我正在编写一个电报机器人,帮助我学习德语 我不想翻译整个段落,而是想一步一步地翻译每个句子,紧接着翻译,这样我就可以面对单词并学习,而不是一直上下滚动 我是一个雷格士新手 我想知道是否有这样一个 我的文本可以分成以下几个句子: This is a sentence. This is another. And here one another, same line, starting with space. this sentence starts with lowercase letter. H
This is a sentence.
This is another. And here one another, same line, starting with space.
this sentence starts with lowercase letter.
Here is a site you may know: google.com.
我希望得到一个数组,其中包含以下内容(我在这里为您现在看到的每行编写一个数组元素):
确实提前感谢。使用
nltk()可能会更好地处理这一问题
当然,以前也有人问过这个问题。有帮助吗?即使是自然语言分析也很难找到句子。有鉴于此,这不是正则表达式所能做到的。原因?正则表达式解析字符,而不是单词、短语、句子结构,也不是与语言、用法等有关的任何东西。。
This is a sentence.
This is another.
And here one another,same line, starting with space.
this sentence starts with lowercase letter.
Here is a site you may know: google.com.
from nltk.tokenize import sent_tokenize
string = "This is a sentence. This is another. And here one another, same line, starting with space. this sentence starts with lowercase letter. Here is a site you may know: google.com."
sent_tokenize_list = sent_tokenize(string)
print(sent_tokenize_list)
# ['This is a sentence.', 'This is another.', 'And here one another, same line, starting with space.', 'this sentence starts with lowercase letter.', 'Here is a site you may know: google.com.']