Python使用另一个元组中的适当对替换子字符串元组中的子字符串的方法
我正在寻找一个快速的解决方案,代码应该循环一个很长的句子列表(每行),并用另一个元组的匹配项替换一个元组(或列表)中的子字符串。(伪)代码应如下所示:Python使用另一个元组中的适当对替换子字符串元组中的子字符串的方法,python,Python,我正在寻找一个快速的解决方案,代码应该循环一个很长的句子列表(每行),并用另一个元组的匹配项替换一个元组(或列表)中的子字符串。(伪)代码应如下所示: # an example of one line sentence: a = "I was thinking to begin this journey." # tuples: targets and replacements verbs = ("to begin", "I begin", "you begin", "we begin") ve
# an example of one line sentence:
a = "I was thinking to begin this journey."
# tuples: targets and replacements
verbs = ("to begin", "I begin", "you begin", "we begin")
verbs_fixed = ("toXXbegin", "IXXbegin", "youXXbegin", "weXXbegin")
with open(<INPUT FILE NAME>) as inf:
for line in inf:
line = ????
#一行句子示例:
a=“我正在考虑开始这段旅程。”
#元组:目标和替换
动词=(“开始”,“我开始”,“你开始”,“我们开始”)
固定动词=(“toXXbegin”、“IXXbegin”、“youXXbegin”、“weXXbegin”)
以open()作为inf:
对于inf中的行:
行=????
鉴于句子列表很长,我希望能找到最快的解决方案
我在考虑重新编译,然后进行列表理解。有更好的方法吗?如果你压缩两个列表,那么你只需要简单的替换:
for original_value, target_value in zip(verbs, verbs_fixed):
line = line.replace(original_value, target_value)
使用regex
def regex_mapping(sentence):
" Function to do the replacements based upon mapping of verbs to verbs fixed"
return regex_pattern.sub(lambda m: mapping[m.group(0)], sentence)
# Setup code
verbs = ("to begin", "I begin", "you begin", "we begin")
verbs_fixed = ("toXXbegin", "IXXbegin", "youXXbegin", "weXXbegin")
# Dictionary mapping
mapping = {x:y for x, y in zip(verbs, verbs_fixed)}
# Regex pattern (pre-compile for speed)
regex_pattern = re.compile('|'.join(verbs))
用法
a = "I was thinking to begin this journey."
print(regex_mapping(a))
附录
如果你的关键词列表有数百个,你应该在构建Trie字典的基础上对此进行研究。你能添加输出吗?它总是
wordXXword
?可以用XX替换空格吗?只需将它们压缩并替换,这有什么问题吗?@SimonFink好吧,这显然是一个元组,但实际上她想说“list”@BUFU,不,这绝对是一个“元组”。