有效地替换20k潜在匹配中的字符串(Python)
我想替换字符串的子字符串,并想检查20k+候选项 有没有比将20k分为900个候选对象的子组并循环它们更有效的方法 这是我的功能:有效地替换20k潜在匹配中的字符串(Python),python,performance,Python,Performance,我想替换字符串的子字符串,并想检查20k+候选项 有没有比将20k分为900个候选对象的子组并循环它们更有效的方法 这是我的功能: def replaceNames(mailString, nameList, replacement=" Nachname"): anzNames = len(nameList) seq = np.arange(start=0, stop=anzNames, step=900).tolist() seq.append(anzNames)
def replaceNames(mailString, nameList, replacement=" Nachname"):
anzNames = len(nameList)
seq = np.arange(start=0, stop=anzNames, step=900).tolist()
seq.append(anzNames)
for i in range(0, len(seq) - 1):
tempNamesString = "|".join(nameList[seq[i]:seq[i + 1]])
mailString = re.sub(tempNamesString, replacement, mailString)
return (mailString)
谢谢 我的建议是:
string
操作,而不是使用re
(regex),因为它更快
# Sample string of 1 million "my_rand_str"
In [9]: x = ["my_rand_str"] * 1000000
In [10]: %%time
...: replaced = [a.replace("str", "replaced") for a in x]
...:
...:
Wall time: 219 ms
In [11]: %%time
...: replaced = [re.sub("str", "replaced", a) for a in x]
...:
...:
Wall time: 1.33 s
In [25]: tobe_replaced = re.compile("str")
In [28]: %%time
...: replaced = [tobe_replaced.sub("replaced", a) for a in x]
...:
...:
...:
Wall time: 1.02 s
In [29]: %%time
...: replaced = tobe_replaced.sub("replaced", "\n".join(x)).split("\n")
...:
...:
Wall time: 291 ms
In [30]: %%time
...: replaced = "\n".join(x).replace("str", "replaced").split("\n")
...:
...:
...:
Wall time: 132 ms
希望这有帮助。您能提供此函数的输入和输出示例吗?我很难理解你想做什么。