在Python3中去掉公共子字符串前缀的最佳方法是什么？_Python_String_Python 3.x_Substring

在Python3中去掉公共子字符串前缀的最佳方法是什么？

python string python-3.x

在Python3中去掉公共子字符串前缀的最佳方法是什么？,python,string,python-3.x,substring,Python,String,Python 3.x,Substring,假设我们有字符串和字符串列表： [<common-part>-<random-text-a>, <common-part>-<random-text-b>] 字符串： str1= 字符串列表： [<common-part>-<random-text-a>, <common-part>-<random-text-b>] [-，-] 在可读性和代码纯度方面，获得此类列表的最佳方法是什么： [&l

假设我们有字符串和字符串列表：

[<common-part>-<random-text-a>, <common-part>-<random-text-b>]

字符串：

```
str1=
```

字符串列表：

[<common-part>-<random-text-a>, <common-part>-<random-text-b>]

[-，-]

在可读性和代码纯度方面，获得此类列表的最佳方法是什么：

[<random-text-a>, <random-text-b>]

[，]

您可以使用列表理解，这非常类似于Python：

[newstr.replace(str1, '', 1) for newstr in list_of_strings]

newstr.replace（str，，，1）

将只替换第一次出现的str1。感谢@ev kounis的建议

MyList = ["xxx-56", "xxx-57", "xxx-58"]
MyList = [x[len(prefix):] for x in MyList] # for each x in the list, 
                                 # this function will return x[len(prefix):] 
                                 # which is the string x minus the length of the prefix string

print(MyList)

---> ['56', '57', '58']

这是最简单的方法。

我将使用

os.path.commonprefix

计算所有字符串的公共前缀，然后切片字符串以删除该前缀（此函数位于

os.path

模块中，但不检查路径分隔符，可用于一般上下文）：

注:

此方法允许预先未知的完整动态前缀。通过反转字符串，还可以删除公共后缀
最好使用
```
len
```
对结果进行切片，而不是使用
```
str.replace（）
```
：这样更快，而且只删除字符串的开头，而且更安全，因为我们知道所有字符串都以这个前缀开头

我本来会

common = "Hello_"
lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]

new_lines = []
for line in lines:
    # Finding first occurrence of the word we want to remove.
    startIndex = line.find(common) + len(common)
    new_lines.append(line[startIndex:])

print new_lines

我们正在测试Jean-François Fabre的性能：

from timeit import timeit
import os

def test_fabre(lines):
    # import os

    commonprefix = os.path.commonprefix(lines)
    return [x[len(commonprefix):] for x in lines]

def test_insert(common, lines):
    new_lines = []
    for line in lines:
        startIndex = line.find(common) + len(common)
        new_lines.append(line[startIndex:])
    return new_lines

print timeit("test_insert(common, lines)", 'from __main__ import test_insert; common="Hello_";lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]')
print timeit("test_fabre(lines)", 'from __main__ import test_fabre; lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]')

# test_insert outputs : 2.92963575145
# test_fabre outputs : 4.23027790484 (with import os OUTside func)
# test_fabre outputs : 5.86552750264 (with import os INside func)

把它切掉。。。替换它。。。正则表达式提取。。。试过任何东西了吗？看起来你想让我们为你写一些代码。虽然许多用户愿意为陷入困境的程序员编写代码，但他们通常只在海报已经试图自己解决问题时才提供帮助。演示这项工作的一个好方法是包括您迄今为止编写的代码、示例输入（如果有）、预期输出和实际获得的输出（控制台输出、回溯等）。你提供的细节越多，你可能得到的答案就越多。检查和。@MooingRawr实际上不是。有许多解决方案，如

regexp

替换或使用

len（）

函数并剪切字符串的开头。这就是我之所以在可读性和代码纯度的问题上发布通知的原因。@KamilZabielski您的评论与我要求显示您迄今为止所做的尝试有什么关系？用空字符串替换字符串是很好的解决方案。谢谢。当且仅当列表中的每个元素都是唯一的时，才是正确的，但在这种情况下，它实际上是唯一的。这将在公共部分后留下连字符，顺便说一句，我也相信这里的连字符将包含在公共部分中。也许，但是如果字符串中存在两次

str1

，该怎么办？

newstr.replace（str1，，，1）

将解决@KamilZabielski所指的问题。请注意其中的

。这告诉Python只替换

st实例。我正在考虑使用

len（）

，但我认为有某种更清晰的解决方案。@Kamil，

len（前缀）如果字符串中的值包含前缀twiceIt，则

是一个完美的解决方案。最好对

len（前缀）

进行编码，而不是像那样对其进行硬编码（

）。有什么不同于

len（）

的方法吗？

len（）

比

str.replace快得多

这里的分隔符不重要吗？文档说它只用于路径字符串不，它稍微滥用了这个工具，但它不关心路径分隔符之类的东西。当使用该工具计算公共目录时，如果所有文件都以相同的pref开头，您必须删除最后一个分隔符后面的内容ix！！当然可以重新编码。但是如果是那样的话，我就不会回答了。@Jean Françoisfare从性能的角度看更快了？这是否应该在

hellohello

上工作，只返回

hello

？@cricket_007它会工作并给出['']正如在问题中提到的，我们必须删除str1，但没有提到它应该发生多少次。问题只要求替换一个通用前缀，虽然基本上不清楚，因为用户没有详细解释是删除一次还是多次，但他的示例显示了一次。所以基本上我们不能说像o一次或多次。我并不是说这是错的，只是指出了边缘案例。注意，你的解决方案删除了单词，但也删除了它之前的内容。好吧，也许它之前什么都没有。你是对的，这是我没有考虑过的副作用。考虑到这一点，考虑到我写的内容，只有len就足够了。我会在我会在电脑后面。：）谢谢

common = "Hello_"
lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]

new_lines = []
for line in lines:
    # Finding first occurrence of the word we want to remove.
    startIndex = line.find(common) + len(common)
    new_lines.append(line[startIndex:])

print new_lines

from timeit import timeit
import os

def test_fabre(lines):
    # import os

    commonprefix = os.path.commonprefix(lines)
    return [x[len(commonprefix):] for x in lines]

def test_insert(common, lines):
    new_lines = []
    for line in lines:
        startIndex = line.find(common) + len(common)
        new_lines.append(line[startIndex:])
    return new_lines

print timeit("test_insert(common, lines)", 'from __main__ import test_insert; common="Hello_";lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]')
print timeit("test_fabre(lines)", 'from __main__ import test_fabre; lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]')

# test_insert outputs : 2.92963575145
# test_fabre outputs : 4.23027790484 (with import os OUTside func)
# test_fabre outputs : 5.86552750264 (with import os INside func)