Python从字符串块中删除连字符

Python从字符串块中删除连字符,python,Python,我想从字符串块中删除连字符: e、 g: 这就是我的绳子的样子。需要一个通用方法从每行末尾删除- 注意:这是从excel复制的完整字符串 我尝试了以下方法: if data.endswith('-'): data=seq[:-1] 我希望我的输出/结果是这样的: (CB)-year-(3F)-year (56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new (56)-ADDR(01)-DATA(03)-(00

我想从字符串块中删除连字符: e、 g:

这就是我的绳子的样子。需要一个通用方法从每行末尾删除
-

注意:这是从excel复制的完整字符串

我尝试了以下方法:

 if data.endswith('-'):
        data=seq[:-1]
我希望我的输出/结果是这样的:

(CB)-year-(3F)-year
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new

使用
str.rstrip

Ex:

s = """"(CB)-year-(3F)-year-
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK-
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:"""

r = ""
for i in s.split("\n"):
    r += "\n" + i.strip().rstrip("-")
print(r)
"(CB)-year-(3F)-year
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:
In [18]: %timeit re.sub('-:?$', '', string, flags=re.MULTILINE)
2.74 µs ± 91.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [19]: %timeit '\n'.join([i[:-1] if i[-1] == '-' else i for i in string.split('\n')])
1.56 µs ± 24.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
输出:

s = """"(CB)-year-(3F)-year-
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK-
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:"""

r = ""
for i in s.split("\n"):
    r += "\n" + i.strip().rstrip("-")
print(r)
"(CB)-year-(3F)-year
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:
In [18]: %timeit re.sub('-:?$', '', string, flags=re.MULTILINE)
2.74 µs ± 91.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [19]: %timeit '\n'.join([i[:-1] if i[-1] == '-' else i for i in string.split('\n')])
1.56 µs ± 24.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
上面的字符串以
-
结尾,在python中,完整字符串被视为单个字符串,而不是不同的字符串,因此在
.endswith()

字符串仅由新行分隔
\n
,因此您需要先拆分并按如下方式连接它们:

In [12]: print('\n'.join([i[:-1] if i[-1] == '-' else i for i in string.split('\n')]))
"(CB)-year-(3F)-year
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new

逻辑:

s = """"(CB)-year-(3F)-year-
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK-
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:"""

r = ""
for i in s.split("\n"):
    r += "\n" + i.strip().rstrip("-")
print(r)
"(CB)-year-(3F)-year
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:
In [18]: %timeit re.sub('-:?$', '', string, flags=re.MULTILINE)
2.74 µs ± 91.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [19]: %timeit '\n'.join([i[:-1] if i[-1] == '-' else i for i in string.split('\n')])
1.56 µs ± 24.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
'\n.join(…)
使用
\n

i[:-1]
给出的字符串没有最后一个字符

i[-1]='-'
检查字符串的最后一个字符是否以连字符结尾

string.split('\n')
使用分隔符拆分字符串
\n
生成一个字符串列表,该列表在列表理解中迭代


时间比较:

s = """"(CB)-year-(3F)-year-
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK-
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:"""

r = ""
for i in s.split("\n"):
    r += "\n" + i.strip().rstrip("-")
print(r)
"(CB)-year-(3F)-year
(56)-ADDR(01)-DATA(06)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(56)-ADDR(01)-DATA(03)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new
(05)-ADDR5-[address0]-(E0)-tWHR2-nintK
(56)-ADDR(01)-DATA(05)-(00)-ADDR5-PBX-CHX-[address0]-(CA)-new-:
In [18]: %timeit re.sub('-:?$', '', string, flags=re.MULTILINE)
2.74 µs ± 91.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [19]: %timeit '\n'.join([i[:-1] if i[-1] == '-' else i for i in string.split('\n')])
1.56 µs ± 24.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

使用
re
您不必拆分字符串并重新构建它

>>> import re
>>> s = '''abc
... de-f
... ghi-
... jkl--
... mno-:'''
>>> print(re.sub('-:?$', '', s, flags=re.MULTILINE))
abc
de-f
ghi
jkl-
mno
>>> print(re.sub('-+:?$', '', s, flags=re.MULTILINE))
abc
de-f
ghi
jkl
mno

最后一行应该怎么办?请尝试拆分字符串,使每一行都是列表中的一项,然后您可以在列表中循环并将当前的
应用于每一行,如果endswith('-'):
有条件。您好,您可以发布您想要的结果吗?Aaron F感谢您的评论。但我希望数据不拆分。看起来您希望用多行字符串中的
\n
替换所有
-\n
序列。对吗?顺便说一句,如果您在文本模式下阅读此数据,其中不应该有任何
\r
字符。Rakesh非常感谢您。这就是我在看的for@PythonBang我记得你说过你不想分割输入字符串?PM 2Ring我的意思是我不想根据“-”来分割它。没有:在字符串中,它将工作,即使在那时。。。如果从字符串末尾删除
,则字符串将以连字符
-
结尾,因此它也将删除它*更新答案,谢谢。这也达到了我的目的,没有分裂