在Python中删除字符串中特定子字符串前后的字符
我是Python新手。可能这可以通过正则表达式完成。我想搜索字符串中的特定子字符串,并删除字符串中前后的字符 例1在Python中删除字符串中特定子字符串前后的字符,python,regex,regex-lookarounds,Python,Regex,Regex Lookarounds,我是Python新手。可能这可以通过正则表达式完成。我想搜索字符串中的特定子字符串,并删除字符串中前后的字符 例1 Input:"This is the consignment no 1234578TP43789" Output:"This is the consignment no TP" 例2 Input:"Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890" Output:"Consignment
Input:"This is the consignment no 1234578TP43789"
Output:"This is the consignment no TP"
例2
Input:"Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890"
Output:"Consignment no TP is on its way on vehicle no MP"
我有这些首字母缩略词的列表(MP
,TP
)要在字符串中搜索
它的作用是什么?
匹配一个或多个数字\d+
匹配(TP | MP)
或TP
。在MP
中捕获它。我们使用这个捕获的字符串来替换整个匹配的字符串\1
如果任何字符可以出现在TP/MP前后,我们可以使用
\S
来匹配空格以外的任何字符。比如说,
>>> string="Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890"
>>> re.sub(r'\S+(TP|MP)\S+', r'\1', string)
'Consignment no TP is on its way on vehicle no MP'
编辑
使用,可以遍历列表并替换所有字符串,如下所示:
>>> list_1=["TP","MP","DCT"]
>>> list_2=["This is the consignment no 1234578TP43789","Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890"]
>>> [ re.sub(r'\d+(' + '|'.join(list_1) + ')\d+', r'\1', string) for string in list_2 ]
['This is the consignment no TP', 'Consignment no TP is on its way on vehicle no MP']
你可以用
它的作用是什么?
匹配一个或多个数字\d+
匹配(TP | MP)
或TP
。在MP
中捕获它。我们使用这个捕获的字符串来替换整个匹配的字符串\1
如果任何字符可以出现在TP/MP前后,我们可以使用
\S
来匹配空格以外的任何字符。比如说,
>>> string="Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890"
>>> re.sub(r'\S+(TP|MP)\S+', r'\1', string)
'Consignment no TP is on its way on vehicle no MP'
编辑
使用,可以遍历列表并替换所有字符串,如下所示:
>>> list_1=["TP","MP","DCT"]
>>> list_2=["This is the consignment no 1234578TP43789","Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890"]
>>> [ re.sub(r'\d+(' + '|'.join(list_1) + ')\d+', r'\1', string) for string in list_2 ]
['This is the consignment no TP', 'Consignment no TP is on its way on vehicle no MP']
您可以使用
strip
,从字符串前后剥离字符
strg="Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890"
strg=' '.join([word.strip('0123456789') for word in strg.split()])
print(strg) # Consignment no TP is on its way on vehicle no MP
如果保留字被包含,就将其放入循环中
strg="Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890 200DG"
reserved=['MP','TP']
for res in reserved:
strg=' '.join([word.strip('0123456789') if (res in word) else word for word in strg.split()])
print(strg) # Consignment no TP is on its way on vehicle no MP 200DG
您可以使用
strip
,从字符串前后剥离字符
strg="Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890"
strg=' '.join([word.strip('0123456789') for word in strg.split()])
print(strg) # Consignment no TP is on its way on vehicle no MP
如果保留字被包含,就将其放入循环中
strg="Consignment no 1234578TP43789 is on its way on vehicle no 3456MP567890 200DG"
reserved=['MP','TP']
for res in reserved:
strg=' '.join([word.strip('0123456789') if (res in word) else word for word in strg.split()])
print(strg) # Consignment no TP is on its way on vehicle no MP 200DG
看看regex模块的替代函数,TP前后的任何内容。它可以同时包含数字和字符。这个东西1234578TP43789应该在输出中被TP替换。看看regex模块的替换函数,在TP之前和之后的任何东西。它可以同时包含数字和字符。这个东西1234578TP43789应该在输出中被TP替换。@NU11P01N73R多了一个东西列表\u 1=[“TP”,“MP”,“DCT”]列表\u 2=[“这是编号为1234578TP43789的货物”,“编号为1234578TP43789的货物已经在3456MP567890号车辆上了”]现在我必须接受TP,MP from list_1在list_2的字符串中搜索并替换它们。如何操作?@SalmanBaqri您可以使用
join
as“|”生成正则表达式。join([“TP”、“MP”、“DCT]”)
并使用它迭代list_2
,以生成所需的输出。您也可以使用。请再解释一下好吗?我会添加一个单词边界,并将\d+
转换为\w+
:-尽管如此+1@nu11p01n73R你能分享我可以从中了解到更多关于正则表达式的资源吗?@NU11P01N73R还有一件事要做列表\u 1=[“TP”、“MP”、“DCT”]list\u 2=[“这是编号为1234578TP43789的货物”,“编号为1234578TP43789的货物正在3456MP567890号车辆上行驶”]现在我必须从列表1中提取TP,MP,在列表2中的字符串中搜索并替换它们。如何做?@SalmanBaqri您可以使用连接作为“|”来生成正则表达式。连接([“TP”,“MP”,“DCT”)
并使用它来迭代列表2
以生成所需的输出。您也可以使用。请再解释一下好吗?我会添加一个单词边界,并将\d+
转换为\w+
:-尽管如此+1@nu11p01n73R你能分享我的资源吗?我可以从那里学到更多关于正则表达式的知识。