Python 获取特定模式前后的完整字符串

Python 获取特定模式前后的完整字符串,python,regex,python-3.x,pandas,Python,Regex,Python 3.x,Pandas,我希望抓取具有特定模式的噪波文本: text = "this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff" 我希望能够删除这句话中的所有内容,在空格之后和空格包含&@之前 result = "this is some text and some more text and some other stuff" 我一直在努力: re.compile(r'([

我希望抓取具有特定模式的噪波文本:

text = "this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff"
我希望能够删除这句话中的所有内容,在空格之后和空格包含&@之前

result = "this is some text and some more text and some other stuff"
我一直在努力:

re.compile(r'([\s]&@.*?([\s])).sub(" ", text)

但我似乎无法理解第一部分。

您可以使用此正则表达式捕获噪声字符串

\s+\S*&@\S*\s+
并将其替换为单个空间

在这里,
\s+
匹配任何空格,然后
\s*
匹配零个或多个非空格字符,同时将
&@
夹在其中,
\s*
再次匹配零个或多个空格,最后是
\s+
一个或多个空格被空格删除,给你你想要的字符串

此外,如果此噪波字符串可以位于字符串的最开头或最末尾,请将
\s+
更改为
\s*

Python代码

import re

s = 'this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff'
print(re.sub(r'\s+\S*&@\S*\s+', ' ', s))
印刷品

this is some text and some more text and some other stuff
试试这个:

import re
result = re.findall(r"[a-zA-z]+\&\@[a-zA-z]+", text) 
print(result)
['lskdfmd&@kjansdl', 'sldkf&@lsakjd']
现在从所有单词列表中删除
result
列表

Edit1由@Jan建议

re.sub(r"[a-zA-z]+\&\@[a-zA-z]+", '', text)
output: 'this is some text  and some more text  and some other stuff'
Edit2由@Pushpesh Kumar Rajwanshi建议

re.sub(r" [a-zA-z]+\&\@[a-zA-z]+ ", " ", text)
output:'this is some text and some more text and some other stuff'
你可以用

\S+&@\S+\s*
看。
在Python中:

import re
text = "this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff"
rx = re.compile(r'\S+&@\S+\s*')
text = rx.sub('', text)
print(text)
产生

this is some text and some more text and some other stuff

是的,没有问题。@Jan,使用这种方式:
re.sub(r“[a-zA-z]+\&\[a-zA-z]+”,'',text)输出:“这是一些文本和一些其他文本以及一些其他内容”
此替换会留下两个空格,而不是所需的空格。此外,您不需要无用地转义
&
@