Python 获取特定模式前后的完整字符串_Python_Regex_Python 3.x_Pandas

Python 获取特定模式前后的完整字符串

python regex python-3.x pandas

Python 获取特定模式前后的完整字符串,python,regex,python-3.x,pandas,Python,Regex,Python 3.x,Pandas,我希望抓取具有特定模式的噪波文本： text = "this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff" 我希望能够删除这句话中的所有内容，在空格之后和空格包含&@之前 result = "this is some text and some more text and some other stuff" 我一直在努力： re.compile(r'([

我希望抓取具有特定模式的噪波文本：

text = "this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff"

我希望能够删除这句话中的所有内容，在空格之后和空格包含&@之前

result = "this is some text and some more text and some other stuff"

我一直在努力：

re.compile(r'([\s]&@.*?([\s])).sub(" ", text)

但我似乎无法理解第一部分。

您可以使用此正则表达式捕获噪声字符串

\s+\S*&@\S*\s+

并将其替换为单个空间

在这里，

\s+

匹配任何空格，然后

\s*

匹配零个或多个非空格字符，同时将

&@

夹在其中，

\s*

再次匹配零个或多个空格，最后是

\s+

一个或多个空格被空格删除，给你你想要的字符串

此外，如果此噪波字符串可以位于字符串的最开头或最末尾，请将

\s+

更改为

\s*

Python代码

import re

s = 'this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff'
print(re.sub(r'\s+\S*&@\S*\s+', ' ', s))

印刷品

this is some text and some more text and some other stuff

试试这个：

import re
result = re.findall(r"[a-zA-z]+\&\@[a-zA-z]+", text) 
print(result)
['lskdfmd&@kjansdl', 'sldkf&@lsakjd']

现在从所有单词列表中删除

result

列表

Edit1由@Jan建议

re.sub(r"[a-zA-z]+\&\@[a-zA-z]+", '', text)
output: 'this is some text  and some more text  and some other stuff'

Edit2由@Pushpesh Kumar Rajwanshi建议

re.sub(r" [a-zA-z]+\&\@[a-zA-z]+ ", " ", text)
output:'this is some text and some more text and some other stuff'

你可以用

\S+&@\S+\s*

看。

在Python中：

import re
text = "this is some text lskdfmd&@kjansdl and some more text sldkf&@lsakjd and some other stuff"
rx = re.compile(r'\S+&@\S+\s*')
text = rx.sub('', text)
print(text)

产生

this is some text and some more text and some other stuff

是的，没有问题。@Jan，使用这种方式：

re.sub（r“[a-zA-z]+\&\[a-zA-z]+”，''，text）输出：“这是一些文本和一些其他文本以及一些其他内容”

此替换会留下两个空格，而不是所需的空格。此外，您不需要无用地转义

和