Python 正则表达式，表示三@后接数字和三@结尾的否定_Python_Regex_Regex Negation

Python 正则表达式，表示三@后接数字和三@结尾的否定

python regex

Python 正则表达式，表示三@后接数字和三@结尾的否定,python,regex,regex-negation,Python,Regex,Regex Negation,我需要构建一个正则表达式，该正则表达式在开始时是以符号@的速率对三个进行求反，然后是长度在1到12位之间变化的数字，最后是三个@符号。应选择除此之外的任何内容基本上，我面临的挑战是，我有一个数据框架，其中包含一个文本语料库和一个模式@@@0-9@@@中的值。我想删除除此模式之外的所有内容。我已经能够将正则表达式开发为[@][@][@]]\d{1,12}[@][@][@][@]，但是我想否定这个模式，因为我想找到并替换它。比如说 my name is x and i work at @@@123

我需要构建一个正则表达式，该正则表达式在开始时是以符号

的速率对三个进行求反，然后是长度在1到12位之间变化的数字，最后是三个

符号。应选择除此之外的任何内容

基本上，我面临的挑战是，我有一个数据框架，其中包含一个文本语料库和一个模式

@@@0-9@@@

中的值。我想删除除此模式之外的所有内容。我已经能够将正则表达式开发为

[@][@][@]]\d{1,12}[@][@][@][@]

，但是我想否定这个模式，因为我想找到并替换它。比如说

my name is x and i work at @@@12354@@@ and i am happy with my job. what is your company name? is it @@@42334@@@? you look happy as well!!

应该返回

@@@12354@@@@@42334@

这样在单个元素之间有一个空格分隔符将非常好。有什么帮助吗

我将在python数据框架uising

str.replace

函数中使用这个正则表达式

我已经试过了，而且已经来了

**编辑：**以下是数据

SNo details
1   account @@@0000082569@@@ / department stores uk & ie credit control operations
2   academic @@@0000060910@@@ , administrative, and @@@0000039198@@@ liaison coordinator
3   account executive, financial @@@0000060910@@@ , enterprise and partner group
4   2015-nasa summer internship- space power system @@@0000129849@@@ and testing
5   account technical @@@0000185187@@@ , technical presales, systems engineer
6   account @@@0000082569@@@ for car, van & 4x4 products in the east of england
7   account @@@0000082569@@@ for mikro segment and owners of the enterprises
8   account @@@0000082569@@@ - affinity digital display, mobile & publishing
9   account @@@0000082569@@@ @@@0000060905@@@ -energy and commodities @@@0000086889@@@ candidate
10  account @@@0000082569@@@ for companies department of external relevance

您可以将

join

与

findall

一起使用，而不是将

replace

替换为复杂的正则表达式，并使用更简单的正则表达式，如下所示：

>>> str = 'my name is x and i work at @@@12354@@@ and i am happy with my job. what is your company name? is it @@@42334@@@? you look happy as well!!'
>>> ' '.join(re.findall(r'@{3}\d{1,12}@{3}', str))
'@@@12354@@@ @@@42334@@@'

我的意思是：

{3}\d+{3}

将匹配由3个

符号括起的任何1+位，并且

.findall

将提取所有匹配项

.apply（'''.join）

将用空格连接值。

一个更简单的正则表达式是{3}\d+{3}。您可以使用

.findall

和

r'{3}\d+{3}'

连接找到的匹配项。如果我以[^{3}\d+{3}]的形式进行否定，它不仅是对模式的否定，而且是对除此模式之外的任何其他数字的否定。我得到的是typeerror“typeerror:只能加入一个iterable”@familier:请发布您的准确输入数据。您可以看到我的示例包含数据初始化。我已经添加了示例数据，但它是如何初始化的？我的意思是，请以一种可复制的方式发布它，比如

df=pd.DataFrame（{…}）

Ok，我在调试了将近3个小时后才意识到。在某些行中，它是一个solli NaN值。我将其替换为空格“”，表示您的代码是正确的。我接受这一有效的回答。谢谢

>>> df = pd.DataFrame({'col1':['at @@@12354@@@ and i am happy with my job. what is your company name? is it @@@42334@@@? you look happy as well!!', 'at @@@222@@@ and t @@@888888@@@?' ]})
>>> df['col1'].str.findall(r'@{3}\d+@{3}').apply(' '.join)
0    @@@12354@@@ @@@42334@@@
1     @@@222@@@ @@@888888@@@