Python 删除括号中的字符_Python_Regex_Pandas

Python 删除括号中的字符

python regex pandas

Python 删除括号中的字符,python,regex,pandas,Python,Regex,Pandas,我想删除[]和当前正在执行的操作之间的字符 df['Text'] = df['Text'].str.replace(r"\[.*\]","") 但产出并不理想。前面是[image]本文档，后面是*******本文档，其中*为空白我该如何摆脱这个空白编辑1 df的Text列如下所示： ID Text 0 REAL ESTATE LEASE THIS INDUSTRIAL REAL ESTAT... 5 Lease AureementMade and signed on

我想删除

[]

和当前正在执行的操作之间的字符

df['Text'] = df['Text'].str.replace(r"\[.*\]","")

但产出并不理想。前面是

[image]本文档

，后面是

*******本文档

，其中

为空白

我该如何摆脱这个空白

编辑1

df

的

Text

列如下所示：

ID    Text
0     REAL ESTATE LEASE THIS INDUSTRIAL REAL ESTAT...
5     Lease AureementMade and signed on the \ of Aug...
6     FIRST AMENDMENT OF LEASEDATE: August 31, 2001L...
8     [image: image0.jpg] Jack[image: image1.jb2] ...
9     [image: image0.jpg] ABC SALES Meeting 97...
14    FIRST AMENDMENT OF LEASETHIS FIRST AMENDMENT O...
17    [image: image0.tif] Deep ML LEASE SERVI...
22    [image: image0.jpg] F 15 083 EX [image: image1...
26    LEASE AGREEMENT—GROSS LEASEBASIC LEASE PROVISI...
28    [image: image0.jpg] 17. Medical VERIFICATION...
31    [image: image0.jpg]  [image: image1.jb2] PLL 3...
32    SUBLEASETHIS SUBLEASE this “Sublease” made as ...
34    [image: image0.tif] Lease Agreement May 10, 20...
35    13057968.3  1 Initials:  _____  _____  SECOND ...
42    [image: image0.jpg] Jack Dowson Buy Real MI...
46     Deep – Machine Learning LEASE   B...

我想看看

ID    Text
0     REAL ESTATE LEASE THIS INDUSTRIAL REAL ESTAT...
5     Lease AureementMade and signed on the \ of Aug...
6     FIRST AMENDMENT OF LEASEDATE: August 31, 2001L...
8     Jack ...
9     ABC SALES Meeting 97...
14    FIRST AMENDMENT OF LEASETHIS FIRST AMENDMENT O...
17    Deep ML LEASE SERVI...
22    F 15 083 EX ...
26    LEASE AGREEMENT—GROSS LEASEBASIC LEASE PROVISI...
28    17. Medical VERIFICATION...
31    PLL 3...
32    SUBLEASETHIS SUBLEASE this “Sublease” made as ...
34    Lease Agreement May 10, 20...
35    13057968.3  1 Initials:  _____  _____  SECOND ...
42    Jack Dowson Buy Real MI...
46    Deep – Machine Learning LEASE   B...

看起来您需要

.str.strip（）

Ex:

df = pd.DataFrame({"ID": [1,2,3], "Text": ["[image: 123.jpg] This document", "[image: image.jpg] Readers of the article", "The agreement between [image: image.jpg] two parties"]})
df["Text"] = df["Text"].str.replace(r"(\s*\[.*?\]\s*)", " ").str.strip()
print(df)

0                        This document
1               Readers of the article
2    The agreement between two parties
Name: Text, dtype: object

输出：

df = pd.DataFrame({"ID": [1,2,3], "Text": ["[image: 123.jpg] This document", "[image: image.jpg] Readers of the article", "The agreement between [image: image.jpg] two parties"]})
df["Text"] = df["Text"].str.replace(r"(\s*\[.*?\]\s*)", " ").str.strip()
print(df)

0                        This document
1               Readers of the article
2    The agreement between two parties
Name: Text, dtype: object

在正则表达式中添加可选空格（

？

），这样整个正则表达式（匹配部分）应该是：

r'\[.*\] ?'

另一个提示：正则表达式用括号括起来（一个捕获组）。

不需要它们。删除它们。

请花时间阅读这篇文章，以及如何提供答案，并相应地修改您的问题。这些关于如何提出一个好问题的提示可能也很有用。

df['Text']=df['Text'].str.replace（r“\[.\]”，“）.str.strip（）

？如果我使用@Rakesh的解决方案，它会删除整行。请注意，单词和单词之间有两个空格，所以这个命题不起作用。str.strip（）从整个文本中删除前导空格和尾随空格，而不是在每次匹配之前/之后。@Valdi_-Bo。谢谢，我没看到。