Python 从数据帧中删除方括号
我有以下数据帧格式的adtaset,我需要从数据中删除方括号。我们该怎么办?有人能帮忙吗Python 从数据帧中删除方括号,python,pandas,replace,Python,Pandas,Replace,我有以下数据帧格式的adtaset,我需要从数据中删除方括号。我们该怎么办?有人能帮忙吗 From TO [wrestle] engage in a wrestling match [write] communicate or express by writing [write] publish [spell] write [compose] write
From TO
[wrestle] engage in a wrestling match
[write] communicate or express by writing
[write] publish
[spell] write
[compose] write music
预期产出为:
From TO
wrestle engage in a wrestling match
write communicate or express by writing
write publish
spell write
如果string
s:
print (type(df.loc[0, 'From']))
<class 'str'>
df['From'] = df['From'].str.strip('[]')
感谢@juanpa.arrivillaga,如果有一项
列表
s:
df['From'] = df['From'].str[0]
可通过以下方式进行检查:
print (type(df.loc[0, 'From']))
<class 'list'>
print (df['From'].str.len().eq(1).all())
True
假设您有以下数据帧:
df = pd.DataFrame({'Region':['New York','Los Angeles','Chicago'], 'State': ['NY [new york]', '[California]', 'IL']})
将是这样的:
Region State
0 New York NY [new york]
1 Los Angeles [California]
2 Chicago IL
只需删除方括号,您需要以下几行:
df['State'] = df['State'].str.replace(r"\[","")
df['State'] = df['State'].str.replace(r"\]","")
df['State'] = df['State'].str.replace(r"\[.*\]","")
df['State'] = df['State'].str.replace(r" \[.*\]","")
结果是:
Region State
0 New York NY new york
1 Los Angeles California
2 Chicago IL
如果要删除方括号及其之间的所有内容:
df['State'] = df['State'].str.replace(r"\[","")
df['State'] = df['State'].str.replace(r"\]","")
df['State'] = df['State'].str.replace(r"\[.*\]","")
df['State'] = df['State'].str.replace(r" \[.*\]","")
第一行只是删除方括号中的字符,第二行考虑字符前的空格,所以为了确保安全,最好同时运行这两行
通过在原始df上应用这两条线:
Region State
0 New York NY
1 Los Angeles
2 Chicago IL
如果所有的
列表都只有一个值,那么也可以使用df.From.str[0]
@juanpa.arrivillaga-谢谢你的建议。@jezrael我有一个问题,是否可以将df['From']=df['From'].str.strip('[])
应用于整个数据帧,而不需要逐个列进行操作?@1muflon1-是的,使用