Python 将数据帧子集的空格替换为空值
对于以下数据帧Python 将数据帧子集的空格替换为空值,python,pandas,Python,Pandas,对于以下数据帧 id words A B C D E 1 new a 1 1 2 good v 1 3 star c 1 4 never 5 final 我尝试使用以下代码将空格替换为空值: df1.loc[:, ["A", "B", "C", "E", "D" ]].replace (r'\s+', np.nan, regex = True, inp
id words A B C D E
1 new a 1 1
2 good v 1
3 star c 1
4 never
5 final
我尝试使用以下代码将空格替换为空值:
df1.loc[:, ["A", "B", "C", "E", "D" ]].replace (r'\s+', np.nan, regex = True, inplace = True)
但它没有起作用。我还尝试了以下代码:
df1[["A", "B", "C", "E", "D" ]].replace (r'\s+', np.nan, regex = True, inplace = True)
它也不起作用
但通过使用以下代码,它成功了:
df1.A.replace (r'\s+', np.nan, regex = True, inplace = True)
df1.B.replace (r'\s+', np.nan, regex = True, inplace = True)
df1.C.replace (r'\s+', np.nan, regex = True, inplace = True)
df1.D.replace (r'\s+', np.nan, regex = True, inplace = True)
df1.E.replace (r'\s+', np.nan, regex = True, inplace = True)
有人知道为什么吗?谢谢 从数据帧中选择列时,返回的对象是副本。如果在该副本上调用方法,则
inplace
参数将在副本上工作,而不是在实际的数据帧上
df1.loc[:, ["A", "B", "C", "E", "D" ]].replace (r'\s+', np.nan, regex = True, inplace = True)
这一行实际上修改了一个数据帧,但是由于该数据帧没有分配给任何对象,所以您看不到结果
使用示例数据帧:
df = pd.DataFrame()
df['words'] = ['x', 'y', 'z', 't']
df['A'] = [1, 1, '', '']
df['B'] = ['', '', '', '']
df['C'] = [1, '', 1, '']
df['D'] = ['', ' ', ' ', ' ']
df['E'] = [' ', ' ', '', '']
df
Out:
words A B C D E
0 x 1 1
1 y 1
2 z 1
3 t
您需要重新分配结果:
cols = ["A", "B", "C", "E", "D" ]
df.loc[:, cols] = df.loc[:, cols].replace (r'\s+', np.nan, regex=True)
请注意,这将仅用1个或多个空格替换单元格。如果还需要替换空字符串,请将其更改为
df.loc[:, cols] = df.loc[:, cols].replace (r'\s*', np.nan, regex=True)
df
Out:
words A B C D E
0 x 1 NaN 1 NaN NaN
1 y 1 NaN NaN NaN NaN
2 z NaN NaN 1 NaN NaN
3 t NaN NaN NaN NaN NaN
从数据帧中选择列时,返回的对象是副本。如果在该副本上调用方法,则
inplace
参数将在副本上工作,而不是在实际的数据帧上
df1.loc[:, ["A", "B", "C", "E", "D" ]].replace (r'\s+', np.nan, regex = True, inplace = True)
这一行实际上修改了一个数据帧,但是由于该数据帧没有分配给任何对象,所以您看不到结果
使用示例数据帧:
df = pd.DataFrame()
df['words'] = ['x', 'y', 'z', 't']
df['A'] = [1, 1, '', '']
df['B'] = ['', '', '', '']
df['C'] = [1, '', 1, '']
df['D'] = ['', ' ', ' ', ' ']
df['E'] = [' ', ' ', '', '']
df
Out:
words A B C D E
0 x 1 1
1 y 1
2 z 1
3 t
您需要重新分配结果:
cols = ["A", "B", "C", "E", "D" ]
df.loc[:, cols] = df.loc[:, cols].replace (r'\s+', np.nan, regex=True)
请注意,这将仅用1个或多个空格替换单元格。如果还需要替换空字符串,请将其更改为
df.loc[:, cols] = df.loc[:, cols].replace (r'\s*', np.nan, regex=True)
df
Out:
words A B C D E
0 x 1 NaN 1 NaN NaN
1 y 1 NaN NaN NaN NaN
2 z NaN NaN 1 NaN NaN
3 t NaN NaN NaN NaN NaN
@ayhan给出的答案要好得多,但我认为这是一种快速而肮脏的方法,用NaN替换一堆空格:
df1.replace(“”,np.NaN,inplace=True)
由@ayhan给出的答案要好得多,但我认为这是一种用NaN替换一堆空格的快速而肮脏的方法:
df1.replace(“”,np.NaN,inplace=True)
您正在修改副本。df1[[“ADR”、“WD”、“EF”、“INF”、“SS”]
和df1.loc[:,[“ADR”、“WD”、“EF”、“INF”、“SS”]
返回副本。原地争论对他们没有任何帮助。最好把它分配回来。@aythan,谢谢?我如何做示例代码中的列名与示例代码中的列名不匹配。@ChuHo,谢谢。我编辑了它。您正在修改副本。df1[[“ADR”、“WD”、“EF”、“INF”、“SS”]
和df1.loc[:,[“ADR”、“WD”、“EF”、“INF”、“SS”]
返回副本。原地争论对他们没有任何帮助。最好把它分配回来。@aythan,谢谢?我如何做示例代码中的列名与示例代码中的列名不匹配。@ChuHo,谢谢。我编辑了它。