是否有python函数用于自动填充numeric&;的空值;对象列?
使用python为数据帧中的分类列和数字列自动填充空值的有效方法是什么 到目前为止,我使用下面的函数为分类列和数字列填充空值 样本数据:是否有python函数用于自动填充numeric&;的空值;对象列?,python,pandas,dataframe,Python,Pandas,Dataframe,使用python为数据帧中的分类列和数字列自动填充空值的有效方法是什么 到目前为止,我使用下面的函数为分类列和数字列填充空值 样本数据: import pandas as pd import numpy as np dictdf = {"Num1": [111, 222, 444, np.nan, 666, 222], "Obj1": ["a", np.nan, "b", "c&qu
import pandas as pd
import numpy as np
dictdf = {"Num1": [111, 222, 444, np.nan, 666, 222],
"Obj1": ["a", np.nan, "b", "c", "d", np.nan],
"Num2": [np.nan, 247, 321, np.nan, 654, 212],
"Obj2": ["cdb", np.nan, np.nan, "kbc", "kdd", "np"]}
df = pd.DataFrame(dictdf)
df
输出:
Num1 Obj1 Num2 Obj2
0 111.0 a -999.0 cdb
1 222.0 Missing 247.0 Missing
2 444.0 b 321.0 Missing
3 -999.0 c -999.0 kbc
4 666.0 d 654.0 kdd
5 222.0 Missing 212.0 np
我正在寻找一种更有效、更好的方法来处理样本或大型数据集。如有任何建议或参考,我们将不胜感激。您可以为替换创建所有列的字典:
d1 = dict.fromkeys(df.select_dtypes(np.number).columns, -999)
d2 = dict.fromkeys(df.columns.difference(d1.keys()), 'Missing')
#merge both dicts
d = {**d1, **d2}
df = df.fillna(d)
#or with one dict
#df = df.fillna(d1).fillna('Missing')
print (df)
Num1 Obj1 Num2 Obj2
0 111.0 a -999.0 cdb
1 222.0 Missing 247.0 Missing
2 444.0 b 321.0 Missing
3 -999.0 c -999.0 kbc
4 666.0 d 654.0 kdd
5 222.0 Missing 212.0 np
此外,如果要仅测试缺少值的列,请执行以下操作:
df1 = df.loc[:, df.isna().sum().gt(0)]
d1 = dict.fromkeys(df1.select_dtypes(np.number).columns, -999)
d2 = dict.fromkeys(df1.columns.difference(d1.keys()), 'Missing')
d = {**d1, **d2}
df = df.fillna(d)
与您的解决方案类似的想法:
c = df.select_dtypes(np.number).columns
df[c] = df[c].fillna(-999)
df = df.fillna('Missing')
print (df)
Num1 Obj1 Num2 Obj2
0 111.0 a -999.0 cdb
1 222.0 Missing 247.0 Missing
2 444.0 b 321.0 Missing
3 -999.0 c -999.0 kbc
4 666.0 d 654.0 kdd
5 222.0 Missing 212.0 np
请为每列尝试使用DataFrame fillna函数
df['Num1'] = df['Num1'].fillna(-999)
df['Num2'] = df['Num2'].fillna(-999)
df['Obj1'] = df['Obj1'].fillna("Missing")
df['Obj2'] = df['Obj2'].fillna("Missing")
首先,快速、简单、准确地回答。。。!!令人惊讶的是,如果我有100列呢??
c = df.select_dtypes(np.number).columns
df[c] = df[c].fillna(-999)
df = df.fillna('Missing')
print (df)
Num1 Obj1 Num2 Obj2
0 111.0 a -999.0 cdb
1 222.0 Missing 247.0 Missing
2 444.0 b 321.0 Missing
3 -999.0 c -999.0 kbc
4 666.0 d 654.0 kdd
5 222.0 Missing 212.0 np
df['Num1'] = df['Num1'].fillna(-999)
df['Num2'] = df['Num2'].fillna(-999)
df['Obj1'] = df['Obj1'].fillna("Missing")
df['Obj2'] = df['Obj2'].fillna("Missing")