Python Pandas-使用相应的id列值填充缺少的列值_Python_Json_Pandas_Missing Data

Python Pandas-使用相应的id列值填充缺少的列值

python json pandas

Python Pandas-使用相应的id列值填充缺少的列值,python,json,pandas,missing-data,Python,Json,Pandas,Missing Data,我希望用JSON文件中相应的代码键来填充缺少的列值，该文件基于下面的代码抛出TypeError：“list”对象不可调用。下面是我读取和填充缺失值的代码 data = json.load((open('world_bank_projects.json'))) themecodes = json_normalize(data, 'mjtheme_namecode') d = themecodes.sort_values('name', na_position='last').set_in

我希望用JSON文件中相应的代码键来填充缺少的列值，该文件基于下面的代码抛出TypeError：“list”对象不可调用。下面是我读取和填充缺失值的代码

data = json.load((open('world_bank_projects.json')))

themecodes = json_normalize(data, 'mjtheme_namecode')
    d = themecodes.sort_values('name', na_position='last').set_index('code')['name'].to_dict()
themecodes.loc[themecodes['name'].isnull(), 'name'] = themecodes['code'].map(d)
themecodes.head(20)
    code    name
0   8   Human development
1   11  
2   1   Economic management
3   6   Social protection and risk management
4   5   Trade and integration
5   2   Public sector governance
6   11  Environment and natural resources management
7   6   Social protection and risk management
8   7   Social dev/gender/inclusion
9   7   Social dev/gender/inclusion
10  5   Trade and integration
11  4   Financial and private sector development
12  6   Social protection and risk management
13  6   
14  2   Public sector governance
15  4   Financial and private sector development
16  11  Environment and natural resources management
17  8   
18  10  Rural development
19  7

我认为如果空值为Nones或NaNs，则需要：

或：

解决方案（如果需要）替换空白或某些空白：

d = themecodes.sort_values('name', na_position='first').set_index('code')['name'].to_dict()
themecodes.loc[themecodes['name'].str.strip() == '', 'name'] = themecodes['code'].map(d)

我认为如果空值为Nones或NaNs，则需要：

或：

解决方案（如果需要）替换空白或某些空白：

d = themecodes.sort_values('name', na_position='first').set_index('code')['name'].to_dict()
themecodes.loc[themecodes['name'].str.strip() == '', 'name'] = themecodes['code'].map(d)

d保存代码和名称的排序列表，但使用map都不能填充缺少的名称值。…。@Mr.Jibz-这是字典。你能解释一下为什么它不起作用吗？我认为使用d作为排序字典是干净的，理想情况下，我会用dIt中的对应键来填充代码中缺少的名称列值。看起来你的空值不是NaN或Nones，所以添加了解决方案。d保存代码和名称的排序列表，但map的使用都不是在填充缺少的名称值。…。@Mr.Jibz-它是字典。你能解释一下为什么它不起作用吗？我认为使用d作为排序字典是干净的，理想情况下，我会用dIt中的对应键来填充代码中缺少的名称列值。看起来你的空值不是NaN或Nones，所以为它添加了解决方案。

themecodes['name'] = (themecodes.sort_values('name', na_position='last')
                                .groupby('code')['name']
                                .transform(lambda x: x.fillna(x.iat[0]))
                                .sort_index())

print (themecodes)
    code                                          name
0      8                             Human development
1     11  Environment and natural resources management
2      1                           Economic management
3      6         Social protection and risk management
4      5                         Trade and integration
5      2                      Public sector governance
6     11  Environment and natural resources management
7      6         Social protection and risk management
8      7                   Social dev/gender/inclusion
9      7                   Social dev/gender/inclusion
10     5                         Trade and integration
11     4      Financial and private sector development
12     6         Social protection and risk management
13     6         Social protection and risk management
14     2                      Public sector governance
15     4      Financial and private sector development
16    11  Environment and natural resources management
17     8                             Human development
18    10                             Rural development
19     7                   Social dev/gender/inclusion

d = themecodes.sort_values('name', na_position='first').set_index('code')['name'].to_dict()
themecodes.loc[themecodes['name'].str.strip() == '', 'name'] = themecodes['code'].map(d)