Python 如何遍历pandas dataframe中的行以构建成员和父级的字典?
我试图循环遍历pandas数据帧的每一行,以构建成员到父项的字典 dataframe的每个值都只是一个成员。如果成员没有父级,则其父级将变为“无” 例如Python 如何遍历pandas dataframe中的行以构建成员和父级的字典?,python,pandas,Python,Pandas,我试图循环遍历pandas数据帧的每一行,以构建成员到父项的字典 dataframe的每个值都只是一个成员。如果成员没有父级,则其父级将变为“无” 例如 df = pd.DataFrame({'level 5': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i'}, 'level 4': {0: 'g1', 1: 'g1', 2: 'g1', 3: 'g1', 4:
df = pd.DataFrame({'level 5': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i'},
'level 4': {0: 'g1', 1: 'g1', 2: 'g1', 3: 'g1', 4: 'g1', 5: 'g1', 6: 'g2', 7: 'g2', 8: 'g3'},
'level 3': {0: 'g4', 1: 'g4', 2: 'g4', 3: 'g4', 4: 'g4', 5: 'g4', 6: 'g4', 7: 'g4', 8: 'g6'},
'level 2': {0: 'g4', 1: 'g4', 2: 'g4', 3: 'g4', 4: 'g4', 5: 'g4', 6: 'g4', 7: 'g4', 8: 'g4'},
'level 1': {0: 'g5', 1: 'g5', 2: 'g5', 3: 'g5', 4: 'g5', 5: 'g5', 6: 'g5', 7: 'g5', 8: 'g5'}})
看起来
level 5 level 4 level 3 level 2 level 1
0 a g1 g4 g4 g5
1 b g1 g4 g4 g5
2 c g1 g4 g4 g5
3 d g1 g4 g4 g5
4 e g1 g4 g4 g5
5 f g1 g4 g4 g5
6 g g2 g4 g4 g5
7 h g2 g4 g4 g5
8 i g3 g6 g4 g5
请注意,除最后一行外,所有第3级和第2级都有两个连续的g4
我想建立一个字典,看起来像这样
output = {'a': 'g1', 'g1': 'g4', 'g4': 'g5', 'g5': 'none', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1', 'f': 'g1', 'g': 'g2', 'g2': 'g4', 'h': 'g2', 'i': 'g3', 'g3': 'g6', 'g6': 'g4'}
我对df的每一行应用了一个函数。但我不能适应参差不齐的等级制度
任何帮助都将不胜感激。一种方法
cols = df.columns
my_dict = {}
for key, value in zip(cols[:-1], cols[1:]):
my_dict.update(dict(zip(df[key], df[value])))
print(my_dict)
{'a': 'g1',
'b': 'g1',
'c': 'g1',
'd': 'g1',
'e': 'g1',
'f': 'g1',
'g': 'g2',
'h': 'g2',
'i': 'g3',
'g1': 'g4',
'g2': 'g4',
'g3': 'g6',
'g4': 'g5',
'g6': 'g4'}
如果您想要“无”值,您可以在末尾添加:
my_dict.update(dict(zip(df[value], ['none']*len(df))))
print(my_dict)
{'a': 'g1', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1',
'f': 'g1', 'g': 'g2', 'h': 'g2', 'i': 'g3', 'g1': 'g4', 'g2': 'g4',
'g3': 'g6', 'g4': 'g5', 'g6': 'g4', 'g5': 'none'}
一种方法
cols = df.columns
my_dict = {}
for key, value in zip(cols[:-1], cols[1:]):
my_dict.update(dict(zip(df[key], df[value])))
print(my_dict)
{'a': 'g1',
'b': 'g1',
'c': 'g1',
'd': 'g1',
'e': 'g1',
'f': 'g1',
'g': 'g2',
'h': 'g2',
'i': 'g3',
'g1': 'g4',
'g2': 'g4',
'g3': 'g6',
'g4': 'g5',
'g6': 'g4'}
如果您想要“无”值,您可以在末尾添加:
my_dict.update(dict(zip(df[value], ['none']*len(df))))
print(my_dict)
{'a': 'g1', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1',
'f': 'g1', 'g': 'g2', 'h': 'g2', 'i': 'g3', 'g1': 'g4', 'g2': 'g4',
'g3': 'g6', 'g4': 'g5', 'g6': 'g4', 'g5': 'none'}