Python 如何遍历pandas dataframe中的行以构建成员和父级的字典？_Python_Pandas

Python 如何遍历pandas dataframe中的行以构建成员和父级的字典？

python pandas

Python 如何遍历pandas dataframe中的行以构建成员和父级的字典？,python,pandas,Python,Pandas,我试图循环遍历pandas数据帧的每一行，以构建成员到父项的字典 dataframe的每个值都只是一个成员。如果成员没有父级，则其父级将变为“无” 例如 df = pd.DataFrame({'level 5': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i'}, 'level 4': {0: 'g1', 1: 'g1', 2: 'g1', 3: 'g1', 4:

我试图循环遍历pandas数据帧的每一行，以构建成员到父项的字典

dataframe的每个值都只是一个成员。如果成员没有父级，则其父级将变为“无”

例如

df = pd.DataFrame({'level 5': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i'},
                   'level 4': {0: 'g1', 1: 'g1', 2: 'g1', 3: 'g1', 4: 'g1', 5: 'g1', 6: 'g2', 7: 'g2', 8: 'g3'},
                   'level 3': {0: 'g4', 1: 'g4', 2: 'g4', 3: 'g4', 4: 'g4', 5: 'g4', 6: 'g4', 7: 'g4', 8: 'g6'},
                   'level 2': {0: 'g4', 1: 'g4', 2: 'g4', 3: 'g4', 4: 'g4', 5: 'g4', 6: 'g4', 7: 'g4', 8: 'g4'},
                   'level 1': {0: 'g5', 1: 'g5', 2: 'g5', 3: 'g5', 4: 'g5', 5: 'g5', 6: 'g5', 7: 'g5', 8: 'g5'}})

看起来

  level 5 level 4 level 3 level 2 level 1
0       a      g1      g4      g4      g5
1       b      g1      g4      g4      g5
2       c      g1      g4      g4      g5
3       d      g1      g4      g4      g5
4       e      g1      g4      g4      g5
5       f      g1      g4      g4      g5
6       g      g2      g4      g4      g5
7       h      g2      g4      g4      g5
8       i      g3      g6      g4      g5

请注意，除最后一行外，所有第3级和第2级都有两个连续的g4

我想建立一个字典，看起来像这样

output = {'a': 'g1', 'g1': 'g4', 'g4': 'g5', 'g5': 'none', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1', 'f': 'g1', 'g': 'g2', 'g2': 'g4', 'h': 'g2', 'i': 'g3', 'g3': 'g6', 'g6': 'g4'}

我对df的每一行应用了一个函数。但我不能适应参差不齐的等级制度

任何帮助都将不胜感激。

一种方法

cols = df.columns
my_dict = {}
for key, value in zip(cols[:-1], cols[1:]):
    my_dict.update(dict(zip(df[key], df[value])))

print(my_dict)

{'a': 'g1',
 'b': 'g1',
 'c': 'g1',
 'd': 'g1',
 'e': 'g1',
 'f': 'g1',
 'g': 'g2',
 'h': 'g2',
 'i': 'g3',
 'g1': 'g4',
 'g2': 'g4',
 'g3': 'g6',
 'g4': 'g5',
 'g6': 'g4'}

如果您想要“无”值，您可以在末尾添加：

my_dict.update(dict(zip(df[value], ['none']*len(df)))) 
print(my_dict)


{'a': 'g1', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1',
 'f': 'g1', 'g': 'g2', 'h': 'g2', 'i': 'g3', 'g1': 'g4', 'g2': 'g4',
 'g3': 'g6', 'g4': 'g5', 'g6': 'g4', 'g5': 'none'}

一种方法

cols = df.columns
my_dict = {}
for key, value in zip(cols[:-1], cols[1:]):
    my_dict.update(dict(zip(df[key], df[value])))

print(my_dict)

{'a': 'g1',
 'b': 'g1',
 'c': 'g1',
 'd': 'g1',
 'e': 'g1',
 'f': 'g1',
 'g': 'g2',
 'h': 'g2',
 'i': 'g3',
 'g1': 'g4',
 'g2': 'g4',
 'g3': 'g6',
 'g4': 'g5',
 'g6': 'g4'}

如果您想要“无”值，您可以在末尾添加：

my_dict.update(dict(zip(df[value], ['none']*len(df)))) 
print(my_dict)


{'a': 'g1', 'b': 'g1', 'c': 'g1', 'd': 'g1', 'e': 'g1',
 'f': 'g1', 'g': 'g2', 'h': 'g2', 'i': 'g3', 'g1': 'g4', 'g2': 'g4',
 'g3': 'g6', 'g4': 'g5', 'g6': 'g4', 'g5': 'none'}