Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/linq/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 使用带条件NaN的pandas库在dataframe上添加列_Python 3.x_Pandas - Fatal编程技术网

Python 3.x 使用带条件NaN的pandas库在dataframe上添加列

Python 3.x 使用带条件NaN的pandas库在dataframe上添加列,python-3.x,pandas,Python 3.x,Pandas,目前正在研究python,新手正在研究。我有一个由两列id和父id组成的数据框 id | parent 1 | A 2 | B 3 | C 4 | A 5 | A 6 | C A | NaN B | NaN C | NaN 预期输出如下表所示: id | parent | child 1 | A | NaN 2 | B | NaN 3 | C | NaN 4 | A |

目前正在研究python,新手正在研究。我有一个由两列id和父id组成的数据框

id   | parent
1    | A
2    | B
3    | C
4    | A
5    | A
6    | C
A    | NaN
B    | NaN
C    | NaN
预期输出如下表所示:

id   | parent | child
1    | A      | NaN
2    | B      | NaN
3    | C      | NaN
4    | A      | NaN
5    | A      | NaN
6    | C      | NaN
A    | NaN    | 1 ; 4 ; 5
B    | NaN    | 2 
C    | NaN    | 3 ; 6

我尝试在它上面使用fillna()函数,但没有得到预期的结果。

我认为应该在它上面使用groupby和merge函数

print(df1)

  id parent
0  1      A
1  2      B
2  3      C
3  4      A
4  5      A
5  6      C
6  A    NaN
7  B    NaN
8  C    NaN
然后搜索他们的孩子:

df2 = df1.groupby('parent').agg({'id': lambda x: x.tolist()}).reset_index()
print(df2)

  parent      child
0      A  [1, 4, 5]
1      B        [2]
2      C     [3, 6]
最后合并它们:

df2.columns = ['id', 'child']
df3 = pd.merge(df1,df2,on='id',how='left')
print(df3)
  id parent      child
0  1      A        NaN
1  2      B        NaN
2  3      C        NaN
3  4      A        NaN
4  5      A        NaN
5  6      C        NaN
6  A    NaN  [1, 4, 5]
7  B    NaN        [2]
8  C    NaN     [3, 6]