Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/342.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 通过在大熊猫组中添加连续数字来填充NaN_Python_Python 3.x_Pandas - Fatal编程技术网

Python 通过在大熊猫组中添加连续数字来填充NaN

Python 通过在大熊猫组中添加连续数字来填充NaN,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个数据帧,比如 Groups NAME Number G1 A 1 G1 B 2 G1 D NaN G1 D NaN G1 I 3 G1 H NaN G2 E 1 G2 E 1 G2 F NaN G2 J 2 G3 K NaN G3 L 1 我想通过填写数字来填充组中的NaN值 例如,G1中的D获取编号4,因为1、2

我有一个数据帧,比如

Groups NAME Number
G1     A    1
G1     B    2
G1     D    NaN
G1     D    NaN 
G1     I    3
G1     H    NaN 
G2     E    1 
G2     E    1
G2     F    NaN
G2     J    2
G3     K    NaN
G3     L    1
我想通过填写数字来填充组中的NaN值

例如,
G1
中的
D
获取编号4,因为1、2和3已经存在。 然后H在
G1
中获得
数字5

在那一刻,我应该得到

Groups NAME Number
G1     A    1
G1     B    2
G1     D    4
G1     D    4
G1     I    3
G1     H    5
G2     E    1 
G2     E    1
G2     F    3
G2     J    2
G3     K    2
G3     L    1

有人有什么想法吗?

这里有一种方法使用
pd.factorize()


您可以使用
groupby
+
ngroup
为每个组/名称添加带递增整数的空值。然后我们减去组内的最小
n组
(以确定要添加的数量),然后添加组内已经存在的最大数量

然后,我们用这个系列
fillna

s = df[df['Number'].isnull()].groupby(['Groups', 'NAME']).ngroup()
#2     0      #<- G1/D  (Series index is DataFrame index)
#3     0      #<- G1/D  
#5     1      #<- G1/H 
#8     2      #<- G2/F
#10    3      #<- G3/K

to_fill = (s - s.groupby(df['Groups']).transform('min') + 1
           + df.groupby('Groups')['Number'].transform('max'))
#0     NaN
#1     NaN
#2     4.0
#3     4.0
#4     NaN
#5     5.0
#6     NaN
#7     NaN
#8     3.0
#9     NaN
#10    2.0
#11    NaN

df['Number'] = df['Number'].fillna(to_fill, downcast='infer')
#   Groups NAME  Number
#0      G1    A       1
#1      G1    B       2
#2      G1    D       4
#3      G1    D       4
#4      G1    I       3
#5      G1    H       5
#6      G2    E       1
#7      G2    E       1
#8      G2    F       3
#9      G2    J       2
#10     G3    K       2
#11     G3    L       1
s=df[df['Number'].isnull()].groupby(['group','NAME']).ngroup()
#2     0      #
s = df[df['Number'].isnull()].groupby(['Groups', 'NAME']).ngroup()
#2     0      #<- G1/D  (Series index is DataFrame index)
#3     0      #<- G1/D  
#5     1      #<- G1/H 
#8     2      #<- G2/F
#10    3      #<- G3/K

to_fill = (s - s.groupby(df['Groups']).transform('min') + 1
           + df.groupby('Groups')['Number'].transform('max'))
#0     NaN
#1     NaN
#2     4.0
#3     4.0
#4     NaN
#5     5.0
#6     NaN
#7     NaN
#8     3.0
#9     NaN
#10    2.0
#11    NaN

df['Number'] = df['Number'].fillna(to_fill, downcast='infer')
#   Groups NAME  Number
#0      G1    A       1
#1      G1    B       2
#2      G1    D       4
#3      G1    D       4
#4      G1    I       3
#5      G1    H       5
#6      G2    E       1
#7      G2    E       1
#8      G2    F       3
#9      G2    J       2
#10     G3    K       2
#11     G3    L       1