Python 计算一行中连续缺失值的数量
我试图找到一种方法来计算从数据帧中随机删除的值的数量,以及一个接一个地随机删除的值的数量 到目前为止,我掌握的代码是:Python 计算一行中连续缺失值的数量,python,Python,我试图找到一种方法来计算从数据帧中随机删除的值的数量,以及一个接一个地随机删除的值的数量 到目前为止,我掌握的代码是: import numpy as np import matplotlib.pyplot as plt import pandas as pd #Sampledata x=[1,2,3,4,5,6,7,8,9,10] y=[1,2,3,4,5,6,7,8,9,10] df = pd.DataFrame({'col_1':y,'col_2':x}) drop_indices
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#Sampledata
x=[1,2,3,4,5,6,7,8,9,10]
y=[1,2,3,4,5,6,7,8,9,10]
df = pd.DataFrame({'col_1':y,'col_2':x})
drop_indices = np.random.choice(df.index, 5,replace=False )
df_subset = df.drop(drop_indices)
print(df_subset)
print(df)
从数据帧中随机删除5行,并作为输出:
col_1 col_2
0 1 1
1 2 2
2 3 3
5 6 6
8 9 9
col_1 col_2
0 1 1
1 2 2
2 3 3
3 4 4
4 5 5
5 6 6
6 7 7
7 8 8
8 9 9
9 10 10
我想将此转换为以下数据框:
col_1 col_2 col_2 N_removedvalues N_consecutive
0 1 1 1 0 0
1 2 2 2 0 0
2 3 3 3 0 0
3 4 4 1 1
4 5 5 2 2
5 6 6 6 2 0
6 7 7 3 1
7 8 8 4 2
8 9 9 9 4 0
9 10 10 5 1
res=df.merge(df_子集,on='col_1',后缀=['''u 1','',how='left')
res[“N_removedvalues”]=np.where(res['col_2'].isna(),res.groupby(res['col_2'].isna()).cumcount().add(1),np.nan)
res[“N_removedvalues”]=res[“N_removedvalues”].ffill().fillna(0)
res['N_']=np.logical_和(res['col_2'].isna(),np.logical_或(~res['col_2'].shift().isna(),res.index==res.index[0]))
res.loc[np.logical_和(res['N_contracted']=0,res['col_2'].isna()),'N_contracted']=np.nan
res['N_contracted']=res.groupby('N_contracted')['N_contracted'].cumsum().ffill()
res.loc[res['N\u continuous'].gt(0),'N\u continuous']=res.loc[res['N\u continuous'].gt(0)].groupby('N\u continuoused').cumcount()。添加(1)
产出:
col_1 col_2_1 col_2 N_删除的值N_连续
0 1 1 1.0 0.0 0.0
1 2 2 2.0 0.0 0.0
2 3南1.0 1.0
3 4 4 4.0 1.0 0.0
45NAN2.01.0
56NAN3.02.0
6 7 7 7.0 3.0 0.0
7 8 8 8.0 3.0 0.0
899南4.01.0
9100 NaN 5.02.0