Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/311.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/postgresql/10.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 计算一行中连续缺失值的数量_Python - Fatal编程技术网

Python 计算一行中连续缺失值的数量

Python 计算一行中连续缺失值的数量,python,Python,我试图找到一种方法来计算从数据帧中随机删除的值的数量,以及一个接一个地随机删除的值的数量 到目前为止,我掌握的代码是: import numpy as np import matplotlib.pyplot as plt import pandas as pd #Sampledata x=[1,2,3,4,5,6,7,8,9,10] y=[1,2,3,4,5,6,7,8,9,10] df = pd.DataFrame({'col_1':y,'col_2':x}) drop_indices

我试图找到一种方法来计算从数据帧中随机删除的值的数量,以及一个接一个地随机删除的值的数量

到目前为止,我掌握的代码是:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

#Sampledata
x=[1,2,3,4,5,6,7,8,9,10]
y=[1,2,3,4,5,6,7,8,9,10]

df = pd.DataFrame({'col_1':y,'col_2':x})

drop_indices = np.random.choice(df.index, 5,replace=False )
df_subset = df.drop(drop_indices)

print(df_subset)
print(df)
从数据帧中随机删除5行,并作为输出:

  col_1  col_2
0      1      1
1      2      2
2      3      3
5      6      6
8      9      9
   col_1  col_2
0      1      1
1      2      2
2      3      3
3      4      4
4      5      5
5      6      6
6      7      7
7      8      8
8      9      9
9     10     10
我想将此转换为以下数据框:

  col_1 col_2 col_2 N_removedvalues   N_consecutive
0     1    1     1    0                 0
1     2    2     2    0                 0
2     3    3     3    0                 0
3     4    4          1                 1
4     5    5          2                 2
5     6    6     6    2                 0
6     7    7          3                 1
7     8    8          4                 2
8     9    9     9    4                 0
9     10   10         5                 1
res=df.merge(df_子集,on='col_1',后缀=['''u 1','',how='left')
res[“N_removedvalues”]=np.where(res['col_2'].isna(),res.groupby(res['col_2'].isna()).cumcount().add(1),np.nan)
res[“N_removedvalues”]=res[“N_removedvalues”].ffill().fillna(0)
res['N_']=np.logical_和(res['col_2'].isna(),np.logical_或(~res['col_2'].shift().isna(),res.index==res.index[0]))
res.loc[np.logical_和(res['N_contracted']=0,res['col_2'].isna()),'N_contracted']=np.nan
res['N_contracted']=res.groupby('N_contracted')['N_contracted'].cumsum().ffill()
res.loc[res['N\u continuous'].gt(0),'N\u continuous']=res.loc[res['N\u continuous'].gt(0)].groupby('N\u continuoused').cumcount()。添加(1)
产出:

col_1 col_2_1 col_2 N_删除的值N_连续
0      1        1    1.0              0.0            0.0
1      2        2    2.0              0.0            0.0
2 3南1.0 1.0
3      4        4    4.0              1.0            0.0
45NAN2.01.0
56NAN3.02.0
6      7        7    7.0              3.0            0.0
7      8        8    8.0              3.0            0.0
899南4.01.0
9100 NaN 5.02.0