Python 有效地填补数据帧中的空白
我的数据帧如下所示:Python 有效地填补数据帧中的空白,python,pandas,numpy,multidimensional-array,Python,Pandas,Numpy,Multidimensional Array,我的数据帧如下所示: Time Conc. Flux 0 0.220000 0.000000e+00 NaN 1 0.536800 0.000000e+00 NaN 2 0.992992 0.000000e+00 NaN 3 1.000000 0.000000e+00 -0.009888 4 1.220000 0.000000e+
Time Conc. Flux
0 0.220000 0.000000e+00 NaN
1 0.536800 0.000000e+00 NaN
2 0.992992 0.000000e+00 NaN
3 1.000000 0.000000e+00 -0.009888
4 1.220000 0.000000e+00 NaN
5 1.536800 0.000000e+00 NaN
6 1.992992 0.000000e+00 NaN
7 2.649909 0.000000e+00 NaN
8 3.595869 0.000000e+00 NaN
9 4.958052 0.000000e+00 NaN
10 6.919595 0.000000e+00 NaN
11 9.744217 0.000000e+00 NaN
12 13.811673 0.000000e+00 NaN
13 19.668812 0.000000e+00 NaN
14 28.103090 0.000000e+00 NaN
15 31.000000 0.000000e+00 -0.009729
16 31.220001 0.000000e+00 NaN
17 31.536800 0.000000e+00 NaN
18 31.992992 0.000000e+00 NaN
19 32.649910 0.000000e+00 NaN
20 33.595871 0.000000e+00 NaN
21 34.958054 0.000000e+00 NaN
22 36.919594 0.000000e+00 NaN
23 39.744217 0.000000e+00 NaN
24 43.811672 0.000000e+00 NaN
25 49.668808 0.000000e+00 NaN
26 58.103088 0.000000e+00 NaN
27 61.000000 0.000000e+00 -0.009751
. . . .
. . . .
. . . .
我正试图以一种高效(快速)的方式用下面的数字替换NaN。通量列基本上是一个阶跃函数,在下一个值之前是恒定的。我的数据框架中有单独的通量:
Time [day] Flux
0 1.0 -0.009888
1 31.0 -0.009729
2 61.0 -0.009751
3 91.0 -0.009727
4 121.0 -0.009723
5 151.0 -0.016197
6 181.0 -0.015375
7 211.0 -0.014224
8 241.0 -0.019393
9 271.0 -0.012164
. . .
. . .
. . .
我尝试在两个数据帧之间使用嵌套循环并逐个重写NaN,但速度非常慢。我有大约100个数据帧,每个数据帧有大约4000行注意:浓度不一定为零,NAN之间出现数字的距离可能会改变 试试bfill
df=df.bfill()
df
Out[96]:
Time Conc. Flux
0 0.220000 0.0 -0.009888
1 0.536800 0.0 -0.009888
2 0.992992 0.0 -0.009888
3 1.000000 0.0 -0.009888
4 1.220000 0.0 -0.009729
5 1.536800 0.0 -0.009729
6 1.992992 0.0 -0.009729
7 2.649909 0.0 -0.009729
8 3.595869 0.0 -0.009729
9 4.958052 0.0 -0.009729
10 6.919595 0.0 -0.009729
11 9.744217 0.0 -0.009729
12 13.811673 0.0 -0.009729
13 19.668812 0.0 -0.009729
14 28.103090 0.0 -0.009729
15 31.000000 0.0 -0.009729
16 31.220001 0.0 -0.009751
17 31.536800 0.0 -0.009751
18 31.992992 0.0 -0.009751
19 32.649910 0.0 -0.009751
20 33.595871 0.0 -0.009751
21 34.958054 0.0 -0.009751
22 36.919594 0.0 -0.009751
23 39.744217 0.0 -0.009751
24 43.811672 0.0 -0.009751
25 49.668808 0.0 -0.009751
26 58.103088 0.0 -0.009751
27 61.000000 0.0 -0.009751
尝试填充b填充
df=df.bfill()
df
Out[96]:
Time Conc. Flux
0 0.220000 0.0 -0.009888
1 0.536800 0.0 -0.009888
2 0.992992 0.0 -0.009888
3 1.000000 0.0 -0.009888
4 1.220000 0.0 -0.009729
5 1.536800 0.0 -0.009729
6 1.992992 0.0 -0.009729
7 2.649909 0.0 -0.009729
8 3.595869 0.0 -0.009729
9 4.958052 0.0 -0.009729
10 6.919595 0.0 -0.009729
11 9.744217 0.0 -0.009729
12 13.811673 0.0 -0.009729
13 19.668812 0.0 -0.009729
14 28.103090 0.0 -0.009729
15 31.000000 0.0 -0.009729
16 31.220001 0.0 -0.009751
17 31.536800 0.0 -0.009751
18 31.992992 0.0 -0.009751
19 32.649910 0.0 -0.009751
20 33.595871 0.0 -0.009751
21 34.958054 0.0 -0.009751
22 36.919594 0.0 -0.009751
23 39.744217 0.0 -0.009751
24 43.811672 0.0 -0.009751
25 49.668808 0.0 -0.009751
26 58.103088 0.0 -0.009751
27 61.000000 0.0 -0.009751
列最初是numpy数组,我将它们更改为pandas数据帧。列最初是numpy数组,我将它们更改为pandas数据帧。@Bob yw happycoding@Bobyw快乐编码