Python 如何替换DataFrame中丢失的数据_Python_Pandas_Dataframe

Python 如何替换DataFrame中丢失的数据

python pandas dataframe

Python 如何替换DataFrame中丢失的数据,python,pandas,dataframe,Python,Pandas,Dataframe,假设我有以下数据帧： df = pd.DataFrame({'col1': [241, 123, 423], 'col2':[977, 78, np.NaN], 'col3':[76, 432, np.NaN], 'col4':[234, 321, 987]}, index=pd.date_range('2019-1-1', periods=3, freq="D")).rename_axis('Date') 哪些产出： col1 col2 col3 col4

假设我有以下数据帧：

df = pd.DataFrame({'col1': [241, 123, 423], 'col2':[977, 78, np.NaN], 'col3':[76, 432, np.NaN], 'col4':[234, 321, 987]}, index=pd.date_range('2019-1-1', periods=3, freq="D")).rename_axis('Date')

哪些产出：

            col1   col2   col3  col4
Date                                
2019-01-01   241  977.0   76.0   234
2019-01-02   123   78.0  432.0   321
2019-01-03   423    NaN    NaN   987

另一个数据帧，甚至一个系列，缺少

col2

和

col3

的值。如何将

NaN

值替换为

df2

中的值

df2 = pd.DataFrame({'col2': 111, 'col3': 222}, index=[pd.to_datetime('2019-1-3')]).rename_axis('Date')

这看起来像：

            col2  col3
Date                  
2019-01-03   111   222

我想要的最终数据帧应该如下所示：

            col1   col2   col3  col4
Date                                
2019-01-01   241  977.0   76.0   234
2019-01-02   123   78.0  432.0   321
2019-01-03   423    111    222   987

我们可以使用：

如果您有类似于使用

df2.iloc[0]

获得的列序列，我们也可以这样做：

my_serie=df2.iloc[0]
print(my_serie)
col2    111
col3    222
Name: 2019-01-03 00:00:00, dtype: int64

print(df.fillna(my_serie))
            col1   col2   col3  col4
Date                                
2019-01-01   241  977.0   76.0   234
2019-01-02   123   78.0  432.0   321
2019-01-03   423  111.0  222.0   987

我们可以使用：

如果您有类似于使用

df2.iloc[0]

获得的列序列，我们也可以这样做：

my_serie=df2.iloc[0]
print(my_serie)
col2    111
col3    222
Name: 2019-01-03 00:00:00, dtype: int64

print(df.fillna(my_serie))
            col1   col2   col3  col4
Date                                
2019-01-01   241  977.0   76.0   234
2019-01-02   123   78.0  432.0   321
2019-01-03   423  111.0  222.0   987

备选方案<代码>首先合并

df2.combine_first(df)
Out[8]: 
             col1   col2   col3   col4
Date                                  
2019-01-01  241.0  977.0   76.0  234.0
2019-01-02  123.0   78.0  432.0  321.0
2019-01-03  423.0  111.0  222.0  987.0

或

更新

df.update(df2)
df
Out[10]: 
            col1   col2   col3  col4
Date                                
2019-01-01   241  977.0   76.0   234
2019-01-02   123   78.0  432.0   321
2019-01-03   423  111.0  222.0   987

备选方案<代码>首先合并

df2.combine_first(df)
Out[8]: 
             col1   col2   col3   col4
Date                                  
2019-01-01  241.0  977.0   76.0  234.0
2019-01-02  123.0   78.0  432.0  321.0
2019-01-03  423.0  111.0  222.0  987.0

或

更新

df.update(df2)
df
Out[10]: 
            col1   col2   col3  col4
Date                                
2019-01-01   241  977.0   76.0   234
2019-01-02   123   78.0  432.0   321
2019-01-03   423  111.0  222.0   987