Python 用给定数组替换pandas data.frame的一部分_Python_Pandas_Dataframe

Python 用给定数组替换pandas data.frame的一部分

python pandas dataframe

Python 用给定数组替换pandas data.frame的一部分,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个名为fake\u num的pandas.DataFrame： fake_num=pd.DataFrame([[1,2,3,4,np.nan,np.nan,np.nan],[1.1,1.2,1.3,1.4,1.6,1.8,2.5]]).T fake_num Out[4]: 0 1 0 1.0 1.1 1 2.0 1.2 2 3.0 1.3 3 4.0 1.4 4 NaN 1.6 5

我有一个名为

fake\u num

的

pandas.DataFrame

：

  fake_num=pd.DataFrame([[1,2,3,4,np.nan,np.nan,np.nan],[1.1,1.2,1.3,1.4,1.6,1.8,2.5]]).T
  fake_num
    Out[4]: 
         0    1
    0  1.0  1.1
    1  2.0  1.2
    2  3.0  1.3
    3  4.0  1.4
    4  NaN  1.6
    5  NaN  1.8
    6  NaN  2.5

我正在尝试使用线性回归来填充
NaN
值：

from sklearn.linear_model import LinearRegression fdrop=fake_num.dropna(axis=0,how='any') lr=LinearRegression() lr.fit(np.array(fdrop.iloc[:,1]).reshape(-1, 1),np.array(fdrop.iloc[:,0])) lr.predict(np.array(fake_num[np.isnan(fake_num[0])][1]).reshape(-1, 1)) Out[5]: array([ 6., 8., 15.])
我要替换的部分是
fake_num[np.isnan（fake_num[0]）][0]
，所以我想要的是：

Out[6]: 0 1 0 1.0 1.1 1 2.0 1.2 2 3.0 1.3 3 4.0 1.4 4 6.0 1.6 5 8.0 1.8 6 5.0 2.5
当我尝试时：

fake_num[np.isnan(fake_num[0])][0]=lr.predict(np.array(fake_num[np.isnan(fake_num.iloc[:,0])].iloc[:,1]).reshape(-1, 1)) fake_num __main__:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy Out[11]: 0 1 0 1.0 1.1 1 2.0 1.2 2 3.0 1.3 3 4.0 1.4 4 NaN 1.6 5 NaN 1.8 6 NaN 2.5
及
及

我应该怎么做才能用一些值替换数据框的一部分，给出它的位置。顺便说一句，因为我需要更多的细节细化，有没有好的工具，用简单的预测模型填充na值，使用其他所有非na行和其他列作为输入？类似R.中的missforest的内容
只需调用
fit
，然后使用
loc
重新分配

v = fake_num.dropna() lr.fit(v[[1]], v[[0]]) m = fake_num[0].isna() fake_num.loc[m, [0]] = lr.predict(fake_num.loc[m, [1]]) fake_num 0 1 0 1.0 1.1 1 2.0 1.2 2 3.0 1.3 3 4.0 1.4 4 6.0 1.6 5 8.0 1.8 6 15.0 2.5
可能重复的
fake_num[np.isnan(fake_num.iloc[:,0])].iloc[:,0]=lr.predict(np.array(fake_num[np.isnan(fake_num.iloc[:,0])].iloc[:,1]).reshape(-1, 1)) fake_num D:\Users\shan xu\Anaconda3\lib\site-packages\pandas\core\indexing.py:630: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self.obj[item_labels[indexer[info_axis]]] = value Out[12]: 0 1 0 1.0 1.1 1 2.0 1.2 2 3.0 1.3 3 4.0 1.4 4 NaN 1.6 5 NaN 1.8

v = fake_num.dropna() lr.fit(v[[1]], v[[0]]) m = fake_num[0].isna() fake_num.loc[m, [0]] = lr.predict(fake_num.loc[m, [1]]) fake_num 0 1 0 1.0 1.1 1 2.0 1.2 2 3.0 1.3 3 4.0 1.4 4 6.0 1.6 5 8.0 1.8 6 15.0 2.5