Python 为什么我在这个数据帧分配中丢失了9个值？_Python_Pandas_Dataframe

Python 为什么我在这个数据帧分配中丢失了9个值？

python pandas dataframe

Python 为什么我在这个数据帧分配中丢失了9个值？,python,pandas,dataframe,Python,Pandas,Dataframe,我试图用新的标准化值（ndf2）更新原始数据帧（df）中的一些数字列。有333行非空值。赋值后，我的9个数值为NaN-我怀疑我的赋值操作有问题或索引有问题？如何正确地执行此操作 ndf2.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 333 entries, 0 to 332 Data columns (total 4 columns): # Column Non-Null Count

我试图用新的标准化值（ndf2）更新原始数据帧（df）中的一些数字列。有333行非空值。赋值后，我的9个数值为NaN-我怀疑我的赋值操作有问题或索引有问题？如何正确地执行此操作

ndf2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 333 entries, 0 to 332
Data columns (total 4 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   bill_length_mm     333 non-null    float64
 1   bill_depth_mm      333 non-null    float64
 2   flipper_length_mm  333 non-null    float64
 3   body_mass_g        333 non-null    float64
dtypes: float64(4)
memory usage: 10.5 KB

因为在此之后：

df.info(), df.shape

<class 'pandas.core.frame.DataFrame'>
Int64Index: 333 entries, 0 to 343
Data columns (total 7 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   species            333 non-null    object 
 1   island             333 non-null    object 
 2   bill_length_mm     324 non-null    float64
 3   bill_depth_mm      324 non-null    float64
 4   flipper_length_mm  324 non-null    float64
 5   body_mass_g        324 non-null    float64
 6   sex                333 non-null    object 
dtypes: float64(4), object(3)
memory usage: 20.8+ KB
(None, (333, 7))

正如我所料，它降到了333

更新：如果我做了

df.reset\u索引（drop=True，inplace=True）

，这就解决了问题。

可能是您的索引没有对齐。您可以通过以下方式进行检查：

df1.index.equals(ndf2.index)

如果不是，您可以通过以下方式重置索引：

df.reset_index(inplace = True)
ndf2.reset_index(inplace = True)

然后，指定值：

df[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']] = \
ndf2[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']]

或者，如果您的数据集具有相同的行数，则在没有索引对齐的情况下，也可以使用以下方法：

df[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']] = \
ndf2[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']].to_numpy()

谢谢，如果我做了

df.reset\u index（drop=True，inplace=True）

，这就解决了问题，因为我只是从这个数据框中删除项目。我上面显示的赋值是否有问题，因为它似乎比您的版本更简洁。关于检查索引值的有用提示。不，你是对的-iloc赋值是完全正确的，而且确实更简洁。@Levon尝试

df.update（ndf2）

来更新inplace，或者

df.assign（**ndf2）

来复制数据帧，这样你就不必担心列名的位置。这是在您重置索引之后。

df.reset_index(inplace = True)
ndf2.reset_index(inplace = True)

df[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']] = \
ndf2[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']]

df[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']] = \
ndf2[['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g']].to_numpy()