如何在Python中处理数据中的NaN值?
我有一个很大的数据集,在多个列中包含许多NaN值 我已经尝试了以下代码,但它没有从数据集中删除Nan值如何在Python中处理数据中的NaN值?,python,data-science,data-analysis,missing-data,Python,Data Science,Data Analysis,Missing Data,我有一个很大的数据集,在多个列中包含许多NaN值 我已经尝试了以下代码,但它没有从数据集中删除Nan值 df = pd.read_excel('sec3_data.xlsx') df.dropna(subset=["Deviation from Partisanship"]) df['Deviation from Partisanship'].unique() 输出: array([nan, 'Vote for opposing party', 'Vote for own party'], d
df = pd.read_excel('sec3_data.xlsx')
df.dropna(subset=["Deviation from Partisanship"])
df['Deviation from Partisanship'].unique()
输出:
array([nan, 'Vote for opposing party', 'Vote for own party'], dtype=object)
它清楚地表明仍然存在一些可用的nan值。如何删除它们?您需要将其写为
df = df.dropna(subset=["Deviation from Partisanship"])
或者
你需要把它写成
df = df.dropna(subset=["Deviation from Partisanship"])
或者
您需要重新分配到新的数据帧:
df2 = df.dropna(subset=["Deviation from Partisanship"])
或执行就地降落:
您可以在此处的文档中找到更多信息:您需要重新分配到新的数据帧:
df2 = df.dropna(subset=["Deviation from Partisanship"])
或执行就地降落:
您可以在以下文档中找到更多信息:
# Method 1
df = pd.read_excel('sec3_data.xlsx')
df.dropna(subset=["Deviation from Partisanship"], inplace=True)
df['Deviation from Partisanship'].unique()
# Method 2
df = pd.read_excel('sec3_data.xlsx')
df2 = df.dropna(subset=["Deviation from Partisanship"])
df2['Deviation from Partisanship'].unique()