Python 为什么大熊猫数据框在我删除选定行后只显示NaN值?
使用熊猫图书馆。17.1,我正在尝试从名为Python 为什么大熊猫数据框在我删除选定行后只显示NaN值?,python,pandas,dataframe,Python,Pandas,Dataframe,使用熊猫图书馆。17.1,我正在尝试从名为productDataNat的大型(882504行)数据帧中删除行,其中parName=='rt',但随后所有其他行都变成NaN: productDataNat = pd.read_csv('https://lobianco.org/temp/productData_P0-Mi-Ei.csv',sep=';', dtype={'value': np.float64}) productDataNat = productDataNat.drop(['Unn
productDataNat
的大型(882504行)数据帧中删除行,其中parName
=='rt',但随后所有其他行都变成NaN
:
productDataNat = pd.read_csv('https://lobianco.org/temp/productData_P0-Mi-Ei.csv',sep=';', dtype={'value': np.float64})
productDataNat = productDataNat.drop(['Unnamed: 8'],axis=1)
productDataNat.set_index(['scen','country','region','prod','freeDim','year','parName'], inplace=True)
productDataNat.head()
相反,当我使用示例数据帧时,它会按预期工作:
midx = pd.MultiIndex(levels=[['one', 'two'], ['x','y']], labels=[[1,1,1,0],[1,0,1,0]])
dfmix = pd.DataFrame({'A' : [1, 2, 3, 4], 'B': [5, 6, 7, 8]}, index=midx)
dfmix
熊猫中的错误或我的数据帧出现错误(什么?)更正:
它对我的作用完全相同(我使用的是Pandas v0.18.0):
作为一种解决方法,您可以在设置多索引之前去掉rt
s:
cols = 'scen;parName;country;region;prod;freeDim;year;value'.split(';')
url = 'https://lobianco.org/temp/productData_P0-Mi-Ei.csv'
productDataNat = pd.read_csv(url, sep=';', dtype={'value': np.float64}, usecols=cols)
df = productDataNat.ix[productDataNat.parName != 'rt']
df.set_index(['scen','country','region','prod','freeDim','year','parName'], inplace=True)
更正:
它对我的作用完全相同(我使用的是Pandas v0.18.0):
作为一种解决方法,您可以在设置多索引之前去掉rt
s:
cols = 'scen;parName;country;region;prod;freeDim;year;value'.split(';')
url = 'https://lobianco.org/temp/productData_P0-Mi-Ei.csv'
productDataNat = pd.read_csv(url, sep=';', dtype={'value': np.float64}, usecols=cols)
df = productDataNat.ix[productDataNat.parName != 'rt']
df.set_index(['scen','country','region','prod','freeDim','year','parName'], inplace=True)
是的,在设置索引之前删除“rt”行是我的做法。。然后我假设这是0.17中的一个bug,在0.18中已经修复。。thanks@Antonello,很抱歉,我在测试之前删除了所有的
rt
s-现在我更新了我的回答是的,在设置索引之前删除“rt”行是我的做法。。然后我假设这是0.17中的一个bug,在0.18中已经修复。。thanks@Antonello,很抱歉,我在测试之前删除了所有rt
s-现在我更新了我的答案,因此此问题已作为错误报告提交此问题:。级别freeDim
中有NaN的
,当标签从另一个级别删除时,其余索引在该级别上只有NaN的。内部执行此操作的方式会导致重新索引,但没有匹配项,因此NaNs
。因此,此问题作为错误报告提交。此问题:。级别freeDim
中有NaN的
,当标签从另一个级别删除时,其余索引在该级别上只有NaN的。内部执行此操作的方式会导致重新索引,但没有匹配项,因此NaNs
。
In [4]: df.drop('rt', level='parName', axis=0)
Out[4]:
value
scen country region prod freeDim year parName
P0-Mi-Ei 11000 11042 hardWRoundW NaN 2005 dl NaN
softWRoundW NaN 2005 dl NaN
pulpWFuelW NaN 2005 dl NaN
ashRoundW NaN 2005 dl NaN
fuelW NaN 2005 dl NaN
hardWSawnW NaN 2005 dl NaN
softWSawnW NaN 2005 dl NaN
plyW NaN 2005 dl NaN
pulpW NaN 2005 dl NaN
pannels NaN 2005 dl NaN
ashSawnW NaN 2005 dl NaN
ashPlyW NaN 2005 dl NaN
11061 hardWRoundW NaN 2005 dl NaN
softWRoundW NaN 2005 dl NaN
pulpWFuelW NaN 2005 dl NaN
ashRoundW NaN 2005 dl NaN
fuelW NaN 2005 dl NaN
hardWSawnW NaN 2005 dl NaN
softWSawnW NaN 2005 dl NaN
plyW NaN 2005 dl NaN
pulpW NaN 2005 dl NaN
pannels NaN 2005 dl NaN
ashSawnW NaN 2005 dl NaN
ashPlyW NaN 2005 dl NaN
11072 hardWRoundW NaN 2005 dl NaN
softWRoundW NaN 2005 dl NaN
pulpWFuelW NaN 2005 dl NaN
ashRoundW NaN 2005 dl NaN
fuelW NaN 2005 dl NaN
hardWSawnW NaN 2005 dl NaN
cols = 'scen;parName;country;region;prod;freeDim;year;value'.split(';')
url = 'https://lobianco.org/temp/productData_P0-Mi-Ei.csv'
productDataNat = pd.read_csv(url, sep=';', dtype={'value': np.float64}, usecols=cols)
df = productDataNat.ix[productDataNat.parName != 'rt']
df.set_index(['scen','country','region','prod','freeDim','year','parName'], inplace=True)