Python KeyError:';最终结果';使用箱线图时
我想使用两个特性制作箱线图,即“提交日期”和“最终结果”。 我查过文件了。我不明白为什么会出现这个错误 我使用了以下代码-Python KeyError:';最终结果';使用箱线图时,python,pandas,Python,Pandas,我想使用两个特性制作箱线图,即“提交日期”和“最终结果”。 我查过文件了。我不明白为什么会出现这个错误 我使用了以下代码- import pandas as pd import numpy as np df = pd.read_csv('/home/user/Documents/MOOC dataset cleaned/assessments.csv') df.boxplot(column ='date_submitted', by='final_result
import pandas as pd
import numpy as np
df = pd.read_csv('/home/user/Documents/MOOC dataset cleaned/assessments.csv')
df.boxplot(column ='date_submitted', by='final_result')
这是对我的数据集的描述-
date_submitted date_registration date_unregistration sum_click \
count 28785.000000 28785.000000 28785.000000 28785.000000
mean 26.414139 -69.139552 321.657426 2.660066
std 15.890933 49.305239 188.935462 5.177789
min -11.000000 -322.000000 -317.000000 1.000000
25% 18.000000 -100.000000 130.000000 1.000000
50% 24.000000 -56.000000 445.000000 1.000000
75% 30.000000 -29.000000 445.000000 3.000000
max 241.000000 124.000000 445.000000 511.000000
num_of_prev_attempts age_band region highest_education \
count 28785.000000 28785.000000 25559.000000 28785.000000
mean 0.121278 1.693660 5.041981 1.280111
std 0.420666 0.474206 3.689341 0.769604
min 0.000000 0.000000 0.000000 0.000000
25% 0.000000 1.000000 2.000000 1.000000
50% 0.000000 2.000000 5.000000 1.000000
75% 0.000000 2.000000 9.000000 2.000000
max 6.000000 2.000000 11.000000 4.000000
studied_credits score final_result
count 28785.000000 28785.000000 28785.000000
mean 78.691506 75.453431 1.029703
std 40.617665 19.968919 0.884043
min 30.000000 0.000000 0.000000
25% 60.000000 68.000000 0.000000
50% 60.000000 82.000000 1.000000
75% 120.000000 87.000000 2.000000
max 655.000000 100.000000 2.000000
错误回溯-
Traceback (most recent call last):
File "/home/user/Documents/outliers.py", line 6, in <module>
df.boxplot(column ='date_submitted', by='final_result')
File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 5516, in boxplot
**kwds)
File "/usr/lib/python2.7/dist-packages/pandas/tools/plotting.py", line 2689, in boxplot
return_type=return_type)
File "/usr/lib/python2.7/dist-packages/pandas/tools/plotting.py", line 3077, in _grouped_plot_by_column
grouped = data.groupby(by)
File "/usr/lib/python2.7/dist-packages/pandas/core/generic.py", line 3436, in groupby
sort=sort, group_keys=group_keys, squeeze=squeeze)
File "/usr/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1311, in groupby
return klass(obj, by, **kwds)
File "/usr/lib/python2.7/dist-packages/pandas/core/groupby.py", line 418, in __init__
level=level, sort=sort)
File "/usr/lib/python2.7/dist-packages/pandas/core/groupby.py", line 2264, in _get_grouper
in_axis, name, gpr = True, gpr, obj[gpr]
File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 1969, in __getitem__
return self._getitem_column(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 1976, in _getitem_column
return self._get_item_cache(key)
File "/usr/lib/python2.7/dist-packages/pandas/core/generic.py", line 1091, in _get_item_cache
values = self._data.get(item)
File "/usr/lib/python2.7/dist-packages/pandas/core/internals.py", line 3211, in get
loc = self.items.get_loc(item)
File "/usr/lib/python2.7/dist-packages/pandas/core/index.py", line 1759, in get_loc
return self._engine.get_loc(key)
File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)
File "pandas/index.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)
File "pandas/hashtable.pyx", line 668, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12265)
File "pandas/hashtable.pyx", line 676, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12216)
KeyError: 'final_result'
[Finished in 0.292s]
回溯(最近一次呼叫最后一次):
文件“/home/user/Documents/outliers.py”,第6行,在
df.箱线图(列为“提交日期”,由“最终结果”组成)
文件“/usr/lib/python2.7/dist packages/pandas/core/frame.py”,第5516行,方框图
**kwds)
文件“/usr/lib/python2.7/dist packages/pandas/tools/plotting.py”,第2689行,在boxplot中
返回类型=返回类型)
文件“/usr/lib/python2.7/dist packages/pandas/tools/plotting.py”,第3077行,在“按”列分组的“绘图”
分组=数据。分组依据(按)
文件“/usr/lib/python2.7/dist-packages/pandas/core/generic.py”,第3436行,在groupby中
排序=排序,组键=组键,挤压=挤压)
groupby中的文件“/usr/lib/python2.7/dist packages/pandas/core/groupby.py”,第1311行
返回klass(obj,由,**科威特先令)
文件“/usr/lib/python2.7/dist packages/pandas/core/groupby.py”,第418行,在__
级别=级别,排序=排序)
文件“/usr/lib/python2.7/dist packages/pandas/core/groupby.py”,第2264行,在grouper中
在_轴中,名称,gpr=True,gpr,obj[gpr]
文件“/usr/lib/python2.7/dist packages/pandas/core/frame.py”,第1969行,在__
返回self.\u getitem\u列(键)
文件“/usr/lib/python2.7/dist packages/pandas/core/frame.py”,第1976行,在_getitem_列中
返回self.\u获取\u项目\u缓存(密钥)
文件“/usr/lib/python2.7/dist packages/pandas/core/generic.py”,第1091行,在获取项目缓存中
values=self.\u data.get(项目)
get中的文件“/usr/lib/python2.7/dist packages/pandas/core/internals.py”,第3211行
loc=自身项目。获取loc(项目)
文件“/usr/lib/python2.7/dist packages/pandas/core/index.py”,第1759行,在get_loc中
返回发动机。获取位置(钥匙)
pandas.index.IndexEngine.get_loc(pandas/index.c:3979)中的文件“pandas/index.pyx”,第137行
pandas.index.IndexEngine.get_loc(pandas/index.c:3843)中的文件“pandas/index.pyx”,第157行
pandas.hashtable.PyObjectHashTable.get_项(pandas/hashtable.c:12265)中第668行的文件“pandas/hashtable.pyx”
pandas.hashtable.PyObjectHashTable.get_项(pandas/hashtable.c:12216)中第676行的文件“pandas/hashtable.pyx”
KeyError:“最终结果”
[在0.292s内完成]
我不明白为什么会出现这个错误。什么是
打印(df.columns.tolist())
?它在我的帖子中的什么位置?它不在你的代码中。他要求您执行它并在这里显示结果。我认为列名中有一些陷阱,可以通过print(df.columns.tolist())