Python 如何在家里使用熊猫?
我经常对熊猫切片操作感到困惑,比如Python 如何在家里使用熊猫?,python,pandas,dataframe,Python,Pandas,Dataframe,我经常对熊猫切片操作感到困惑,比如 import pandas as pd raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'], 'company': ['1st', '1st', '2nd', '2nd
import pandas as pd
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'],
'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'],
'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'],
'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'name', 'preTestScore', 'postTestScore'])
def get_stats(group):
return {'min': group.min(), 'max': group.max(), 'count': group.count(), 'mean': group.mean()}
bins = [0, 25, 50, 75, 100]
group_names = ['Low', 'Okay', 'Good', 'Great']
df['categories'] = pd.cut(df['postTestScore'], bins, labels=group_names)
des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()
des.at['Good','mean']
我得到:
TypeError回溯(最近的调用
最后)pandas/_libs/index.pyx in
pandas._libs.index.IndexEngine.get_loc()
中的pandas/_libs/hashtable_class_helper.pxi
pandas._libs.hashtable.Int64HashTable.get_item()
TypeError:需要一个整数
在处理上述异常期间,发生了另一个异常:
KeyError回溯(最近的呼叫
最后)在()
---->1 des.在['Good','mean']
中的C:\ProgramData\Anaconda3\lib\site packages\pandas\core\index.py
getitem(self,key)1867 1868 key=self.\u convert\u key(key)
->1869返回self.obj.\u获取值(*key,takeable=self.\u takeable)1870 1871 defsetitem(self,
键、值):
C:\ProgramData\Anaconda3\lib\site packages\pandas\core\frame.py in
_获取_值(self、index、col、takeable)1983 1984尝试:
->1985返回引擎。获取值(系列。值,索引)1986除外(类型错误,值错误):1987
pandas/_libs/index.pyx在pandas中。_libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx在pandas中。_libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx在pandas中。_libs.index.IndexEngine.get_loc()
关键错误:“好”
我该怎么做
提前感谢。问题在于线路
des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()
通过“postTestScroe”进行分组后,您得到的是“Series”而不是“DataFrame”,如下所示
现在,当您尝试使用DataFrame des访问标量标签时,“无法识别标签“Good”,因为它不存在于Series中
des.at['Good','mean']
只需打印desprint,您就会看到结果系列
count max mean min
categories
Low 2.0 25.0 25.00 25.0
Okay 0.0 NaN NaN NaN
Good 8.0 70.0 63.75 57.0
Great 2.0 94.0 94.00 94.0
由于分类索引的原因,它不起作用:
des.index
# Out[322]: CategoricalIndex(['Low', 'Okay', 'Good', 'Great'], categories=['Low', 'Okay', 'Good', 'Great'], ordered=True, name='categories', dtype='category')
尝试如下更改:
des.index = des.index.tolist()
des.at['Good','mean']
# Out[326]: 63.75
具体做什么?