Python 2.7 熊猫:熊猫。_libs.hashtable.Int64HashTable.get_项
我有以下代码在数据帧Python 2.7 熊猫:熊猫。_libs.hashtable.Int64HashTable.get_项,python-2.7,pandas,Python 2.7,Pandas,我有以下代码在数据帧df上运行: print df categories = df['my_classification'].unique() for c in categories: print c win = df[df.result == 'Won'][df['my_classification'] == c]['prob'][0] print type(win) lost = df[df.result == 'Lost'][df['my_cl
df
上运行:
print df
categories = df['my_classification'].unique()
for c in categories:
print c
win = df[df.result == 'Won'][df['my_classification'] == c]['prob'][0]
print type(win)
lost = df[df.result == 'Lost'][df['my_classification'] == c]['prob'][0]
print type(lost)
然后我得到了以下输出:
result my_classification prob
0 Won ENTERPRISE 0.657895
1 Won COMMERCIAL 0.342105
2 Lost ENTERPRISE 0.611842
3 Lost COMMERCIAL 0.388158
ENTERPRISE
<type 'numpy.float64'>
结果我的分类问题
0韩元企业0.657895
1韩元商业0.342105
2丢失的企业0.611842
3商业损失0.388158
企业
以及错误:
There was a problem running this cell
KeyError 0
KeyErrorTraceback (most recent call last)
<ipython-input-4-38a901f9868a> in <module>()
38
39 print type(win)
---> 40 lost = df[df.result == 'Lost'][df['my_classification'] == c]['prob'][0]
41
42 print type(lost)
/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/series.pyc in __getitem__(self, key)
599 key = com._apply_if_callable(key, self)
600 try:
--> 601 result = self.index.get_value(self, key)
602
603 if not is_scalar(result):
/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/indexes/base.pyc in get_value(self, series, key)
2426 try:
2427 return self._engine.get_value(s, k,
-> 2428 tz=getattr(series.dtype, 'tz', None))
2429 except KeyError as e1:
2430 if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4363)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4046)()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5085)()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:13913)()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:13857)()
KeyError: 0
运行此单元格时出现问题
关键错误0
KeyErrorTraceback(最近一次呼叫最后一次)
在()
38
39打印类型(win)
--->40 lost=df[df.result=='lost'][df['my_classification']==c]['prob'][0]
41
42打印类型(丢失)
/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/series.pyc in____获取项目(self,key)
599 key=com.\u如果可调用(key,self),则应用
600次尝试:
-->601结果=self.index.get_值(self,key)
602
603如果不是标量(结果):
/get_值中的opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/index/base.pyc(self、series、key)
2426尝试:
2427返回自引擎。获取值(s,k,
->2428 tz=getattr(series.dtype,'tz',无))
2429除键错误为e1外:
2430如果len(self)>0且self.u输入['integer','boolean']:
pandas/_libs/index.pyx在pandas中。_libs.index.IndexEngine.get_值(pandas/_libs/index.c:4363)()
pandas/_libs/index.pyx在pandas中。_libs.index.IndexEngine.get_值(pandas/_libs/index.c:4046)()
pandas/_libs/index.pyx在pandas中。_libs.index.IndexEngine.get_loc(pandas/_libs/index.c:5085)()
pandas/_libs/hashtable\u class\u helper.pxi在pandas中。_libs.hashtable.Int64HashTable.get\u项(pandas/_libs/hashtable.c:13913)()
pandas/_libs/hashtable\u class_helper.pxi在pandas中。_libs.hashtable.Int64HashTable.get_项(pandas/_libs/hashtable.c:13857)()
关键错误:0
我不明白的是:赢和输的格式完全相同,为什么
win
可以,但是lost
产生了一个错误?谢谢 因为您从整个数据帧中获得了类别
,但是对于赢和输,您通过子集过滤它们,有时它并不存在
例如:
result my_classification prob
0 Won ENTERPRISE 0.657895
1 Won COMMERCIAL 0.342105
2 Lost ENTERPRISE 0.611842
当你这样做的时候
df[df.result == 'Lost'][df['my_classification'] == 'COMMERCIAL']['prob'][0]
它将返回错误
使用groupby
df.groupby(['result','my_classification']).head(1)
但在我的例子中,错误发生在列表组中没有丢失的类别时。我还尝试了:df.groupby(['result','my_classification']).head(1)。。。仍然是相同的错误…我注意到如果我将“df['my_classification']==c”替换为df['my_classification']==ENTERPRISE',类别值的硬编码会使错误消失。。。为什么呢?