Python groupby不在熊猫身上工作。系列
我正试着在一个熊猫系列上做一个分组,然后在上面做排名。奇怪的是,这在以前版本的pandas中有效,但自从我们将pandas版本升级到0.14.0后,它就停止了工作 这里有一个例子 系列Python groupby不在熊猫身上工作。系列,python,pandas,series,Python,Pandas,Series,我正试着在一个熊猫系列上做一个分组,然后在上面做排名。奇怪的是,这在以前版本的pandas中有效,但自从我们将pandas版本升级到0.14.0后,它就停止了工作 这里有一个例子 系列 i1 = pd.MultiIndex(levels=[[0, 1, 2, 3], [u'A', u'B'], [u'Spar', u'PnP', 'Checkers', 'Woolworths']], labels=[[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
i1 = pd.MultiIndex(levels=[[0, 1, 2, 3], [u'A', u'B'], [u'Spar', u'PnP', 'Checkers', 'Woolworths']],
labels=[[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3],
[0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1],
[0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]],
names=[u'respondent', u'survey', u'brand'])
s1 = pd.Series.from_array([1, 2, 3, 4, 2, 3, 4, 1, 3, 4, 2, 1, 4, 1, 2, 3, 1, 2, 3, 4, 3, 2, 1, 4, 2, 3, 4, 1, 1, 4, 3, 2], index = i1, name='usage')
s1
respondent survey brand
0 A Spar 1
PnP 2
Checkers 3
Woolworths 4
B Spar 2
PnP 3
Checkers 4
Woolworths 1
1 A Spar 3
PnP 4
Checkers 2
Woolworths 1
B Spar 4
PnP 1
Checkers 2
Woolworths 3
2 A Spar 1
PnP 2
Checkers 3
Woolworths 4
B Spar 3
PnP 2
Checkers 1
Woolworths 4
3 A Spar 2
PnP 3
Checkers 4
Woolworths 1
B Spar 1
PnP 4
Checkers 3
Woolworths 2
Name: usage, dtype: int64
当我试着做如下的分组练习时
s1.groupby(['respondent']).rank()
我得到以下错误
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-41-14bf5be195e8> in <module>()
----> 1 s1.groupby(['respondent']).mean()
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze)
2727 axis = self._get_axis_number(axis)
2728 return groupby(self, by, axis=axis, level=level, as_index=as_index,
-> 2729 sort=sort, group_keys=group_keys, squeeze=squeeze)
2730
2731 def asfreq(self, freq, method=None, how=None, normalize=False):
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc in groupby(obj, by, **kwds)
1098 raise TypeError('invalid type: %s' % type(obj))
1099
-> 1100 return klass(obj, by, **kwds)
1101
1102
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc in __init__(self, obj, keys, axis, level, grouper, exclusions, selection, as_index, sort, group_keys, squeeze)
384 if grouper is None:
385 grouper, exclusions, obj = _get_grouper(obj, keys, axis=axis,
--> 386 level=level, sort=sort)
387
388 self.obj = obj
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc in _get_grouper(obj, key, axis, level, sort)
1978 exclusions.append(gpr)
1979 name = gpr
-> 1980 gpr = obj[gpr]
1981
1982 if isinstance(gpr, Categorical) and len(gpr) != len(obj):
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in __getitem__(self, key)
477 def __getitem__(self, key):
478 try:
--> 479 result = self.index.get_value(self, key)
480
481 if not np.isscalar(result):
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/index.pyc in get_value(self, series, key)
2554 raise InvalidIndexError(key)
2555 else:
-> 2556 raise e1
2557 except Exception: # pragma: no cover
2558 raise e1
KeyError: 'respondent'
---------------------------------------------------------------------------
KeyError回溯(最近一次呼叫最后一次)
在()
---->1 s1.分组依据(['应答者])。平均值()
/groupby中的Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc(self、by、axis、level、as_索引、排序、组键、压缩)
2727轴=自身。获取轴编号(轴)
2728返回groupby(self,by,axis=axis,level=level,as_index=as_index,
->2729排序=排序,组\键=组\键,挤压=挤压)
2730
2731 def asfreq(self、freq、method=None、how=None、normalize=False):
/groupby中的Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc(obj,by,**kwds)
1098 raise TypeError('无效类型:%s'%type(obj))
1099
->1100返回klass(obj,由,**科威特先令)
1101
1102
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc in uuuuuu init_uuuu(self、obj、keys、axis、level、grouper、excludes、selection、as_index、sort、group_ukeys、squence)
384如果石斑鱼没有:
385 grouper,排除,obj=_get_grouper(obj,键,轴=轴,
-->386级别=级别,排序=排序)
387
388 self.obj=obj
/用户/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/groupby.pyc in_get_grouper(对象、键、轴、级别、排序)
1978年除外条款。附加条款(gpr)
1979名称=探地雷达
->1980年探地雷达=obj[gpr]
1981
1982如果isinstance(探地雷达,分类)和len(探地雷达)!=len(obj):
/Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc in uuu__ugetitem_uu_uu(self,key)
477定义获取项目(自身,密钥):
478试试:
-->479结果=self.index.get_值(self,key)
480
481如果不是np.isscalar(结果):
/获取值(self、series、key)中的Users/donovanthomson/anaconda/lib/python2.7/site-packages/pandas/core/index.pyc
2554升起无效指示灯错误(钥匙)
2555其他:
->2556上升e1
2557例外情况除外:#pragma:无保险
2558上升e1
关键错误:“响应者”
您需要按索引级别分组,而不是不存在的列:
In [218]:
s1.groupby(level=0).rank()
Out[218]:
respondent survey brand
0 A Spar 1.5
PnP 3.5
Checkers 5.5
Woolworths 7.5
B Spar 3.5
PnP 5.5
Checkers 7.5
Woolworths 1.5
1 A Spar 5.5
PnP 7.5
Checkers 3.5
Woolworths 1.5
B Spar 7.5
PnP 1.5
Checkers 3.5
Woolworths 5.5
2 A Spar 1.5
PnP 3.5
Checkers 5.5
Woolworths 7.5
B Spar 5.5
PnP 3.5
Checkers 1.5
Woolworths 7.5
3 A Spar 3.5
PnP 5.5
Checkers 7.5
Woolworths 1.5
B Spar 1.5
PnP 7.5
Checkers 5.5
Woolworths 3.5
dtype: float64
如果愿意,您还可以使用s1.groupby(level='responder').rank()
从技术上讲,我认为它不应该在以前的版本中工作,因为它在索引级别和列上的分组在语义上是不同的非常感谢您的快速反馈!对于奖励积分,您知道最近是否发生了变化吗?我在pandasI的更改日志中找不到任何内容我在