Python 如何将列名传递到Pandas中groupby函数的level参数中？_Python_Pandas_Pandas Groupby

Python 如何将列名传递到Pandas中groupby函数的level参数中？

python pandas

Python 如何将列名传递到Pandas中groupby函数的level参数中？,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我在Pandas中的groupby函数中传递级别名称时遇到问题。我的数据框非常大，有34列 Shpr_Resi_Ratio = ( data[data.Resi == 'Y'].groupby(level='Shpr_ID').count() / data.groupby(level='Shpr_ID').count() ) 错误我试图计算两个克隆的比率 Resi Shpr_ID Shpr_ID_Ratio Y 577030944 0.933333333 N

我在Pandas中的groupby函数中传递级别名称时遇到问题。我的数据框非常大，有34列

Shpr_Resi_Ratio = (
    data[data.Resi == 'Y'].groupby(level='Shpr_ID').count() /
    data.groupby(level='Shpr_ID').count()
)

错误

我试图计算两个克隆的比率

Resi    Shpr_ID Shpr_ID_Ratio
Y   577030944   0.933333333
N   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333

是否尝试按列“Shpr_ID”分组

在这种情况下，请将代码更改为：

Shpr_Resi_Ratio = (
    data[data.Resi == 'Y'].groupby(['Shpr_ID']).count() /
    float(data.groupby(['Shpr_ID']).count())
)

您应该注意这一点。

是否尝试按列“Shpr\u ID”分组

Shpr_ID_total=data.groupby(['Shpr_ID']).agg({'Shpr_ID': 'count'})
Shpr_ID_Y=data[data['Resi'] == 'Y'].groupby(['Shpr_ID']).agg({'Shpr_ID': 'count'})

def computeResi(Shpr_ID):
    ratio=0

    try:
        ratio=Shpr_ID_Y.Shpr_ID[Shpr_ID]/Shpr_ID_total.Shpr_ID[Shpr_ID]
    except:
        pass

    return ratio

在这种情况下，请将代码更改为：

Shpr_Resi_Ratio = (
    data[data.Resi == 'Y'].groupby(['Shpr_ID']).count() /
    float(data.groupby(['Shpr_ID']).count())
)

应该注意这一点。

，这不是计算比率，这可能是由于计数不是浮动引起的吗？我已经编辑了我的答案来解释这一点。它再次给出了错误。TypeError:float（）参数必须是字符串或数字，而不是“DataFrame”。我仍然得到错误，它没有计算比率。这可能是由于计数不是浮点数引起的吗？我已经编辑了我的答案来说明这一点。它再次给出了错误。TypeError:float（）参数必须是字符串或数字，而不是“DataFrame”。我仍然有错误。可以添加一个示例数据，如

Shpr\u Resi\u Ratio.head（）.iloc[：，：5]

。您的索引是否为多索引（）？如果不尝试

level=0

@bharthshetty，则输出top_Type Resi Co_Name Lat Lng Shpr_ID 1.0 NaN NaN NaN NaN 30.0 NaN NaN NaN NaN 132.0 NaN NaN NaN NaN NaN 148.0 NaN NaN NaN NaN NaN NaN NaN 156.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN我很抱歉，它被假定为

data.head（10.iloc[：，：5]

。将此添加到您的qn@Bharathshetty，添加了输出您试图实现的目标。当然不是这样做的。您可以添加一个示例数据，如

Shpr\u Resi\u Ratio.head（）.iloc[：，：5]

。您的索引是否为多索引（）？如果不尝试

level=0

data.head（10.iloc[：，：5]

。将此添加到您的qn@Bharathshetty，添加了输出您试图实现的目标。这肯定不是这样做的方式。

Shpr_ID_total=data.groupby(['Shpr_ID']).agg({'Shpr_ID': 'count'})
Shpr_ID_Y=data[data['Resi'] == 'Y'].groupby(['Shpr_ID']).agg({'Shpr_ID': 'count'})

def computeResi(Shpr_ID):
    ratio=0

    try:
        ratio=Shpr_ID_Y.Shpr_ID[Shpr_ID]/Shpr_ID_total.Shpr_ID[Shpr_ID]
    except:
        pass

    return ratio