Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/332.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:基于GROUPBY生成频率(总和和计数)_Python - Fatal编程技术网

Python:基于GROUPBY生成频率(总和和计数)

Python:基于GROUPBY生成频率(总和和计数),python,Python,我试图在Python中复制类似于PROC SUMMARY的结果,并使用下面的函数作为堆栈溢出的可用函数: def wmean_grouped2 (group, var_name_in, var_name_weight): d = group[var_name_in] w = group[var_name_weight] return (d * w).sum() / w.sum() FUNCS = { "mean" : np.mean , "sum"

我试图在Python中复制类似于PROC SUMMARY的结果,并使用下面的函数作为堆栈溢出的可用函数:

def wmean_grouped2 (group, var_name_in, var_name_weight):
    d = group[var_name_in]
    w = group[var_name_weight]
    return (d * w).sum() / w.sum()

FUNCS = { "mean"  : np.mean ,
          "sum"   : np.sum ,
          "count" : np.count_nonzero }

def my_summary2 (
        data ,
        var_names_in ,
        var_names_out ,
        var_functions ,
        var_name_weight = None ,
        var_names_group = None ):

    result = pd.DataFrame()

    if var_names_group is None:
        grouped = data.groupby (lambda x: True)
    else:
        grouped = data.groupby (var_names_group)



    for var_name_in, var_name_out, var_function in \
            zip(var_names_in,var_names_out,var_functions):
        if var_function == "wsum":
            func = lambda x : wmean_grouped2 (x, var_name_in, var_name_weight)
            result[var_name_out] = pd.Series(grouped.apply(func))
        else:
            func = FUNCS[var_function]
            result[var_name_out] = grouped[var_name_in].apply(func)

    return result
我调用了如下函数:

print(my_summary2 (
        data=df,
        var_names_in=["sal","sal","age"] ,
        var_names_out=[
            "COUNT","SAL","age"
        ] ,
        var_functions=["count","sum","sum"] ,
        var_name_weight="val_1" ,
        var_names_group=["name"]
))
并获得以下输出:

        COUNT  SAL  age
name                  
Arik       1  100   32
David      2  260   88
John       2  500   67
Peter      1  100   33
请您帮助生成以下输出: (i) “名称”列后的新行 (ii)插入连字符(-)后每个变量的总日照数


我能够使用以下代码生成每个列的总和:

result.loc['Total'] = result.select_dtypes(pd.np.number).sum()
在返回结果之前

result.loc['Total'] = result.select_dtypes(pd.np.number).sum()