Python 生成特定数字中频率最高的前N个数字列的有效方法是什么？_Python_Pandas_Numpy

Python 生成特定数字中频率最高的前N个数字列的有效方法是什么？

python pandas numpy

Python 生成特定数字中频率最高的前N个数字列的有效方法是什么？,python,pandas,numpy,Python,Pandas,Numpy,我正在尝试获取最高频率为1s的前N个数字列（唯一其他值为0）。我知道最简单的方法是对所有数字列求和并对它们进行排序，但实现这一点的最简单/有效的方法是什么对以下数据帧进行采样： df Non-NumericCol1 Non-NumericCol2 Col1 Col2 Col3 ... Coln ABC PQR 1 0 1 0 XYZ LMN

我正在尝试获取最高频率为1s的前N个数字列（唯一其他值为0）。我知道最简单的方法是对所有数字列求和并对它们进行排序，但实现这一点的最简单/有效的方法是什么

对以下数据帧进行采样：

Non-NumericCol1 Non-NumericCol2   Col1   Col2   Col3   ...   Coln
      ABC             PQR            1      0       1           0
      XYZ             LMN            0      0       0           1
      ABC             LMN            0      1       1           0

我希望实现，比方说，前3个列名

示例：d={'Col3'：2000'Col10200'：1500'Col4900'：1000}

我可以接受任何其他格式的输出（例如熊猫数据帧）。总共大约有10000列，6000行。

试试这个：

In [113]: df
Out[113]:
  Non-NumericCol1 Non-NumericCol2  Col1  Col2  Col3  Col4  Coln
0             ABC             PQR     1     0     1     0     0
1             XYZ             LMN     0     0     0     0     1
2             ABC             LMN     0     1     1     0     0

In [114]: df.select_dtypes(['number']).sum().nlargest(3)
Out[114]:
Col3    2
Col1    1
Col2    1
dtype: int64

试试这个：

In [113]: df
Out[113]:
  Non-NumericCol1 Non-NumericCol2  Col1  Col2  Col3  Col4  Coln
0             ABC             PQR     1     0     1     0     0
1             XYZ             LMN     0     0     0     0     1
2             ABC             LMN     0     1     1     0     0

In [114]: df.select_dtypes(['number']).sum().nlargest(3)
Out[114]:
Col3    2
Col1    1
Col2    1
dtype: int64

这会给你你想要的。使用列表选择数字列，使用

df.sum

计算频率，使用

df.nlargest

选择前三列：

In [1002]: df[['Col%d' %d for d in range(1, 4)]].sum().nlargest(3)
Out[1002]: 
Col3    2
Col2    1
Col1    1
dtype: int64

如果你想把它作为一本字典，请调用

df.to\u dict

：

In [1003]: _.to_dict()
Out[1003]: {'Col1': 1, 'Col2': 1, 'Col3': 2}

这会给你你想要的。使用列表选择数字列，使用

df.sum

计算频率，使用

df.nlargest

选择前三列：

In [1002]: df[['Col%d' %d for d in range(1, 4)]].sum().nlargest(3)
Out[1002]: 
Col3    2
Col2    1
Col1    1
dtype: int64

如果你想把它作为一本字典，请调用

df.to\u dict

：

In [1003]: _.to_dict()
Out[1003]: {'Col1': 1, 'Col2': 1, 'Col3': 2}

@RijulMagu，很高兴能帮上忙：）@RijulMagu，很高兴能帮上忙：）