Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python在dataframe列中查找最大值以循环查找所有值_Python_Python 3.x_Pandas_Numpy_Dataframe - Fatal编程技术网

Python在dataframe列中查找最大值以循环查找所有值

Python在dataframe列中查找最大值以循环查找所有值,python,python-3.x,pandas,numpy,dataframe,Python,Python 3.x,Pandas,Numpy,Dataframe,所以我有一个大数据框,使用熊猫 当我做max(df['A'])时,它会报告一个最大值9999,而观察结果应该是396450 import numpy as numpy import pandas as pd f = open("20170901.as-rel2.txt", 'r') #read file into array, ignore first 6 lines lines = loadtxt("20170901.as-rel2.txt", dtype='str', comments="

所以我有一个大数据框,使用熊猫

当我做
max(df['A'])
时,它会报告一个最大值
9999
,而观察结果应该是
396450

import numpy as numpy
import pandas as pd

f = open("20170901.as-rel2.txt", 'r')
#read file into array, ignore first 6 lines
lines = loadtxt("20170901.as-rel2.txt", dtype='str', comments="#", delimiter="|", unpack=False)
#ignore col 4
lines=lines[:, :3]
#convert to dataframe
df = pd.DataFrame(lines, columns=['A', 'B', 'C'])
找到最大值后,我必须计算每个
节点(col'A')
,并说出重复的次数

以下是该文件的示例:

df=
                 A       B   C
    0            2   45714   0
    1            2   52685  -1
    2            3     293   0
    3            3   23248  -1
    4            3  133296   0
    5            3  265301  -1
    6            5   28599  -1
    7            5   52352   0
    8            5  262879  -1
    9            5  265048  -1
    10           5  265316  -1
    11          10   46392   0
    .....
    384338  396238   62605  -1
    384339  396371    3785  -1
    384340  396434   35039  -1
    384341  396450    2495  -1
    384342  396450    5078  -1

    Expect:
    [1, 0
    2, 2
    3, 4
    4, 0
    5, 5
    10, 1
    ....]

我打算使用
category
value\u counts

df.A=pd.Categorical(df.A,categories=np.arange(1,max(df.A)+1))
df.A.value_counts().sort_index()
Out[312]: 
1    0
2    2
3    4
4    0
5    5
6    0
7    0
8    0
9    0
Name: A, dtype: int64
np.bincount

谢谢你!现在我有了更大的挑战。以类似的方式。如果在A列中,对于ex 396450_396434396371396238,一行中有多个条目,该怎么办。我还需要计算每个数字的出现次数。有什么建议吗?用另一种方式读?
pd.Series(np.bincount(df.A))

0     0
1     0
2     2
3     4
4     0
5     5
6     0
7     0
8     0
9     0
10    1
dtype: int64