Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/286.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 设置索引排序特定列_Python_Pandas - Fatal编程技术网

Python 设置索引排序特定列

Python 设置索引排序特定列,python,pandas,Python,Pandas,正在尝试以特定格式准备此数据 import pandas as pd voting = pd.read_json("GE2000.json") voting.set_index(['county_fips','candidate_name','pty','vote_pct'],inplace=True) print(voting) 然后返回 vote county_fips candidate_name

正在尝试以特定格式准备此数据

import pandas as pd

voting = pd.read_json("GE2000.json")
voting.set_index(['county_fips','candidate_name','pty','vote_pct'],inplace=True)

print(voting)
然后返回

                                            vote
county_fips candidate_name  pty vote_pct
2000        Howard Phillips CS  0            596
            John Hagelin    NL  0            919
            Harry Browne    LB  1           2636
            George W. Bush  R   59        167398
            Al Gore         D   28         79004
1001        Howard Phillips I   0              9
            John Hagelin    I   0              5
            Harry Browne    LB  0             51
            George W. Bush  R   70         11993
            Al Gore         D   29          4942
在这之后,我想对vote\u pct进行排序并获取最大值,就像这样(我尝试了排序\u值、排序\u索引等,但无法得到所需的输出)

下面是示例数据

[

  {
    "office" : "PRESIDENT",
    "county_name" : "Alaska",
    "vote_pct" : "0",
    "county_fips" : "2000",
    "pty" : "CS",
    "candidate_name" : "Howard Phillips",
  },
  {
    "office" : "PRESIDENT",
    "county_name" : "Alaska",
    "vote_pct" : "0",
    "county_fips" : "2000",
    "pty" : "NL",
    "candidate_name" : "John Hagelin",
  }
]

该数据继续

您可以使用
groupby
例如
voting.groupby('country\u fips')['candidate\u name'].max()

这里还有更详细的答案:

在执行
设置索引之前,您可以使用
groupby
apply
获得每个索引的最大值,然后再设置索引。这允许您在列上使用
groupby
,而不是在索引上使用(这很奇怪):


你能提供原始数据的样本吗?@juanpa.arrivillaga updated,thk YOUYAY比我的答案更好;)
[

  {
    "office" : "PRESIDENT",
    "county_name" : "Alaska",
    "vote_pct" : "0",
    "county_fips" : "2000",
    "pty" : "CS",
    "candidate_name" : "Howard Phillips",
  },
  {
    "office" : "PRESIDENT",
    "county_name" : "Alaska",
    "vote_pct" : "0",
    "county_fips" : "2000",
    "pty" : "NL",
    "candidate_name" : "John Hagelin",
  }
]
voting = pd.read_json("GE2000.json")

get_largest_vote_pct = lambda row: row[row.vote_pct == row.vote_pct.max()]

largest = voting.groupby('county_fips').apply(get_largest_vote_pct)

largest.set_index(['county_fips','candidate_name','pty','vote_pct'],inplace=True) 

print(largest)

                                           vote
county_fips candidate_name pty vote_pct        
1001        George W. Bush R   70         11993
2000        George W. Bush R   59        167398