Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/339.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何根据级别内的最新日期对表进行排序?_Python_Sorting_Date_Pandas_Pivot Table - Fatal编程技术网

Python 如何根据级别内的最新日期对表进行排序?

Python 如何根据级别内的最新日期对表进行排序?,python,sorting,date,pandas,pivot-table,Python,Sorting,Date,Pandas,Pivot Table,我已经按照所需的日期顺序创建了一个数据框,但是,当我将其放入透视表时,顺序会发生变化 我想根据给定级别内任何行的最新日期对数据透视表进行排序 data = [['yellow',1,'02/01/2015'], ['yellow',2,'04/01/2015'], ['green',3,'03/01/2015'], ['red',4,'01/01/2015']] df = pd.DataFrame(data, columns=['colo

我已经按照所需的日期顺序创建了一个数据框,但是,当我将其放入透视表时,顺序会发生变化

我想根据给定级别内任何行的最新日期对数据透视表进行排序

data = [['yellow',1,'02/01/2015'],
        ['yellow',2,'04/01/2015'],
        ['green',3,'03/01/2015'],
        ['red',4,'01/01/2015']]

    df = pd.DataFrame(data, columns=['colour','number','date'])
    df.pivot_table(index=['number','date'])
结果是

                    number
colour  date    
green   03/01/2015  3
red     01/01/2015  4
yellow  02/01/2015  1
        04/01/2015  2
我希望最终的结果是一个颜色列表,其中最新日期位于顶部,基本上是每行最新日期的排序(周围有星号的颜色)。因此,结果将是:-

                    number
colour  date    
yellow  02/01/2015  2
        *04/01/2015*  3
green   *03/01/2015*    4
red     *01/01/2015*    1
我能想出三个解决办法,但我不能解决它们

a) 获取pivot_表以保持原始顺序 b) 使用func在pivot_表上沿最新_date_行进行排序 c) 创建一个额外的列,包含每种颜色的最新日期


我不确定在熊猫世界里走哪条路是正确的,但现在我被卡住了:(

你可以在旋转之前记住旧的
多索引
,然后通过旧的
多索引
输出数据帧

import pandas as pd

data = [['yellow',1,'02/01/2015'],
        ['yellow',2,'04/01/2015'],
        ['green',3,'03/01/2015'],
        ['red',4,'01/01/2015']]
df = pd.DataFrame(data, columns=['colour','number','date'])
#simulate datetime column date
df['date'] = pd.to_datetime(df['date'])
#set index from columns colour and date
df = df.set_index(['colour', 'date'])
print df
#                   number
#colour date              
#yellow 2015-02-01       1
#       2015-04-01       2
#green  2015-03-01       3
#red    2015-01-01       4

#set old index to variable idx
idx = df.index
print df.index

#pivot table, it doesn't work with test data
df.pivot_table(index=['number','date'])

#reindex by old multiindex
df1 = df.reindex(idx)
print df1
#                   number
#colour date              
#yellow 2015-02-01       1
#       2015-04-01       2
#green  2015-03-01       3
#red    2015-01-01       4
编辑:

我认为问题在于原始数据帧并没有被排序。 其
多索引
为:

MultiIndex(levels=[[u'green', u'red', u'yellow'], [u'2015-01-01', u'2015-02-01', u'2015-03-01', u'2015-04-01']],
           labels=[[2, 2, 0, 1], [1, 3, 2, 0]],
           names=[u'colour', u'date'])
输出数据帧具有按
颜色排序的
多索引

MultiIndex(levels=[[u'green', u'red', u'yellow'], [u'2015-01-01', u'2015-02-01', u'2015-03-01', u'2015-04-01']],
           labels=[[0, 1, 2, 2], [2, 0, 1, 3]],
           names=[u'colour', u'date'])
您可以按级别
日期
排序,但多索引和输出是:

idx1 = df.sortlevel(level='date').index
print idx1
MultiIndex(levels=[[u'green', u'red', u'yellow'], [u'2015-01-01', u'2015-02-01', u'2015-03-01', u'2015-04-01']],
           labels=[[1, 2, 0, 2], [0, 1, 2, 3]],
           names=[u'colour', u'date'])


#reindex by idx1
df1 = df.reindex(idx)
                   number
colour date              
red    2015-01-01       4
yellow 2015-02-01       1
green  2015-03-01       3
yellow 2015-04-01       2

因此,解决方案是原始
多索引
重新索引

谢谢jezrael。这听起来很有道理。我尝试了你的第一个答案,但有一些问题。明天我会再看一看你的第二个答案。我接受了你的答案,因为它看起来应该有效,但我仍在学习过程中,所以我没有去它还不能完全工作。
import pandas as pd

data = [['yellow',1,'02/01/2015'],
        ['yellow',2,'04/01/2015'],
        ['green',3,'03/01/2015'],
        ['red',4,'01/01/2015']]
df = pd.DataFrame(data, columns=['colour','number','date'])
#simulate datetime column date
df['date'] = pd.to_datetime(df['date'])
#set index from columns colour and date
df = df.set_index(['colour', 'date'])
print df
#                   number
#colour date              
#yellow 2015-02-01       1
#       2015-04-01       2
#green  2015-03-01       3
#red    2015-01-01       4

#set old index to variable idx
idx = df.index
print df.index

#pivot table, it doesn't work with test data
df.pivot_table(index=['number','date'])

#reindex by old multiindex
df1 = df.reindex(idx)
print df1
#                   number
#colour date              
#yellow 2015-02-01       1
#       2015-04-01       2
#green  2015-03-01       3
#red    2015-01-01       4