Python 2.7 Python dataframe sort_值不适用于第二项
我有一个包含销售数据的数据框Python 2.7 Python dataframe sort_值不适用于第二项,python-2.7,sorting,pandas,dataframe,Python 2.7,Sorting,Pandas,Dataframe,我有一个包含销售数据的数据框 Order ID Order Date Order Priority Order Quantity Sales 928.0 1/1/2009 High 32.0 180.36 10369.0 1/2/2009 Low 43.0 4,083.19 10144.0 1/2/2009 Critical 16.0
Order ID Order Date Order Priority Order Quantity Sales
928.0 1/1/2009 High 32.0 180.36
10369.0 1/2/2009 Low 43.0 4,083.19
10144.0 1/2/2009 Critical 16.0 137.63
32323.0 1/1/2009 Not Specified 9.0 872.48
48353.0 1/2/2009 Critical 3.0 124.81
51008.0 1/3/2009 Critical 15.0 85.56
26756.0 1/2/2009 Critical 43.0 614.8
18144.0 1/2/2009 Low 4.0 1,239.06
22912.0 1/2/2009 Low 32.0 4,902.38
...
我想按日期(从最早到最新)和销售额(从最大到最小)对值进行排序。我在PyCharm Edu 3.5.1(python 2.7)中编写了这段代码:
输出:
Order ID Order Date Order Priority Order Quantity Sales
32323.0 2009-01-01 Not Specified 9.0 872.48
928.0 2009-01-01 High 32.0 180.36
26756.0 2009-01-02 Critical 43.0 614.8
22912.0 2009-01-02 Low 32.0 4,902.38
10369.0 2009-01-02 Low 43.0 4,083.19
10144.0 2009-01-02 Critical 16.0 137.63
48353.0 2009-01-02 Critical 3.0 124.81
18144.0 2009-01-02 Low 4.0 1,239.06
29376.0 2009-01-03 Not Specified 4.0 896.49
...
“订单日期”列已正确排序,“销售”列未按预期排序。对于1000分隔符,PyCharm似乎忽略了带的值。我是不是遗漏了什么 与参数数千
一起使用,用于删除,
在浮动中,以及解析日期
用于将列转换为日期时间,因为列销售
的值被读取为字符串
s:
df = pd.read_csv('sales.csv', thousands=',', parse_dates=['Order Date'])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
0 928.0 2009-01-01 High 32.0 180.36
1 10369.0 2009-01-02 Low 43.0 4083.19
2 10144.0 2009-01-02 Critical 16.0 137.63
3 32323.0 2009-01-01 Not Specified 9.0 872.48
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
6 26756.0 2009-01-02 Critical 43.0 614.80
7 18144.0 2009-01-02 Low 4.0 1239.06
8 22912.0 2009-01-02 Low 32.0 4902.38
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
另一种解决方案是使用+或:
似乎需要
df=pd.read_csv('sales.csv',header=0,数千=',')
将列sales
转换为float
s。
df = pd.read_csv('sales.csv', thousands=',', parse_dates=['Order Date'])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
0 928.0 2009-01-01 High 32.0 180.36
1 10369.0 2009-01-02 Low 43.0 4083.19
2 10144.0 2009-01-02 Critical 16.0 137.63
3 32323.0 2009-01-01 Not Specified 9.0 872.48
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
6 26756.0 2009-01-02 Critical 43.0 614.80
7 18144.0 2009-01-02 Low 4.0 1239.06
8 22912.0 2009-01-02 Low 32.0 4902.38
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56
df['Order Date'] = pd.to_datetime(df['Order Date'])
df['Sales'] = df['Sales'].replace(',', '', regex=True).astype(float)
#if astype does not work because bad data
#df['Sales'] = pd.to_numeric(df['Sales'].replace(',', '', regex=True), errors='coerce')
df = df.sort_values(by=['Order Date', 'Sales'], ascending=[True, False])
print (df)
Order ID Order Date Order Priority Order Quantity Sales
3 32323.0 2009-01-01 Not Specified 9.0 872.48
0 928.0 2009-01-01 High 32.0 180.36
8 22912.0 2009-01-02 Low 32.0 4902.38
1 10369.0 2009-01-02 Low 43.0 4083.19
7 18144.0 2009-01-02 Low 4.0 1239.06
6 26756.0 2009-01-02 Critical 43.0 614.80
2 10144.0 2009-01-02 Critical 16.0 137.63
4 48353.0 2009-01-02 Critical 3.0 124.81
5 51008.0 2009-01-03 Critical 15.0 85.56