Python 以升序和降序对单个列进行条件排序
我有98000个美国家庭街道地址,我需要按“步行”顺序进行排序,即按您步行的顺序列出,沿着街道的一侧,然后穿过街道,然后往回走Python 以升序和降序对单个列进行条件排序,python,python-3.x,pandas,sorting,pandas-groupby,Python,Python 3.x,Pandas,Sorting,Pandas Groupby,我有98000个美国家庭街道地址,我需要按“步行”顺序进行排序,即按您步行的顺序列出,沿着街道的一侧,然后穿过街道,然后往回走 import pandas as pd df = pd.read_excel('c:pdsort.xlsx') # add boolean column for even or odd on number column is_even = df.loc[:,'number'] % 2 == 0 df.loc[:, 'even'] = is_even # group
import pandas as pd
df = pd.read_excel('c:pdsort.xlsx')
# add boolean column for even or odd on number column
is_even = df.loc[:,'number'] % 2 == 0
df.loc[:, 'even'] = is_even
# group and then sort by number
df.groupby(['town','street','even']).apply(lambda x: x.sort_values('number'))
# sort odd numbers ascending and even numbers descending
所需的df结果,对奇数街道编号进行升序排序,然后对偶数街道编号切换为降序排序。[抱歉,第一个stackoverflow问题,还没有资格复制Jupyter笔记本的图像]
4列:数字、街道、城镇、偶数
“编号”列的预期结果:
1231
1233
1235
1237
1239
1238
1236
1234
1232
1230以下是我的尝试,如果我理解正确,我相信这可以在lambda函数中完成,但它有助于以详细的方式设置逻辑:) 给出: 使用,您可以定义要排序的序列。数据来自@smj 设置
import pandas as pd
import numpy as np
number_list = list(range(1, 11))
df = pd.DataFrame({'town': sorted(['Springfield', 'Shelbyville'] * 10),
'street': sorted(['Evergreen Terrace', 'Main Street'] * 10),
'number': number_list + number_list})
解决方案
订购时要小心np.lexsort
从序列的最后一个元素开始工作;e、 g.s1
的排序优先级最高,s4
的排序优先级最低
s1 = df['town']
s2 = df['street']
s3 = ~df['number']%2 # i.e. "is odd"
s4 = np.where(s3, -df['number'], df['number']) # i.e. "negate if odd"
res = df.iloc[np.lexsort((s4, s3, s2, s1))]
结果
print(res)
town street number
0 Shelbyville Evergreen Terrace 1
2 Shelbyville Evergreen Terrace 3
4 Shelbyville Evergreen Terrace 5
6 Shelbyville Evergreen Terrace 7
8 Shelbyville Evergreen Terrace 9
9 Shelbyville Evergreen Terrace 10
7 Shelbyville Evergreen Terrace 8
5 Shelbyville Evergreen Terrace 6
3 Shelbyville Evergreen Terrace 4
1 Shelbyville Evergreen Terrace 2
10 Springfield Main Street 1
12 Springfield Main Street 3
14 Springfield Main Street 5
16 Springfield Main Street 7
18 Springfield Main Street 9
19 Springfield Main Street 10
17 Springfield Main Street 8
15 Springfield Main Street 6
13 Springfield Main Street 4
11 Springfield Main Street 2
如果我理解正确,您首先必须
groupby
street。然后划分奇数/偶数,对每个列表进行排序,并根据需要连接结果。作为最终输出,您是否将整个数据库按新的顺序排序?请给出两条或三条街道以及每条街道上可能有四间房屋的示例输入和所需输出。smj和jpp都尝试过回答我的问题,不能pandasice piRSquared,在Jupyter NB上都运行%%timeit-选项smj 16.7 s±227 ms/环路(平均±标准偏差7次,每个环路1次)选项jpp 1.24 s±2.6 ms/环路(平均值±标准偏差7次,每次循环1次)-非常感谢,所有98k地址都已排序!回答好,我必须留下中间答案(-:
print(res)
town street number
0 Shelbyville Evergreen Terrace 1
2 Shelbyville Evergreen Terrace 3
4 Shelbyville Evergreen Terrace 5
6 Shelbyville Evergreen Terrace 7
8 Shelbyville Evergreen Terrace 9
9 Shelbyville Evergreen Terrace 10
7 Shelbyville Evergreen Terrace 8
5 Shelbyville Evergreen Terrace 6
3 Shelbyville Evergreen Terrace 4
1 Shelbyville Evergreen Terrace 2
10 Springfield Main Street 1
12 Springfield Main Street 3
14 Springfield Main Street 5
16 Springfield Main Street 7
18 Springfield Main Street 9
19 Springfield Main Street 10
17 Springfield Main Street 8
15 Springfield Main Street 6
13 Springfield Main Street 4
11 Springfield Main Street 2