在Python中,使用ref对列表进行排序,并分别返回这两个列表

在Python中,使用ref对列表进行排序,并分别返回这两个列表,python,python-3.x,list,sorting,Python,Python 3.x,List,Sorting,我有3个清单: 说 基本上,st是字符串中不同的字母,freq是它们对应的频率,pos是这些字母的位置。 现在我想按频率的降序对所有3个列表进行排序。 zip没有帮助,因为我需要单独存储这些排序的列表 使用NumPy: import numpy as np st = np.array(['B', 'D', 'C', 'A']) freq = np.array([2, 3, 2, 4]) pos = np.array([[1, 19], [3, 18, 21], [2, 20], [0, 17,

我有3个清单:

基本上,st是字符串中不同的字母,freq是它们对应的频率,pos是这些字母的位置。 现在我想按频率的降序对所有3个列表进行排序。 zip没有帮助,因为我需要单独存储这些排序的列表

使用NumPy:

import numpy as np

st = np.array(['B', 'D', 'C', 'A'])
freq = np.array([2, 3, 2, 4])
pos = np.array([[1, 19], [3, 18, 21], [2, 20], [0, 17, 22, 23]])

indices = np.argsort(freq)[::-1]  # indices that sort freq in decreasing order

st = st[indices]
freq = freq[indices]
pos = pos[indices]

您可以计算新指数,然后应用它们:

from operator import itemgetter

# sort enumerated freq values
sorted_freq = sorted(enumerate(freq), key=itemgetter(1), reverse=True)

# construct itemgetter object using first values from sorted_freq
# gives operator.itemgetter(3, 1, 0, 2)
order = itemgetter(*map(itemgetter(0), sorted_freq))

st = order(st)      # ('A', 'D', 'B', 'C')
freq = order(freq)  # (4, 3, 2, 2)
pos = order(pos)    # ([0, 17, 22, 23], [3, 18, 21], [1, 19], [2, 20])

这会产生元组,但列表转换很简单:listorderst、listorderfreq等。

我会将数据结构更改为:

d = [{'st': 'B', 'freq': 2, 'pos': [1, 19]}, {'st': 'D', 'freq': 3, 'pos': [3, 18, 21]}, {'st': 'C', 'freq': 2, 'pos': [2, 20]}, {'st': 'A', 'freq': 4, 'pos': [0, 17, 22, 23]}]
然后,在本例中,根据标准进行排序

import operator
sorted(d, key=operator.itemgetter('freq'), reverse=True)
结果

[{'freq': 4, 'pos': [0, 17, 22, 23], 'st': 'A'},
 {'freq': 3, 'pos': [3, 18, 21], 'st': 'D'},
 {'freq': 2, 'pos': [1, 19], 'st': 'B'},
 {'freq': 2, 'pos': [2, 20], 'st': 'C'}]

如果数据属于一个整体,它实际上不应该像单独的列表那样分开,至少我会使用a,或者如果您有Python3.7,也可以使用a,您可以轻松地将它们重新打包以供存储或进一步操作

要演示namedtuple方法,请执行以下操作:

from collections import namedtuple

string_info = namedtuple("string_info", "string,frequency,positions")

st = ['B', 'D', 'C', 'A']
freq = [2, 3, 2, 4]
pos = [[1, 19], [3, 18, 21], [2, 20], [0, 17, 22, 23]]

infos = [string_info(s, f, p) for s, f, p in zip(st, freq, pos)]
现在您有了一个列表,其中相关数据也存储在一起。排序现在非常简单,几乎不重要:

>>> sorted_infos_by_frequency = sorted(infos, key=lambda info: info.frequency, reverse=True)
>>> sorted_infos_by_frequency
[string_info(string='A', frequency=4, positions=[0, 17, 22, 23]),
 string_info(string='D', frequency=3, positions=[3, 18, 21]),
 string_info(string='B', frequency=2, positions=[1, 19]),
 string_info(string='C', frequency=2, positions=[2, 20])]
如果您需要再次打开包装:

>>> [i.string for i in sorted_infos_by_frequency]
['A', 'D', 'B', 'C']
>>> [i.frequency for i in sorted_infos_by_frequency]
[4, 3, 2, 2]
>>> [i.positions for i in sorted_infos_by_frequency]
[[0, 17, 22, 23], [3, 18, 21], [1, 19], [2, 20]]
顺便说一下,这里有一些冗余,因为位置的长度实际上包含频率。在这种情况下,通常最好不要直接存储计算出的属性,尽管这可能是主观的,如果可以有效地计算它们,并且len是有效的:

string_info = namedtuple("string_info", "string,positions")

st = ['B', 'D', 'C', 'A']
pos = [[1, 19], [3, 18, 21], [2, 20], [0, 17, 22, 23]]

infos = [string_info(s, p) for s, p in zip(st, pos)]
通过以下操作对其进行排序和解压缩:

>>> sorted_infos_by_frequency = sorted(infos, key=lambda info: len(info.positions), reverse=True)
>>> sorted_infos_by_frequency
[string_info(string='A', positions=[0, 17, 22, 23]),
 string_info(string='D', positions=[3, 18, 21]),
 string_info(string='B', positions=[1, 19]),
 string_info(string='C', positions=[2, 20])]
>>> [i.string for i in sorted_infos_by_frequency]
['A', 'D', 'B', 'C']
>>> [i.positions for i in sorted_infos_by_frequency]
[[0, 17, 22, 23], [3, 18, 21], [1, 19], [2, 20]]
>>> [len(i.positions) for i in sorted_infos_by_frequency]  # if you need the frequencies
[4, 3, 2, 2]

谢谢,但我一开始不想用numpy。
>>> sorted_infos_by_frequency = sorted(infos, key=lambda info: len(info.positions), reverse=True)
>>> sorted_infos_by_frequency
[string_info(string='A', positions=[0, 17, 22, 23]),
 string_info(string='D', positions=[3, 18, 21]),
 string_info(string='B', positions=[1, 19]),
 string_info(string='C', positions=[2, 20])]
>>> [i.string for i in sorted_infos_by_frequency]
['A', 'D', 'B', 'C']
>>> [i.positions for i in sorted_infos_by_frequency]
[[0, 17, 22, 23], [3, 18, 21], [1, 19], [2, 20]]
>>> [len(i.positions) for i in sorted_infos_by_frequency]  # if you need the frequencies
[4, 3, 2, 2]