Python 在列表中将最相似的值排序在一起_Python_Python 3.x

Python 在列表中将最相似的值排序在一起

python python-3.x

Python 在列表中将最相似的值排序在一起,python,python-3.x,Python,Python 3.x,假设我有六台仪器可以在synth上编程，如下所示： patches = { "piano": (0, 10, 20, 30, 50), "grand-piano": (10, 10, 20, 35, 45), "guitar": (80, 0, 50, 33, 80), "trumpet": (85, 85, 85, 90, 90), "banjo": (95, 0, 60, 45, 75), "trombone

假设我有六台仪器可以在synth上编程，如下所示：

patches = {
  "piano":        (0,  10, 20, 30, 50), 
  "grand-piano":  (10, 10, 20, 35, 45),
  "guitar":       (80, 0,  50, 33, 80),
  "trumpet":      (85, 85, 85, 90, 90),
  "banjo":        (95, 0,  60, 45, 75),
  "trombone":     (95, 85, 85, 90, 85),
}

其中，每个仪器使用一组唯一的五个参数值进行定义。因此，钢琴声音是通过使用以下值对合成器进行编程产生的：

0、10、20、30、50

我想对补丁列表进行排序，以便将最相似的工具排序在一起。你可以看到钢琴和大钢琴非常相似，所以它们应该在一起。如果将这两种仪器的参数值差异相加，则得到：

10+0+0+5+5=20

这就是我所说的接近。同样地，吉他和班卓琴很接近，小号和长号也很接近

我可以确定每种仪器与其他仪器的相似程度，如下所示：

from itertools import combinations

def distance(a,b):
    assert len(a) == len(b)
    return sum(abs(a[i]-b[i]) for i in range(len(a)))

distances = { (a,b): distance(patches[a], patches[b]) for a,b in combinations(patches.keys(),r=2) }

但我不知道如何根据这些信息订购一个带有补丁的列表。我该怎么做

更新

我认为这是基于АаСааааааа的实施的解决方案：

from sklearn.neighbors import NearestNeighbors as NN
from sklearn.neighbors import DistanceMetric as DM
from sklearn.preprocessing import normalize


patches = {
           "piano":        (0,  10, 20, 30, 50),
           "grand-piano":  (10, 10, 20, 35, 45),
           "guitar":       (80, 0,  50, 33, 80),
           "trumpet":      (85, 85, 85, 90, 90),
           "banjo":        (95, 0,  60, 45, 75),
           "trombone":     (95, 85, 85, 90, 85),
}

dist = DM.get_metric('manhattan')

patches = list(patches.items())
n = len(patches)

l = [i[1] for i in patches]

neighbors = NN(n).fit(l)
distances, indices = neighbors.kneighbors(l)

best_indices = sorted(((indices[i], sum(distances[i])) for i in range(len(indices))),key=lambda x:x[1])[0][0]
print([patches[i][0] for i in best_indices])

我写了一个解决方案：

patches = {
    "piano":        (0,  10, 20, 30, 50), 
    "grand-piano":  (10, 10, 20, 35, 45),
    "guitar":       (80, 0,  50, 33, 80),
    "trumpet":      (85, 85, 85, 90, 90),
    "banjo":        (95, 0,  60, 45, 75),
    "trombone":     (95, 85, 85, 90, 85),
}

import functools
import copy

maxI = [float("inf"), []]

# Returns the difference between two arrays
def return_diff(l1, l2):
    total = []
    for i in range(len(l1)):
        # Arrange the array then reduce it
        total.append(functools.reduce(lambda x, y: y-x, sorted([l1[i], l2[i]])))
    return functools.reduce(lambda x, y: x+y, total)

# 2D looping to order by closest affinity
for i, j in patches.items():
    print(i, end=" ")
    maxI = [float("inf"), []]
    for k, l in patches.items():
        diff = [return_diff(l, j), k]
        if maxI[0] > diff[0] and diff[0] != 0:
            maxI = copy.deepcopy(diff)
    print(maxI)

我假设你想通过找出差异然后减少它们来订购它们。基本上，对于每个项目，程序都会再次在对象上循环以过滤掉较大的值。maxI每次都设置为一个较低的值，如果该值较大，则不会更改，从而找到最小的值。

这并不是那么简单。你有很多维度。也没有参考点（理想的仪器）来比较其他一切。但是，如果将其转换为数组，则可以找到所有距离，并使用

sklearn.neights

对其进行排序

输出：

[[0.         0.19664399 0.70818726 0.62892825 0.76357009 0.67486123]
 [0.19664399 0.         0.5912477  0.49435355 0.62671507 0.5346901 ]
 [0.70818726 0.5912477  0.         0.54458768 0.1255816  0.53411468]
 [0.62892825 0.49435355 0.54458768 0.         0.51928639 0.05581397]
 [0.76357009 0.62671507 0.1255816  0.51928639 0.         0.49940661]
 [0.67486123 0.5346901  0.53411468 0.05581397 0.49940661 0.        ]]


[[0.         0.19664399 0.62892825 0.67486123 0.70818726 0.76357009]
 [0.         0.19664399 0.49435355 0.5346901  0.5912477  0.62671507]
 [0.         0.1255816  0.53411468 0.54458768 0.5912477  0.70818726]
 [0.         0.05581397 0.49435355 0.51928639 0.54458768 0.62892825]
 [0.         0.1255816  0.49940661 0.51928639 0.62671507 0.76357009]
 [0.         0.05581397 0.49940661 0.53411468 0.5346901  0.67486123]]

[[0 1 3 5 2 4]
 [1 0 3 5 2 4]
 [2 4 5 3 1 0]
 [3 5 1 4 2 0]
 [4 2 5 3 1 0]
 [5 3 4 2 1 0]]

在第一个嵌套的

索引列表

（第三个

打印

）中，我们收到了所有按照到第一个仪器的距离排序的仪器（它们的索引）。诸如此类。

您想要的就是所谓的“集群”。有很多好的算法可以做到这一点。你的目标函数不是很清楚。5维空间中的点排列不整齐，因此我不确定将其视为一种排序是否有用。如果你只有6，只要你有一个很好的方法来判断两种可能的排列，你就可以很容易地用蛮力来强迫它。@John Coleman Order可能是一个比Sort更好的词。因此，在你的算法中，当两个工具的差异

时，它们是闭合的。如果您希望列表的排序能够最小化连续元素之间的距离，这就是旅行商问题，伪装成：谢谢您的回答。为什么欧几里德度量会导致钢琴比吉他更接近小号（0.62892825）而不是吉他（0.70818726）？我认为应该使用“曼哈顿”而不是欧几里德。老实说，我没有检查计算结果，但sklearn库是相当可靠的。所以钢琴很可能更接近欧几里得度量的小号。是的，曼哈顿可能更适合这个特殊的任务。您应该在这里添加它：neights=NN（n，metric='manhattan'）.fit（l）。您可以删除dist=DM.get_metric（'manhattan'）。这更多是为了说明。
[[0.         0.19664399 0.70818726 0.62892825 0.76357009 0.67486123]
 [0.19664399 0.         0.5912477  0.49435355 0.62671507 0.5346901 ]
 [0.70818726 0.5912477  0.         0.54458768 0.1255816  0.53411468]
 [0.62892825 0.49435355 0.54458768 0.         0.51928639 0.05581397]
 [0.76357009 0.62671507 0.1255816  0.51928639 0.         0.49940661]
 [0.67486123 0.5346901  0.53411468 0.05581397 0.49940661 0.        ]]


[[0.         0.19664399 0.62892825 0.67486123 0.70818726 0.76357009]
 [0.         0.19664399 0.49435355 0.5346901  0.5912477  0.62671507]
 [0.         0.1255816  0.53411468 0.54458768 0.5912477  0.70818726]
 [0.         0.05581397 0.49435355 0.51928639 0.54458768 0.62892825]
 [0.         0.1255816  0.49940661 0.51928639 0.62671507 0.76357009]
 [0.         0.05581397 0.49940661 0.53411468 0.5346901  0.67486123]]

[[0 1 3 5 2 4]
 [1 0 3 5 2 4]
 [2 4 5 3 1 0]
 [3 5 1 4 2 0]
 [4 2 5 3 1 0]
 [5 3 4 2 1 0]]