Python 通过组合列表中的不同元素在列表上训练模型，返回给出最佳分数的组合_Python

Python 通过组合列表中的不同元素在列表上训练模型，返回给出最佳分数的组合

python

Python 通过组合列表中的不同元素在列表上训练模型，返回给出最佳分数的组合,python,Python,我试图通过使用特性列表的不同组合，在一个数据集上训练几个模型。到目前为止，我有： features = ['ca','thal','slope','oldpeak','chol','fbs','thalach','exang'] for i in range(1, len(features) + 1): # iterate and select next features Sbest = [] # Sbest will contain the list of elements whi

我试图通过使用特性列表的不同组合，在一个数据集上训练几个模型。到目前为止，我有：

features = ['ca','thal','slope','oldpeak','chol','fbs','thalach','exang']
for i in range(1, len(features) + 1):  # iterate and select next features
    Sbest = [] # Sbest will contain the list of elements which give the best score
    input_f = features[:i]
    y = data['target']
    X = data[input_f] 
    model_= KMeans(n_clusters=2, random_state=0, init='k-means++', n_init=10, max_iter=100)
    model_.fit(X)
    precision,recall,fscore,support=score(y,model_.labels_,average='macro')
    Sbest.append(input_f)
    print(input_f,': {:.2f}'.format(fscore))

这将提供以下输出：

['ca'] : 0.62
['ca', 'thal'] : 0.62
['ca', 'thal', 'slope'] : 0.62
['ca', 'thal', 'slope', 'oldpeak'] : 0.71
['ca', 'thal', 'slope', 'oldpeak', 'chol'] : 0.42
['ca', 'thal', 'slope', 'oldpeak', 'chol', 'fbs'] : 0.42
['ca', 'thal', 'slope', 'oldpeak', 'chol', 'fbs', 'thalach'] : 0.56
['ca', 'thal', 'slope', 'oldpeak', 'chol', 'fbs', 'thalach', 'exang'] : 0.56

我想输出的是给出最佳结果的功能列表，正如我们在这里看到的，它的

fscore

为0.71。因此，与其拥有所有的输出，我想要这个输出：

['ca', 'thal', 'slope', 'oldpeak'] : 0.71

如果碰巧我有不同的列表，它们输出相同的分数，那么输出的元素就会更少。我的代码缺少什么？

您正在获取输出中的所有列表，因为您正在for循环中使用

print（input_f'：{.2f}.format（fscore））

您可以将列表的元组及其分数附加到Sbest列表中，然后根据需要的条件对其进行排序

像这样：（我修改了输入列表，使之有两个长度不同、分数相同的列表）

给

(['ca', 'thal', 'slope'], 0.71)

因此，在您的程序中，您可以：

从回路内部卸下打印线

通过执行

Sbest.append（（input，fscore））

循环完成后，对Sbest

Sbest.sort中的第一项进行排序并打印（反向=True，键=lambda x:（x[1]，-1*len（x[0]））

和

打印（Sbest[0]）

这将返回最后一个分数，而不是最好的分数。请将Sbest的声明保持在循环之外，否则将丢失以前迭代的信息。这就是为什么你只看到了最后一个分数。我算出了，但它返回了一个列表，那里没有返回语句。Sbest是一个元组列表。Sbest-->[（[]，float），（[]，float），（[]，float）]不是这样的，它更像是：Sbest-->[[s']，float]在每一行上，它们之间没有逗号。

(['ca', 'thal', 'slope'], 0.71)