Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/359.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在sklearn python中给出不同答案的管道_Python_Machine Learning_Scikit Learn_Artificial Intelligence_Logistic Regression - Fatal编程技术网

在sklearn python中给出不同答案的管道

在sklearn python中给出不同答案的管道,python,machine-learning,scikit-learn,artificial-intelligence,logistic-regression,Python,Machine Learning,Scikit Learn,Artificial Intelligence,Logistic Regression,我写了两个程序,它们应该遵循相同的逻辑。但他们两人给出了不同的答案 首先- train_data = train_features[:1710][:] train_label = label_features[:1710][:].ravel() test_data = train_features[1710:][:] test_label = label_features[1710:][:].ravel() def getAccuracy(ans): d = 0 for i i

我写了两个程序,它们应该遵循相同的逻辑。但他们两人给出了不同的答案

首先-

train_data = train_features[:1710][:]
train_label = label_features[:1710][:].ravel()
test_data = train_features[1710:][:]
test_label = label_features[1710:][:].ravel()

def getAccuracy(ans):
    d = 0
    for i in range(np.size(ans,0)):
        if(ans[i] == test_label[i]):
            d+=1
    return (d*100)/float(np.size(ans,0))

estimators = [('pps', pps.RobustScaler()), ('clf', LogisticRegression())]
pipe = Pipeline(estimators)
pipe = pipe.fit(train_data,train_label)

ans = pipe.predict(test_data)
getAccuracy(ans)
第二-

train_data = train_features[:1710][:]
train_label = label_features[:1710][:].ravel()
test_data = train_features[1710:][:]
test_label = label_features[1710:][:].ravel()

def getAccuracy(ans):
    d = 0
    for i in range(np.size(ans,0)):
        if(ans[i] == test_label[i]):
            d+=1
    return (d*100)/float(np.size(ans,0))

def preprocess(features):
    return pps.RobustScaler().fit_transform(features)

train_data = preprocess(train_data)
clf = LogisticRegression().fit(train_data,train_label)

test_data = preprocess(test_data)
ans = clf.predict(test_data)
getAccuracy(ans)

第一个给80.81,第二个给84.92。为什么两者都不同?

您的第二个代码无效,因为您的“预处理”适合测试集的定标器,这是不应该发生的。另一方面,管道只适合您的列车数据的RobustScaler,然后在测试数据上调用“transform”。

感谢您的帮助