Python 为什么我会收到数据转换警告?
我是这方面的新手,所以我非常感谢你的帮助。 我正在玩mnist数据集。我从中获取代码,但将“图像”更改为二维,这样每个图像都将是一个特征向量。然后我对数据进行PCA,然后使用SVM检查分数。一切似乎都很好,但我得到以下警告,我不知道为什么Python 为什么我会收到数据转换警告?,python,scikit-learn,warnings,Python,Scikit Learn,Warnings,我是这方面的新手,所以我非常感谢你的帮助。 我正在玩mnist数据集。我从中获取代码,但将“图像”更改为二维,这样每个图像都将是一个特征向量。然后我对数据进行PCA,然后使用SVM检查分数。一切似乎都很好,但我得到以下警告,我不知道为什么 "DataConversionWarning: A column-vector y was passed when a 1d array was expected.\ Please change the shape of y to (n_samples, ),
"DataConversionWarning: A column-vector y was passed when a 1d array was expected.\
Please change the shape of y to (n_samples, ), for example using ravel()."
我试过几种方法,但似乎无法摆脱这个警告。有什么建议吗?这是完整的代码(忽略缺少的缩进,似乎他们在这里复制代码时有点搞砸了):
谢谢你的帮助 我认为scikit learn期望y是一个一维阵列。您的
labels
变量是2-D-labels。shape
是(N,1)。警告告诉您使用labels.ravel()
,它将labels
转换为1-D数组,形状为(N,)。重塑也会起作用:
标签=标签。重塑((N,)
想想看,调用squence也会这样:
labels=labels.squence()
我想这里的问题是,在numpy中,一维数组不同于二维数组,其中一个维度等于1。谢谢!出于某种原因,我确信问题出在“图像”数组上。我都没想到。我真傻。不管怎样,没有更多的警告。再次感谢:)
import os, struct
from array import array as pyarray
from numpy import append, array, int8, uint8, zeros, arange
from sklearn import svm, decomposition
#from pylab import *
#from matplotlib import pyplot as plt
def load_mnist(dataset="training", digits=arange(10), path="."):
"""
Loads MNIST files into 3D numpy arrays
Adapted from: http://abel.ee.ucla.edu/cvxopt/_downloads/mnist.py
"""
if dataset == "training":
fname_img = os.path.join(path, 'train-images.idx3-ubyte')
fname_lbl = os.path.join(path, 'train-labels.idx1-ubyte')
elif dataset == "testing":
fname_img = os.path.join(path, 't10k-images.idx3-ubyte')
fname_lbl = os.path.join(path, 't10k-labels.idx1-ubyte')
else:
raise ValueError("dataset must be 'testing' or 'training'")
flbl = open(fname_lbl, 'rb')
magic_nr, size = struct.unpack(">II", flbl.read(8))
lbl = pyarray("b", flbl.read())
flbl.close()
fimg = open(fname_img, 'rb')
magic_nr, size, rows, cols = struct.unpack(">IIII", fimg.read(16))
img = pyarray("B", fimg.read())
fimg.close()
ind = [ k for k in range(size) if lbl[k] in digits ]
N = len(ind)
images = zeros((N, rows*cols), dtype=uint8)
labels = zeros((N, 1), dtype=int8)
for i in range(len(ind)):
images[i] = array(img[ ind[i]*rows*cols : (ind[i]+1)*rows*cols ])
labels[i] = lbl[ind[i]]
return images, labels
if __name__ == "__main__":
images, labels = load_mnist('training', arange(10),"path...")
pca = decomposition.PCA()
pca.fit(images)
pca.n_components = 200
images_reduced = pca.fit_transform(images)
lin_classifier = svm.LinearSVC()
lin_classifier.fit(images_reduced, labels)
images2, labels2 = load_mnist('testing', arange(10),"path...")
images2_reduced = pca.transform(images2)
score = lin_classifier.score(images2_reduced,labels2)
print score