Python 2.7 使用Scipy和x27的逻辑回归;s快速消费品
我正在尝试使用python实现逻辑分类器。目标是训练算法使用mnist手写数字数据集识别数字0-9。然而,快速消费品似乎正在改变我输入参数的维度。我尝试过重塑cost()和gradient()中的争论,但没有成功;只是更多的错误Python 2.7 使用Scipy和x27的逻辑回归;s快速消费品,python-2.7,machine-learning,scipy,logistic-regression,Python 2.7,Machine Learning,Scipy,Logistic Regression,我正在尝试使用python实现逻辑分类器。目标是训练算法使用mnist手写数字数据集识别数字0-9。然而,快速消费品似乎正在改变我输入参数的维度。我尝试过重塑cost()和gradient()中的争论,但没有成功;只是更多的错误 from scipy.io import loadmat from numpy import shape, zeros, ones, dot, hstack, vstack, log, transpose, kron from scipy.special import
from scipy.io import loadmat
from numpy import shape, zeros, ones, dot, hstack, vstack, log, transpose, kron
from scipy.special import expit as sigmoid
import scipy.optimize
def cost(theta, X, y):
h = sigmoid( X.dot(theta) )
pos_class = y.T.dot( log(h) )
neg_class = (1.0-y).T.dot( log(1.0-h) )
cost = ((-1.0/m)*(pos_class+neg_class))
return cost
def gradient(theta, X, y):
h = sigmoid( X.dot(theta) )
grad = (1.0/m)*(X.T.dot((h-y)))
return grad
def one_vs_all(X, y, theta):
# add x1 feature,x1 = 1, to each example set
X = hstack( (ones((m,1)),X) )
# train the classifier for digit 9.0
temp_y = (y == 9.0)+0
result = scipy.optimize.fmin_cg( cost, fprime=gradient, x0=theta, \
args=(X, temp_y), maxiter=50, disp=False, full_output=True )
print result[1]
# Load data from Matlab file
data = loadmat('data.mat')
X,y = data['X'],data['y']
m,n = shape(X)
theta = zeros((n+1, 1))
one_vs_all(X, y, theta)
我收到的错误是:
Traceback (most recent call last):
File "/Users/jkarimi91/Documents/Digit Recognizer/Digit_Recognizer.py", line 36, in <module>
one_vs_all(X, y, theta)
File "/Users/jkarimi91/Documents/Digit Recognizer/Digit_Recognizer.py", line 26, in one_vs_all
args=(X, temp_y), maxiter=50, disp=False, full_output=True )
File "/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 1092, in fmin_cg
res = _minimize_cg(f, x0, args, fprime, callback=callback, **opts)
File "/anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 1156, in _minimize_cg
deltak = numpy.dot(gfk, gfk)
ValueError: shapes (401,5000) and (401,5000) not aligned: 5000 (dim 1) != 401 (dim 0)
[Finished in 1.0s with exit code 1]
回溯(最近一次呼叫最后一次):
文件“/Users/jkarimi91/Documents/Digit Recognizer/Digit_Recognizer.py”,第36行,在
一对所有(X,y,θ)
文件“/Users/jkarimi91/Documents/Digit Recognizer/Digit\u Recognizer.py”,第26行,一对一
args=(X,temp\u y),maxiter=50,disp=False,full\u输出=True)
文件“/anaconda/lib/python2.7/site packages/scipy/optimize/optimize.py”,第1092行,在fmin_cg中
res=\u最小化\u cg(f,x0,args,fprime,callback=callback,**选项)
文件“/anaconda/lib/python2.7/site packages/scipy/optimize/optimize.py”,第1156行,在cg中
deltak=numpy.dot(gfk,gfk)
ValueError:形状(4015000)和(4015000)未对齐:5000(尺寸1)!=401(尺寸0)
[在1.0秒内完成,退出代码为1]
对于当前代码,cost&gradient函数都返回一个二维数组。为了使fmin_cg正常工作,这些函数必须各自返回一个1-D数组(如所述)。我知道这可能有点晚,但这应该可以工作
.在你的梯度函数中,我有几个内存错误,所以我稍微修改了代码并添加了正则化,请检查
def gradients (theta,X,y,Lambda):
m,n = shape(X)
theta = reshape(theta,(n,1))
h = sigmoid(X.dot(theta))
h = h-y
theta[0,0] = 0
grad = ((X.T.dot(h)) / m) + (Lambda / m * theta)
return grad.ravel()