Numpy 我的自定义逻辑回归实现有什么问题?
我试图反映出几乎与sklearn相同的结果,但我没有得到好的结果。我的自定义实现和sklearn的实现中截取的值相差5,所以我在这里尽量减少这个值 我的sklearn代码如下:Numpy 我的自定义逻辑回归实现有什么问题?,numpy,machine-learning,scikit-learn,logistic-regression,Numpy,Machine Learning,Scikit Learn,Logistic Regression,我试图反映出几乎与sklearn相同的结果,但我没有得到好的结果。我的自定义实现和sklearn的实现中截取的值相差5,所以我在这里尽量减少这个值 我的sklearn代码如下: from sklearn.datasets import make_classification X, y = make_classification(n_samples=50000, n_features=15, n_informative=10, n_redundant=5,
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=50000, n_features=15, n_informative=10, n_redundant=5,
n_classes=2, weights=[0.7], class_sep=0.7, random_state=15)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=15)
clf = linear_model.SGDClassifier(eta0=0.0001, alpha=0.0001, loss='log', random_state=15, penalty='l2', tol=1e-3, verbose=2, learning_rate='constant')
clf.fit(X=X_train, y=y_train) # fitting our model
print(clf.coef_, clf.coef_.shape, clf.intercept_)
这导致
(array([[-0.42336692, 0.18547565, -0.14859036, 0.34144407, -0.2081867 ,
0.56016579, -0.45242483, -0.09408813, 0.2092732 , 0.18084126,
0.19705191, 0.00421916, -0.0796037 , 0.33852802, 0.02266721]]),
(1, 15),
array([-0.8531383]))
(array([-0.22281323, 0.10570237, -0.02506523, 0.16630429, -0.07033019,
0.27985805, -0.27348925, -0.04622113, 0.13212066, 0.05330409,
0.09926212, -0.00791336, -0.02920803, 0.1828124 , 0.03442375]),
-0.8019981458384148)
我的自定义实现
def initialize_weights(dim):
''' In this function, we will initialize our weights and bias'''
#initialize the weights to zeros array of (dim,1) dimensions
#you use zeros_like function to initialize zero
#initialize bias to zero
w = np.zeros_like(dim)
b = 0
return w,b
def sigmoid(z):
''' In this function, we will return sigmoid of z'''
# compute sigmoid(z) and return
return 1/(1+np.exp(-z))
def logloss(y_true,y_pred):
'''In this function, we will compute log loss '''
loss = 0
A = list(zip(y_true, y_pred))
for y, y_score in A:
loss += (-1/len(A))*(y*np.log10(y_score) + (1-y) * np.log10(1-y_score))
return loss
def gradient_dw(x,y,w,b,alpha,N):
'''In this function, we will compute the gardient w.r.to w '''
z = np.dot(w, x) + b
dw = x*(y - sigmoid(z)) - ((1/alpha)*(1/N) * w)
return dw
def gradient_db(x,y,w,b):
z = np.dot(w, x) + b
db = y - sigmoid(z)
return DB
def train(X_train,y_train,X_test,y_test,epochs,alpha,eta0, tol=1e-3):
''' In this function, we will implement logistic regression'''
#Here eta0 is learning rate
#implement the code as follows
# initalize the weights (call the initialize_weights(X_train[0]) function)
w, b = initialize_weights(X_train[0])
# for every epoch
train_loss = []
test_loss = []
for epoch in range(epochs):
# for every data point(X_train,y_train)
for x, y in zip(X_train, y_train):
#compute gradient w.r.to w (call the gradient_dw() function)
dw = gradient_dw(x, y, w, b, alpha, len(X_train))
#compute gradient w.r.to b (call the gradient_db() function)
db = gradient_db(x, y, w, b)
#update w, b
w = w + eta0 * dw
b = b + eta0 * db
# predict the output of x_train[for all data points in X_train] using w,b
y_pred = [sigmoid(np.dot(w, x)) for x in X_train]
#compute the loss between predicted and actual values (call the loss function)
train_loss.append(logloss(y_train, y_pred))
# store all the train loss values in a list
# predict the output of x_test[for all data points in X_test] using w,b
y_pred_test = [sigmoid(np.dot(w, x)) for x in X_test]
print(f"EPOCH: {epoch} Train Loss: {logloss(y_train, y_pred)} Test Loss: {logloss(y_test, y_pred_test)}")
#compute the loss between predicted and actual values (call the loss function)
test_loss.append(logloss(y_test, y_pred_test))
# you can also compare previous loss and current loss if the loss is not updating then stop the process and return w,b
return w,b, train_loss, test_loss
alpha=0.0001
eta0=0.0001
N=len(X_train)
epochs=50
w,b, train_loss, test_loss=train(X_train,y_train,X_test,y_test,epochs,alpha,eta0)
w,b导致
(array([[-0.42336692, 0.18547565, -0.14859036, 0.34144407, -0.2081867 ,
0.56016579, -0.45242483, -0.09408813, 0.2092732 , 0.18084126,
0.19705191, 0.00421916, -0.0796037 , 0.33852802, 0.02266721]]),
(1, 15),
array([-0.8531383]))
(array([-0.22281323, 0.10570237, -0.02506523, 0.16630429, -0.07033019,
0.27985805, -0.27348925, -0.04622113, 0.13212066, 0.05330409,
0.09926212, -0.00791336, -0.02920803, 0.1828124 , 0.03442375]),
-0.8019981458384148)
请提供帮助。在函数
gradient\u dw()
中,α(即正则化项)应该在分子中
def梯度(x,y,w,b,alpha,N):
''在此函数中,我们将计算gardient w.r.到w''
z=np.dot(w,x)+b
dw=x*(y-sigmoid(z))-((α)*(1/N)*w)
返回数据仓库
作为正则化logistic回归的成本函数,是
梯度下降算法通过对代价函数w.r.t.权重的导数而变得如下
对代码的另一个小更正-需要将用于计算预测值数组的截距b
添加到以下行中
y_pred=[x_列中x的sigmoid(np.dot(w,x)+b]
y_pred_test=[x_test中x的sigmoid(np.dot(w,x)+b]
因此,完整代码的最终形式如下所示,对于所有权重,它与Scikit学习实现的差异为0.001
将numpy导入为np
作为pd进口熊猫
从sklearn.dataset导入make_分类
从sklearn.model\u选择导入列车\u测试\u拆分
从sklearn.preprocessing导入StandardScaler
从sklearn导入线性_模型
将matplotlib.pyplot作为plt导入
十、 y=进行分类(n_样本=50000,n_特征=15,n_信息=10,n_冗余=5,
n_类=2,权重=0.7,类sep=0.7,随机状态=15)
X_序列,X_测试,y_序列,y_测试=序列测试分割(X,y,测试大小=0.25,随机状态=15)
clf=线性模型。SGDClassizer(eta0=0.0001,alpha=0.0001,损耗='log',随机状态=15,惩罚='l2',tol=1e-3,冗余=2,学习速率='常量')
clf.fit(X=X_火车,y=y_火车)#适合我们的模型
打印(clf.coef、clf.coef、shape、clf.intercept)
def初始化重量(dim):
''在此函数中,我们将初始化权重和偏差''
#将(dim,1)维度的权重初始化为零数组
#您可以使用类零函数来初始化零
#将偏差初始化为零
w=np.类零(dim)
b=0
返回w,b
def乙状结肠(z):
''在此函数中,我们将返回z的sigmoid''
#计算sigmoid(z)并返回
返回1/(1+np.exp(-z))
def logloss(y_true,y_pred):
''在此函数中,我们将计算日志损失''
损失=0
A=列表(zip(y_true,y_pred))
对于y,y_在A中的分数:
损失+=(-1/len(A))*(y*np.log10(y_分数)+(1-y)*np.log10(1-y_分数))
回波损耗
def梯度(x,y,w,b,alpha,N):
''在此函数中,我们将计算gardient w.r.到w''
z=np.dot(w,x)+b
dw=x*(y-sigmoid(z))-((α)*(1/N)*w)
返回数据仓库
def梯度_db(x,y,w,b):
z=np.dot(w,x)+b
db=y-乙状结肠(z)
返回数据库
def序列(X_序列、y_序列、X_测试、y_测试、历元、α、eta0、tol=1e-3):
''在此函数中,我们将实现逻辑回归''
#这里eta0是学习率
#实现代码如下
#初始化权重(调用initialize_weights(X_列[0])函数)
w、 b=初始化重量(X列[0])
#每个时代
列车损失=[]
测试损耗=[]
对于范围内的历元(历元):
#对于每个数据点(X_列、y_列)
对于拉链中的x,y(x_火车,y_火车):
#计算梯度w.r.到w(调用gradient_dw()函数)
dw=梯度(x,y,w,b,alpha,len(x_列))
#计算梯度w.r.到b(调用gradient_db()函数)
db=梯度×db(x,y,w,b)
#更新w,b
w=w+eta0*dw
b=b+eta0*db
#使用w,b预测x_序列的输出[对于x_序列中的所有数据点]
y_pred=[x_列中x的sigmoid(np.dot(w,x))]
#计算预测值和实际值之间的损失(调用损失函数)
列车丢失。追加(日志丢失(列车丢失,列车丢失)
#将所有列车损失值存储在列表中
#使用w,b预测x_测试的输出[对于x_测试中的所有数据点]
y_pred_test=[x_test中x的sigmoid(np.dot(w,x))]
打印(f“历元:{历元}列损失:{logloss(y_Train,y_pred)}测试损失:{logloss(y_Test,y_pred_Test)}”)
#计算预测值和实际值之间的损失(调用损失函数)
test_loss.append(logloss(y_test,y_pred_test))
#您还可以比较以前的损耗和当前损耗,如果损耗没有更新,则停止该过程并返回w,b
返回w、b、列车损耗、测试损耗
α=0.0001
eta0=0.0001
N=len(X_列车)
纪元=50
w、 b,列车损耗,测试损耗=列车(X列车,y列车,X列车,y列车,y列车,历元,α,eta0)
打印(“自定义w和Scikit learn的clf.coef_uu之间的差异”,w-clf.coef_uu)
打印(“自定义截取b和Scikit learn的clf.intercept_u3;之间的差异”,b-clf.intercept_3;)
输出如下
在函数gradient\u dw()
中,alpha
(即正则化项)应位于分子中
def梯度(x,y,w,b,alpha,N):
''在此函数中,我们将计算gardient w.r.到w''
z=np.dot(w,x)+b
dw=x*(y-sigmoid(z))-((α)*(1/N)*w)
返回数据仓库
作为正则化logistic回归的成本函数,是
梯度下降算法通过对代价函数w.r.t.权重的导数而变得如下
对代码的另一个小更正-需要添加截距