如何在python中从头开始获取用于交叉验证的kfold拆分？_Python_Machine Learning_Scikit Learn_Logistic Regression_Cross Validation

如何在python中从头开始获取用于交叉验证的kfold拆分？

python machine-learning scikit-learn

如何在python中从头开始获取用于交叉验证的kfold拆分？,python,machine-learning,scikit-learn,logistic-regression,cross-validation,Python,Machine Learning,Scikit Learn,Logistic Regression,Cross Validation,我想我已经将我的训练数据分成了5个kold，有没有一种方法可以让我标记/识别5个分割中的每一个，这样我就可以将每个分割发送到我的算法中，以计算它们自己的精度 from sklearn.model_selection import KFold kf = KFold(n_splits=5) splits=kf.get_n_splits(X_train) print(splits) 另外，我还尝试分割数据，然后在逻辑回归中运行，但这会输出nan%的准确率： X_train1 = X[0:8

我想我已经将我的训练数据分成了5个kold，有没有一种方法可以让我标记/识别5个分割中的每一个，这样我就可以将每个分割发送到我的算法中，以计算它们自己的精度

from sklearn.model_selection import KFold 
kf = KFold(n_splits=5)  
splits=kf.get_n_splits(X_train) 
print(splits)

另外，我还尝试分割数据，然后在逻辑回归中运行，但这会输出nan%的准确率：

X_train1 = X[0:84]
Y_train1 = Y[0:84]
X_train2 = X[85:170]
Y_train2 = Y[85:170]
X_train3 = X[171:255]
Y_train3 = Y[171:255]
X_train4 = X[256:340]
Y_train4 = Y[256:340]
X_train5 = X[341:426]
Y_train5 = Y[341:426]

def Sigmoid(z):
    return 1/(1 + np.exp(-z))

def Hypothesis(theta, x):   
    return Sigmoid(x @ theta)

def Cost_Function(X,Y,theta,m):
    hi = Hypothesis(theta, x)
    _y = Y.reshape(-1, 1)
    J = 1/float(m) * np.sum(-_y * np.log(hi) - (1-_y) * np.log(1-hi))
    return J

def Cost_Function_Regularisation(X,Y,theta,m,alpha):
    hi = Hypothesis(theta,X)
    _y = Y.reshape(-1, 1)
    J = alpha/float(m) * X.T @ (hi - _y)
    return J

def Cost_Function_Regularisation(X,Y,theta,m,alpha):
    hi = Hypothesis(theta,X)
    _y = Y.reshape(-1, 1)
    J = alpha/float(m) * X.T @ (hi - _y)
    return J

def Gradient_Descent(X,Y,theta,m,alpha):
    new_theta = theta - Cost_Function_Regularisation(X,Y,theta,m,alpha)
    return new_theta

def Accuracy(theta):
    correct = 0
    length = len(X_test)
    prediction = (Hypothesis(theta, X_test) > 0.5)
    _y = Y_test.reshape(-1, 1)
    correct = prediction == _y
    my_accuracy = (np.sum(correct) / length)*100
    print ('LR Accuracy CV: ', my_accuracy, "%")


def Logistic_Regression(X,Y,alpha,theta,num_iters):
    m = len(Y)
    for x in range(num_iters):
        new_theta = Gradient_Descent(X,Y,theta,m,alpha)
        theta = new_theta
        if x % 100 == 0:
            print #('theta: ', theta)    
            print #('cost: ', Cost_Function(X,Y,theta,m))
    Accuracy(theta)

ep = .012 

initial_theta = np.random.rand(X_train.shape[1],1) * 2 * ep - ep
alpha = 0.5
iterations = 10000
Logistic_Regression(X_train1,Y_train1,alpha,initial_theta,iterations)
Logistic_Regression(X_train2,Y_train2,alpha,initial_theta,iterations)
Logistic_Regression(X_train3,Y_train3,alpha,initial_theta,iterations)
Logistic_Regression(X_train4,Y_train4,alpha,initial_theta,iterations)
Logistic_Regression(X_train5,Y_train5,alpha,initial_theta,iterations

get_n_splits

返回为skf配置的“分割数”

请参阅此处的文档以获取示例：