Python 如何获取文本文件并将其拆分为可用于机器学习分类器的数据？_Python_Numpy_Machine Learning

Python 如何获取文本文件并将其拆分为可用于机器学习分类器的数据？

python numpy machine-learning

Python 如何获取文本文件并将其拆分为可用于机器学习分类器的数据？,python,numpy,machine-learning,Python,Numpy,Machine Learning,对于这个练习，我应该只使用numpy，所以我不能只使用scikit学习我已经加载了数据集，并设法将其拆分为正数组和负数组。然而，我不知道现在该做什么，甚至不知道我在为分类器处理数据时做了什么 datasettrain = np.loadtxt("Adaboost-trainer.txt") negtrain, postrain = np.delete(datasettrain[datasettrain[:,2] < 0],2,1), np.delete(datase

对于这个练习，我应该只使用numpy，所以我不能只使用scikit学习

我已经加载了数据集，并设法将其拆分为正数组和负数组。然而，我不知道现在该做什么，甚至不知道我在为分类器处理数据时做了什么

datasettrain = np.loadtxt("Adaboost-trainer.txt")

negtrain, postrain = np.delete(datasettrain[datasettrain[:,2] < 0],2,1), np.delete(datasettrain[datasettrain[:,2] > 0],2,1)

clf = Adaboost(n_clf=5)
clf.fit(postrain, negtrain)

啊，如果我正确解释了示例数据，那么前两列是要素列，最后一列是目标值。如果这是正确的，那么要获得培训和测试集，您需要执行以下操作：

import numpy as np


data = np.loadtxt("Adaboost-trainer.txt")

# Determine your training/test split. I opted for 80/20
test_size = 0.2
split_index = int(data.shape[0] * test_size)

# Get the full train and test splits
indices = np.random.permutation(data.shape[0])
test_idx = indices[split_index:]
train_idx = indices[:split_index]
test = data[test_idx,:]
train = data[train_idx,:]

# Split the X and y for use in models
y_train = train[:,-1]
X_train = np.delete(train, 2, axis=1)
y_test = test[:,-1]
X_test = np.delete(test, 2, axis=1)

从这里开始，您将有一个80/20训练/测试数据分割，用于模型。

啊，如果我正确解释了您的样本数据，前两列是您的特征列，最后一列是您的目标值。如果这是正确的，那么要获得培训和测试集，您需要执行以下操作：

import numpy as np


data = np.loadtxt("Adaboost-trainer.txt")

# Determine your training/test split. I opted for 80/20
test_size = 0.2
split_index = int(data.shape[0] * test_size)

# Get the full train and test splits
indices = np.random.permutation(data.shape[0])
test_idx = indices[split_index:]
train_idx = indices[:split_index]
test = data[test_idx,:]
train = data[train_idx,:]

# Split the X and y for use in models
y_train = train[:,-1]
X_train = np.delete(train, 2, axis=1)
y_test = test[:,-1]
X_test = np.delete(test, 2, axis=1)

从那里，您就有了一个80/20训练/测试数据的分割，用于模型。

您能提供

Adaboost trainer.txt

中的示例吗？你试图训练模型进行分类的是什么，例如你的目标？我试图使用自适应增强的弱分类器设置边界。我已经提供了我已有的数据样本。你能提供

Adaboost trainer.txt

中的数据样本吗？你试图训练模型进行分类的是什么，例如你的目标？我试图使用自适应增强的弱分类器设置边界。我已经提供了我已有的数据样本。