Python Skikit学习训练\测试\分割返回值
我正在使用Skikit Learn的train_test_split函数,它返回所有数组,除了X_test之外,X_test返回奇怪的值,而不是来自数据集。下面是代码Python Skikit学习训练\测试\分割返回值,python,scikit-learn,Python,Scikit Learn,我正在使用Skikit Learn的train_test_split函数,它返回所有数组,除了X_test之外,X_test返回奇怪的值,而不是来自数据集。下面是代码 from sklearn.model_selection import train_test_split heart_data = np.genfromtxt("cleveland.txt", delimiter=",") hd_y = heart_data[:,13] X_train, X_test, y_train,y_te
from sklearn.model_selection import train_test_split
heart_data = np.genfromtxt("cleveland.txt", delimiter=",")
hd_y = heart_data[:,13]
X_train, X_test, y_train,y_test = train_test_split(heart_data,hd_y, test_size=0.2, random_state=42)
print(X_train,y_train)
print(X_test,y_test)
返回此值-(缩短数组,因为数组太长)
不是来自数据集的奇怪值,是指
X_测试中的科学符号吗?我认为您可能希望在使用train\u test\u split
之前,通过heart\u data=np.genfromtxt(“cleveland.txt”,delimiter=“,”).astype(np.float32)
将所有数据显式转换为float32格式。谢谢您的想法,但它仍然会产生相同的输出。您所说的奇怪值是什么意思?例如,5.30e+01是53,这有什么奇怪的?与您的问题完全无关,但您在X
中留下了y
向量(idx 13),因此要小心泄漏。感谢大家的评论,我忘了说我已经解决了问题,现在我的神经网络运行顺利。
#X_train
[[58. 1. 3. ... 0. 7. 0.]
[50. 0. 3. ... 0. 3. 0.]
[70. 1. 4. ... 0. 7. 4.]
...
[61. 1. 4. ... 1. 7. 2.]
[35. 1. 2. ... 0. 3. 0.]
[49. 1. 3. ... 3. 7. 3.]]
#y_train
[0. 0. 4. 2. 1. 3. 0. 0. 0. 0. 0. 2. 3. 1. 1. 0. 0. 2. 0. 0. 0. 0. 2. 4.
0. 0. 0. 1. 1. 0. 0. 2. 1. 2. 1. 3. 2. 0. 1. 0. 1. 2. 2. 2. 1. 3. 0. 1
0. 1. 0. 0. 1. 1. 1. 0. 0. 3. 2. 0. 3.]
#X_test
[[5.30e+01 1.00e+00 4.00e+00 1.40e+02 2.03e+02 1.00e+00 2.00e+00 1.55e+02
1.00e+00 3.10e+00 3.00e+00 0.00e+00 7.00e+00 1.00e+00]
[4.00e+01 1.00e+00 4.00e+00 1.52e+02 2.23e+02 0.00e+00 0.00e+00 1.81e+02
0.00e+00 0.00e+00 1.00e+00 0.00e+00 7.00e+00 1.00e+00]
[5.10e+01 0.00e+00 4.00e+00 1.30e+02 3.05e+02 0.00e+00 0.00e+00 1.42e+02
1.00e+00 1.20e+00 2.00e+00 0.00e+00 7.00e+00 2.00e+00]]
#y_test
[1. 1. 0. 0. 0. 3. 0. 1. 1. 0. 1. 0. 0. 0. 0. 0. 3. 2. 4. 3. 0. 1. 0. 0.
0. 1. 1. 0. 2. 0. 0. 0. 3. 0. 0. 3. 0. 4. 1. 0. 0. 0. 1. 4. 3. 2. 0. 0.
0. 1. 1. 0. 0. 3. 3. 0. 0. 2.]