Python ValueError:形状为(124,1)的不可广播输出操作数不';t匹配广播形状(124,13)
我想使用Python ValueError:形状为(124,1)的不可广播输出操作数不';t匹配广播形状(124,13),python,python-2.7,numpy,scikit-learn,Python,Python 2.7,Numpy,Scikit Learn,我想使用sklearn.preprocessing中的MinMaxScaler规范化训练和测试数据集。但是,包似乎不接受我的测试数据集 import pandas as pd import numpy as np # Read in data. df_wine = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data', heade
sklearn.preprocessing
中的MinMaxScaler
规范化训练和测试数据集。但是,包似乎不接受我的测试数据集
import pandas as pd
import numpy as np
# Read in data.
df_wine = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data',
header=None)
df_wine.columns = ['Class label', 'Alcohol', 'Malic acid', 'Ash',
'Alcalinity of ash', 'Magnesium', 'Total phenols',
'Flavanoids', 'Nonflavanoid phenols', 'Proanthocyanins',
'Color intensity', 'Hue', 'OD280/OD315 of diluted wines',
'Proline']
# Split into train/test data.
from sklearn.model_selection import train_test_split
X = df_wine.iloc[:, 1:].values
y = df_wine.iloc[:, 0].values
X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.3,
random_state = 0)
# Normalize features using min-max scaling.
from sklearn.preprocessing import MinMaxScaler
mms = MinMaxScaler()
X_train_norm = mms.fit_transform(X_train)
X_test_norm = mms.transform(X_test)
执行此操作时,我得到一个DeprecationWarning:在0.17中,将1d数组作为数据传递是不推荐的,并且在0.19中会引发ValueError。如果数据具有单个特征,请使用X.RESUPATE(-1,1),如果数据包含单个样本,请使用X.RESUPATE(1,-1)重塑数据。
以及值错误:操作数无法与形状(124,)(13,)(124,)
一起广播
重塑数据仍会产生错误
X_test_norm = mms.transform(X_test.reshape(-1, 1))
此整形产生错误ValueError:具有形状(124,1)的不可广播输出操作数与广播形状(124,13)不匹配
任何关于如何修复此错误的输入都会很有帮助 必须按照与函数输入数组相同的顺序指定列车/测试数据的分区,以使其按照该顺序解包 显然,当顺序指定为
X\u-train、y\u-train、X\u-test、y\u-test
时,y\u-train
(len(y\u-train)=54
)和X\u-test
(len(X\u-test)=124
)的结果形状被交换,导致ValueError
相反,您必须:
# Split into train/test data.
# _________________________________
# | | \
# | | \
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
# | | /
# |__________|_____________________________________/
# (or)
# y_train, y_test, X_train, X_test = train_test_split(y, X, test_size=0.3, random_state=0)
# Normalize features using min-max scaling.
from sklearn.preprocessing import MinMaxScaler
mms = MinMaxScaler()
X_train_norm = mms.fit_transform(X_train)
X_test_norm = mms.transform(X_test)
产生:
X_train_norm[0]
array([ 0.72043011, 0.20378151, 0.53763441, 0.30927835, 0.33695652,
0.54316547, 0.73700306, 0.25 , 0.40189873, 0.24068768,
0.48717949, 1. , 0.5854251 ])
X_test_norm[0]
array([ 0.72849462, 0.16386555, 0.47849462, 0.29896907, 0.52173913,
0.53956835, 0.74311927, 0.13461538, 0.37974684, 0.4364852 ,
0.32478632, 0.70695971, 0.60566802])
当你有形状错误时,你需要做的第一件事就是显示所有与你的问题有关的数组的形状。在这种情况下,
xu-train
和xu-test
,可能更多。因此,他在13个功能集上进行训练,在1个功能集上进行测试。这就是异常错误消息的原因。sklearn问题中的形状错误很常见,但不是那些涉及不可广播的问题。如果他的密集层与他的特征数量不匹配,那么这也会导致不可广播的错误。