Python 找到样本数不一致的输入变量错误
我试图训练一个模型,但是当我适合模型时,我得到以下错误:Python 找到样本数不一致的输入变量错误,python,scikit-learn,Python,Scikit Learn,我试图训练一个模型,但是当我适合模型时,我得到以下错误: ValueError: Found input variables with inconsistent numbers of samples: [1, 3608] 这是我的密码: data = pd.read_csv("/Users/amanpuranik/Desktop/fake-news-detection/data.csv") data = data[['Headline', "Label"]] print(data) x =
ValueError: Found input variables with inconsistent numbers of samples: [1, 3608]
这是我的密码:
data = pd.read_csv("/Users/amanpuranik/Desktop/fake-news-detection/data.csv")
data = data[['Headline', "Label"]]
print(data)
x = data[["Headline"]]
y = data[["Label"]]
print(x.shape)
print(y.shape)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.1, random_state=1)
print(x_train.shape)
tfidf_vectorizer=TfidfVectorizer(stop_words='english', max_df=1)
model = MultinomialNB()
#model.fit(x_train,y_train) #this part gives me a string to float error
pipeline = Pipeline([('vectorizer', tfidf_vectorizer), ('classifier', model)])
pipeline.fit(x_train, y_train)
我不确定如何克服这个错误我认为您有一个功能和3608条记录,但代码认为有一个示例具有3608条功能 更改定义x和y的代码,如下所示
x = data[["Headline"]].values.reshape(-1, 1)
y = data["Label"].values