Python 带柱形变压器的SKLearn管道：'；numpy.ndarray和#x27；对象没有属性'；下'；_Python_Scikit Learn_Pipeline

Python 带柱形变压器的SKLearn管道：'；numpy.ndarray和#x27；对象没有属性'；下'；

python scikit-learn

Python 带柱形变压器的SKLearn管道：'；numpy.ndarray和#x27；对象没有属性'；下'；,python,scikit-learn,pipeline,Python,Scikit Learn,Pipeline,在使用新的ColumnTransformer特性时，我尝试使用SKLearn 0.20.2来创建管道。我的问题是，我不断得到错误： AttributeError:'numpy.ndarray'对象没有属性'lower' 我有一列名为text的文本。我的其他专栏都是数字性质的。我试图在我的管道中使用计数向量器，我认为这就是问题所在。非常感谢您能帮忙 from sklearn.impute import SimpleImputer from sklearn.compose import Column

在使用新的ColumnTransformer特性时，我尝试使用SKLearn 0.20.2来创建管道。我的问题是，我不断得到错误：

AttributeError:'numpy.ndarray'对象没有属性'lower'

我有一列名为

text

的文本。我的其他专栏都是数字性质的。我试图在我的管道中使用

计数向量器，我认为这就是问题所在。非常感谢您能帮忙
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer
# plus other necessary modules

# mapped to column names from dataframe
numeric_features = ['hasDate', 'iterationCount', 'hasItemNumber', 'isEpic']
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median'))
])

# mapped to column names from dataframe
text_features = ['text']
text_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent”')),
    ('vect', CountVectorizer())
])

preprocessor = ColumnTransformer(
    transformers=[('num', numeric_transformer, numeric_features),('text', text_transformer, text_features)]
)

clf = Pipeline(steps=[('preprocessor', preprocessor),
                      ('classifier', MultinomialNB())
                     ])

x_train, x_test, y_train, y_test = train_test_split(features, labels, test_size=0.33)
clf.fit(x_train,y_train)

@SergeyBushmanov帮助我诊断标题中的错误，它是由在文本上运行SimpleImputer
引起的
我还有一个错误，我将为其编写一个新问题。
SimpleImputer
不适用于文本。试试text\u transformer=Pipeline（[（'vect'，CountVectorizer（））]）
看看会发生什么。谢谢@SergeyBushmanov！现在我有一个新错误：ValueError:除了连接轴之外，所有输入数组维度都必须完全匹配
。我将更新我的代码片段以删除插补器。@SergeyBushmanov事实上，我继续将有问题的代码放回原处，并在您的说明中留下了答案，因为它确实修复了初始错误。就您最近的错误而言。例如，你能用fit_transform
方法追踪哪一行，preprocessor
或clf
产生错误吗？@SergeyBushmanov我将这个问题标记为已回答，因为你确实给了我错误的解决方案。我在这里开始了一个新问题，详细介绍了新错误：