Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/338.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在scikit管道中正确应用标签编码器?_Python_Machine Learning_Scikit Learn_Pipeline_Encoder - Fatal编程技术网

Python 如何在scikit管道中正确应用标签编码器?

Python 如何在scikit管道中正确应用标签编码器?,python,machine-learning,scikit-learn,pipeline,encoder,Python,Machine Learning,Scikit Learn,Pipeline,Encoder,我已经编写了处理数据的管道,但我的程序给出了以下错误: AttributeError:'numpy.ndarray'对象没有属性'fit' 我之所以创建一个新类,是因为我试图直接在管道中实现LabelEncoder,但它给了我一些不同的错误 我的班级是这样的: class myLabelEncoder(BaseEstimator, TransformerMixin): def __init__(self): self.encoder = LabelEncoder

我已经编写了处理数据的管道,但我的程序给出了以下错误:

AttributeError:'numpy.ndarray'对象没有属性'fit'

我之所以创建一个新类,是因为我试图直接在管道中实现LabelEncoder,但它给了我一些不同的错误

我的班级是这样的:

class myLabelEncoder(BaseEstimator, TransformerMixin):
    
    def __init__(self):
        self.encoder = LabelEncoder()
        self.X = None
    def fit(self,X,y = None):
        self.X = X
        return self.encoder.fit(X)
    
    def transform(self, X, y = None):
        return self.encoder.transform(X)
这是管道:

#other transformers...

label_transformer = Pipeline(steps = [('imputer',SimpleImputer(missing_values = np.nan, 
strategy = 'most_frequent')),('label',myLabelEncoder)])


pipeline =ColumnTransformer(
        transformers = [('cat', categorical_transformer, cat_features),
                        ('num', numeric_transformer, num_features),  
                        ('ord', ordinal_transformer, ord_features),
                       ('lab', label_transformer, label_features)]) 

logistic = LogisticRegression()

pca = PCA()

clf = Pipeline(steps = [('preprocessor', pipeline),
                        ('pca', pca),
                        ('classifier',logistic)])

#creating parameters for tuning pca and logistic regression

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


clf.fit(X_train, y_train)
我认为问题出在类代码中,但我找不到解决方案,尽管我搜索了关于此错误的类似帖子

以下是数据集的5个示例:

X的2个样本行=

 {'hotel': {54261: 'City Hotel', 77908: 'City Hotel'},
 'meal': {54261: 'SC', 77908: 'BB'},
 'lead_time': {54261: 49, 77908: 9},
 'adr': {54261: 93.15, 77908: 155.08},
 'stays_in_weekend_nights': {54261: 1, 77908: 2},
 'stays_in_week_nights': {54261: 0, 77908: 1},
 'adults': {54261: 2, 77908: 2},
 'children': {54261: 0.0, 77908: 0.0},
 'previous_cancellations': {54261: 0, 77908: 0},
 'previous_bookings_not_canceled': {54261: 0, 77908: 0},
 'total_of_special_requests': {54261: 0, 77908: 1},
 'customer_type': {54261: 'Transient', 77908: 'Transient'},
 'market_segment': {54261: 'Online TA', 77908: 'Online TA'},
 'distribution_channel': {54261: 'TA/TO', 77908: 'TA/TO'},
 'reserved_room_type': {54261: 'A', 77908: 'A'},
 'assigned_room_type': {54261: 'A', 77908: 'A'},
 'arrival_date_month': {54261: 'July', 77908: 'September'}}
y的2个样本行:

{54261: 1, 77908: 0, 38772: 0, 100452: 0, 2318: 1}

谢谢您的时间。

您忘记在
label\u transformer
中实例化标签编码器类:用
myLabelEncoder()替换
myLabelEncoder