Scikit learn sk learn:Error of fit（）接受2个位置参数，但在FeatureUnion中给出了3个_Scikit Learn

Scikit learn sk learn:Error of fit（）接受2个位置参数，但在FeatureUnion中给出了3个

scikit-learn

Scikit learn sk learn:Error of fit（）接受2个位置参数，但在FeatureUnion中给出了3个,scikit-learn,Scikit Learn,我在python中使用sk learn来拟合模型，并通过模型转换输入数据我使用FeatureUnion组合countvectorier和tfidfmembeddedingvectorier。仅使用CountVectorizer或仅使用TFIDFembeddedingVectorizer是可以的，但如果我通过特征联合组合两个特征，则会出现如下错误： TypeError: fit() takes 2 positional arguments but 3 were given class Tfi

我在python中使用sk learn来拟合模型，并通过模型转换输入数据

我使用FeatureUnion组合countvectorier和tfidfmembeddedingvectorier。

仅使用CountVectorizer或仅使用TFIDFembeddedingVectorizer是可以的，但如果我通过特征联合组合两个特征，则会出现如下错误：

TypeError: fit() takes 2 positional arguments but 3 were given

class TfidfEmbeddingVectorizer(object):
   ...
    def fit(self, X):
            tfidf = TfidfVectorizer(analyzer=lambda x: x)
            tfidf.fit(X)
            # if a word was never seen - it must be at least as infrequent
            # as any of the known words - so the default idf is the max of 
            # known idf's
            max_idf = max(tfidf.idf_)
            self.word2weight = defaultdict(
                lambda: max_idf,
                [(w, tfidf.idf_[i]) for w, i in tfidf.vocabulary_.items()])

            return self

model = gensim.models.Word2Vec(speech.train_data, size = 100)
w2v = dict(zip(model.wv.index2word, model.wv.syn0))

count = CountVectorizer(tokenizer=lambda doc: doc, lowercase=False)
w2v_tfidf = TfidfEmbeddingVectorizer(w2v)
feature_union = FeatureUnion([('ngram', count),
                             ('tfidf', w2v_tfidf)])
feature_union.fit(speech.train_data)

TfidfEmbeddingVectorizer类如下所示：

TypeError: fit() takes 2 positional arguments but 3 were given

class TfidfEmbeddingVectorizer(object):
   ...
    def fit(self, X):
            tfidf = TfidfVectorizer(analyzer=lambda x: x)
            tfidf.fit(X)
            # if a word was never seen - it must be at least as infrequent
            # as any of the known words - so the default idf is the max of 
            # known idf's
            max_idf = max(tfidf.idf_)
            self.word2weight = defaultdict(
                lambda: max_idf,
                [(w, tfidf.idf_[i]) for w, i in tfidf.vocabulary_.items()])

            return self

model = gensim.models.Word2Vec(speech.train_data, size = 100)
w2v = dict(zip(model.wv.index2word, model.wv.syn0))

count = CountVectorizer(tokenizer=lambda doc: doc, lowercase=False)
w2v_tfidf = TfidfEmbeddingVectorizer(w2v)
feature_union = FeatureUnion([('ngram', count),
                             ('tfidf', w2v_tfidf)])
feature_union.fit(speech.train_data)

我使用的FeatureUnion如下：

TypeError: fit() takes 2 positional arguments but 3 were given

class TfidfEmbeddingVectorizer(object):
   ...
    def fit(self, X):
            tfidf = TfidfVectorizer(analyzer=lambda x: x)
            tfidf.fit(X)
            # if a word was never seen - it must be at least as infrequent
            # as any of the known words - so the default idf is the max of 
            # known idf's
            max_idf = max(tfidf.idf_)
            self.word2weight = defaultdict(
                lambda: max_idf,
                [(w, tfidf.idf_[i]) for w, i in tfidf.vocabulary_.items()])

            return self

model = gensim.models.Word2Vec(speech.train_data, size = 100)
w2v = dict(zip(model.wv.index2word, model.wv.syn0))

count = CountVectorizer(tokenizer=lambda doc: doc, lowercase=False)
w2v_tfidf = TfidfEmbeddingVectorizer(w2v)
feature_union = FeatureUnion([('ngram', count),
                             ('tfidf', w2v_tfidf)])
feature_union.fit(speech.train_data)

我已经看到了一个解决方案，将sk learn版本降级为0.18.0可以使其正常运行，但我无法通过此错误将sk learn降级：

error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://andinghub.visualstudio.com/visual-cpp-build-tools

使用FeatureUnion的fit函数还有其他解决方案吗？

FeatureUnion的

fit（）

方法将

和

作为输入，如下所示：

拟合（X，y=无）

尽管其默认值为

None

，但其仍传递给内部变压器。在管道中使用时，出于兼容性原因，它存在于管道中

现在谈谈内部变压器

fit（）

方法

有一个签名：

拟合（原始文档，y=无）

正如您所看到的，出于同样的原因，它还包含

，即使在任何地方都不使用它

您的自定义
```
tfidfmembeddingvectorier
```
fit（）没有额外的
```
y
```
参数

但是特性联合会将尝试将

（及其

None

值）推给它，从而导致错误。只需将

fit

更改为：

def fit(self, X, y=None):
    ....
    ....