Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Spark Word2Vec.save()方法错误_Python_Apache Spark_Pyspark - Fatal编程技术网

Python Spark Word2Vec.save()方法错误

Python Spark Word2Vec.save()方法错误,python,apache-spark,pyspark,Python,Apache Spark,Pyspark,我尝试使用Spark ML的Word2Vec方法。但是,当我尝试使用这两种方法保存模型对象时 model.save()或model.write().save()它抛出以下错误 ---> 69 w2vdf,w2vmodel=word2Vec(sdf_cleaned) <ipython-input-244-e0067f9959c6> in word2Vec(df) 60 61 # Saving the model ---> 62 w2v

我尝试使用Spark ML的Word2Vec方法。但是,当我尝试使用这两种方法保存模型对象时

model.save()或model.write().save()它抛出以下错误

---> 69 w2vdf,w2vmodel=word2Vec(sdf_cleaned)

<ipython-input-244-e0067f9959c6> in word2Vec(df)
     60 
     61     # Saving the model
---> 62     w2vmodel.write().save()
     63 
     64     return w2vdf,w2vmodel

AttributeError: 'Word2VecModel' object has no attribute 'write

我的第二个问题是,是否可以得到每个文档的向量表示,而不是单个单词?。Spark是否实现了Doc2Vec功能?

只需执行
w2vmodel.save(sc,'filename')
就行了。我做到了。同样的错误,等等。我需要一个背景。好的,让我试试看,好的不行。同样的错误。Spark Docs也没有说我们需要传递Spark上下文。我在查看GitHub上以前的公开问题时看到了这一点,但也许我犯了一个错误。然而,这似乎是一个已知的问题,而且加载它时似乎也有问题。是否不可能将此模型腌制以备以后使用?你也在使用什么版本的Spark?它是Spark模型。我可以腌制它吗?
def word2Vec(df):
    """ This function takes in the data frame of the texts and finds the Word vector 
    representation of that

    """


    from pyspark.ml.feature import Tokenizer, Word2Vec


    # Carrying out the Tokenization of the text documents (splitting into words)

    tokenizer = Tokenizer(inputCol="desc", outputCol="tokenised_text")
    tokensDf = tokenizer.transform(df)

    # Implementing the word2Vec model

    word2Vec = Word2Vec(vectorSize=100, seed=42, inputCol="tokenised_text", outputCol="model")
    w2vmodel = word2Vec.fit(tokensDf)
    w2vdf=w2vmodel.transform(tokensDf)

    # Saving the model 
    w2vmodel.write().save()

    return w2vdf,w2vmodel