Python kmeans.cluster（）给出错误；类型错误：'；浮动'；“对象不可编辑”；关于在句子中使用单词嵌入（word2vec）_Python_K Means_Word2vec

Python kmeans.cluster（）给出错误；类型错误：'；浮动'；“对象不可编辑”；关于在句子中使用单词嵌入（word2vec）

python

Python kmeans.cluster（）给出错误；类型错误：'；浮动'；“对象不可编辑”；关于在句子中使用单词嵌入（word2vec）,python,k-means,word2vec,Python,K Means,Word2vec,我试图使用kmeans对句子进行聚类但是我没有为cluster（）获取正确的输入类型我尝试过使用列表Y和wordembedding创建的sent\u矢量器def，还尝试了dataframe版本的Y def sent_vectorizer(sent, model): #creates vectors for each tokenized sentence sent_vec =[] numw = 0 for w in sent: try:

我试图使用kmeans对句子进行聚类但是我没有为

cluster（）

获取正确的输入类型

我尝试过使用列表Y和

wordembedding

创建的

sent\u矢量器

def，还尝试了

dataframe

版本的Y

def sent_vectorizer(sent, model): #creates vectors for each tokenized sentence
    sent_vec =[]
    numw = 0
    for w in sent:
        try:
            if numw == 0:
                sent_vec = model[w]
            else:
                sent_vec = np.add(sent_vec, model[w]) #adds vectors of all words in a sentence over iterations
            numw+=1 #counts the number of words in all sentences
        except:
            pass

    return np.asarray(sent_vec) / numw

Y=[]
for sentence in all_words:
    Y.append(sent_vectorizer(sentence, model))   

print ("========================")
print (Y)

df_Y = pd.DataFrame(Y) 

NUM_CLUSTERS=3
kclusterer = KMeansClusterer(NUM_CLUSTERS, distance=nltk.cluster.util.cosine_distance, repeats=25,avoid_empty_clusters=True)
assigned_clusters = kclusterer.cluster(df_Y, assign_clusters=True)
print (assigned_clusters)

所有单词都有一个标记化句子列表：

[[cloud]、[technologies]、[still]、[building]、[strong]、[foundation]、[game]、[changers]、[hyper]、[Convergend]、[technology]、[sd]、[wan]、[Ping]、[college]、[college]、[security]、[plane]、[Protection]、[college]、[data]、[analytics]、[customer]、[experience]、[improvements]、[ar]、[vr]等，['none'，'timeframe'，'longer'，'term'，'cloud'，'services'，'ai'，'technologies'，'game'，'changer']，['cloud'，'finance'，'integrated'，'ship'，'management']，['MicroService'，'based'，'api'，'platform'，'omni'，'Channel']，['moving'，'erp'，'cloud'，['online'，'learning'，'open'，'Education'，'resources']，[‘在线’、‘计划’、‘满足’、‘需求’、‘今天’、‘学习者’、[‘自动化’、‘现有’、‘流程’]]

错误回溯：


    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-108-68f5fe386b54> in <module>
          1 NUM_CLUSTERS=3
          2 kclusterer = KMeansClusterer(NUM_CLUSTERS, distance=nltk.cluster.util.cosine_distance, repeats=25,avoid_empty_clusters=True)
    ----> 3 assigned_clusters = kclusterer.cluster(df_Y, assign_clusters=True)
          4 print (assigned_clusters)

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\cluster\util.py in cluster(self, vectors, assign_clusters, trace)
         60 
         61         # call abstract method to cluster the vectors
    ---> 62         self.cluster_vectorspace(vectors, trace)
         63 
         64         # assign the vectors to clusters

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\cluster\kmeans.py in cluster_vectorspace(self, vectors, trace)
         99             # effect the distance comparison)
        100             for means in meanss:
    --> 101                 means.sort(key=sum)
        102 
        103             # find the set of means that's minimally different from the others

    TypeError: 'float' object is not iterable ```

I also tried the following code and get error there as well:
    ```
    kmeans = cluster.KMeans(n_clusters=NUM_CLUSTERS)
    kmeans.fit(Y)
    ```
the error in this case is:
    ```
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-48-229383cd99be> in <module>
          1 kmeans = cluster.KMeans(n_clusters=NUM_CLUSTERS)
    ----> 2 kmeans.fit(Y)
          3 
          4 labels = kmeans.labels_
          5 centroids = kmeans.cluster_centers_

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\cluster\k_means_.py in fit(self, X, y, sample_weight)
        969                 tol=self.tol, random_state=random_state, copy_x=self.copy_x,
        970                 n_jobs=self.n_jobs, algorithm=self.algorithm,
    --> 971                 return_n_iter=True)
        972         return self
        973 

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\cluster\k_means_.py in k_means(X, n_clusters, sample_weight, init, precompute_distances, n_init, max_iter, verbose, tol, random_state, copy_x, n_jobs, algorithm, return_n_iter)
        309     order = "C" if copy_x else None
        310     X = check_array(X, accept_sparse='csr', dtype=[np.float64, np.float32],
    --> 311                     order=order, copy=copy_x)
        312     # verify that the number of samples given is larger than k
        313     if _num_samples(X) < n_clusters:

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
        525             try:
        526                 warnings.simplefilter('error', ComplexWarning)
    --> 527                 array = np.asarray(array, dtype=dtype, order=order)
        528             except ComplexWarning:
        529                 raise ValueError("Complex data not supported\n"

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
        536 
        537     """
    --> 538     return array(a, dtype, copy=False, order=order)
        539 
        540 

    ValueError: setting an array element with a sequence.


---------------------------------------------------------------------------
TypeError回溯（最近一次调用上次）
在里面
1个群集=3个
2 kclusterer=KMeansClusterer（NUM_CLUSTERS，distance=nltk.cluster.util.cosine_distance，repeats=25，avoid_empty_CLUSTERS=True）
---->3个已分配的集群=kclusterer.cluster（df_Y，assign_clusters=True）
4个打印（分配的_集群）
群集中的~\AppData\Local\Continuum\anaconda3\lib\site packages\nltk\cluster\util.py（self、vectors、assign\u clusters、trace）
60
61#调用抽象方法对向量进行聚类
--->62自簇向量空间（向量，轨迹）
63
64#将向量分配给簇
群集向量空间中的~\AppData\Local\Continuum\anaconda3\lib\site packages\nltk\cluster\kmeans.py（self、vectors、trace）
99（影响距离比较）
100表示平均数：
-->101表示排序（键=和）
102
103#找到与其他方法差异最小的方法集
TypeError:“float”对象不可编辑```
我还尝试了以下代码，并在那里得到了错误：
```
kmeans=cluster.kmeans（n_clusters=NUM_clusters）
kmeans.fit（Y）
```
这种情况下的错误是：
```
---------------------------------------------------------------------------
ValueError回溯（最近一次调用上次）
在里面
1 kmeans=cluster.kmeans（n_clusters=NUM_clusters）
---->2公里平均值。安装（Y）
3.
4标签=kmeans.labels_
5质心=kmeans.cluster\u中心_
~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\cluster\k\u表示适合（自身、X、y、样本重量）
969 tol=self.tol，random_state=random_state，copy_x=self.copy_x，
970 n_作业=self.n_作业，算法=self.algorithm，
-->971返回值（iter=True）
972回归自我
973
~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\cluster\k_means\k_means.py in k_means（X，n_clusters，sample_weight，init，precompute_distance，n_init，max_iter，verbose，tol，random_state，copy_X，n_jobs，algorithm，return_n_iter）
309 order=“C”如果复制，否则无
310 X=检查数组（X，接受稀疏=csr'，dtype=[np.float64，np.float32]，
-->311订单=订单，副本=副本x）
312#验证给定的样本数是否大于k
313如果_num_样本（X）527数组=np.asarray（数组，dtype=dtype，order=order）
528除复杂警告外：
529提升值错误（“不支持复杂数据\n”
asarray中的~\AppData\Local\Continuum\anaconda3\lib\site packages\numpy\core\numeric.py（a，数据类型，顺序）
536
537     """
-->538返回数组（a，数据类型，copy=False，order=order）
539
540
ValueError:使用序列设置数组元素。

你能发布错误的回溯吗检查

发送的类型

。并使用编辑选项将完整的回溯添加到问题中。类型（发送）是str，我已为两个代码添加了回溯