Python kmeans.cluster()给出错误;类型错误:';浮动';“对象不可编辑”;关于在句子中使用单词嵌入(word2vec)

Python kmeans.cluster()给出错误;类型错误:';浮动';“对象不可编辑”;关于在句子中使用单词嵌入(word2vec),python,k-means,word2vec,Python,K Means,Word2vec,我试图使用kmeans对句子进行聚类 但是我没有为cluster()获取正确的输入类型 我尝试过使用列表Y和wordembedding创建的sent\u矢量器def,还尝试了dataframe版本的Y def sent_vectorizer(sent, model): #creates vectors for each tokenized sentence sent_vec =[] numw = 0 for w in sent: try:

我试图使用kmeans对句子进行聚类 但是我没有为
cluster()
获取正确的输入类型

我尝试过使用列表Y和
wordembedding
创建的
sent\u矢量器
def,还尝试了
dataframe
版本的Y

def sent_vectorizer(sent, model): #creates vectors for each tokenized sentence
    sent_vec =[]
    numw = 0
    for w in sent:
        try:
            if numw == 0:
                sent_vec = model[w]
            else:
                sent_vec = np.add(sent_vec, model[w]) #adds vectors of all words in a sentence over iterations
            numw+=1 #counts the number of words in all sentences
        except:
            pass

    return np.asarray(sent_vec) / numw

Y=[]
for sentence in all_words:
    Y.append(sent_vectorizer(sentence, model))   

print ("========================")
print (Y)

df_Y = pd.DataFrame(Y) 

NUM_CLUSTERS=3
kclusterer = KMeansClusterer(NUM_CLUSTERS, distance=nltk.cluster.util.cosine_distance, repeats=25,avoid_empty_clusters=True)
assigned_clusters = kclusterer.cluster(df_Y, assign_clusters=True)
print (assigned_clusters)
所有单词都有一个标记化句子列表:

[[cloud]、[technologies]、[still]、[building]、[strong]、[foundation]、[game]、[changers]、[hyper]、[Convergend]、[technology]、[sd]、[wan]、[Ping]、[college]、[college]、[security]、[plane]、[Protection]、[college]、[data]、[analytics]、[customer]、[experience]、[improvements]、[ar]、[vr]等,['none','timeframe','longer','term','cloud','services','ai','technologies','game','changer'],['cloud','finance','integrated','ship','management'],['MicroService','based','api','platform','omni','Channel'],['moving','erp','cloud',['online','learning','open','Education','resources'],[‘在线’、‘计划’、‘满足’、‘需求’、‘今天’、‘学习者’、[‘自动化’、‘现有’、‘流程’]]

错误回溯:


    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-108-68f5fe386b54> in <module>
          1 NUM_CLUSTERS=3
          2 kclusterer = KMeansClusterer(NUM_CLUSTERS, distance=nltk.cluster.util.cosine_distance, repeats=25,avoid_empty_clusters=True)
    ----> 3 assigned_clusters = kclusterer.cluster(df_Y, assign_clusters=True)
          4 print (assigned_clusters)

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\cluster\util.py in cluster(self, vectors, assign_clusters, trace)
         60 
         61         # call abstract method to cluster the vectors
    ---> 62         self.cluster_vectorspace(vectors, trace)
         63 
         64         # assign the vectors to clusters

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\nltk\cluster\kmeans.py in cluster_vectorspace(self, vectors, trace)
         99             # effect the distance comparison)
        100             for means in meanss:
    --> 101                 means.sort(key=sum)
        102 
        103             # find the set of means that's minimally different from the others

    TypeError: 'float' object is not iterable ```

I also tried the following code and get error there as well:
    ```
    kmeans = cluster.KMeans(n_clusters=NUM_CLUSTERS)
    kmeans.fit(Y)
    ```
the error in this case is:
    ```
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-48-229383cd99be> in <module>
          1 kmeans = cluster.KMeans(n_clusters=NUM_CLUSTERS)
    ----> 2 kmeans.fit(Y)
          3 
          4 labels = kmeans.labels_
          5 centroids = kmeans.cluster_centers_

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\cluster\k_means_.py in fit(self, X, y, sample_weight)
        969                 tol=self.tol, random_state=random_state, copy_x=self.copy_x,
        970                 n_jobs=self.n_jobs, algorithm=self.algorithm,
    --> 971                 return_n_iter=True)
        972         return self
        973 

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\cluster\k_means_.py in k_means(X, n_clusters, sample_weight, init, precompute_distances, n_init, max_iter, verbose, tol, random_state, copy_x, n_jobs, algorithm, return_n_iter)
        309     order = "C" if copy_x else None
        310     X = check_array(X, accept_sparse='csr', dtype=[np.float64, np.float32],
    --> 311                     order=order, copy=copy_x)
        312     # verify that the number of samples given is larger than k
        313     if _num_samples(X) < n_clusters:

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
        525             try:
        526                 warnings.simplefilter('error', ComplexWarning)
    --> 527                 array = np.asarray(array, dtype=dtype, order=order)
        528             except ComplexWarning:
        529                 raise ValueError("Complex data not supported\n"

    ~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
        536 
        537     """
    --> 538     return array(a, dtype, copy=False, order=order)
        539 
        540 

    ValueError: setting an array element with a sequence.


---------------------------------------------------------------------------
TypeError回溯(最近一次调用上次)
在里面
1个群集=3个
2 kclusterer=KMeansClusterer(NUM_CLUSTERS,distance=nltk.cluster.util.cosine_distance,repeats=25,avoid_empty_CLUSTERS=True)
---->3个已分配的集群=kclusterer.cluster(df_Y,assign_clusters=True)
4个打印(分配的_集群)
群集中的~\AppData\Local\Continuum\anaconda3\lib\site packages\nltk\cluster\util.py(self、vectors、assign\u clusters、trace)
60
61#调用抽象方法对向量进行聚类
--->62自簇向量空间(向量,轨迹)
63
64#将向量分配给簇
群集向量空间中的~\AppData\Local\Continuum\anaconda3\lib\site packages\nltk\cluster\kmeans.py(self、vectors、trace)
99(影响距离比较)
100表示平均数:
-->101表示排序(键=和)
102
103#找到与其他方法差异最小的方法集
TypeError:“float”对象不可编辑```
我还尝试了以下代码,并在那里得到了错误:
```
kmeans=cluster.kmeans(n_clusters=NUM_clusters)
kmeans.fit(Y)
```
这种情况下的错误是:
```
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在里面
1 kmeans=cluster.kmeans(n_clusters=NUM_clusters)
---->2公里平均值。安装(Y)
3.
4标签=kmeans.labels_
5质心=kmeans.cluster\u中心_
~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\cluster\k\u表示适合(自身、X、y、样本重量)
969 tol=self.tol,random_state=random_state,copy_x=self.copy_x,
970 n_作业=self.n_作业,算法=self.algorithm,
-->971返回值(iter=True)
972回归自我
973
~\AppData\Local\Continuum\anaconda3\lib\site packages\sklearn\cluster\k_means\k_means.py in k_means(X,n_clusters,sample_weight,init,precompute_distance,n_init,max_iter,verbose,tol,random_state,copy_X,n_jobs,algorithm,return_n_iter)
309 order=“C”如果复制,否则无
310 X=检查数组(X,接受稀疏=csr',dtype=[np.float64,np.float32],
-->311订单=订单,副本=副本x)
312#验证给定的样本数是否大于k
313如果_num_样本(X)527数组=np.asarray(数组,dtype=dtype,order=order)
528除复杂警告外:
529提升值错误(“不支持复杂数据\n”
asarray中的~\AppData\Local\Continuum\anaconda3\lib\site packages\numpy\core\numeric.py(a,数据类型,顺序)
536
537     """
-->538返回数组(a,数据类型,copy=False,order=order)
539
540
ValueError:使用序列设置数组元素。

你能发布错误的回溯吗检查
发送的类型
。并使用编辑选项将完整的回溯添加到问题中。类型(发送)是str,我已为两个代码添加了回溯