Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/306.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 散点图中的数据大小问题_Python_Matplotlib_Knn - Fatal编程技术网

Python 散点图中的数据大小问题

Python 散点图中的数据大小问题,python,matplotlib,knn,Python,Matplotlib,Knn,我想将Kmeans应用于可从以下网址获得的全部销售客户数据: 到目前为止,我的代码如下: import pandas as pd import numpy as np from sklearn.preprocessing import MinMaxScaler from sklearn.cluster import KMeans import matplotlib.pyplot as plt data = pd.read_csv('Wholesale customers data.csv')

我想将Kmeans应用于可从以下网址获得的全部销售客户数据:

到目前为止,我的代码如下:

import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

data = pd.read_csv('Wholesale customers data.csv')
cont_features = ['Fresh', 'Milk', 'Grocery', 'Frozen', 'Detergents_Paper', 'Delicassen']
dataS=data[cont_features]
mms = MinMaxScaler()
mms.fit(dataS)
data_norm = mms.transform(dataS)
dataNorm=pd.DataFrame(data_norm,columns=cont_features)
kmeans = KMeans(n_clusters=5).fit(data)
centroids = kmeans.cluster_centers_
labels = kmeans.predict(data)
data=data.iloc[:,[3,4]].values #only to select two features for visualizing the scatter plot
plt.scatter(data[labels==0, 0], data[labels==0, 1], s=10, c='red', label ='Cluster 1')
plt.scatter(data[labels==1, 0], data[labels==1, 1], s=10, c='blue', label ='Cluster 2')
plt.scatter(data[labels==2, 0], data[labels==2, 1], s=10, c='green', label ='Cluster 3')
plt.scatter(data[labels==3, 0], data[labels==3, 1], s=10, c='cyan', label ='Cluster 4')
plt.scatter(data[labels==4, 0], data[labels==3, 1], s=10, c='cyan', label ='Cluster 5')
    
plt.scatter(centroids[:, 0], centroids[:, 1], s=10, c='yellow', label = 'Centroids')
plt.title('Clusters')
plt.xlabel('Frozen')
plt.ylabel('Detergent')
plt.show()
x and y must be the same size
问题是,当我想运行代码时,出现的错误如下:

import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

data = pd.read_csv('Wholesale customers data.csv')
cont_features = ['Fresh', 'Milk', 'Grocery', 'Frozen', 'Detergents_Paper', 'Delicassen']
dataS=data[cont_features]
mms = MinMaxScaler()
mms.fit(dataS)
data_norm = mms.transform(dataS)
dataNorm=pd.DataFrame(data_norm,columns=cont_features)
kmeans = KMeans(n_clusters=5).fit(data)
centroids = kmeans.cluster_centers_
labels = kmeans.predict(data)
data=data.iloc[:,[3,4]].values #only to select two features for visualizing the scatter plot
plt.scatter(data[labels==0, 0], data[labels==0, 1], s=10, c='red', label ='Cluster 1')
plt.scatter(data[labels==1, 0], data[labels==1, 1], s=10, c='blue', label ='Cluster 2')
plt.scatter(data[labels==2, 0], data[labels==2, 1], s=10, c='green', label ='Cluster 3')
plt.scatter(data[labels==3, 0], data[labels==3, 1], s=10, c='cyan', label ='Cluster 4')
plt.scatter(data[labels==4, 0], data[labels==3, 1], s=10, c='cyan', label ='Cluster 5')
    
plt.scatter(centroids[:, 0], centroids[:, 1], s=10, c='yellow', label = 'Centroids')
plt.title('Clusters')
plt.xlabel('Frozen')
plt.ylabel('Detergent')
plt.show()
x and y must be the same size
我的情节是这样的:

我找不到这个错误。有什么帮助吗?

plt.scatter(数据[labels==4,0],数据[labels==4,1],s=10,c='cyan',label='Cluster 5')
这里是标签部分的3

你的阴谋


我建议您在3D中绘制点,以便更好地可视化。谢谢@ombk。您能给出一些提示吗?这很简单。检查matplotlib 3D打印。