在R中创建集群_R_Scikit Learn_Cluster Analysis_Data Science

在R中创建集群

r scikit-learn

在R中创建集群,r,scikit-learn,cluster-analysis,data-science,R,Scikit Learn,Cluster Analysis,Data Science,我有一个df，看起来像： selection.body selection.hair selection.eyes selection.breasts selection.butt selection.skin normal blonde other large medium tanned normal blonde other

我有一个df，看起来像：

selection.body selection.hair selection.eyes selection.breasts selection.butt selection.skin         
normal         blonde          other             large         medium         tanned
normal         blonde          other                xl         medium         tanned
normal         blonde          other             large         medium         tanned
chubby         blonde           blue                xl          large         tanned
slim           blonde          other            medium          small          white

让我们将此数据集想象为调查的答案：

每一行代表单个响应者的选择，从一组封闭的首选项中选择他的首选项

我已经做的是检查每个选择的频率，但我想继续

我的目标是：

确定最常见的选择组合
根据此组合对用户进行分组
选择之间的相关性

谢谢你的提示

查找最常见的组合不是聚类，而是频繁项集挖掘

你试过apriori吗？

试一下

data.table

。下面的语法应该足以回答前两个问题：

dt[，（Count=.N），（col1，col2…等）

。对于第三个问题，请尝试从base和包中选择

？cor

。在我看来，您只需将任务交给其他人即可！我没有要求任何代码。我只是要求大家集思广益，并提出一些开始的建议。在我看来，你手头有很多时间，我想到的是找到多个分类变量之间的相关性，对数线性模型和质量，然后是下面的马赛克图：。非常有用。但事实上，这不是集群。先验是关于什么的？