Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/three.js/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Machine learning 我应该使用哪种聚类算法对职务进行聚类?_Machine Learning_Nlp_Scikit Learn - Fatal编程技术网

Machine learning 我应该使用哪种聚类算法对职务进行聚类?

Machine learning 我应该使用哪种聚类算法对职务进行聚类?,machine-learning,nlp,scikit-learn,Machine Learning,Nlp,Scikit Learn,我有一个包含职位的数据集,我想对它们进行聚类 职位包括: Automotive Service Worker Community Police Services Aide DEPUTY SHERIFF COUNSELOR, JUVENILE HALL Swimming Instructor FIREFIGHTER Porter Account Clerk Deputy Sheriff Assistant Retirement Analyst POLICE OFFICER III Patient

我有一个包含职位的数据集,我想对它们进行聚类

职位包括:

Automotive Service Worker
Community Police Services Aide
DEPUTY SHERIFF
COUNSELOR, JUVENILE HALL
Swimming Instructor
FIREFIGHTER
Porter
Account Clerk
Deputy Sheriff
Assistant Retirement Analyst
POLICE OFFICER III
Patient Care Assistant
Public Service Trainee
PUBLIC RELATIONS OFFICER
SPECIAL NURSE

我将清理标题(删除不需要的字符,将所有标题大写,等等),以使事情更容易处理。一旦我对语料库进行矢量化,维度就会非常大。对于这样的问题,您建议使用哪些群集ALG?KMeans在处理高维问题时是否表现良好?

使用。实现是可用的。

你能详细说明一下为什么Brown集群是解决这个问题的好选择吗?很难在帖子中解释。也许你需要看看柯林斯的讲座。如果我没记错的话,他在他的NLP课程中已经涵盖了这一点: