Genetic algorithm 用于分类和适应度评估的遗传算法

Genetic algorithm 用于分类和适应度评估的遗传算法,genetic-algorithm,Genetic Algorithm,我一直在读汤姆·米切尔(Tom Mitchell)关于机器学习的书,这是用于分类的遗传算法的一部分。他们举的例子相当简单,他们说如果我有以下几点: age: continuous. workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked. fnlwgt: continuous. education: Bachelors, Som

我一直在读汤姆·米切尔(Tom Mitchell)关于机器学习的书,这是用于分类的遗传算法的一部分。他们举的例子相当简单,他们说如果我有以下几点:

age: continuous.
workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.
fnlwgt: continuous.
education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.
education-num: continuous.
marital-status: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.
occupation: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.
relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.
race: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.
sex: Female, Male.
capital-gain: continuous.
capital-loss: continuous.
hours-per-week: continuous.
native-country

那么适应度函数可以定义为:

我想将此方法应用于人口普查收入数据的分类,其形式如下:

39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K
53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K
28, Private, 338409, Bachelors, 13, Married-civ-spouse, Prof-specialty, Wife, Black, Female, 0, 0, 40, Cuba, <=50K

最后,我想要的是有一个分类器,给定一些属性,它可以预测一个人的收入是小于还是大于50000。如何为这种情况建立适应度函数的模型?

通常,出于这种目的,使用遗传编程。下面是一篇描述这种情况的文章:

如果您正在寻找源代码,可以使用ricardo poli的Tiny GP:但是,首先必须将所有属性转换为数值


您也可以使用其他GP变体。我在这里做了一个多表达式编程的实现:

通常,出于这种目的,使用遗传编程。下面是一篇描述这种情况的文章:

如果您正在寻找源代码,可以使用ricardo poli的Tiny GP:但是,首先必须将所有属性转换为数值

您也可以使用其他GP变体。我实现了多表达式编程,如下所示: