I';我试图将二叉树(party)的分支列表到R中的数据帧中
在用party::ctree()拟合树之后,我想创建一个表来描述分支的特征 我已经拟合了这些变量I';我试图将二叉树(party)的分支列表到R中的数据帧中,r,decision-tree,party,R,Decision Tree,Party,在用party::ctree()拟合树之后,我想创建一个表来描述分支的特征 我已经拟合了这些变量 > summary(juridicos_segmentar) actividad_economica Financieras : 89 Gubernamental : 48 Sector Primario : 34 Sector Secundario:596 Sector Terciario :669
> summary(juridicos_segmentar)
actividad_economica
Financieras : 89
Gubernamental : 48
Sector Primario : 34
Sector Secundario:596
Sector Terciario :669
ingresos_cut
(-Inf,1.03e+08] :931
(1.03e+08,4.19e+08]:252
(4.19e+08,1.61e+09]:144
(1.61e+09, Inf] :109
egresos_cut
(-Inf,6e+07] :922
(6e+07,2.67e+08] :256
(2.67e+08,1.03e+09]:132
(1.03e+09, Inf] :126
patrimonio_cut
(-Inf,2.72e+08] :718
(2.72e+08,1.46e+09]:359
(1.46e+09,5.83e+09]:191
(5.83e+09, Inf] :168
op_ingreso_cut
(-Inf,3] :1308
(3,7] : 53
(7,22] : 44
(22, Inf]: 31
第一个是分类的,其他的是顺序的,我把它们放在一起
另一个因素变量
> summary(as.factor(segmento))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
27 66 30 39 36 33 39 15 84 70 271 247 101 34 100 74 47 25 48 50
我使用了以下代码
library(party)
fit_jur <- ctree(cluster ~ .,
data=data.frame(juridicos_segmentar, cluster=as.factor(segmento)))
问题是有几个叶子需要表征,有时一个变量会在一条路径中出现多次,因此我希望与条件相交,即与范围相交
我想到了data.tree::ToDataFrameTable
,但我不知道它如何与方一起工作
非常感谢大家
库(partykit)
fit_jur您可以将party类(来自partykit)和BinaryTree(来自party)转换为data.tree,并使用它转换为数据帧和/或打印。例如:
actividad economica ingresos (rango) egresos (rango) patrimonio (rango) operaciones de ingreso segmento
Sector Primario <=261.000.000 18
Sector Primario >261.000.000 20
library(party)
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
tree <- as.Node(airct)
df <- ToDataFrameTable(tree,
"pathString",
"label",
criterion = function(x) round(x$criterion$maxcriterion, 3),
statistic = function(x) round(max(x$criterion$statistic), 3)
)
df
绘图:
#print subtree
subtree <- Clone(tree$`2`)
SetNodeStyle(subtree,
style = "filled,rounded",
shape = "box",
fillcolor = "GreenYellow",
fontname = "helvetica",
label = function(x) x$label,
tooltip = function(x) round(x$criterion$maxcriterion, 3))
plot(subtree)
#打印子树
子树这可能类似于你想做的:谢谢你,阿希姆。我正在研究这个解决方案。谢谢你,克里斯托夫。但在这种情况下,我需要路径的规则。请注意,权重不是属于要拟合的变量。我需要点什么对不起。类似这样的。我需要你最后拟合臭氧预测的每个变量的决策规则。风|温度|月|日|太阳。R |臭氧>z | w |预测
library(party)
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
tree <- as.Node(airct)
df <- ToDataFrameTable(tree,
"pathString",
"label",
criterion = function(x) round(x$criterion$maxcriterion, 3),
statistic = function(x) round(max(x$criterion$statistic), 3)
)
df
pathString label criterion statistic
1 1/2/3 weights = 10 0.000 0.000
2 1/2/4/5 weights = 48 0.936 6.141
3 1/2/4/6 weights = 21 0.891 5.182
4 1/7/8 weights = 30 0.675 3.159
5 1/7/9 weights = 7 0.000 0.000
#print subtree
subtree <- Clone(tree$`2`)
SetNodeStyle(subtree,
style = "filled,rounded",
shape = "box",
fillcolor = "GreenYellow",
fontname = "helvetica",
label = function(x) x$label,
tooltip = function(x) round(x$criterion$maxcriterion, 3))
plot(subtree)