mllib spark/scala中对fp增长规则的支持和提升
我想利用fp-growth提取对生成的关联规则的支持和提升。找到具有以下代码的规则后,我手动查看事务并计算支持和提升。我想知道是否有一种更合理的方法来提取这些信息。谢谢mllib spark/scala中对fp增长规则的支持和提升,scala,apache-spark,apache-spark-mllib,Scala,Apache Spark,Apache Spark Mllib,我想利用fp-growth提取对生成的关联规则的支持和提升。找到具有以下代码的规则后,我手动查看事务并计算支持和提升。我想知道是否有一种更合理的方法来提取这些信息。谢谢 val fpg = new FPGrowth() .setMinSupport(0.2) .setNumPartitions(10) val model = fpg.run(transactions) model.freqItemsets.collect().foreach { itemset => prin
val fpg = new FPGrowth()
.setMinSupport(0.2)
.setNumPartitions(10)
val model = fpg.run(transactions)
model.freqItemsets.collect().foreach { itemset =>
println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)
}
val minConfidence = 0.8
model.generateAssociationRules(minConfidence).collect().foreach { rule =>
println(
rule.antecedent.mkString("[", ",", "]")
+ " => " + rule.consequent .mkString("[", ",", "]")
+ ", " + rule.confidence)
}
嗯,不优雅,但我就是这么做的
val freqs = fpgrowth_model(transactions, min_supp=supp)
val supps = freqs.withColumn("support", $"freq" / total_transactions)
val rules = get_rules(transactions, min_supp=supp, min_confidence=conf)
val cross_df = supps.join(rules, $"items" === $"consequent")
.withColumn("lift",$"confidence" / $"support")
有一张JIRA的票,我只是要求重新打开它。看见