mllib spark/scala中对fp增长规则的支持和提升

mllib spark/scala中对fp增长规则的支持和提升,scala,apache-spark,apache-spark-mllib,Scala,Apache Spark,Apache Spark Mllib,我想利用fp-growth提取对生成的关联规则的支持和提升。找到具有以下代码的规则后,我手动查看事务并计算支持和提升。我想知道是否有一种更合理的方法来提取这些信息。谢谢 val fpg = new FPGrowth() .setMinSupport(0.2) .setNumPartitions(10) val model = fpg.run(transactions) model.freqItemsets.collect().foreach { itemset => prin

我想利用fp-growth提取对生成的关联规则的支持和提升。找到具有以下代码的规则后,我手动查看事务并计算支持和提升。我想知道是否有一种更合理的方法来提取这些信息。谢谢

val fpg = new FPGrowth()
  .setMinSupport(0.2)
  .setNumPartitions(10)
val model = fpg.run(transactions)

model.freqItemsets.collect().foreach { itemset =>
  println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)
}

val minConfidence = 0.8
model.generateAssociationRules(minConfidence).collect().foreach { rule =>
  println(
    rule.antecedent.mkString("[", ",", "]")
      + " => " + rule.consequent .mkString("[", ",", "]")
      + ", " + rule.confidence)
}

嗯,不优雅,但我就是这么做的

val freqs = fpgrowth_model(transactions, min_supp=supp)
val supps = freqs.withColumn("support", $"freq" / total_transactions)
val rules = get_rules(transactions, min_supp=supp, min_confidence=conf)
val cross_df = supps.join(rules, $"items" === $"consequent")
               .withColumn("lift",$"confidence" / $"support")

有一张JIRA的票,我只是要求重新打开它。看见