Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/neo4j/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Spring boot Mongo Spark Java连接器分组依据_Spring Boot_Apache Spark_Apache Spark Sql_Apache Spark Dataset - Fatal编程技术网

Spring boot Mongo Spark Java连接器分组依据

Spring boot Mongo Spark Java连接器分组依据,spring-boot,apache-spark,apache-spark-sql,apache-spark-dataset,Spring Boot,Apache Spark,Apache Spark Sql,Apache Spark Dataset,我在我的服务器上存储来自客户端移动应用程序的事件,事件存储是mongodb。 我有一个MongoSpark连接器,它获取这些事件的列表,并且应该使用RESTAPI显示它们。它应该在以后进行流式传输,但现在我正试图将其显示为单个呼叫 到目前为止,我已经编写了控制器,如下所示: @RestController @RequestMapping("/analytics") class EventController @Autowired constructor(val eventMongoService

我在我的服务器上存储来自客户端移动应用程序的事件,事件存储是
mongodb
。 我有一个MongoSpark连接器,它获取这些事件的列表,并且应该使用RESTAPI显示它们。它应该在以后进行流式传输,但现在我正试图将其显示为单个呼叫

到目前为止,我已经编写了控制器,如下所示:

@RestController
@RequestMapping("/analytics")
class EventController @Autowired constructor(val eventMongoServiceImpl: EventMongoServiceImpl,
                                             val javaSparkContext: JavaSparkContext) {

    @GetMapping("/event")
    fun getEvent(): ResponseEntity<EventResponse> {
        val customRdd: JavaMongoRDD<Document> = MongoSpark.load(javaSparkContext)
        val toDF = customRdd.toDF()
    }
}
/* 1 */
{
    "_id" : ObjectId("5e61e38eb8425d3b1c7679ea"),
    "name" : "Event A",
    "description" : "Event A Description",
    "date" : ISODate("2020-03-05T18:30:00.000Z"),
    "_class" : "x"
}

/* 2 */
{
    "_id" : ObjectId("5e61e416b8425d3b1c7679ec"),
    "name" : "Event A",
    "description" : "Event A Description",
    "date" : ISODate("2020-03-05T18:30:00.000Z"),
    "_class" : "x"
}

/* 3 */
{
    "_id" : ObjectId("5e61e47fb8425d3b1c7679ee"),
    "name" : "Event A",
    "description" : "Event A Description",
    "date" : ISODate("2020-03-05T18:30:00.000Z"),
    "_class" : "x"
}
我的数据集如下所示:

@RestController
@RequestMapping("/analytics")
class EventController @Autowired constructor(val eventMongoServiceImpl: EventMongoServiceImpl,
                                             val javaSparkContext: JavaSparkContext) {

    @GetMapping("/event")
    fun getEvent(): ResponseEntity<EventResponse> {
        val customRdd: JavaMongoRDD<Document> = MongoSpark.load(javaSparkContext)
        val toDF = customRdd.toDF()
    }
}
/* 1 */
{
    "_id" : ObjectId("5e61e38eb8425d3b1c7679ea"),
    "name" : "Event A",
    "description" : "Event A Description",
    "date" : ISODate("2020-03-05T18:30:00.000Z"),
    "_class" : "x"
}

/* 2 */
{
    "_id" : ObjectId("5e61e416b8425d3b1c7679ec"),
    "name" : "Event A",
    "description" : "Event A Description",
    "date" : ISODate("2020-03-05T18:30:00.000Z"),
    "_class" : "x"
}

/* 3 */
{
    "_id" : ObjectId("5e61e47fb8425d3b1c7679ee"),
    "name" : "Event A",
    "description" : "Event A Description",
    "date" : ISODate("2020-03-05T18:30:00.000Z"),
    "_class" : "x"
}

您应该能够在dataframe上生成这样的内容

val aggDf = toDf
 .groupBy("name")
 .agg(count("name"), max("description"))

现在,在新的数据帧
aggDf
上,您可以执行
aggDf.toJson
并获得结果。如果列与输出不匹配,您可以使用
withColumnRenamed

对其进行调整。我遇到异常
错误:(36,18)Kotlin:以下函数都不能使用提供的参数调用:public open fun agg(p0:Column!,vararg p1:Column!):Dataset!在org.apache.spark.sql.RelationalGroupedDataset public open fun agg(p0:Column!,p1:Seq!):数据集中定义!定义于org.apache.spark.sql.RelationalGroupedDataset public open fun agg(p0:Tuple2!,p1:Seq!):数据集!在org.apache.spark.sql.RelationalGroupedDataset中定义,我在Scala中编写了它,将其转换为Java,它应该非常类似。