Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 向CassandraConnector添加自定义编解码器_Scala_Apache Spark_Cassandra_Spark Streaming_Spark Cassandra Connector - Fatal编程技术网

Scala 向CassandraConnector添加自定义编解码器

Scala 向CassandraConnector添加自定义编解码器,scala,apache-spark,cassandra,spark-streaming,spark-cassandra-connector,Scala,Apache Spark,Cassandra,Spark Streaming,Spark Cassandra Connector,有没有办法在实例化CassandraConnector时注册自定义编解码器 每次调用cassandraConnector.withSessionDo val cassandraConnector = CassandraConnector(ssc.sparkContext.getConf) ... ... .mapPartitions(partition => { cassandraConnector.withSessionDo(session => { // regist

有没有办法在实例化
CassandraConnector
时注册自定义编解码器

每次调用
cassandraConnector.withSessionDo

val cassandraConnector = CassandraConnector(ssc.sparkContext.getConf)
...
...
.mapPartitions(partition => {
  cassandraConnector.withSessionDo(session => {
    // register custom codecs once for each partition so it isn't loaded as often for each data point
    if (partition.nonEmpty) {
      session.getCluster.getConfiguration.getCodecRegistry
        .register(new TimestampLongCodec)
        .register(new SummaryStatsBlobCodec)
        .register(new JavaHistogramBlobCodec)
    }
这样做似乎有点像反模式。它还真的堵塞了我们的日志,因为我们有一个spark流媒体服务,每30秒运行一次,它在我们的日志中填充以下内容:

16/11/01 14:14:44 WARN CodecRegistry: Ignoring codec SummaryStatsBlobCodec [blob <-> SummaryStats] because it collides with previously registered codec SummaryStatsBlobCodec [blob <-> SummaryStats]
16/11/01 14:14:44 WARN CodecRegistry: Ignoring codec JavaHistogramBlobCodec [blob <-> Histogram] because it collides with previously registered codec JavaHistogramBlobCodec [blob <-> Histogram]
16/11/01 14:14:44 WARN CodecRegistry: Ignoring codec TimestampLongCodec [timestamp <-> java.lang.Long] because it collides with previously registered codec TimestampLongCodec [timestamp <-> java.lang.Long]

此功能在本地运行,但部署到我们的mesos群集时,找不到编解码器。我假设这是因为它只在驱动程序中本地注册这些,而从不将它们添加到executors版本。

更好的方法是覆盖cassandra连接工厂,类似这样

import com.datastax.driver.core.Cluster
import com.datastax.spark.connector.cql.{CassandraConnectionFactory, CassandraConnectorConf, DefaultConnectionFactory}
object MyConnectionFactory extends CassandraConnectionFactory {
  override def createCluster(conf: CassandraConnectorConf): Cluster = {
    val cluster = DefaultConnectionFactory.createCluster(conf)
    cluster.getConfiguration.getCodecRegistry
      .register(new TimestampLongCodec)
      .register(new SummaryStatsBlobCodec)
      .register(new JavaHistogramBlobCodec)
    cluster
  }
}

并设置
spark.cassandra.connection.factory
参数以指向类

为什么不只注册一次,例如在实例化th连接器之后?他们将在会话中注册,因此如果您重新使用它,您应该会很好。我想我在我的问题中添加了一些关于这方面的内容,对不起。我已经编辑了我的问题。你就是这么想注册的吗?我真的很喜欢这个答案。谢谢你的帮助。
import com.datastax.driver.core.Cluster
import com.datastax.spark.connector.cql.{CassandraConnectionFactory, CassandraConnectorConf, DefaultConnectionFactory}
object MyConnectionFactory extends CassandraConnectionFactory {
  override def createCluster(conf: CassandraConnectorConf): Cluster = {
    val cluster = DefaultConnectionFactory.createCluster(conf)
    cluster.getConfiguration.getCodecRegistry
      .register(new TimestampLongCodec)
      .register(new SummaryStatsBlobCodec)
      .register(new JavaHistogramBlobCodec)
    cluster
  }
}