Scala cassandra/datastax:以编程方式设置datastax包
以下spark submit脚本有效:Scala cassandra/datastax:以编程方式设置datastax包,scala,apache-spark,cassandra,datastax,Scala,Apache Spark,Cassandra,Datastax,以下spark submit脚本有效: nohup ./bin/spark-submit --jars ./ikoda/extrajars/ikoda_assembled_ml_nlp.jar,./ikoda/extrajars/stanford-corenlp-3.8.0.jar,./ikoda/extrajars/stanford-parser-3.8.0.jar \ --packages datastax:spark-cassandra-connector:2.0.1-s_2.11 \
nohup ./bin/spark-submit --jars ./ikoda/extrajars/ikoda_assembled_ml_nlp.jar,./ikoda/extrajars/stanford-corenlp-3.8.0.jar,./ikoda/extrajars/stanford-parser-3.8.0.jar \
--packages datastax:spark-cassandra-connector:2.0.1-s_2.11 \
--class ikoda.mlserver.Application \
--conf spark.cassandra.connection.host=192.168.0.33 \
--master local[*] ./ikoda/ikodaanalysis-mlserver-0.1.0.jar 1000 > ./logs/nohup.out &
在编程方面,我可以通过配置SparkContext来实现同样的功能:
val conf = new SparkConf().setMaster("local[4]").setAppName("MLPCURLModelGenerationDataStream")
conf.set("spark.streaming.stopGracefullyOnShutdown", "true")
conf.set("spark.cassandra.connection.host", sparkcassandraconnectionhost)
conf.set("spark.driver.maxResultSize", sparkdrivermaxResultSize)
conf.set("spark.network.timeout", sparknetworktimeout)
问题:
我可以通过编程方式添加-packagestax:spark cassandra连接器:2.0.1-s_2.11吗?如果是,如何选择?相应的选项是spark.jars.packages
conf.set(
"spark.jars.packages",
"datastax:spark-cassandra-connector:2.0.1-s_2.11")