Apache spark 无法从主机的传输池获取传输
我正试图从IBM Analytics Engine上的Spark Structured Streaming向IBM Compose Elasticsearch sink写信。我的火花代码:Apache spark 无法从主机的传输池获取传输,apache-spark,elasticsearch,ibm-cloud,compose,analytics-engine,Apache Spark,elasticsearch,Ibm Cloud,Compose,Analytics Engine,我正试图从IBM Analytics Engine上的Spark Structured Streaming向IBM Compose Elasticsearch sink写信。我的火花代码: dataDf .writeStream .outputMode(OutputMode.Append) .format("org.elasticsearch.spark.sql") .queryName("ElasticSink") .option("checkpointLocatio
dataDf
.writeStream
.outputMode(OutputMode.Append)
.format("org.elasticsearch.spark.sql")
.queryName("ElasticSink")
.option("checkpointLocation", s"${s3Url}/checkpoint_elasticsearch")
.option("es.nodes", "xxx1.composedb.com,xxx2.composedb.com")
.option("es.port", "xxxx")
.option("es.net.http.auth.user", "admin")
.option("es.net.http.auth.pass", "xxxx")
.option("es.net.ssl", true)
.option("es.nodes.wan.only", true)
.option("es.net.ssl.truststore.location", SparkFiles.getRootDirectory() + "/my.jks")
.option("es.net.ssl.truststore.pass", "xxxx")
.start("test/broadcast")
但是,我收到以下例外情况:
org.elasticsearch.hadoop.EsHadoopException: Could not get a Transport from the Transport Pool for host [xxx2.composedb.com:xxxx]
at org.elasticsearch.hadoop.rest.pooling.PooledHttpTransportFactory.borrowFrom(PooledHttpTransportFactory.java:106)
at org.elasticsearch.hadoop.rest.pooling.PooledHttpTransportFactory.create(PooledHttpTransportFactory.java:55)
at org.elasticsearch.hadoop.rest.NetworkClient.selectNextNode(NetworkClient.java:99)
at org.elasticsearch.hadoop.rest.NetworkClient.<init>(NetworkClient.java:82)
at org.elasticsearch.hadoop.rest.NetworkClient.<init>(NetworkClient.java:59)
at org.elasticsearch.hadoop.rest.RestClient.<init>(RestClient.java:94)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverEsVersion(InitializationUtils.java:317)
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:576)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58)
at org.elasticsearch.spark.sql.streaming.EsStreamQueryWriter.run(EsStreamQueryWriter.scala:41)
at org.elasticsearch.spark.sql.streaming.EsSparkSqlStreamingSink$$anonfun$addBatch$2$$anonfun$2.apply(EsSparkSqlStreamingSink.scala:52)
at org.elasticsearch.spark.sql.streaming.EsSparkSqlStreamingSink$$anonfun$addBatch$2$$anonfun$2.apply(EsSparkSqlStreamingSink.scala:51)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
org.elasticsearch.hadoop.eshadoop异常:无法从主机[xxx2.composedb.com:xxxx]的传输池中获取传输
在org.elasticsearch.hadoop.rest.pooling.PooledHttpTransportFactory.from(PooledHttpTransportFactory.java:106)上
位于org.elasticsearch.hadoop.rest.pooling.PooledHttpTransportFactory.create(PooledHttpTransportFactory.java:55)
在org.elasticsearch.hadoop.rest.NetworkClient.selectNextNode(NetworkClient.java:99)
位于org.elasticsearch.hadoop.rest.NetworkClient.(NetworkClient.java:82)
位于org.elasticsearch.hadoop.rest.NetworkClient(NetworkClient.java:59)
位于org.elasticsearch.hadoop.rest.RestClient.(RestClient.java:94)
位于org.elasticsearch.hadoop.rest.InitializationUtils.DiscoveryVersion(InitializationUtils.java:317)
位于org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:576)
在org.elasticsearch.spark.rdd.esrdwriter.write(esrddrwriter.scala:58)上
位于org.elasticsearch.spark.sql.streaming.EsStreamQueryWriter.run(EsStreamQueryWriter.scala:41)
在org.elasticsearch.spark.sql.streaming.EsSparkSqlStreamingSink$$anonfun$addBatch$2$$anonfun$2.apply(EsSparkSqlStreamingSink.scala:52)上
在org.elasticsearch.spark.sql.streaming.EsSparkSqlStreamingSink$$anonfun$addBatch$2$$anonfun$2.apply上(EsSparkSqlStreamingSink.scala:51)
位于org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
位于org.apache.spark.scheduler.Task.run(Task.scala:109)
位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:345)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
运行(Thread.java:745)
有什么想法吗 我修改了Elasticsearch hadoop库以输出异常,根本问题是找不到信任库:
org.elasticsearch.hadoop.EsHadoopIllegalStateException: Cannot initialize SSL - Expected to find keystore file at [/tmp/spark-e2203f9c-4f0f-4929-870f-d491fce0ad06/userFiles-62df70b0-7b76-403d-80a1-8845fd67e6a0/my.jks] but was unable to. Make sure that it is available on the classpath, or if not, that you have specified a valid URI.
at org.elasticsearch.hadoop.rest.pooling.PooledHttpTransportFactory.borrowFrom(PooledHttpTransportFactory.java:106)