Java EshadopopinvalidRequest:[发布]
您好,我正在尝试从ElasticSearch读取数据。我的代码如下:Java EshadopopinvalidRequest:[发布],java,hadoop,elasticsearch,apache-spark,Java,Hadoop,elasticsearch,Apache Spark,您好,我正在尝试从ElasticSearch读取数据。我的代码如下: SparkConf sparkConf = new SparkConf() .setAppName("Spark ES Integration").setMaster("local"); // .set("spark.ui.port", "7077"); sparkConf.set("es.nodes", "xx.xx.xx.xx"); sparkConf.set("es.port", "9200"); spa
SparkConf sparkConf = new SparkConf()
.setAppName("Spark ES Integration").setMaster("local");
// .set("spark.ui.port", "7077");
sparkConf.set("es.nodes", "xx.xx.xx.xx");
sparkConf.set("es.port", "9200");
sparkConf.set("es.resource", "blog/post");
sparkConf.set("es.query", "?q=user:dilbert");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
JavaPairRDD<String, Map<String, Object>> esRDD = JavaEsSpark.esRDD(sc);
System.out.println("**********" + esRDD.count()); // Prints 1 - Only one record is present
System.out.println("**********" + esRDD.first()); // Throws exception
程序输出计数正确,但在从RDD获取第一条记录时引发异常。
当我只想读取数据时,为什么请求是POST类型?
这里怎么了?配置中提到的查询正确执行
例外是
org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest:[blog/POST/_search?search_type=scan&scroll=5m&size=50&preference=_shards:0;_only_节点:03oYNb7BTjG2vOzo9lSnzQ]失败;服务器[null]返回[400 |错误请求:]
.
.
3091[Executor task launch worker-0]错误org.apache.spark.Executor.Executor-任务0.0中的异常,阶段0.0 TID 0
org.apache.spark.util.TaskCompletionListenerException:[blog/POST/_search?search_type=scan&scroll=5m&size=50&preference=_shards:0;_only_节点:03oYNb7BTjG2vOzo9lSnzQ]失败;服务器[null]返回[400 |错误请求:]
.
.
3099[task-result-getter-0]警告org.apache.spark.scheduler.TaskSetManager-任务丢失
您正在使用哪个连接器?ElasticSearch版本是1.7 ES Hadoop版本是2.2.0