Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark spark solr CDH作业在solrj.impl.HttpClientUtil处出现java.lang.VerifyError失败_Apache Spark_Solr_Sbt_Solrj_Cloudera Cdh - Fatal编程技术网

Apache spark spark solr CDH作业在solrj.impl.HttpClientUtil处出现java.lang.VerifyError失败

Apache spark spark solr CDH作业在solrj.impl.HttpClientUtil处出现java.lang.VerifyError失败,apache-spark,solr,sbt,solrj,cloudera-cdh,Apache Spark,Solr,Sbt,Solrj,Cloudera Cdh,我正在运行CDH 5.7.1并提交一个Spark作业,该作业在纱线集群模式下使用2.0.1,该作业由于以下错误而失败,该错误是由org.apache.solr.client.solrj.impl.HttpClientUtil中的二进制兼容性问题(可能)引起的。该作业在本地模式下运行良好,允许我将Spark数据帧索引到Solr 6.1.0 我正在构建使用插件部署到集群的胖jar,请参见下面的build.sbt配置 我已经尝试了各种方法来解决这个问题,例如使用spark submit指定--jars

我正在运行CDH 5.7.1并提交一个Spark作业,该作业在纱线集群模式下使用2.0.1,该作业由于以下错误而失败,该错误是由org.apache.solr.client.solrj.impl.HttpClientUtil中的二进制兼容性问题(可能)引起的。该作业在本地模式下运行良好,允许我将Spark数据帧索引到Solr 6.1.0

我正在构建使用插件部署到集群的胖jar,请参见下面的build.sbt配置

我已经尝试了各种方法来解决这个问题,例如使用spark submit指定--jars参数来提供适当的solr solrj和httpclient JAR,并且多次更改build.sbt来明确指定solrj和httpclient的版本,同样也没有运气

如有任何见解或解决方案,将不胜感激

来自纱线簇模式的火花作业历史错误:

16/07/21 02:10:07 ERROR yarn.ApplicationMaster: User class threw exception: com.google.common.util.concurrent.ExecutionError: java.lang.VerifyError: Bad return type
Exception Details:
  Location:
    org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;Lorg/apache/http/conn/ClientConnectionManager;)Lorg/apache/http/impl/client/CloseableHttpClient; @58: areturn
  Reason:
    Type 'org/apache/http/impl/client/DefaultHttpClient' (current frame, stack[0]) is not assignable to 'org/apache/http/impl/client/CloseableHttpClient' (from method signature)
  Current Frame:
    bci: @58
    flags: { }
    locals: { 'org/apache/solr/common/params/SolrParams', 'org/apache/http/conn/ClientConnectionManager', 'org/apache/solr/common/params/ModifiableSolrParams', 'org/apache/http/impl/client/DefaultHttpClient' }
    stack: { 'org/apache/http/impl/client/DefaultHttpClient' }
  Bytecode:
    0x0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601
    0x0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
    0x0000020: b600 0a2c b600 0bb6 000c b900 0d02 002b
    0x0000030: b800 104e 2d2c b800 0f2d b0            
  Stackmap Table:
    append_frame(@47,Object[#143])
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2232)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
    at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
    at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
    at com.lucidworks.spark.util.SolrSupport$.getCachedCloudClient(SolrSupport.scala:93)
    at com.lucidworks.spark.util.SolrSupport$.getSolrBaseUrl(SolrSupport.scala:97)
    at com.lucidworks.spark.util.SolrQuerySupport$.getUniqueKey(SolrQuerySupport.scala:82)
    at com.lucidworks.spark.rdd.SolrRDD.<init>(SolrRDD.scala:32)
    at com.lucidworks.spark.SolrRelation.<init>(SolrRelation.scala:63)
    at solr.DefaultSource.createRelation(DefaultSource.scala:26)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
    at com.tempurer.intelligence.searchindexer.SolrIndexer3$.runJob(SolrIndexer3.scala:127)
    at com.tempurer.intelligence.searchindexer.SolrIndexer3$.main(SolrIndexer3.scala:80)
    at com.tempurer.intelligence.searchindexer.SolrIndexer3.main(SolrIndexer3.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
Caused by: java.lang.VerifyError: Bad return type
16/07/21 02:10:07 ERROR.ApplicationMaster:用户类引发异常:com.google.common.util.concurrent.ExecutionError:java.lang.VerifyError:错误返回类型
例外情况详情:
地点:
org/apache/solr/client/solrj/impl/HttpClientUtil.createClient(Lorg/apache/solr/common/params/SolrParams;Lorg/apache/http/conn/ClientConnectionManager;)Lorg/apache/http/impl/client/CloseableHttpClient@58:轮到你了
原因:
类型“org/apache/http/impl/client/DefaultHttpClient”(当前帧,堆栈[0])不可分配给“org/apache/http/impl/client/CloseableHttpClient”(来自方法签名)
当前帧:
密件抄送:@58
标志:{}
本地人:{'org/apache/solr/common/params/SolrParams','org/apache/http/conn/ClientConnectionManager','org/apache/solr/common/params/ModifiableSolrParams','org/apache/http/impl/client/DefaultHttpClient'}
堆栈:{'org/apache/http/impl/client/DefaultHttpClient'}
字节码:
0x0000000:bb00 0359 2ab7 0004 4db2 0005 b900 0601
0x0000010:0099 001e b200 05bb 0007 59b7 0008 1209
0x0000020:b600 0a2c b600 0bb6 000c b900 0d02 002b
0x0000030:b800 104e 2d2c b800 0f2d b0
堆栈映射表:
追加帧(@47,对象[#143])
位于com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2232)
位于com.google.common.cache.LocalCache.get(LocalCache.java:3965)
位于com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
位于com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
在com.lucidworks.spark.util.SolrSupport$.getCachedCloudClient上(SolrSupport.scala:93)
在com.lucidworks.spark.util.SolrSupport$.getSolrBaseUrl上(SolrSupport.scala:97)
在com.lucidworks.spark.util.SolrQuerySupport$.getUniqueKey(SolrQuerySupport.scala:82)上
在com.lucidworks.spark.rdd.SolrRDD.(SolrRDD.scala:32)
在com.lucidworks.spark.solrelation上(solrelation.scala:63)
在solr.DefaultSource.createRelation(DefaultSource.scala:26)
位于org.apache.spark.sql.execution.datasources.resolvedatasource$.apply(resolvedatasource.scala:222)
位于org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
在com.tempurer.intelligence.searchindexer.SolrIndexer3$.runJob上(SolrIndexer3.scala:127)
在com.tempurer.intelligence.searchindexer.SolrIndexer3$.main(SolrIndexer3.scala:80)
访问com.tempurer.intelligence.searchindexer.SolrIndexer3.main(SolrIndexer3.scala)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:498)
位于org.apache.spark.deploy.warn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
原因:java.lang.VerifyError:返回类型错误
我正在使用sbt汇编插件将Spark应用程序构建为一个胖罐子,下面是我的build.sbt文件:

name := "search-indexer"
version := "0.1.0-SNAPSHOT"
scalaVersion := "2.10.6"

resolvers ++= Seq(
  "Cloudera CDH 5.0"        at "https://repository.cloudera.com/artifactory/cloudera-repos"
)

libraryDependencies ++= Seq(
  "org.apache.hadoop"           % "hadoop-common"           % "2.6.0-cdh5.7.0" % "provided",
  "org.apache.hadoop"           % "hadoop-hdfs"             % "2.6.0-cdh5.7.0" % "provided",
  "org.apache.hive"             % "hive-exec"               % "1.1.0-cdh5.7.0",
  "org.apache.spark"            % "spark-core_2.10"         % "1.6.0-cdh5.7.0" % "provided",
  "org.apache.spark"            % "spark-sql_2.10"          % "1.6.0-cdh5.7.0" % "provided",
  "org.apache.spark"            % "spark-catalyst_2.10"     % "1.6.0-cdh5.7.0" % "provided",
  "org.apache.spark"            % "spark-mllib_2.10"        % "1.6.0-cdh5.7.0" % "provided",
  "org.apache.spark"            % "spark-graphx_2.10"       % "1.6.0-cdh5.7.0" % "provided",
  "org.apache.spark"            % "spark-streaming_2.10"    % "1.6.0-cdh5.7.0" % "provided",
  "com.databricks"              % "spark-avro_2.10"         % "2.0.1",
  "com.databricks"              % "spark-csv_2.10"          % "1.4.0",
    "com.lucidworks.spark"        % "spark-solr"              % "2.0.1",
  "com.fasterxml.jackson.core"  % "jackson-core"            % "2.8.0",  // Solves runtime error: java.lang.NoSuchMethodError: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
  "org.scalatest"               % "scalatest_2.10"          % "2.2.4"          % "test"
)

// See: https://github.com/sbt/sbt-assembly
mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
   {
    case PathList("META-INF", xs @ _*) => MergeStrategy.discard
    case x => MergeStrategy.first
   }
}
name:=“搜索索引器”
版本:=“0.1.0-快照”
规模规避:=“2.10.6”
分解器+++=Seq(
“Cloudera CDH 5.0”athttps://repository.cloudera.com/artifactory/cloudera-repos"
)
libraryDependencies++=Seq(
提供了“org.apache.hadoop”%”hadoop通用“%”2.6.0-cdh5.7.0“%”,
“org.apache.hadoop”%“hadoop hdfs”%“2.6.0-cdh5.7.0”%“提供”,
“org.apache.hive”%“hive exec”%“1.1.0-cdh5.7.0”,
提供了“org.apache.spark”%”spark-core_2.10“%”1.6.0-cdh5.7.0“%”,
提供了“org.apache.spark”%”spark-sql_2.10“%”1.6.0-cdh5.7.0“%”,
提供了“org.apache.spark”%”spark-catalyst_2.10“%”1.6.0-cdh5.7.0“%”,
提供了“org.apache.spark”%”spark-mllib_2.10“%”1.6.0-cdh5.7.0“%”,
“org.apache.spark”%”spark-graphx_2.10“%”1.6.0-cdh5.7.0“%”提供“,
提供了“org.apache.spark”%”spark-streaming_2.10“%”1.6.0-cdh5.7.0“%”,
“com.databricks”%“spark-avro_2.10”%“2.0.1”,
“com.databricks”%“spark-csv_2.10”%“1.4.0”,
“com.lucidworks.spark”%“spark solr”%“2.0.1”,
“com.fasterxml.jackson.core”%”jackson core“%”2.8.0“,//解决运行时错误:java.lang.NoSuchMethodError:com.fasterxml.jackson.core.JsonFactory.requirePropertyOrdering()Z
“org.scalatest”%”scalatest_2.10“%”2.2.4“%”测试
)
//见:https://github.com/sbt/sbt-assembly
程序集mergeStrategy.discard中的mergeStrategy
案例x=>MergeStrategy.first
}
}

更多信息,包括记录的纱线上下文,可以在我打开的github项目票证中找到:

我遇到了类似的问题。你能解决这个问题吗?