Apache spark 我在Hbase中保存了一个数据帧,得到:java.lang.NoClassDefFoundError:org/apache/hadoop/Hbase/client/TableDescriptor

Apache spark 我在Hbase中保存了一个数据帧,得到:java.lang.NoClassDefFoundError:org/apache/hadoop/Hbase/client/TableDescriptor,apache-spark,hbase,Apache Spark,Hbase,我在ApacheSpark上创建了一个项目 版本: scala 2.11.8 ApacheSpark2.3.0 apache hbase 1.2.0 hortonworks shc 1.1.0.3.1.2.0-4(hortonworks连接器) 我需要在HBase表中保存一个简单的数据帧。为此,我在Docker容器()中启动了HBase 1.2.0,并创建了下表: $ hbase(main):002:0> create "table1", "cf1", "cf2", "cf3", "c

我在ApacheSpark上创建了一个项目

版本:

  • scala 2.11.8
  • ApacheSpark2.3.0
  • apache hbase 1.2.0
  • hortonworks shc 1.1.0.3.1.2.0-4(hortonworks连接器)
我需要在HBase表中保存一个简单的数据帧。为此,我在Docker容器()中启动了HBase 1.2.0,并创建了下表:

$ hbase(main):002:0> create "table1", "cf1", "cf2", "cf3", "cf4", "cf5", "cf6", "cf7", "cf8"
$ 0 row (s) in 1.4440 seconds
要在Hbase中保存数据帧,我使用:

  • 我声明的目录与示例中的完全相同
  • 我创建了一个基于目录的数据框架
  • 我尝试在hbase中保存数据帧,如示例所示:
代码:

错误:

java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/client/TableDescriptor

    at org.apache.spark.sql.execution.datasources.hbase.DefaultSource.createRelation(HBaseRelation.scala:63)
    at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
    at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
    at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:654)
    at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
    at SparkTest.bar(SparkTest.scala:56)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:59)
    at org.junit.internal.runners.MethodRoadie.runTestMethod(MethodRoadie.java:98)
    at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:79)
    at org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:87)
    at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:77)
    at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:42)
    at org.junit.internal.runners.JUnit4ClassRunner.invokeTestMethod(JUnit4ClassRunner.java:88)
    at org.junit.internal.runners.JUnit4ClassRunner.runMethods(JUnit4ClassRunner.java:51)
    at org.junit.internal.runners.JUnit4ClassRunner$1.run(JUnit4ClassRunner.java:44)
    at org.junit.internal.runners.ClassRoadie.runUnprotected(ClassRoadie.java:27)
    at org.junit.internal.runners.ClassRoadie.runProtected(ClassRoadie.java:37)
    at org.junit.internal.runners.JUnit4ClassRunner.run(JUnit4ClassRunner.java:42)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
    at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.client.TableDescriptor
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 41 more
意味着您正在本地运行,并且您的hbase客户端jar丢失。(如果它在类路径中,那么您可以将作用域更改为
运行时
,而不是
编译

这将把所有hbase jar添加到类路径中

查看/打印下面类路径中的所有JAR将有助于理解类路径中的JAR

 def urlsinclasspath(cl: ClassLoader): Array[java.net.URL] = cl match {
    case null => Array()
    case u: java.net.URLClassLoader => u.getURLs() ++ urlsinclasspath(cl.getParent)
    case _ => urlsinclasspath(cl.getParent)
  }
打电话的人是

  val  urls = urlsinclasspath(getClass.getClassLoader).foreach(println)

它有用吗?不,我将Hortonworks shc版本更改为1.1.1-2.1-s_2.11,现在它可以工作了,无论如何,感谢您尝试帮助我。可能是您的新版本得到了hbase客户端,它的类为org.apache.hadoop.hbase.client.TableDescriptor,但答案仍然有效。因为在类路径中没有hbase客户端,所以在升级平台之后,在类路径下就有了那个jar。无论如何,这个
urlsinclasspath
对于调试此类问题非常有用。一般来说,用户不会一直升级,他们会尝试修复
ClassDefNotFound
的类路径问题。干杯
val sparkSession = SparkSession.builder
      .appName("SparkTest")
      .master("local[*]")
      .config("spark.testing.memory", 2147480000)
      .getOrCreate()
<!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-client -->
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>2.1.4</version>
</dependency>
export HBASE_CLASSPATH=$HBASE_CLASSPATH:`hbase classpath`
 def urlsinclasspath(cl: ClassLoader): Array[java.net.URL] = cl match {
    case null => Array()
    case u: java.net.URLClassLoader => u.getURLs() ++ urlsinclasspath(cl.getParent)
    case _ => urlsinclasspath(cl.getParent)
  }
  val  urls = urlsinclasspath(getClass.getClassLoader).foreach(println)