Apache spark 插入期间spark databricks群集noclassdeffounderror发生错误_Apache Spark_Hive_Noclassdeffounderror_Databricks_Hive Serde

Apache spark 插入期间spark databricks群集noclassdeffounderror发生错误

apache-spark hive

Apache spark 插入期间spark databricks群集noclassdeffounderror发生错误,apache-spark,hive,noclassdeffounderror,databricks,hive-serde,Apache Spark,Hive,Noclassdeffounderror,Databricks,Hive Serde,使用Databricks Spark cluster进行实验。在配置单元数据库中创建表时，我第一次遇到以下错误 19/06/18 21:34:17 ERROR SparkExecuteStatementOperation: Error running hive query: org.apache.hive.service.cli.HiveSQLException: java.lang.NoClassDefFoundError: org/joda/time/ReadWritableInstant

使用Databricks Spark cluster进行实验。在配置单元数据库中创建表时，我第一次遇到以下错误

19/06/18 21:34:17 ERROR SparkExecuteStatementOperation: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: java.lang.NoClassDefFoundError: org/joda/time/ReadWritableInstant
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:296)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2$$anonfun$run$2.apply$mcV$sp(SparkExecuteStatementOperation.scala:182)
    at org.apache.spark.sql.hive.thriftserver.server.SparkSQLUtils$class.withLocalProperties(SparkSQLOperationManager.scala:190)

在随后尝试创建相同的表（不重新启动集群）时，我得到了以下结果

org.apache.hive.service.cli.HiveSQLException: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyPrimitiveObjectInspectorFactory
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:296)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2$$anonfun$run$2.apply$mcV$sp(SparkExecuteStatementOperation.scala:182)
    at org.apache.spark.sql.hive.thriftserver.server.SparkSQLUtils$class.withLocalProperties(SparkSQLOperationManager.scala:190)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:44)

从beeline（客户端），我得到以下错误。。。。本质上是一样的

13: jdbc:spark://dbc-e1ececb9-10d2.cloud.data> create table test_dnax_db.sample2 (name2 string);
Error: [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: java.lang.NoClassDefFoundError: org/joda/time/ReadWritableInstant, Query: create table test_dnax_db.sample2 (name2 string). (state=HY000,code=500051)
13: jdbc:spark://dbc-e1ececb9-10d2.cloud.data> create table test_dnax_db.sample2 (name2 string);
Error: [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyPrimitiveObjectInspectorFactory, Query: create table test_dnax_db.sample2 (name2 string). (state=HY000,code=500051)

我尝试过使用databricks的

库

功能上传依赖的joda时间罐和serde罐。此外，我还设置了spark属性

spark.driver.extraClassPath

（给定的错误来自spark驱动程序，而不是工人）。两者都没有帮助。我确实在hosts/databricks/hive和/databricks/jars文件夹中看到了可用的依赖jar

我还尝试设置环境变量，比如

HADOOP\u CLASSPATH

，但运气不好

众所周知，Databricks论坛毫无用处，因为它们根本没有策划（与splunk或类似的商业产品相比）

欢迎任何建议

我可以使用

location

关键字以及元存储中现有表的查询成功创建数据库

编辑：

我怀疑SparkExecuteStatementOperation（spark cluster中的thrift entry类到sql执行，在驱动程序上运行）可能使用了与应用程序不同的类加载器。我在我的应用程序类静态块中添加了这个，我知道它会被初始化，并且我看不到ClassNotFoundException，即jar对应用程序可用。但是底层驱动程序看不到相关的jar

static {
        try {
            Class<?> aClass = Class.forName("org.joda.time.ReadWritableInstant");
            }
        } catch (ClassNotFoundException e) {
            LOG.warn("Unable to find ReadWritableInstant class", e);
        }
}

静态{
试一试{
Class aClass=Class.forName（“org.joda.time.ReadWritableInstant”）；
}
}catch（classnotfounde异常）{
LOG.warn（“找不到ReadWritableInstant类”，e）；
}
}