java.lang.NoClassDefFoundError:org/apache/hadoop/fs/StorageStatistics

java.lang.NoClassDefFoundError:org/apache/hadoop/fs/StorageStatistics,hadoop,apache-spark,Hadoop,Apache Spark,我正在尝试从服务器上运行一个简单的spark to s3应用程序,但我一直收到以下错误,因为服务器上安装了hadoop 2.7.3,而且它似乎不包括。我在pom.xml文件中定义了hadoop 2.8.x,但试图通过在本地运行它来测试它 如果我必须使用hadoop 2.7.3,我如何才能让它忽略对该类的搜索,或者包含该类的解决方案是什么 Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/

我正在尝试从服务器上运行一个简单的spark to s3应用程序,但我一直收到以下错误,因为服务器上安装了hadoop 2.7.3,而且它似乎不包括。我在pom.xml文件中定义了hadoop 2.8.x,但试图通过在本地运行它来测试它

如果我必须使用hadoop 2.7.3,我如何才能让它忽略对该类的搜索,或者包含该类的解决方案是什么

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StorageStatistics
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
    at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at org.apache.spark.sql.execution.datasources.DataSource.hasMetadata(DataSource.scala:301)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:344)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
    at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441)
    at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:425)
    at com.ibm.cos.jdbc2DF$.main(jdbc2DF.scala:153)
    at com.ibm.cos.jdbc2DF.main(jdbc2DF.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.StorageStatistics
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 28 more

你不能把Hadoop的一些东西混在一起,然后期望它能工作。它不仅仅是hadoop common和hadoop aws中内部类之间的紧密耦合,它还包括特定版本的amazon aws SDK,hadoop aws模块就是由它构建的

除非可以隔离类路径,否则需要将POM回滚到2.7.3

对不起