Spark Java:Java.lang.NoClassDefFoundError
我在本地使用Spark standalone,并使用Maven作为构建自动化工具。因此,我为spark和简单的JSON设置了所有必需的依赖项。对于单词计数这样的简单应用程序,我很好地运行了Spark应用程序,但是当我从简单JSON api导入JSONParser时,我得到了类NotFoundException。我曾尝试使用sparkconfig和spark上下文添加jar文件,但仍然没有帮助 下面是我的pom.xmlSpark Java:Java.lang.NoClassDefFoundError,java,json,maven,apache-spark,java-8,Java,Json,Maven,Apache Spark,Java 8,我在本地使用Spark standalone,并使用Maven作为构建自动化工具。因此,我为spark和简单的JSON设置了所有必需的依赖项。对于单词计数这样的简单应用程序,我很好地运行了Spark应用程序,但是当我从简单JSON api导入JSONParser时,我得到了类NotFoundException。我曾尝试使用sparkconfig和spark上下文添加jar文件,但仍然没有帮助 下面是我的pom.xml <project xmlns="http://maven.apache.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org</groupId>
<artifactId>sparketl</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>sparketl</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1.1</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
我的Spark配置有
spark.executor.memory 512m
spark.driver.cores 1
spark.driver.memory 512m
spark.driver.extraClassPath /Users/username/.m2/repository/com/googlecode/json-simple/json-simple/1.1.1/json-simple-1.1.1-sources.jar
有人遇到过这个问题吗?如果是这样的话,这个问题的解决方案是什么?根据
spark.driver.extraClassPath
(和代码库)-提供给spark的库是一个源库(json-simple-1.1.1-sources.jar
)。该库可能只包含java文件(源文件,而不是编译的java类)
将其更改为
json-simple-1.1.1.jar
(当然带有完整路径)应该会有所帮助。根据spark.driver.extraClassPath
(和代码库)-提供给spark的库是一个源库(json-simple-1.1.1-sources.jar
)。该库可能只包含java文件(源文件,而不是编译的java类)
将其更改为
json-simple-1.1.1.jar
(当然带有完整路径)应该会有所帮助。根据spark.driver.extraClassPath
(和代码库)-提供给spark的库是一个源库(json-simple-1.1.1-sources.jar
)。该库可能只包含java文件(源文件,而不是编译的java类)
将其更改为
json-simple-1.1.1.jar
(当然带有完整路径)应该会有所帮助。根据spark.driver.extraClassPath
(和代码库)-提供给spark的库是一个源库(json-simple-1.1.1-sources.jar
)。该库可能只包含java文件(源文件,而不是编译的java类)
将其更改为json-simple-1.1.1.jar
(当然是完整路径)应该会有所帮助
15/07/08 16:09:17 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
15/07/08 16:09:17 INFO SparkContext: Added JAR /Users/username/.m2/repository/com/googlecode/json-simple/json-simple/1.1.1/json-simple-1.1.1-sources.jar at http://172.16.8.157:52255/jars/json-simple-1.1.1-sources.jar with timestamp 1436396957111
15/07/08 16:09:17 INFO MemoryStore: ensureFreeSpace(110248) called with curMem=0, maxMem=278019440
15/07/08 16:09:17 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 107.7 KB, free 265.0 MB)
15/07/08 16:09:17 INFO MemoryStore: ensureFreeSpace(10090) called with curMem=110248, maxMem=278019440
15/07/08 16:09:17 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 9.9 KB, free 265.0 MB)
15/07/08 16:09:17 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.16.8.157:52257 (size: 9.9 KB, free: 265.1 MB)
15/07/08 16:09:17 INFO SparkContext: Created broadcast 0 from textFile at SparkEtl.java:35
15/07/08 16:09:17 INFO FileInputFormat: Total input paths to process : 1
15/07/08 16:09:17 INFO SparkContext: Starting job: sortByKey at SparkEtl.java:58
15/07/08 16:09:17 INFO DAGScheduler: Got job 0 (sortByKey at SparkEtl.java:58) with 2 output partitions (allowLocal=false)
15/07/08 16:09:17 INFO DAGScheduler: Final stage: ResultStage 0(sortByKey at SparkEtl.java:58)
15/07/08 16:09:17 INFO DAGScheduler: Parents of final stage: List()
15/07/08 16:09:17 INFO DAGScheduler: Missing parents: List()
15/07/08 16:09:17 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[5] at sortByKey at SparkEtl.java:58), which has no missing parents
15/07/08 16:09:17 INFO MemoryStore: ensureFreeSpace(5248) called with curMem=120338, maxMem=278019440
15/07/08 16:09:17 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 5.1 KB, free 265.0 MB)
15/07/08 16:09:17 INFO MemoryStore: ensureFreeSpace(2888) called with curMem=125586, maxMem=278019440
15/07/08 16:09:17 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.8 KB, free 265.0 MB)
15/07/08 16:09:17 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 172.16.8.157:52257 (size: 2.8 KB, free: 265.1 MB)
15/07/08 16:09:17 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:874
15/07/08 16:09:17 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[5] at sortByKey at SparkEtl.java:58)
15/07/08 16:09:17 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
15/07/08 16:09:18 INFO SparkDeploySchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@172.16.8.157:52260/user/Executor#2100827222]) with ID 0
15/07/08 16:09:18 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 172.16.8.157, PROCESS_LOCAL, 1560 bytes)
15/07/08 16:09:18 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 172.16.8.157, PROCESS_LOCAL, 1560 bytes)
15/07/08 16:09:18 INFO BlockManagerMasterEndpoint: Registering block manager 172.16.8.157:52263 with 265.1 MB RAM, BlockManagerId(0, 172.16.8.157, 52263)
15/07/08 16:09:18 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 172.16.8.157:52263 (size: 2.8 KB, free: 265.1 MB)
15/07/08 16:09:18 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.16.8.157:52263 (size: 9.9 KB, free: 265.1 MB)
15/07/08 16:09:19 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 172.16.8.157): java.lang.NoClassDefFoundError: org/json/simple/parser/JSONParser
at org.sparketl.etljobs.SparkEtl.lambda$main$b9f570ea$1(SparkEtl.java:44)
at org.sparketl.etljobs.SparkEtl$$Lambda$11/1498038525.call(Unknown Source)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1030)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1030)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.random.SamplingUtils$.reservoirSampleAndCount(SamplingUtils.scala:42)
at org.apache.spark.RangePartitioner$$anonfun$8.apply(Partitioner.scala:259)
at org.apache.spark.RangePartitioner$$anonfun$8.apply(Partitioner.scala:257)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
15/07/08 16:09:19 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on executor 172.16.8.157: java.lang.NoClassDefFoundError (org/json/simple/parser/JSONParser) [duplicate 1]
spark.executor.memory 512m
spark.driver.cores 1
spark.driver.memory 512m
spark.driver.extraClassPath /Users/username/.m2/repository/com/googlecode/json-simple/json-simple/1.1.1/json-simple-1.1.1-sources.jar