在java中使用spark api,在yarnclient模式下,容器:tomcat8,错误:FileNotFoundException:\uuuu spark\uu libs\uuuu*.zip
当我使用spark api for java,在yarnclient模式下从web执行作业时,这是tomcat日志异常:在java中使用spark api,在yarnclient模式下,容器:tomcat8,错误:FileNotFoundException:\uuuu spark\uu libs\uuuu*.zip,tomcat,apache-spark,Tomcat,Apache Spark,当我使用spark api for java,在yarnclient模式下从web执行作业时,这是tomcat日志异常: For more detailed output, check application tracking page:http://dev1:8088/cluster/app/application_1495078972652_0002Then, click on links to logs of each attempt. Diagnostics: File file:/us
For more detailed output, check application tracking page:http://dev1:8088/cluster/app/application_1495078972652_0002Then, click on links to logs of each attempt.
Diagnostics: File file:/usr/local/tomcat/temp/spark-aae1afc9-5738-4e5b-ae29-f8935adf53b8/__spark_libs__1867469074993542381.zip does not exist
java.io.FileNotFoundException: File file:/usr/local/tomcat/temp/spark-aae1afc9-5738-4e5b-ae29-f8935adf53b8/__spark_libs__1867469074993542381.zip does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
我的代码:
private void initSparkConf(String appName) {
SparkConf conf = new SparkConf().setMaster("yarn-client").setAppName(appName);
conf.set("spark.driver.maxResultSize","2g") ;
this.context = new JavaSparkContext(conf) ;
}
public static void main(String[] args) {
Date start = new Date() ;
Utils.taskStart("Test Spark",start);
SparkTest model = new SparkTest("test") ;
testWorkCount(model.getContext());
model.getContext().close();
Utils.taskEnd("Test Spark",start,new Date());
}
/**
* word count test
* @param context JavaSparkContext
*/
private static void testWorkCount(JavaSparkContext context) {
JavaRDD<String> lines = context.textFile("hdfs://192.168.1.110:9000/east3/JYLS/JYLS-20150520.txt");
JavaRDD<String> words = lines.flatMap(line -> Arrays.asList(line.split(",")).iterator());
JavaPairRDD<String,Integer> pairRDD = words.mapToPair(word -> new Tuple2<>(word,1)) ;
JavaPairRDD<String,Integer> countRDD = pairRDD.reduceByKey((count1,count2) -> count1+count2) ;
countRDD.coalesce(1,true).saveAsTextFile("hdfs://192.168.1.110:9000/east3//result/" + new SimpleDateFormat("yyyyMMddHHmmss").format(new Date()));
}
spark-defaults.conf:
spark.master spark://dev1:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://dev1:9000/spark/eventlogs
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.executor.memory 2g
spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
Tomcat在主节点上运行
为什么文件spark\u libs*.zip只存在于集群中的主节点上?为什么工作节点从“tomcat temp”目录中查找文件?
如何解决此异常?
谢谢。
spark.master spark://dev1:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://dev1:9000/spark/eventlogs
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.executor.memory 2g
spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"