使用Spark&;创建Wordcount项目时出错;通过Vmware在Cloudera中的Eclipse上使用Java
我正试图通过VMware在Cloudera的Eclipse上使用Spark和Java创建Wordcount项目。Java版本为1.7,Spark版本为2.0.0。项目中“JavaWordCount.java”类中的代码如下:使用Spark&;创建Wordcount项目时出错;通过Vmware在Cloudera中的Eclipse上使用Java,java,scala,hadoop,apache-spark,bigdata,Java,Scala,Hadoop,Apache Spark,Bigdata,我正试图通过VMware在Cloudera的Eclipse上使用Spark和Java创建Wordcount项目。Java版本为1.7,Spark版本为2.0.0。项目中“JavaWordCount.java”类中的代码如下: package com.vishal.wc; import scala.Tuple2; import org.apache.hadoop.hive.ql.exec.spark.session.SparkSession; import org.apac
package com.vishal.wc;
import scala.Tuple2;
import org.apache.hadoop.hive.ql.exec.spark.session.SparkSession;
import org.apache.spark.api.java.JavaRDD;
public class JavaWordCount {
public static final Pattern SPACE = Pattern.compile(" ");
public static void main(String[] args) throws Exception {
if(args.length < 2){
System.err.println("Usage: JavaWordCount <InputFile> <OutputFile>"); System.exit(1); }
SparkSession spark= SparkSession.builder().appName("JavaWordCount").getOrCreate(); JavaRDD<String> lines = spark.read().textFile(args[0]).javaRDD(); JavaRDD<String> words = lines.flatMap(new FlatMapFunction<String, String>(){
public Iterator<String> call(String s){
return Arrays.asList(s.split(" ")).iterator();
}
});
JavaPairRDD<String, Integer> ones = words.mapToPair(new PairFunction<String, String, Integer>(){
public tuple2<String, Integer> call(String s){
return new tuple2<>(s,1);
}
});
JavaPairRDD<String, Integer> counts = ones.reduceByKey(
new Function2<Integer, Integer, Integer>(){
public Integer call(Integer i1, Integer i2){
return i1 = i2;
}
});
counts.saveAsTextFile(args[1]);
spark.stop();
}
}
package com.vishal.wc;
导入scala.Tuple2;
导入org.apache.hadoop.hive.ql.exec.spark.session.SparkSession;
导入org.apache.spark.api.java.JavaRDD;
公共类JavaWordCount{
公共静态最终模式空间=Pattern.compile(“”);
公共静态void main(字符串[]args)引发异常{
如果(args.length<2){
System.err.println(“用法:JavaWordCount”);System.exit(1);}
SparkSession spark=SparkSession.builder().appName(“JavaWordCount”).getOrCreate();JavaRDD lines=spark.read().textFile(args[0]).JavaRDD();JavaRDD words=lines.flatMap(新FlatMapFunction(){
公共迭代器调用(字符串s){
返回Arrays.asList(s.split(“”).iterator();
}
});
javapairdd ones=words.mapToPair(新PairFunction(){
公共元组2调用(字符串s){
返回新的tuple2(s,1);
}
});
JavaPairRDD计数=1.reduceByKey(
新功能2(){
公共整数调用(整数i1、整数i2){
返回i1=i2;
}
});
counts.saveAsTextFile(args[1]);
spark.stop();
}
}
由于没有添加火花罐,因此出现了错误。
我将Spark-2.0.0-bin-hadoop-2.7.tgz中的JAR添加到构建路径中,但错误仍然几乎相同。错误如下所示:
Description Resource Path Location Type
FlatMapFunction cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 26 Java Problem
Function2 cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 44 Java Problem
Iterator cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 28 Java Problem
JavaPairRDD cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 32 Java Problem
JavaPairRDD cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 42 Java Problem
PairFunction cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 32 Java Problem
The method builder() is undefined for the type SparkSession JavaWordCount.java /SparkProject/src/com/vishal/wc line 22 Java Problem
The method flatMap(FlatMapFunction<String,U>) in the type AbstractJavaRDDLike<String,JavaRDD<String>> is not applicable for the arguments (new FlatMapFunction<String,String>(){}) JavaWordCount.java /SparkProject/src/com/vishal/wc line 26 Java Problem
The method mapToPair(PairFunction<String,K2,V2>) in the type AbstractJavaRDDLike<String,JavaRDD<String>> is not applicable for the arguments (new PairFunction<String,String,Integer>(){}) JavaWordCount.java /SparkProject/src/com/vishal/wc line 32 Java Problem
The method read() is undefined for the type SparkSession JavaWordCount.java /SparkProject/src/com/vishal/wc line 24 Java Problem
The method stop() is undefined for the type SparkSession JavaWordCount.java /SparkProject/src/com/vishal/wc line 52 Java Problem
tuple2 cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 35 Java Problem
tuple2 cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 37 Java Problem
描述资源路径位置类型
FlatMapFunction无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc line 26 java问题
函数2无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc line 44 java问题
迭代器无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc line 28 java问题
javapairdd无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc line 32 java问题
javapairdd无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc line 42 java问题
PairFunction无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc line 32 java问题
对于类型SparkSession JavaWordCount.java/SparkProject/src/com/vishal/wc line 22 java问题,未定义方法生成器()
AbstractJavaRDDLike类型中的方法flatMap(FlatMapFunction)不适用于参数(新FlatMapFunction(){})JavaWordCount.java/SparkProject/src/com/vishal/wc line 26 java Problem
AbstractJavaRDDLike类型中的方法mapToPair(PairFunction)不适用于参数(新PairFunction(){})JavaWordCount.java/SparkProject/src/com/vishal/wc line 32 java问题
对于类型SparkSession JavaWordCount.java/SparkProject/src/com/vishal/wc line 24 java Problem,未定义read()方法
对于类型SparkSession JavaWordCount.java/SparkProject/src/com/vishal/wc line 52 java Problem,方法stop()未定义
tuple2无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc line 35 java问题
tuple2无法解析为类型JavaWordCount.java/SparkProject/src/com/vishal/wc-line 37 java问题
请提供帮助。您需要像下面这样导入缺少的库
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
Eclipse提供了快捷方式Ctrl+Shft+O来获取所有缺少的导入。顶部有正确的导入语句吗?因为FlatMapFunction是org.apache.spark.api.java.function包的成员,您还没有添加它。其他未解决的类型可能存在类似问题。请尝试重新启动eclipse