Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/file/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
spark错误:saveAsTextFile()创建一个不能用bash file命令打开的现有文件_File_Hadoop_Apache Spark - Fatal编程技术网

spark错误:saveAsTextFile()创建一个不能用bash file命令打开的现有文件

spark错误:saveAsTextFile()创建一个不能用bash file命令打开的现有文件,file,hadoop,apache-spark,File,Hadoop,Apache Spark,我正在使用spark,做一些教程,假设我们在mycode路径中,并在这里键入“spark shell”。最终目录如下所示: /mycode | |-input.txt |-output |--part-0000 |--part-0001 然后我输入一些命令,就像上面说的 scala>val inputfile=sc.textFile(“input.txt”) inputfile:org.apache.spark.rdd.rdd[String]=input.txt MapPartitio

我正在使用spark,做一些教程,假设我们在mycode路径中,并在这里键入“spark shell”。最终目录如下所示:

/mycode
|
|-input.txt
|-output
  |--part-0000
  |--part-0001
然后我输入一些命令,就像上面说的

scala>val inputfile=sc.textFile(“input.txt”)
inputfile:org.apache.spark.rdd.rdd[String]=input.txt MapPartitionsRDD[14]位于文本文件的位置:24
scala>val counts=inputfile.flatMap(line=>line.split(“”).map(word=>(word,1)).reduceByKey(+389;)
counts:org.apache.spark.rdd.rdd[(String,Int)]=ShuffledRDD[17]位于reduceByKey的位置:26
scala>counts.collect().foreach(println)
(对话一)
(一、二)
(仅限第1页)
(同上,第8段)
(,1)
(他们,7)
(爱,1)
(不是,1)
(1人)
(股份有限公司,1)
(或,1)
(注意,1)
(美丽,2)
(步行,1)
(看,1)
scala>counts.saveAsTextFile(“file:///home/hadoop/Mycode/output")

奇怪的是,当我尝试cat文件这部分0000时,没有出现这样的文件或目录错误。如果该文件在主机的二进制版本中不同,则不会出错,因为。我高度怀疑这是由于文件系统的错误操作,或者hadoop或spark的错误配置造成的。有人能帮我吗?谢谢:)

请添加您正在使用的特定cat命令。这应该有效:cat/home/hadoop/Mycode/output/part-0000请添加您正在使用的特定cat命令。这应该有效:cat/home/hadoop/Mycode/output/part-0000
scala> val inputfile = sc.textFile("input.txt")
inputfile: org.apache.spark.rdd.RDD[String] = input.txt MapPartitionsRDD[14] at textFile at <console>:24

scala> val counts = inputfile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_+_)
counts: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[17] at reduceByKey at <console>:26

scala> counts.collect().foreach(println)
(talk.,1)
(are,2)
(only,1)
(as,8)
(,1)
(they,7)
(love,,1)
(not,1)
(people,1)
(share.,1)
(or,1)
(care,1)
(beautiful,2)
(walk,1)
(look,,1)

scala> counts.saveAsTextFile("file:///home/hadoop/Mycode/output")