Scala Spark Streaming中作为参数的HDFS目录_Scala_Hadoop_Apache Spark

Scala Spark Streaming中作为参数的HDFS目录

scala hadoop apache-spark

Scala Spark Streaming中作为参数的HDFS目录,scala,hadoop,apache-spark,Scala,Hadoop,Apache Spark,我对Spark流媒体示例有问题：当我尝试使用SBT启动它时 run local /user/dir/subdir/ 我得到这个例外 [info] Running org.apache.spark.streaming.examples.HdfsWordCount local /user/dir/subdir/ 14/04/21 18:45:55 INFO StreamingExamples: Using Spark's default log4j profile: org/apache/sp

我对Spark流媒体示例有问题：

当我尝试使用SBT启动它时

run local /user/dir/subdir/

我得到这个例外

[info] Running org.apache.spark.streaming.examples.HdfsWordCount local /user/dir/subdir/
14/04/21 18:45:55 INFO StreamingExamples: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/04/21 18:45:55 INFO StreamingExamples: Setting log level to [WARN] for streaming example. To override add a custom log4j.properties to the classpath.
14/04/21 18:45:55 WARN Utils: Your hostname, ubuntu resolves to a loopback address: 127.0.1.1; using 10.4.4.6 instead (on interface eth0)
14/04/21 18:45:55 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
14/04/21 18:45:57 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/04/21 18:46:00 ERROR JobScheduler: Error generating jobs for time 1398098760000 ms
java.io.FileNotFoundException: File /user/dir/subdir/ does not exist

我确信Hadoop fs上存在该目录，我甚至在那里复制了一个文件。

有没有我不知道的输入格式？

我找到了答案的解决方案。输入hdfs目录的正确方法如下（至少在我的情况下）：

run local hdfs://localhost:9000/user/dir/subdir/

我在Spark文档中发现：

您必须查看hadoop中的core-site.xml文件。它必须具有具有默认路径的属性

<configuration>
    <property>
          <name>fs.default.name</name>
          <value>hdfs://localhost:9000</value>
    </property>
</configuration>


fs.default.name
hdfs://localhost:9000

你是对的，链接错误。现在应该纠正。