Apache spark 如何修复can';是否无法连接到spark shell中的hdfs?
我尝试使用spark shell连接hdfs 我使用spark 2.4.3、scala 2.11.12、Hadoop 3.1.2 火花壳中的代码Apache spark 如何修复can';是否无法连接到spark shell中的hdfs?,apache-spark,hdfs,Apache Spark,Hdfs,我尝试使用spark shell连接hdfs 我使用spark 2.4.3、scala 2.11.12、Hadoop 3.1.2 火花壳中的代码 scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json") scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json") rdd: org.apache.spark.rdd.RDD[String] =
scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
rdd: org.apache.spark.rdd.RDD[String] = hdfs://localhost:8020/tmp/1.json MapPartitionsRDD[1] at textFile at <console>:24
scala> rdd.count()
java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
hadoop的core-site.xml文件中的配置
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>
但在火花壳中失败
scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
rdd: org.apache.spark.rdd.RDD[String] = hdfs://localhost:8020/tmp/1.json MapPartitionsRDD[1] at textFile at <console>:24
scala> rdd.count()
java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
scala>val rdd=sc.textFile(“hdfs://localhost:8020/tmp/1.json")
rdd:org.apache.spark.rdd.rdd[String]=hdfs://localhost:8020/tmp/1.json 文本文件处的MapPartitionsRDD[1]:24
scala>rdd.count()
java.net.ConnectException:从localhost/127.0.0.1调用localhost:8020失败,连接异常:java.net.ConnectException:连接被拒绝;有关更多详细信息,请参阅:http://wiki.apache.org/hadoop/ConnectionRefused
如果从/tmp/1.json
而不是hdfs://...
?您是在本地还是在群集模式下运行spark会话?你试过hdfs://:8020/tmp/1.json
而不是localhost
?@Lamanus同样的问题。我明白了。当在/et/hosts中删除localhost配置时,它可以工作。