Apache spark 如何修复can'；是否无法连接到spark shell中的hdfs？_Apache Spark_Hdfs

Apache spark 如何修复can'；是否无法连接到spark shell中的hdfs？

apache-spark

Apache spark 如何修复can'；是否无法连接到spark shell中的hdfs？,apache-spark,hdfs,Apache Spark,Hdfs,我尝试使用spark shell连接hdfs 我使用spark 2.4.3、scala 2.11.12、Hadoop 3.1.2 火花壳中的代码 scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json") scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json") rdd: org.apache.spark.rdd.RDD[String] =

我尝试使用spark shell连接hdfs

我使用spark 2.4.3、scala 2.11.12、Hadoop 3.1.2

火花壳中的代码

scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")

scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
rdd: org.apache.spark.rdd.RDD[String] = hdfs://localhost:8020/tmp/1.json MapPartitionsRDD[1] at textFile at <console>:24

scala> rdd.count()
java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

hadoop的core-site.xml文件中的配置

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
        <description>A base for other temporary directories.</description>
    </property>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:8020</value>
    </property>
</configuration>

但在火花壳中失败

scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")

scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
rdd: org.apache.spark.rdd.RDD[String] = hdfs://localhost:8020/tmp/1.json MapPartitionsRDD[1] at textFile at <console>:24

scala> rdd.count()
java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

scala>val rdd=sc.textFile（“hdfs://localhost:8020/tmp/1.json")
rdd:org.apache.spark.rdd.rdd[String]=hdfs://localhost:8020/tmp/1.json 文本文件处的MapPartitionsRDD[1]：24
scala>rdd.count（）
java.net.ConnectException:从localhost/127.0.0.1调用localhost:8020失败，连接异常：java.net.ConnectException:连接被拒绝；有关更多详细信息，请参阅：http://wiki.apache.org/hadoop/ConnectionRefused

如果从

/tmp/1.json

而不是

hdfs://...

？您是在本地还是在群集模式下运行spark会话？你试过

hdfs://:8020/tmp/1.json

而不是

localhost

？@Lamanus同样的问题。我明白了。当在/et/hosts中删除localhost配置时，它可以工作。