Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/80.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 如何修复can';是否无法连接到spark shell中的hdfs?_Apache Spark_Hdfs - Fatal编程技术网

Apache spark 如何修复can';是否无法连接到spark shell中的hdfs?

Apache spark 如何修复can';是否无法连接到spark shell中的hdfs?,apache-spark,hdfs,Apache Spark,Hdfs,我尝试使用spark shell连接hdfs 我使用spark 2.4.3、scala 2.11.12、Hadoop 3.1.2 火花壳中的代码 scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json") scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json") rdd: org.apache.spark.rdd.RDD[String] =

我尝试使用spark shell连接hdfs

我使用spark 2.4.3、scala 2.11.12、Hadoop 3.1.2

火花壳中的代码

scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
rdd: org.apache.spark.rdd.RDD[String] = hdfs://localhost:8020/tmp/1.json MapPartitionsRDD[1] at textFile at <console>:24

scala> rdd.count()
java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
hadoop的core-site.xml文件中的配置

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
        <description>A base for other temporary directories.</description>
    </property>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:8020</value>
    </property>
</configuration>
但在火花壳中失败

scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
scala> val rdd = sc.textFile("hdfs://localhost:8020/tmp/1.json")
rdd: org.apache.spark.rdd.RDD[String] = hdfs://localhost:8020/tmp/1.json MapPartitionsRDD[1] at textFile at <console>:24

scala> rdd.count()
java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
scala>val rdd=sc.textFile(“hdfs://localhost:8020/tmp/1.json")
rdd:org.apache.spark.rdd.rdd[String]=hdfs://localhost:8020/tmp/1.json 文本文件处的MapPartitionsRDD[1]:24
scala>rdd.count()
java.net.ConnectException:从localhost/127.0.0.1调用localhost:8020失败,连接异常:java.net.ConnectException:连接被拒绝;有关更多详细信息,请参阅:http://wiki.apache.org/hadoop/ConnectionRefused

如果从
/tmp/1.json
而不是
hdfs://...
?您是在本地还是在群集模式下运行spark会话?你试过
hdfs://:8020/tmp/1.json
而不是
localhost
?@Lamanus同样的问题。我明白了。当在/et/hosts中删除localhost配置时,它可以工作。