Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 无法使用Spark从HDFS读取文件_Hadoop_Apache Spark_Cloudera Cdh - Fatal编程技术网

Hadoop 无法使用Spark从HDFS读取文件

Hadoop 无法使用Spark从HDFS读取文件,hadoop,apache-spark,cloudera-cdh,Hadoop,Apache Spark,Cloudera Cdh,我使用cloudera manager安装了cloudera CDH 5 我很容易做到 hadoop fs -ls /input/war-and-peace.txt hadoop fs -cat /input/war-and-peace.txt 上述命令将在控制台上打印整个txt文件 现在我启动火花壳并说 val textFile = sc.textFile("hdfs://input/war-and-peace.txt") textFile.count 现在我得到一个错误 Spark上下文

我使用cloudera manager安装了cloudera CDH 5

我很容易做到

hadoop fs -ls /input/war-and-peace.txt
hadoop fs -cat /input/war-and-peace.txt
上述命令将在控制台上打印整个txt文件

现在我启动火花壳并说

val textFile = sc.textFile("hdfs://input/war-and-peace.txt")
textFile.count
现在我得到一个错误

Spark上下文作为sc提供

scala> val textFile = sc.textFile("hdfs://input/war-and-peace.txt")
2014-12-14 15:14:57,874 INFO  [main] storage.MemoryStore (Logging.scala:logInfo(59)) - ensureFreeSpace(177621) called with curMem=0, maxMem=278302556
2014-12-14 15:14:57,877 INFO  [main] storage.MemoryStore (Logging.scala:logInfo(59)) - Block broadcast_0 stored as values in memory (estimated size 173.5 KB, free 265.2 MB)
textFile: org.apache.spark.rdd.RDD[String] = hdfs://input/war-and-peace.txt MappedRDD[1] at textFile at <console>:12

scala> textFile.count
2014-12-14 15:15:21,791 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 0 time(s); maxRetries=45
2014-12-14 15:15:41,905 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 1 time(s); maxRetries=45
2014-12-14 15:16:01,925 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 2 time(s); maxRetries=45
2014-12-14 15:16:21,983 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 3 time(s); maxRetries=45
2014-12-14 15:16:42,001 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 4 time(s); maxRetries=45
2014-12-14 15:17:02,062 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 5 time(s); maxRetries=45
2014-12-14 15:17:22,082 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 6 time(s); maxRetries=45
2014-12-14 15:17:42,116 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 7 time(s); maxRetries=45
2014-12-14 15:18:02,138 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 8 time(s); maxRetries=45
2014-12-14 15:18:22,298 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 9 time(s); maxRetries=45
2014-12-14 15:18:42,319 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 10 time(s); maxRetries=45
2014-12-14 15:19:02,354 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 11 time(s); maxRetries=45
2014-12-14 15:19:22,373 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 12 time(s); maxRetries=45
2014-12-14 15:19:42,424 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 13 time(s); maxRetries=45
2014-12-14 15:20:02,446 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 14 time(s); maxRetries=45
2014-12-14 15:20:22,512 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 15 time(s); maxRetries=45
2014-12-14 15:20:42,515 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 16 time(s); maxRetries=45
2014-12-14 15:21:02,550 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 17 time(s); maxRetries=45
2014-12-14 15:21:22,558 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 18 time(s); maxRetries=45
2014-12-14 15:21:42,683 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 19 time(s); maxRetries=45
2014-12-14 15:22:02,702 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 20 time(s); maxRetries=45
2014-12-14 15:22:22,832 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 21 time(s); maxRetries=45
2014-12-14 15:22:42,852 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 22 time(s); maxRetries=45
2014-12-14 15:23:02,974 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 23 time(s); maxRetries=45
2014-12-14 15:23:22,995 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 24 time(s); maxRetries=45
2014-12-14 15:23:43,109 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 25 time(s); maxRetries=45
2014-12-14 15:24:03,128 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 26 time(s); maxRetries=45
2014-12-14 15:24:23,250 INFO  [main] ipc.Client (Client.java:handleConnectionTimeout(814)) - Retrying connect to server: input/92.242.140.21:8020. Already tried 27 time(s); maxRetries=45
java.net.ConnectException: Call From dn1home/192.168.1.21 to input:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
        at org.apache.hadoop.ipc.Client.call(Client.java:1415)
scala>val textFile=sc.textFile(“hdfs://input/war-and-peace.txt")
2014-12-14 15:14:57874 INFO[main]storage.MemoryStore(Logging.scala:logInfo(59))-ensurerefreespace(177621)调用curMem=0,maxMem=278302556
2014-12-14 15:14:57877 INFO[main]storage.MemoryStore(Logging.scala:logInfo(59))-块广播_0作为值存储在内存中(估计大小173.5 KB,可用容量265.2 MB)
textFile:org.apache.spark.rdd.rdd[String]=hdfs://input/war-and-peace.txt 文本文件中的MappedRDD[1]位于:12
scala>textFile.count
2014-12-14 15:15:21791信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已尝试0次;maxRetries=45
2014-12-14 15:15:41905 INFO[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过1次;maxRetries=45
2014-12-14 15:16:01925 INFO[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过2次;maxRetries=45
2014-12-14 15:16:21983信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过3次;maxRetries=45
2014-12-14 15:16:42001信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过4次;maxRetries=45
2014-12-14 15:17:02062信息[main]ipc.客户端(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过5次;maxRetries=45
2014-12-14 15:17:22082信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过6次;maxRetries=45
2014-12-14 15:17:42116信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过7次;maxRetries=45
2014-12-14 15:18:02138信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过8次;maxRetries=45
2014-12-14 15:18:22298信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过9次;maxRetries=45
2014-12-14 15:18:42319信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过10次;maxRetries=45
2014-12-14 15:19:02354信息[main]ipc.客户端(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过11次;maxRetries=45
2014-12-14 15:19:22373信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过12次;maxRetries=45
2014-12-14 15:19:42424信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过13次;maxRetries=45
2014-12-14 15:20:02446信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过14次;maxRetries=45
2014-12-14 15:20:22512信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过15次;maxRetries=45
2014-12-14 15:20:42515信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过16次;maxRetries=45
2014-12-14 15:21:02550信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过17次;maxRetries=45
2014-12-14 15:21:22558信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过18次;maxRetries=45
2014-12-14 15:21:42683信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过19次;maxRetries=45
2014-12-14 15:22:02702 INFO[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过20次;maxRetries=45
2014-12-14 15:22:22832信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过21次;maxRetries=45
2014-12-14 15:22:42852信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过22次;maxRetries=45
2014-12-14 15:23:02974 INFO[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过23次;maxRetries=45
2014-12-14 15:23:22995信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过24次;maxRetries=45
2014-12-14 15:23:43109信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过25次;maxRetries=45
2014-12-14 15:24:03128信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-重试连接到服务器:input/92.242.140.21:8020。已试过26次;maxRetries=45
2014-12-14 15:24:23250信息[main]ipc.Client(Client.java:handleConnectionTimeout(814))-正在重试连接到服务器:inpu
sc.textFile("hdfs://nn1home:8020/input/war-and-peace.txt")
val textFile=sc.textFile("hdfs:/input1/Card_History2016_3rdFloor.csv")
textFile: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:22

textFile.count

res1: Long = 58973  
   logFile = "hdfs://localhost:9000/sampledata/sample.txt"
sc.textFie("/myhdfsdirectory/myfiletoprocess.txt")
hdfs dfs -mkdir /myhdfsdirectory
hdfs dfs -copyFromLocal mylocalfile /myhdfsdirectory/myfiletoprocess.txt
hdfs://localhost:54310/input/war-and-peace.txt
 val conf = new SparkConf().setMaster("local[*]").setAppName("HDFSFileReader")
conf.set("fs.defaultFS", "hdfs://hostname:9000")
val sc = new SparkContext(conf)
val data = sc.textFile("hdfs://hostname:9000/hdfspath/")
data.saveAsTextFile("C:\\dummy\")
val textFile = sc.textFile("hdfs://localhost:9000/user/input.txt")
var result= scontext.textFile("hdfs://localhost:9000/home/usr/abc/fileName.txt", 2)