Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Windows 7 Windows 7上的Hadoop单节点安装_Windows 7_Hadoop - Fatal编程技术网

Windows 7 Windows 7上的Hadoop单节点安装

Windows 7 Windows 7上的Hadoop单节点安装,windows-7,hadoop,Windows 7,Hadoop,我是hadoop新手,正在尝试在我的Windows7机器上安装hadoop 0.20.2的单节点设置 我的问题有两个方面——一个是关于安装本身的完整性,另一个是关于示例字数计算程序的reduce阶段的错误 我的安装步骤如下: 我遵循安装程序 我已经在本地主机上安装了cygwin并设置了无密码ssh 我的java版本是: java version "1.7.0_02" Java(TM) SE Runtime Environment (build 1.7.0_02-b13) Java HotSpot

我是hadoop新手,正在尝试在我的Windows7机器上安装hadoop 0.20.2的单节点设置

我的问题有两个方面——一个是关于安装本身的完整性,另一个是关于示例字数计算程序的reduce阶段的错误

我的安装步骤如下:

我遵循安装程序

我已经在本地主机上安装了cygwin并设置了无密码ssh 我的java版本是:

java version "1.7.0_02"
Java(TM) SE Runtime Environment (build 1.7.0_02-b13)
Java HotSpot(TM) 64-Bit Server VM (build 22.0-b10, mixed mode)
conf/core-site.xml的内容:

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>
<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
</configuration>
我注意到jps打印了tasktracker和secondarynamenode的PID

我能够查看

http://localhost:50030 for the jobtracker, http://localhost:50060 for the tasktracker and http://localhost:50070 for the namenode. 我还可以通过namenode的http接口浏览DFS来查看这些文件

  • 我的安装完成了吗
  • 如果是,为什么jps命令不显示所有五个组件的PID
  • 如果没有,那么我需要哪些步骤来完成安装
  • 用于测试安装完整性的其他健全性检查有哪些
  • 我最初认为我的安装已经完成,并按照

    我获得以下输出:

    12/03/25 00:10:26 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    12/03/25 00:10:26 INFO input.FileInputFormat: Total input paths to process : 1
    12/03/25 00:10:27 INFO mapred.JobClient: Running job: job_201203242348_0001
    12/03/25 00:10:28 INFO mapred.JobClient:  map 0% reduce 0%
    12/03/25 00:10:35 INFO mapred.JobClient:  map 100% reduce 0%
    12/03/25 00:21:29 INFO mapred.JobClient: Task Id : attempt_201203242348_0001_r_0
    00000_0, Status : FAILED
    Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
    12/03/25 00:32:25 INFO mapred.JobClient: Task Id : attempt_201203242348_0001_r_0
    00000_1, Status : FAILED
    Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
    12/03/25 00:44:02 INFO mapred.JobClient: Task Id : attempt_201203242348_0001_r_0
    00000_2, Status : FAILED
    Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
    12/03/25 00:55:00 INFO mapred.JobClient: Job complete: job_201203242348_0001
    12/03/25 00:55:00 INFO mapred.JobClient: Counters: 12
    12/03/25 00:55:00 INFO mapred.JobClient:   Job Counters
    12/03/25 00:55:00 INFO mapred.JobClient:     Launched reduce tasks=4
    12/03/25 00:55:00 INFO mapred.JobClient:     Launched map tasks=1
    12/03/25 00:55:00 INFO mapred.JobClient:     Data-local map tasks=1
    12/03/25 00:55:00 INFO mapred.JobClient:     Failed reduce tasks=1
    12/03/25 00:55:00 INFO mapred.JobClient:   FileSystemCounters
    12/03/25 00:55:00 INFO mapred.JobClient:     HDFS_BYTES_READ=13366
    12/03/25 00:55:00 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=23511
    12/03/25 00:55:00 INFO mapred.JobClient:   Map-Reduce Framework
    12/03/25 00:55:00 INFO mapred.JobClient:     Combine output records=0
    12/03/25 00:55:00 INFO mapred.JobClient:     Map input records=244
    12/03/25 00:55:00 INFO mapred.JobClient:     Spilled Records=1887
    12/03/25 00:55:00 INFO mapred.JobClient:     Map output bytes=19699
    12/03/25 00:55:00 INFO mapred.JobClient:     Combine input records=0
    12/03/25 00:55:00 INFO mapred.JobClient:     Map output records=1887
    
    映射任务似乎已完成,但reduce任务在日志中显示以下错误:

    2012-03-25 00:10:35,202 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0: Got 1 new map-outputs
    2012-03-25 00:10:40,193 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0 Scheduled 1 outputs (0 slow hosts and0 dup hosts)
    2012-03-25 00:10:40,243 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201203242348_0001_m_000000_0, compressed len: 23479, decompressed len: 23475
    2012-03-25 00:10:40,243 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 23475 bytes (23479 raw bytes) into RAM from attempt_201203242348_0001_m_000000_0
    2012-03-25 00:11:35,194 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0 Need another 1 map output(s) where 1 is already in progress
    2012-03-25 00:11:35,194 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts)
    2012-03-25 00:12:35,197 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0 Need another 1 map output(s) where 1 is already in progress
    2012-03-25 00:12:35,197 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts)
    2012-03-25 00:13:35,202 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0 Need another 1 map output(s) where 1 is already in progress
    2012-03-25 00:13:35,202 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201203242348_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts and0 dup hosts)
    2012-03-25 00:13:40,249 INFO org.apache.hadoop.mapred.ReduceTask: Failed to shuffle from attempt_201203242348_0001_m_000000_0
    java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:150)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at sun.net.www.http.ChunkedInputStream.fastRead(ChunkedInputStream.java:239)
    at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:680)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2959)
    at org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:149)
    at org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:101)
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1522)
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
    at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
    
    以下是任务跟踪器日志的内容:

    2012-03-25 00:10:27,910 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203242348_0001_m_000002_0 task's state:UNASSIGNED
    2012-03-25 00:10:27,915 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201203242348_0001_m_000002_0
    2012-03-25 00:10:27,915 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203242348_0001_m_000002_0
    2012-03-25 00:10:28,453 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203242348_0001_m_625085452
    2012-03-25 00:10:28,454 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201203242348_0001_m_625085452 spawned.
    2012-03-25 00:10:29,217 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201203242348_0001_m_625085452 given task: attempt_201203242348_0001_m_000002_0
    2012-03-25 00:10:29,523 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201203242348_0001_m_000002_0 0.0% setup
    2012-03-25 00:10:29,524 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_201203242348_0001_m_000002_0 is done.
    2012-03-25 00:10:29,524 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_201203242348_0001_m_000002_0  was 0
    2012-03-25 00:10:29,526 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
    2012-03-25 00:10:29,718 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201203242348_0001_m_625085452 exited. Number of tasks it ran: 1
    2012-03-25 00:10:30,911 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201203242348_0001/attempt_201203242348_0001_m_000002_0/output/file.out in any of the configured local directories
    2012-03-25 00:10:30,952 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203242348_0001_m_000000_0 task's state:UNASSIGNED
    2012-03-25 00:10:30,952 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201203242348_0001_m_000000_0
    2012-03-25 00:10:30,952 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203242348_0001_m_000000_0
    2012-03-25 00:10:30,952 INFO org.apache.hadoop.mapred.TaskTracker: Received KillTaskAction for task: attempt_201203242348_0001_m_000002_0
    2012-03-25 00:10:30,952 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_201203242348_0001_m_000002_0
    2012-03-25 00:10:30,952 INFO org.apache.hadoop.mapred.TaskRunner: attempt_201203242348_0001_m_000002_0 done; removing files.
    2012-03-25 00:10:30,952 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_201203242348_0001_m_000002_0 not found in cache
    2012-03-25 00:10:31,077 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203242348_0001_m_-1399302881
    2012-03-25 00:10:31,077 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201203242348_0001_m_-1399302881 spawned.
    2012-03-25 00:10:31,812 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201203242348_0001_m_-1399302881 given task: attempt_201203242348_0001_m_000000_0
    2012-03-25 00:10:32,642 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201203242348_0001_m_000000_0 1.0% 
    2012-03-25 00:10:32,642 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_201203242348_0001_m_000000_0 is done.
    2012-03-25 00:10:32,642 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_201203242348_0001_m_000000_0  was 0
    2012-03-25 00:10:32,642 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
    2012-03-25 00:10:32,822 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201203242348_0001_m_-1399302881 exited. Number of tasks it ran: 1
    2012-03-25 00:10:33,982 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203242348_0001_r_000000_0 task's state:UNASSIGNED
    2012-03-25 00:10:33,982 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201203242348_0001_r_000000_0
    2012-03-25 00:10:33,982 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203242348_0001_r_000000_0
    2012-03-25 00:10:34,057 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203242348_0001_r_625085452
    2012-03-25 00:10:34,057 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201203242348_0001_r_625085452 spawned.
    2012-03-25 00:10:34,852 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201203242348_0001_r_625085452 given task: attempt_201203242348_0001_r_000000_0
    2012-03-25 00:10:40,243 INFO org.apache.hadoop.mapred.TaskTracker: Sent out 23479 bytes for reduce: 0 from map: attempt_201203242348_0001_m_000000_0 given 23479/23475
    2012-03-25 00:10:40,243 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 192.168.1.33:50060, dest: 192.168.1.33:60790, bytes: 23479, op: MAPRED_SHUFFLE, cliID: attempt_201203242348_0001_m_000000_0
    2012-03-25 00:10:41,153 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201203242348_0001_r_000000_0 0.0% reduce > copy > 
    2012-03-25 00:10:44,158 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201203242348_0001_r_000000_0 0.0% reduce > copy > 
    2012-03-25 00:16:05,244 INFO org.apache.hadoop.mapred.TaskTracker: Sent out 23479 bytes for reduce: 0 from map: attempt_201203242348_0001_m_000000_0 given 23479/23475
    2012-03-25 00:16:05,244 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 192.168.1.33:50060, dest: 192.168.1.33:60864, bytes: 23479, op: MAPRED_SHUFFLE, cliID: attempt_201203242348_0001_m_000000_0
    2012-03-25 00:16:05,249 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201203242348_0001_r_000000_0 0.0% reduce > copy > 
    2012-03-25 00:16:08,249 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201203242348_0001_r_000000_0 0.0% reduce > copy > 
    2012-03-25 00:21:25,251 FATAL org.apache.hadoop.mapred.TaskTracker: Task: attempt_201203242348_0001_r_000000_0 - Killed due to Shuffle Failure: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
    
    我在windows防火墙中打开了端口9000和9001 我检查了telnet输出,以验证这些端口是否确实已打开:

    C:\Windows\system32>netstat -a -n | grep -e "500[367]0"
      TCP    0.0.0.0:50030          0.0.0.0:0              LISTENING
      TCP    0.0.0.0:50060          0.0.0.0:0              LISTENING
      TCP    0.0.0.0:50070          0.0.0.0:0              LISTENING
      TCP    [::]:50030             [::]:0                 LISTENING
      TCP    [::]:50060             [::]:0                 LISTENING
      TCP    [::]:50070             [::]:0                 LISTENING
    
    C:\Windows\system32>netstat -a -n | grep -e "900[01]"
      TCP    127.0.0.1:9000         0.0.0.0:0              LISTENING
      TCP    127.0.0.1:9000         127.0.0.1:60332        ESTABLISHED
      TCP    127.0.0.1:9000         127.0.0.1:60987        ESTABLISHED
      TCP    127.0.0.1:9001         0.0.0.0:0              LISTENING
      TCP    127.0.0.1:9001         127.0.0.1:60410        ESTABLISHED
      TCP    127.0.0.1:60332        127.0.0.1:9000         ESTABLISHED
      TCP    127.0.0.1:60410        127.0.0.1:9001         ESTABLISHED
      TCP    127.0.0.1:60987        127.0.0.1:9000         ESTABLISHED
    
    在安装和使reduce任务正常工作这两个问题上,您都能提供帮助吗

    我看着http://wiki.apache.org/hadoop/SocketTimeout 和其他一些链接,并尝试的建议,但没有任何成功

    http://wiki.apache.org/hadoop/SocketTimeout 我感谢你耐心阅读这篇文章,并很乐意提供更多细节


    提前感谢。

    请在日志中查看这一行:

    2012-03-25 00:10:30,911 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201203242348_0001/attempt_201203242348_0001_m_000002_0/output/file.out in any of the configured local directories
    
    我猜您需要检查
    hadoop.tmp.dir
    mapred.local.dir
    。您提到了正在使用的配置,因此这两个参数的值是默认值。给出了这些参数的默认值。将这些设置到相关位置,然后重试


    注意:在进行此更改之前,您需要停止hadoop并在完成后启动。

    我尝试设置显式值:/tmp/hadoopuser/mapredtmp和/tmp/hadoopuser/hadooptmp,并且我验证了map任务的实际输出是否存在于该位置,但仍然得到与之前相同的错误。提到此错误在0.20.2中仍未解决,并已在0.21中解决 http://wiki.apache.org/hadoop/SocketTimeout
    2012-03-25 00:10:30,911 INFO org.apache.hadoop.mapred.TaskTracker: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201203242348_0001/attempt_201203242348_0001_m_000002_0/output/file.out in any of the configured local directories