Matlab 如何从Hadoop的HDFS中读取HDF数据
我在Hadoop上从事图像处理工作。我使用HDF卫星数据进行处理,我可以在hadoop流媒体中访问和使用jpg和其他图像类型的数据。但是,当使用HDF数据时,会出现错误。Hadoop无法从HDFS读取HDF数据。显示错误也需要20分钟以上。我的HDF数据大小超过150MB的单个文件 如何解决这个问题。如何使hadoop可以从HDFS读取此HDF数据 我的一些代码Matlab 如何从Hadoop的HDFS中读取HDF数据,matlab,hadoop,hive,distributed-computing,hdf,Matlab,Hadoop,Hive,Distributed Computing,Hdf,我在Hadoop上从事图像处理工作。我使用HDF卫星数据进行处理,我可以在hadoop流媒体中访问和使用jpg和其他图像类型的数据。但是,当使用HDF数据时,会出现错误。Hadoop无法从HDFS读取HDF数据。显示错误也需要20分钟以上。我的HDF数据大小超过150MB的单个文件 如何解决这个问题。如何使hadoop可以从HDFS读取此HDF数据 我的一些代码 hadoop@master:/usr/local/master/hdf/examples$ ./runD1.sh Buildfil
hadoop@master:/usr/local/master/hdf/examples$ ./runD1.sh
Buildfile: /usr/local/master/hdf/build.xml
downloader:
setup:
test_settings:
compile:
BUILD SUCCESSFUL
Total time: 0 seconds
Output HIB: /var/www/html/uploads/
14/09/26 15:28:46 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
Found host successfully: 0
Repeated host: 1
Repeated host: 2
Repeated host: 3
Tried to get 2 nodes, got 1
14/09/26 15:28:46 INFO input.FileInputFormat: Total input paths to process : 1
First n-1 nodes responsible for 1592259 images
Last node responsible for 1592259 images
14/09/26 15:29:04 INFO mapred.JobClient: Running job: job_201409191212_0006
14/09/26 15:29:05 INFO mapred.JobClient: map 0% reduce 0%
14/09/26 15:39:15 INFO mapred.JobClient: Task Id : attempt_201409191212_0006_m_000000_0, Status : FAILED
Task attempt_201409191212_0006_m_000000_0 failed to report status for 600 seconds. Killing!
14/09/26 15:49:17 INFO mapred.JobClient: Task Id : attempt_201409191212_0006_m_000000_1, Status : FAILED
Task attempt_201409191212_0006_m_000000_1 failed to report status for 600 seconds. Killing!
14/09/26 15:59:19 INFO mapred.JobClient: Task Id : attempt_201409191212_0006_m_000000_2, Status : FAILED
Task attempt_201409191212_0006_m_000000_2 failed to report status for 600 seconds. Killing!
错误日志为:
2014-09-26 15:38:45,133 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201409191212_0006_m_-1211757488
2014-09-26 15:38:45,133 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201409191212_0006_m_-1211757488 spawned.
2014-09-26 15:38:45,136 INFO org.apache.hadoop.mapred.TaskController: Writing commands to /usr/local/master/temp/mapred/local/ttprivate/taskTracker/hadoop/jobcache/job_201409191212_0006/attempt_201409191212_0006_m_000000_0.cleanup/taskjvm.sh
2014-09-26 15:38:45,631 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201409191212_0006_m_-1211757488 given task: attempt_201409191212_0006_m_000000_0
2014-09-26 15:38:46,145 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201409191212_0006_m_000000_0 0.0%
2014-09-26 15:38:46,198 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201409191212_0006_m_000000_0 0.0% cleanup
2014-09-26 15:38:46,200 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_201409191212_0006_m_000000_0 is done.
2014-09-26 15:38:46,200 INFO org.apache.hadoop.mapred.TaskTracker: reported output size for attempt_201409191212_0006_m_000000_0 was -1
2014-09-26 15:38:46,200 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2
2014-09-26 15:38:46,340 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201409191212_0006_m_-1211757488 exited with exit code 0. Number of tasks it ran: 1
请任何人帮我解决这个问题 您使用的是哪种
InputFormat
?我的输入数据格式是HDF(扩展名)。我指的是MapreduceInputFormat
而不是HDF。你能把runD1.sh的内容贴出来吗/bin/bash./runDownloader2.sh/usr/O2_01AUG2013_010_012_GAN_L1B_ST_.hdf/var/www/html/uploads/mynew2如果没有runDownloader2.sh,仍然很难猜测。请更新您的问题。