hadoop上的FileNotFoundException_Hadoop_Mapreduce

hadoop上的FileNotFoundException

hadoop mapreduce

hadoop上的FileNotFoundException,hadoop,mapreduce,Hadoop,Mapreduce,在map函数中，我试图从distributedcache读取一个文件，将其内容加载到哈希映射中 MapReduce作业的sys输出日志打印hashmap的内容。这表明它已找到文件，已加载到数据结构中，并执行了所需的操作。它遍历列表并打印其内容。从而证明手术是成功的但是，在运行MR作业几分钟后，我仍然得到以下错误： 13/01/27 18:44:21 INFO mapred.JobClient: Task Id : attempt_201301271841_0001_m_000001_2, St

在map函数中，我试图从distributedcache读取一个文件，将其内容加载到哈希映射中

MapReduce作业的sys输出日志打印hashmap的内容。这表明它已找到文件，已加载到数据结构中，并执行了所需的操作。它遍历列表并打印其内容。从而证明手术是成功的

但是，在运行MR作业几分钟后，我仍然得到以下错误：

13/01/27 18:44:21 INFO mapred.JobClient: Task Id : attempt_201301271841_0001_m_000001_2, Status : FAILED java.io.FileNotFoundException: File does not exist: /app/hadoop/jobs/nw_single_pred_in/predict at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.(DFSClient.java:1834) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:67) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/01/27 18:44:21 INFO mapred.JobClient:任务Id:尝试\u 201301271841\u 0001\u m\u000001\u 2，状态：失败

java.io.FileNotFoundException:文件不存在：/app/hadoop/jobs/nw\u single\u pred\u in/predict 在org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo（DFSClient.java:1843）位于org.apache.hadoop.hdfs.DFSClient$DFSInputStream。（DFSClient.java:1834）位于org.apache.hadoop.hdfs.DFSClient.open（DFSClient.java:578）位于org.apache.hadoop.hdfs.DistributedFileSystem.open（DistributedFileSystem.java:154）位于org.apache.hadoop.fs.FileSystem.open（FileSystem.java:427）位于org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize（LineRecordReader.java:67）位于org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize（MapTask.java:522）位于org.apache.hadoop.mapred.MapTask.runNewMapper（MapTask.java:763）位于org.apache.hadoop.mapred.MapTask.run（MapTask.java:370）位于org.apache.hadoop.mapred.Child$4.run（Child.java:255）位于java.security.AccessController.doPrivileged（本机方法）位于javax.security.auth.Subject.doAs（Subject.java:396）位于org.apache.hadoop.security.UserGroupInformation.doAs（UserGroupInformation.java:1121）位于org.apache.hadoop.mapred.Child.main（Child.java:249）下面是初始化路径的部分，其中包含要放置在分布式缓存中的文件的位置

// inside main, surrounded by try catch block, yet no exception thrown here Configuration conf = new Configuration(); // rest of the stuff that relates to conf Path knowledgefilepath = new Path(args[3]); // args[3] = /app/hadoop/jobs/nw_single_pred_in/predict/knowledge.txt DistributedCache.addCacheFile(knowledgefilepath.toUri(), conf); job.setJarByClass(NBprediction.class); // rest of job settings job.waitForCompletion(true); // kick off load //在main内部，被try-catch块包围，但这里没有抛出异常 Configuration conf=新配置（）； //与conf相关的其他内容

Path knowledgefilepath=新路径（参数[3]）；//args[3]=/app/hadoop/jobs/nw_single_pred_in/predict/knowledge.txt DistributedCache.addCacheFile（knowledgefilepath.toUri（），conf）； job.setJarByClass（NBprediction.class）； //其余作业设置作业。等待完成（true）；//起动负荷这一个在map函数中：

try { System.out.println("Inside try !!"); Path files[]= DistributedCache.getLocalCacheFiles(context.getConfiguration()); Path cfile = new Path(files[0].toString()); // only one file System.out.println("File path : "+cfile.toString()); CSVReader reader = new CSVReader(new FileReader(cfile.toString()),'\t'); while ((nline=reader.readNext())!=null) data.put(nline[0],Double.parseDouble(nline[1])); // load into a hashmap } catch (Exception e) {// handle exception } 试一试{ System.out.println（“insidetry！！”；路径文件[]=DistributedCache.getLocalCacheFiles（context.getConfiguration（））； Path cfile=新路径（文件[0].toString（））；//仅一个文件 System.out.println（“文件路径：+cfile.toString（））； CSVReader reader=new CSVReader（new FileReader（cfile.toString（）），“\t”）；而（（nline=reader.readNext（））！=null） data.put（nline[0]，Double.parseDouble（nline[1]）；//加载到hashmap中 } 捕获（例外e） {//句柄异常} 谢谢你的帮助

干杯

重新安装了hadoop，并用同一个jar运行该作业，问题消失了。这似乎是一个bug，而不是编程错误。

如果在使用分布式缓存时不共享部分代码，很难找出问题所在。/app/hadoop/jobs/nw_single_pred_in/predict这是文件的绝对路径还是文件所在的目录？@shazin这是HDFS上文件所在的目录驻留。@CharlesMenguy添加了代码。请看邮报。