Hadoop PigLatin无法从hdfs读取文件_Hadoop_Mapreduce_Apache Pig

Hadoop PigLatin无法从hdfs读取文件

hadoop mapreduce apache-pig

Hadoop PigLatin无法从hdfs读取文件,hadoop,mapreduce,apache-pig,Hadoop,Mapreduce,Apache Pig,我尝试猪演示代码后，它的在线手稿首先，我创建了一个名为myfile.txt的测试文件。它包含两行中的六个整数： 4 5 3 1 2 3 使用hadoop fs-copyFromLocal myfile.txt/user/myfile.txt将文件放入hdfs中然后我就跑 A = LOAD '/user/myfile.text'; DUMP A; 但获取以下错误消息： 2014-10-08 14:15:54,259 [main] INFO org.apache.pig.tools.p

我尝试猪演示代码后，它的在线手稿

首先，我创建了一个名为

myfile.txt

的测试文件。它包含两行中的六个整数：

4 5 3 
1 2 3

使用hadoop fs-copyFromLocal myfile.txt/user/myfile.txt将文件放入hdfs中

然后我就跑

A = LOAD '/user/myfile.text';
DUMP A;

但获取以下错误消息：

2014-10-08 14:15:54,259 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-10-08 14:15:54,594 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-10-08 14:15:54,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-10-08 14:15:54,693 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-10-08 14:15:54,909 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-10-08 14:15:54,998 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-10-08 14:15:55,006 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2014-10-08 14:15:55,013 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=12
2014-10-08 14:15:55,015 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2014-10-08 14:15:55,016 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job7804857093829884774.jar
2014-10-08 14:15:58,229 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job7804857093829884774.jar created
2014-10-08 14:15:58,266 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-10-08 14:15:58,304 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2014-10-08 14:15:58,353 [JobControl] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2014-10-08 14:15:58,806 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2014-10-08 14:15:58,964 [JobControl] WARN  org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-08 14:15:58,968 [JobControl] WARN  org.apache.hadoop.conf.Configuration - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
2014-10-08 14:15:58,969 [JobControl] WARN  org.apache.hadoop.conf.Configuration - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-10-08 14:15:59,024 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-10-08 14:15:59,025 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-10-08 14:15:59,051 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2014-10-08 14:16:00,533 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201410081312_0015
2014-10-08 14:16:00,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A
2014-10-08 14:16:00,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[2,4] C:  R: 
2014-10-08 14:16:05,098 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2014-10-08 14:16:05,098 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201410081312_0015 has failed! Stop running all dependent jobs
2014-10-08 14:16:05,099 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2014-10-08 14:16:05,109 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-10-08 14:16:05,111 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.0.0-cdh4.7.0  0.11.0-cdh4.7.0 hdfs    2014-10-08 14:15:54 2014-10-08 14:16:05 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_201410081312_0015   A   MAP_ONLY    Message: Job failed!    

**Input(s):
Failed to read data from "/user/myfile.txt"**

Pig似乎没有连接到hdfs，因此无法访问该文件。有人能帮我解决这个问题吗？

更改文件的设置。可能您无法读取该文件

在Linux环境中，使用更改文件的权限

chmod 755 myfile.txt

然后执行CopyFromLocal命令。

是否检查hadoop日志？您是否在http://:50030/？的JobDetails页面上看到您的hadoop作业？您是否可以通过hdfs dfs-ls查找该文件？我面临同样的问题。。你找到问题的根本原因了吗？我面临着同样的问题，有什么解决办法吗？权限是正确的