Hadoop PIG拉丁语-DUMP命令未显示_Hadoop_Apache Pig_Cloudera Cdh

Hadoop PIG拉丁语-DUMP命令未显示

hadoop apache-pig

Hadoop PIG拉丁语-DUMP命令未显示,hadoop,apache-pig,cloudera-cdh,Hadoop,Apache Pig,Cloudera Cdh,我只是尝试使用DUMP显示分组记录的结果，但是没有显示数据，而是有很多日志数据。我只是在玩10张唱片详情如下: grunt> DUMP grouped_records; 2016-02-21 17:34:24,338 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,FILTER 2016-02-21 17:34:24,339 [main]

我只是尝试使用DUMP显示分组记录的结果，但是没有显示数据，而是有很多日志数据。我只是在玩10张唱片

详情如下:

grunt> DUMP grouped_records;
2016-02-21 17:34:24,338 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,FILTER
2016-02-21 17:34:24,339 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, DuplicateForEachColumnRewrite, GroupByConstParallelSetter, ImplicitSplitInserter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NewPartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier, PartitionFilterOptimizer]}
2016-02-21 17:34:24,354 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2016-02-21 17:34:24,374 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2016-02-21 17:34:24,374 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2016-02-21 17:34:24,434 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-02-21 17:34:24,440 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2016-02-21 17:34:24,527 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2016-02-21 17:34:24,530 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2016-02-21 17:34:24,534 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2016-02-21 17:34:24,541 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=142
2016-02-21 17:34:24,541 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2016-02-21 17:34:25,128 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job662989067023626482.jar
2016-02-21 17:34:31,290 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job662989067023626482.jar created
2016-02-21 17:34:31,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2016-02-21 17:34:31,338 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2016-02-21 17:34:31,338 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cache
2016-02-21 17:34:31,338 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2016-02-21 17:34:31,549 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2016-02-21 17:34:31,550 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2016-02-21 17:34:31,556 [JobControl] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-02-21 17:34:31,607 [JobControl] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-02-21 17:34:31,918 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-02-21 17:34:31,918 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2016-02-21 17:34:31,921 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2016-02-21 17:34:31,979 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2016-02-21 17:34:32,092 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1454294818944_0034
2016-02-21 17:34:32,192 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1454294818944_0034
2016-02-21 17:34:32,198 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://quickstart.cloudera:8088/proxy/application_1454294818944_0034/
2016-02-21 17:34:32,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1454294818944_0034
2016-02-21 17:34:32,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases filtered_records,grouped_records,records
2016-02-21 17:34:32,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: records[1,10],records[-1,-1],filtered_records[2,19],grouped_records[3,18] C:  R: 
2016-02-21 17:34:32,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://localhost:50030/jobdetails.jsp?jobid=job_1454294818944_0034
2016-02-21 17:34:32,428 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-02-21 17:35:02,623 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2016-02-21 17:35:23,469 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-02-21 17:35:23,470 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.6.0-cdh5.5.0  0.12.0-cdh5.5.0 cloudera    2016-02-21 17:34:24 2016-02-21 17:35:23 GROUP_BY,FILTER

Success!

Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime  MinMapTIme  AvgMapTime  MedianMapTime   MaxReduceTime   MinReduceTime   AvgReduceTime   MedianReducetime    Alias   Feature Outputs
job_1454294818944_0034  1   1   12  12  12  12  16  16  16  16  filtered_records,grouped_records,records    GROUP_BY    hdfs://quickstart.cloudera:8020/tmp/temp-1703423271/tmp-988597361,

Input(s):
Successfully read 10 records (525 bytes) from: "/user/hduser/input/maxtemppig.tsv"

Output(s):
Successfully stored 0 records in: "hdfs://quickstart.cloudera:8020/tmp/temp-1703423271/tmp-988597361"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1454294818944_0034


2016-02-21 17:35:23,646 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2016-02-21 17:35:23,648 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-02-21 17:35:23,648 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2016-02-21 17:35:23,649 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2016-02-21 17:35:23,660 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-02-21 17:35:23,660 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1

我尝试过的命令：记录=加载“/user/hduser/input/maxtemppig.tsv”为（年份：chararray，温度：int，质量：int）；过滤记录=按（-10,19）中的温度和（0,1,4,5,9）中的质量过滤记录；转储已过滤的数据记录

分组的_记录=按年份分组过滤的_记录；转储分组的用户记录

max_temp=FOREACH grouped_records GENERATE group，max（过滤的_records.temperature）；卸载最大温度

我的输入tsv文件

1950 32001459 1951 33 01459 1950 21 01459 1940 24 01459 1950 33 01459 2000 30 01459 2010 44 01459 2014 -10 01459 2016 -20 01459 2011 1901459

我遗漏了什么？

解析很可能不起作用，而您正在筛选所有记录

试一试

请用您的文件和PIG代码的其余部分编辑您的问题。有些人可能想要重现问题。请在您的问题下找到相应的“编辑”链接。请不要将注释用于代码。另外，再次添加tsv文件的内容。请删除这些评论和您的问题谢谢您的回复…我将检查解析部分。

records = LOAD '/user/hduser/input/maxtemppig.tsv' USING PigStorage('\t') AS (year:chararray, temperature:int, quality:int);