Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/351.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 通过matplotlib从apache pig打印数据_Python_Matplotlib_Streaming_Apache Pig - Fatal编程技术网

Python 通过matplotlib从apache pig打印数据

Python 通过matplotlib从apache pig打印数据,python,matplotlib,streaming,apache-pig,Python,Matplotlib,Streaming,Apache Pig,因此,我尝试使用python/matplotlib通过ApachePig绘制一些数据 具体地说,我希望使用pig读取和处理数据,然后通过用python编写的绘图脚本将数据流化 我在ApachePig之外使用绘图脚本已经有一段时间了,没有发生任何事件,所以我很确定这不是问题所在,但是如果有人想让我发布,我可以发布它 这是我的猪剧本 %default BINSIZE 5.0 /* functions */ define plot `test_plot.py -f output_image.png`

因此,我尝试使用python/matplotlib通过ApachePig绘制一些数据

具体地说,我希望使用pig读取和处理数据,然后通过用python编写的绘图脚本将数据流化

我在ApachePig之外使用绘图脚本已经有一段时间了,没有发生任何事件,所以我很确定这不是问题所在,但是如果有人想让我发布,我可以发布它

这是我的猪剧本

%default BINSIZE 5.0

/* functions */
define plot `test_plot.py -f output_image.png` ship('/tank/user/eric/dev/pig/test_plot.py');

/* load the data */
cd /scratch;
VALUE = load 'test_data.txt' as (x_val:double);

/* bin the data */
BINNED_VAL = foreach VALUE
         generate (double)((int)( x_val / $BINSIZE )) * $BINSIZE;

/* make a histogram */
COUNTED = group BINNED_VAL by $0;
HIST = foreach COUNTED generate group, COUNT(BINNED_VAL);

A = stream HIST through plot;

dump A;
test_plot.py的-f标志指定输出文件。脚本从stdin读取数据,但不向stdout写入数据,因此A实际上从未设置为任何值,这意味着dump A实际上不做任何事情。并确实抛出了一个错误

以下是test_data.txt的内容:

5
5
6
6.5
8
12
28
25
25
25
26
29
32
35
下面是我收到的错误消息:

2014-07-07 12:49:30,973 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
2014-07-07 12:49:30,973 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-07-07 12:49:30,974 [main] INFO  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
2.4.0.2.1.2.1-471       0.12.1.2.1.2.1-471      eric    2014-07-07 12:48:57  2014-07-07 12:49:30      GROUP_BY,STREAMING

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1404713698289_0021  A,BINNED_VAL,COUNTED,HIST,VALUE GROUP_BY,STREAMING,COMBINER   Message: Job failed!    hdfs://hypno.st.hmc.edu:8020/tmp/temp-2122498041/tmp461187682,

Input(s):
Failed to read data from "hdfs://hypno.st.hmc.edu:8020/scratch/test_data.txt"

Output(s):
Failed to produce result in "hdfs://hypno.st.hmc.edu:8020/tmp/temp-2122498041/tmp461187682"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1404713698289_0021


2014-07-07 12:49:30,974 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2014-07-07 12:49:30,986 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
Details at logfile: /tank/user/eric/dev/pig/pig_1404762535492.log
这是输出日志文件:

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_m_000000_0 Info:Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_r_000000_0 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_r_000000_0 Info:Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_r_000000_1 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_r_000000_1 Info:Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_r_000000_2 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_r_000000_2 Info:Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

Backend error message
---------------------
AttemptID:attempt_1404713698289_0021_r_000000_3 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Error message from task (reduce) task_1404713698289_0021_r_000000
-----------------------------------------------------------------
ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1

org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
================================================================================
Error message from task (reduce) task_1404713698289_0021_r_000000
-----------------------------------------------------------------
ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1

org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
================================================================================
Error message from task (reduce) task_1404713698289_0021_r_000000
-----------------------------------------------------------------
ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1

org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
================================================================================
Error message from task (reduce) task_1404713698289_0021_r_000000
-----------------------------------------------------------------
ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1

org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
================================================================================
Pig Stack Trace
---------------
ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A. Backend error : Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.PigServer.openIterator(PigServer.java:872)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
        at org.apache.pig.Main.run(Main.java:607)
        at org.apache.pig.Main.main(Main.java:156)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2090: Received Error while processing the reduce plan: 'test_plot.py (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)' failed with exit status: 1
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:496)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.cleanup(PigGenericMapReduce.java:522)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:179)
        at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
================================================================================
我的pig版本是ApachePig版本0.12.1.2.1.2.1-471,我使用的是Python 2.6.6

我对猪也很陌生,所以如果我错过了一些愚蠢的东西,我向你道歉


如果有人能给我指出正确的方向,我将不胜感激

您的所有节点上都安装了Python吗?我建议从本地模式开始,先让它工作。另外,我不确定转储PNG文件是否有效……是的,这是一个安装了Python2.6.6的单节点/测试安装。我也没有尝试转储PNG,而是尝试从python脚本中向本地文件系统写入PNG。垃圾可能是我的问题的一部分,因为它实际上什么也没倒。有没有一种方法可以在不转储的情况下结束脚本?你能发布你的python脚本吗?也许从一个非常简单的python脚本开始,它只向stdout写一行代码来找出错误在哪里顺便问一下,pig会给数据添加括号,你的python脚本能处理吗?