Hadoop 使用Python运行MapReduce流作业时出错

Hadoop 使用Python运行MapReduce流作业时出错,hadoop,mapreduce,hadoop-streaming,Hadoop,Mapreduce,Hadoop Streaming,我正在尝试运行这个mapper和reducer代码*免责声明-培训课程解决方案的一部分 mapper.py 减速机 不断抛出此错误: UNDEF/bin/hadoop job -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201404041914_0012 14/04/04 23:13:53 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid

我正在尝试运行这个mapper和reducer代码*免责声明-培训课程解决方案的一部分

mapper.py

减速机

不断抛出此错误:

UNDEF/bin/hadoop job  -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201404041914_0012
14/04/04 23:13:53 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201404041914_0012
14/04/04 23:13:53 ERROR streaming.StreamJob: Job not successful. Error: NA
14/04/04 23:13:53 INFO streaming.StreamJob: killJob...
Streaming Command Failed!
Traceback (most recent call last):
  File "r.py", line 14, in <module>
    print transactions, "\t", salesTotal
NameError: name 'transactions' is not defined
我尝试过显式调用python函数,也尝试过指定python解释器。i、 e./usr/bin/env python


知道哪里出错了吗?

作业失败了,因为您的reducer.py有语法错误

这一行的问题在于:

print transactions, "\t", salesTotal
没有名称为transactions和salesTotal的变量

如果在本地执行,则会出现以下错误:

UNDEF/bin/hadoop job  -Dmapred.job.tracker=0.0.0.0:8021 -kill job_201404041914_0012
14/04/04 23:13:53 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201404041914_0012
14/04/04 23:13:53 ERROR streaming.StreamJob: Job not successful. Error: NA
14/04/04 23:13:53 INFO streaming.StreamJob: killJob...
Streaming Command Failed!
Traceback (most recent call last):
  File "r.py", line 14, in <module>
    print transactions, "\t", salesTotal
NameError: name 'transactions' is not defined

Jay Setti:你能告诉你执行的命令吗?提供的信息不足以调试,请检查日志文件以获取更多信息。
print trans, "\t", sTotal