在Pig中使用流式python时出错
当我运行以下命令时:在Pig中使用流式python时出错,python,apache-pig,Python,Apache Pig,当我运行以下命令时: REGISTER /home/hduser/Documents/ccc/Research/phd/code/ECentre/scripts/bags.py USING streaming_python AS bp; raw = LOAD 'hdfs:///user/hduser/smsCorpus_en_2012.04.30_all.xml' AS (line:chararray); b = foreach raw generate bp.enumerate_b
REGISTER /home/hduser/Documents/ccc/Research/phd/code/ECentre/scripts/bags.py USING streaming_python
AS bp;
raw = LOAD 'hdfs:///user/hduser/smsCorpus_en_2012.04.30_all.xml' AS (line:chararray);
b = foreach raw generate bp.enumerate_bag(line);
我明白了
Failed to parse: Pig script failed to parse:
<file /home/hduser/Documents/ccc/Research/phd/code/ECentre/scripts/nltk.pig, line 13, column
25> Failed to generate logical plan. Nested exception: org.apache.pig.backend.executionengine.ExecException:
ERROR 1070: Could not resolve bp.enumerate_bag using imports: [, java.lang., org.apache.pig.builtin.,
org.apache.pig.impl.builtin.]
谁能告诉我为什么
我的版本是:
Apache Pig版本0.12.2-SNAPSHOT(r:未知)
编译于2014年4月29日,13:40:45这些引号是否真的存在于python源代码中?不存在。没有它们,代码无法正常显示(或者我是这么想的!)谢谢,我为您删除了它们并将其格式化:)
#!/usr/bin/env python
def enumerate_bag(input):
output = []
for rank, item in enumerate(input):
output.append(tuple([rank] + list(item)))
return output