Python字数映射减少读取stdin时的错误_Python_Hadoop_Mapreduce

Python字数映射减少读取stdin时的错误

python hadoop mapreduce

Python字数映射减少读取stdin时的错误,python,hadoop,mapreduce,Python,Hadoop,Mapreduce,我正在尝试用Python编写一个基本的单词计数MapReduce。以下是映射程序代码： #!/usr/bin/env python import sys # input comes from STDIN (standard input) for line in sys.stdin: try: # remove leading and trailing whitespace line = line.strip() # split the

我正在尝试用Python编写一个基本的单词计数MapReduce。以下是映射程序代码：

#!/usr/bin/env python

import sys
# input comes from STDIN (standard input)
for line in sys.stdin:

    try:
        # remove leading and trailing whitespace
        line = line.strip()
        # split the line into words
        words = line.split()
        # loop over words
        for word in words:
        # write out word and trivial count
            print '%s\t%s' % (word.strip(), 1)
    except:
        pass

我在运行古腾堡项目的尤利西斯

在Hadoop集群上运行时，会收到以下错误消息：

    File "<stdin>", line 1
    The Project Gutenberg EBook of Ulysses, by James Joyce
              ^
SyntaxError: invalid syntax

文件“”，第1行
詹姆斯·乔伊斯的古腾堡计划《尤利西斯》电子书
^
SyntaxError:无效语法

我不知道出了什么问题，有什么帮助吗？

看起来您可能正试图将该书作为Python文件运行。也许您将参数以错误的顺序传递给了某些对象。

哦，您正在运行Python 3吗

Python3更改了

print

的语法，它需要是

print（…）

此外，您还可以像这样使用

.format（）

：

可能的答案

print（{word}\t{value}）。格式（word=word.strip（），value=1））

可以简化为：

print（{}\t{}）。格式（word.strip（），1））

也如果你有一句“台词”，比如“詹姆斯·乔伊斯的《尤利西斯的古腾堡计划》电子书”

您可能还想去掉

，

；的内容）

是的，这就是问题所在。我使用的执行字符串是错误的，因此Python试图将输入作为脚本读取。我必须用引号括住映射器，例如，-mapper“python mapper.py”。