Python 替换RDD中的最后一个元素_Python_Apache Spark_Text_Replace_Str Replace

Python 替换RDD中的最后一个元素

python apache-spark text replace

Python 替换RDD中的最后一个元素,python,apache-spark,text,replace,str-replace,Python,Apache Spark,Text,Replace,Str Replace,我的RDD如下所示： uplherc.upl.com [01/Aug/1995:00:00:07] "GET /" 304 0 uplherc.upl.com [01/Aug/1995:00:00:08] "GET /images/ksclogo-medium.gif" 304 0 uplherc.upl.com [01/Aug/1995:00:00:08] "GET /images/MOSAIC-logosmall.gif" 3

我的RDD如下所示：

 uplherc.upl.com [01/Aug/1995:00:00:07] "GET /" 304 0
 uplherc.upl.com [01/Aug/1995:00:00:08] "GET /images/ksclogo-medium.gif" 304 0
 uplherc.upl.com [01/Aug/1995:00:00:08] "GET /images/MOSAIC-logosmall.gif" 304 0
 uplherc.upl.com [01/Aug/1995:00:00:08] "GET /images/USA-logosmall.gif" 304 0
 ix-esc-ca2-07.ix.netcom.com [01/Aug/1995:00:00:09] "GET /images/launch-logo.gif" 200 1713
 uplherc.upl.com [01/Aug/1995:00:00:10] "GET /images/WORLD-logosmall.gif" 304 0
 slppp6.intermind.net [01/Aug/1995:00:00:10] "GET /history/skylab/skylab.html" 200 1687
 piweba4y.prodigy.com [01/Aug/1995:00:00:10] "GET /images/launchmedium.gif" 200 11853
 slppp6.intermind.net [01/Aug/1995:00:00:11] "GET /history/skylab/skylab-small.gif" 200 9202

我想检查最后一个元素（标记）是否为连字符，如果是，则将其替换为零。我的代码如下：

 def process_row(row):
 words = row.replace('"', '').split(' ')
 words.map(lambda row: 0 if x[5] == '-' else  x[5])
 return words

 nasa = (
 nasa_raw.flatMap(process_row)
 )

for row in nasa.take(5):
print(row)

当我尝试运行时，我得到一个错误对象没有属性映射

这里缺少什么？

split

返回一个没有

映射的python列表。您可以使用以下选项
words=map（lambda行：如果x[5]='-'否则x[5]，words，则为0）