Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/296.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在SPARK python中用空格替换双引号_Python_String_Apache Spark_Replace - Fatal编程技术网

在SPARK python中用空格替换双引号

在SPARK python中用空格替换双引号,python,string,apache-spark,replace,Python,String,Apache Spark,Replace,我正在尝试从文本文件中删除双引号,如: in24.inernebr.com[01/Aug/1995:00:00:01]“GET/shutter/missions/sts-68/news/sts-68-mcc-05.txt”200 1839 uplherc.upl.com[01/Aug/1995:00:00:07]“GET/”304 0 uplherc.upl.com[01/Aug/1995:00:08]“GET/images/ksclogo medium.gif”304 0 uplherc.co

我正在尝试从文本文件中删除双引号,如:

in24.inernebr.com[01/Aug/1995:00:00:01]“GET/shutter/missions/sts-68/news/sts-68-mcc-05.txt”200 1839 uplherc.upl.com[01/Aug/1995:00:00:07]“GET/”304 0 uplherc.upl.com[01/Aug/1995:00:08]“GET/images/ksclogo medium.gif”304 0 uplherc.com[01/Aug/1995:00:00:08]“GET/images/MOSAIC small.gif”upllogherc.upl.com[01/Aug/1995:00:00:08]“GET/images/USA logosall.gif”304 0 ix-esc-ca2-07.ix.netcom.com[01/Aug/1995:00:09]“GET/images/launch logo.gif”200 1713 uplherc.upl.com[01/Aug/1995:00:10]“GET/images/WORLD logosall.gif”304 0 slppp6.interndd.net[01/Aug/1995:00:00:10]“GET/history/skylab/skylab/skylab.html”200 1687-WebA4y.com[01/Aug/1995:00:00:10]“GET/images/launchmedium.gif”200 11853 slppp6.internd.net[01/Aug/1995:00:00:11]“GET/history/skylab/skylab small.gif”200 9202

我正在尝试的代码是:

def process_row(row):
  
row.replace('""', '')
row.split('\t')

nasa = nasa_raw.map(process_row)
for row in nasa.take(10):
print(row)
运行此代码时的结果是:

None None None None None None None None None None
我做错了什么?

两件事

您错过了return语句,在replace语句中使用单引号代替双引号

def process_row(row):
    return row.replace('"', '')

file = open('filename')
for row in file.readlines():
    print(row)
    print(process_row(row))

您的函数不返回任何内容,这就是为什么在使用Spark/RDD之前,使用普通Python获得所有非测试代码的原因