Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/306.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 理解PySpark Reduce()_Python_Apache Spark_Pyspark_Bigdata_Reduce - Fatal编程技术网

Python 理解PySpark Reduce()

Python 理解PySpark Reduce(),python,apache-spark,pyspark,bigdata,reduce,Python,Apache Spark,Pyspark,Bigdata,Reduce,我正在用PySpark学习Spark,并尝试让不同的工作人员使用函数reduce()来正确理解它,但我做了一些事情,得到了一个对我来说毫无意义的结果 我之前使用reduce执行的示例基本上是: >>> a = sc.parallelize(['a','b','c','d']) >>> a.reduce(lambda x,y:x+y) 'abcd' >>> a = sc.parallelize([1,2,3,4]) >>>

我正在用PySpark学习Spark,并尝试让不同的工作人员使用函数reduce()来正确理解它,但我做了一些事情,得到了一个对我来说毫无意义的结果

我之前使用reduce执行的示例基本上是:

>>> a = sc.parallelize(['a','b','c','d'])
>>> a.reduce(lambda x,y:x+y)
'abcd'

>>> a = sc.parallelize([1,2,3,4])
>>> a.reduce(lambda x,y:x+y)
10

>>> a = sc.parallelize(['azul','verde','azul','rojo','amarillo'])
>>> aV2 = a.map(lambda x:(x,1))
>>> aRes = aV2.reduceByKey(lambda x,y: x+y)
>>> aRes.collect()
[('rojo', 1), ('azul', 2), ('verde', 1), ('amarillo', 1)]
但我试过:

>>> a = sc.parallelize(['a','b','c','d'])
>>> a.reduce(lambda x,y:x+x)
'aaaaaaaa'
我期待的结果是“aaaa”,但没有“aaaaaa”

我正在寻找一个阅读reduce()文档的答案,但我想我遗漏了一些东西


谢谢

lambda函数中的x不断变化,因此每个步骤中最后一个表达式的x是

a
aa 
aaaa
这将给出最后一个结果
aaaaaa
。你的表情中的字符数应该是原来的两倍