Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 Pyspark中的Py4JJavaError_Python 2.7_Apache Spark_Pyspark - Fatal编程技术网

Python 2.7 Pyspark中的Py4JJavaError

Python 2.7 Pyspark中的Py4JJavaError,python-2.7,apache-spark,pyspark,Python 2.7,Apache Spark,Pyspark,我正在使用Python API开发Spark。下面是我的代码。当我执行wordCount.first行时。我收到ValueError:需要超过1个值才能解包。如能对上述错误有所了解,将不胜感激。谢谢 #create an RDD with textFile method text_data_file=sc.textFile('/resources/yelp_labelled.txt') #import the required library for word count operation

我正在使用Python API开发Spark。下面是我的代码。当我执行wordCount.first行时。我收到ValueError:需要超过1个值才能解包。如能对上述错误有所了解,将不胜感激。谢谢

#create an RDD with textFile method
text_data_file=sc.textFile('/resources/yelp_labelled.txt')

#import the required library for word count operation
from operator import add
#Use filter to return RDD for words length greater than zero
wordCountFilter=text_data_file.filter(lambda x:len(x)>0)
#use flat map to split each line into words
wordFlatMap=wordCountFilter.flatMap(lambda x: x.split())
#map each key with value 1 using map function
wordMapper=wordFlatMap.flatMap(lambda x:(x,5))
#Use reducebykey function to reduce above mapped keys
#returns the key-value pairs by adding values for similar keys
wordCount=wordMapper.reduceByKey(add)
#view the first element
wordCount.first()
你的错误在这里:

wordMapper=wordFlatMap.flatMap(lambda x:(x,5))
应该是

wordMapper=wordFlatMap.map(lambda x:(x,5))
否则你只会发出

x

作为单独的值。Spark将尝试扩展x并失败,因为它的长度不等于2。否则,它将尝试解包5并失败

x
5