Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/356.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Spark统计函数Python_Python_Hadoop_Apache Spark - Fatal编程技术网

Spark统计函数Python

Spark统计函数Python,python,hadoop,apache-spark,Python,Hadoop,Apache Spark,我问了一个关于统计函数的问题并得到了答案,但我正在寻找另一种方法: 我觉得奇怪的是: 这项工作: myData = dataSplit.map(lambda arr: (arr[1])) myData2 = myData.map(lambda line: line.split(',')).map(lambda fields: ("Column", float(fields[0]))).groupByKey() stats[1] = myData2.map(lambda (Column, valu

我问了一个关于统计函数的问题并得到了答案,但我正在寻找另一种方法:

我觉得奇怪的是: 这项工作:

myData = dataSplit.map(lambda arr: (arr[1]))
myData2 = myData.map(lambda line: line.split(',')).map(lambda fields: ("Column", float(fields[0]))).groupByKey()
stats[1] = myData2.map(lambda (Column, values): (min(values))).collect()
但当我添加此函数时:

stats[4] = myData2.map(lambda (Column, values): (values)).variance()
它失败了

所以我印了一些字:

myData = dataSplit.map(lambda arr: (arr[1]))
print myData.collect()
myData2 = myData.map(lambda line: line.split(',')).map(lambda fields: ("Column", float(fields[0]))).groupByKey()
print myData2.map(lambda (Column, values): (values)).collect()
打印myData:

[u'18964', u'18951', u'18950', u'18949', u'18960', u'18958', u'18956', u'19056', u'18948', u'18969', u'18961', u'18959', u'18957', u'18968', u'18966', u'18967', u'18971', u'18972', u'18353', u'18114', u'18349', u'18348', u'18347', u'18346', u'19053', u'19052', u'18305', u'18306', u'18318', u’18317']
正在打印myData2:

[<pyspark.resultiterable.ResultIterable object at 0x7f3f7d3e0710>]
[]
已解决

 print  myData.map(lambda line: line.split(',')).map(lambda fields: ("Column", float(fields[0]))).map(lambda (column, value) : (value)).stdev()