Python 无法导入pyspark统计模块

Python 无法导入pyspark统计模块,python,pyspark,Python,Pyspark,Python 2.7、Apache Spark 2.1.0、Ubuntu 14.04 在pyspark shell中,我得到以下错误: >>> from pyspark.mllib.stat import Statistics Traceback (most recent call last): File "", line 1, in ImportError: No module named stat >>>从pyspark.mllib.stat导入统计信息 回溯(最近一次呼叫最后一次):

Python 2.7、Apache Spark 2.1.0、Ubuntu 14.04 在pyspark shell中,我得到以下错误:

>>> from pyspark.mllib.stat import Statistics Traceback (most recent call last): File "", line 1, in ImportError: No module named stat >>>从pyspark.mllib.stat导入统计信息 回溯(最近一次呼叫最后一次): 文件“”,第1行,在 ImportError:没有名为stat的模块 解决方案

同样地

>>> from pyspark.mllib.linalg import SparseVector Traceback (most recent call last): File "", line 1, in ImportError: No module named linalg >>>从pyspark.mllib.linalg导入SparseVector 回溯(最近一次呼叫最后一次): 文件“”,第1行,在 ImportError:没有名为linalg的模块 我已经安装了numpy,并且

>>> sys.path ['', u'/tmp/spark-2d5ea25c-e2e7-490a-b5be-815e320cdee0/userFiles-2f177853-e261-46f9-97e5-01ac8b7c4987', '/usr/local/lib/python2.7/dist-packages/setuptools-18.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pyspark-2.1.0+hadoop2.7-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/py4j-0.10.4-py2.7.egg', '/home/d066537/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip', '/home/d066537/spark/spark-2.1.0-bin-hadoop2.7/python', '/home/d066537', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gst-0.10', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] >>>系统路径 据英国政府称,“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们”的“你们们文件-2f1777-2-1777-7-7-3-3-2-2-2-2-2-5-2-2-2-2-e261-6-6-6-7-6-7-7-7-6-7-7-7-7-7-7-7-7-8-7-7-8-8-8-8-8-8-8-8-8-8-8-8-8-8-7-8-8-7-8-8-8-8-8-8-8-8 1.0-bin-2.7/家/家/家/家/家/火花/火花/火花/火花2.7 7 7/libThThThThth2.7 7/libThThtho2.1.1.0-10-2.1 1.0-10-2.7/python 7.7/家/家/家/2.7/python 7/python 2.7/家/python 2.7/7/2.7/2.7/2.7/2.7/2.7/2.7/2.7/2.7/2.7/2.7/7/2.7/2.7/2.7/2.7/7/2.7/2.7/2.7/7/2.7/2.7/2.7/2.7/2.7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/7/N2.7/dist-packages'、'/usr/lib/python2.7/dist-packages/pilcomat'、'/usr/lib/python2.7/dist-packages/gst-0.10'、'/usr/lib/python2.7/dist-packages/gtk-2.0'、'/usr/lib/python2.7/dist-packages/ubuntu-sso-client']
拆下pyspark安装

sudo -H pip uninstall pyspark

我也有同样的问题。Python文件
stat.py
似乎不在Spark 2.1.x中,而是在Spark 2.2.x中。因此,您似乎需要使用更新的pyspark升级Spark(但齐柏林飞艇0.7.x似乎无法与Spark 2.2.x配合使用)