带spark的ipython笔记本在使用sparkcontext时出错

带spark的ipython笔记本在使用sparkcontext时出错,python,apache-spark,ipython,pyspark,jupyter-notebook,Python,Apache Spark,Ipython,Pyspark,Jupyter Notebook,我正在我的MacBookOSX10.10.5上用这个例子测试turi 当到达这一步时 # Set up the SparkContext object # this can be 'local' or 'yarn-client' in PySpark # Remember if using yarn-client then all the paths should be accessible # by all nodes in the cluster. sc = SparkContext('l

我正在我的MacBookOSX10.10.5上用这个例子测试turi

当到达这一步时

# Set up the SparkContext object
# this can be 'local' or 'yarn-client' in PySpark
# Remember if using yarn-client then all the paths should be accessible
# by all nodes in the cluster.
sc = SparkContext('local')
出现以下错误

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-12-dc1befb4186c> in <module>()
      3 # Remember if using yarn-client then all the paths should be accessible
      4 # by all nodes in the cluster.
----> 5 sc = SparkContext()

/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    110         """
    111         self._callsite = first_spark_call() or CallSite(None, None, None)
--> 112         SparkContext._ensure_initialized(self, gateway=gateway)
    113         try:
    114             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
    243         with SparkContext._lock:
    244             if not SparkContext._gateway:
--> 245                 SparkContext._gateway = gateway or launch_gateway()
    246                 SparkContext._jvm = SparkContext._gateway.jvm
    247 

/usr/local/Cellar/apache-spark/1.6.2/libexec/python/pyspark/java_gateway.pyc in launch_gateway()
     92                 callback_socket.close()
     93         if gateway_port is None:
---> 94             raise Exception("Java gateway process exited before sending the driver its port number")
     95 
     96         # In Windows, ensure the Java child processes do not linger after Python has exited.

Exception: Java gateway process exited before sending the driver its port number
有人知道如何修复这个错误吗


谢谢

这可能是因为两个原因:

  • 环境变量
    SPARK\u HOME
    可能指向错误的路径
  • 设置
    export PYSPARK\u SUBMIT\u ARGS=“--master local[2]”
    -这是您希望从
    PYSPARK
    开始的配置

  • SPARK\u HOME
    路径是否正确?您是否在环境变量中设置了
    PYSPARK\u SUBMIT\u ARGS=“--master spark://:”
    ?这可能是您丢失的端口号Spark\u home是正确的,我没有配置PYSPARK\u SUBMIT\u ARGS,在这种情况下我应该指定什么?试试这个
    export PYSPARK\u SUBMIT\u ARGS=“--master local[2]”
    @KartikKannapur我想这确实有效,您能编辑您的答案让我接受吗?非常感谢,我很乐意帮忙。我将添加答案。
    # added by Anaconda2 4.1.1 installer
    export PATH="/Users/me/anaconda/bin:$PATH"
    
    export SCALA_HOME=/usr/local/Cellar/scala/2.11.8/libexec
    export SPARK_HOME=/usr/local/Cellar/apache-spark/1.6.2/libexec
    export PYTHONPATH=$SPARK_HOME/python/pyspark:$PYTHONPATH
    export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH 
    export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH