Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
我们可以从pyspark shell外部运行pyspark python脚本吗?_Python_Apache Spark_Pyspark - Fatal编程技术网

我们可以从pyspark shell外部运行pyspark python脚本吗?

我们可以从pyspark shell外部运行pyspark python脚本吗?,python,apache-spark,pyspark,Python,Apache Spark,Pyspark,我的pyspark脚本是它包含的m.py l = [1,2,3,4,7,5,6,7,8,9,0] k = sc.parallelize(l) type(k) 当我提交m.py时 SPARK_MAJOR_VERSION is set to 2, using Spark2 Traceback (most recent call last): File "/root/m.py", line 3, in <module> k = sc.parallelize(l)

我的pyspark脚本是它包含的m.py

l = [1,2,3,4,7,5,6,7,8,9,0]
k = sc.parallelize(l)
type(k)
当我提交m.py时

   SPARK_MAJOR_VERSION is set to 2, using Spark2
   Traceback (most recent call last):
   File "/root/m.py", line 3, in <module>
   k = sc.parallelize(l)
   NameError: name 'sc' is not defined
错误再次出现:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "m.py", line 3, in <module>
k = sc.parallelize(l)
NameError: name 'sc' is not defined
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“m.py”,第3行,在
k=sc.并行化(l)
NameError:未定义名称“sc”

是的,可以,但必须确保正确的
PYTHONPATH
并初始化所有要使用的对象:

from pyspark import SparkContext

sc = SparkContext.getOrCreate()

在驱动程序中,确保首先创建一个sparkContext变量。正如我所看到的,您直接使用了“sc”,而没有初始化它。然后您可以运行您的程序:

from pyspark import SparkContext

sc = SparkContext.getOrCreate()

import m.py
from pyspark import SparkContext

sc = SparkContext.getOrCreate()

import m.py