Python PySpark对self-inside map()和reduce()函数的引用
如前所述,我应该避免在map函数中调用self。在此基础上,我有两个问题: 让我们使用此处所述的相同代码:Python PySpark对self-inside map()和reduce()函数的引用,python,apache-spark,mapreduce,pyspark,Python,Apache Spark,Mapreduce,Pyspark,如前所述,我应该避免在map函数中调用self。在此基础上,我有两个问题: 让我们使用此处所述的相同代码: class C0(object): def func0(self, arg): # added self ... def func1(self, rdd): # added self func = self.func0 result = rdd.map(lambda x: func(x)) result=rdd.map(lambda x:func(x)
class C0(object):
def func0(self, arg): # added self
...
def func1(self, rdd): # added self
func = self.func0
result = rdd.map(lambda x: func(x))
result=rdd.map(lambda x:func(x))
与
result=rdd.map(func)
?特别是在我以前使用func=self.func0
的情况下Spark是如何处理这个问题的?我应该在func0中执行
func2=self.func2
吗?我想这就是我想要的。非常感谢你
def func2(self, arg):
...
def func0(self, arg): # added self
self.func2(arg)
...
def func1(self, rdd): # added self
func = self.func0
result = rdd.map(lambda x: func(x))