Python 如何在pyspark中使用lambda创建值对?
我正在尝试转换pyspark rdd,如下所示: 之前:Python 如何在pyspark中使用lambda创建值对?,python,apache-spark,lambda,pyspark,Python,Apache Spark,Lambda,Pyspark,我正在尝试转换pyspark rdd,如下所示: 之前: [ [('169', '5'), ('2471', '6'), ('48516', '10')], [('58', '7'), ('163', '7')], [('172', '5'), ('186', '4'), ('236', '6')] ] 之后: [ [('169', '5'), ('2471', '6')], [('169', '5'),('48516', '10')],
[
[('169', '5'), ('2471', '6'), ('48516', '10')],
[('58', '7'), ('163', '7')],
[('172', '5'), ('186', '4'), ('236', '6')]
]
之后:
[
[('169', '5'), ('2471', '6')],
[('169', '5'),('48516', '10')],
[('2471', '6'), ('48516', '10')],
[('58', '7'), ('163', '7')],
[('172', '5'), ('186', '4')],
[('172', '5'), ('236', '6')],
[('186', '4'), ('236', '6')]
]
这个想法是通过每一行,并创建新的线两两。我试图通过
lambda
tutorials自己找到一个解决方案,但效果不佳。我可以请求一些帮助吗?如果这是重复其他问题,我道歉。谢谢 我会将flatMap
与itertools.组合使用
:
from itertools import combinations
rdd.flatMap(lambda xs: combinations(xs, 2))
这正是我需要的。谢谢!