Apache spark 从Java/Scala应用程序执行PySpark代码
有没有办法在现有SparkSession上从Java/Scala应用程序执行PySpark代码 具体来说,给定一个接收并返回PySpark数据帧的PySpark代码,是否有办法将其提交给Java/Scala SparkSession并返回输出数据帧:Apache spark 从Java/Scala应用程序执行PySpark代码,apache-spark,pyspark,pyspark-dataframes,Apache Spark,Pyspark,Pyspark Dataframes,有没有办法在现有SparkSession上从Java/Scala应用程序执行PySpark代码 具体来说,给定一个接收并返回PySpark数据帧的PySpark代码,是否有办法将其提交给Java/Scala SparkSession并返回输出数据帧: String pySparkCode = "def my_func(input_df):\n" + " from pyspark.sql.functions import *\n" + " return input_df
String pySparkCode = "def my_func(input_df):\n" +
" from pyspark.sql.functions import *\n" +
" return input_df.selectExpr(...)\n" +
" .drop(...)\n" +
" .withColumn(...)\n"
SparkSession spark = SparkSession.builder().master("local").getOrCreate()
Dataset inputDF = spark.sql("SELECT * from my_table")
outputDf = spark.<SUBMIT_PYSPARK_METHOD>(pySparkCode, inputDF)
String pySparkCode=“def my\u func(input\u df):\n”+
“从pyspark.sql.functions导入*\n”+
“返回输入值。选择表达式(…)\n”+
“.drop(…)\n”+
“.withColumn(…)\n”
SparkSession spark=SparkSession.builder().master(“本地”).getOrCreate()
数据集inputDF=spark.sql(“从我的表中选择*)
outputDf=火花。(pySparkCode,inputDF)