如何在工作节点中写入HDInsight pyspark中的blob_Pyspark_Azure Hdinsight

如何在工作节点中写入HDInsight pyspark中的blob

pyspark

如何在工作节点中写入HDInsight pyspark中的blob,pyspark,azure-hdinsight,Pyspark,Azure Hdinsight,示例代码： @pandas_udf(return_schema, functionType=PandasUDFType.GROUPED_MAP) def g(df): # summarize the df and write to the Azure blob storage for the HDInsight cluster df = spark.createDataFrame(...) df.groupBy("x").apply(g).show() 在g函数中

示例代码：

@pandas_udf(return_schema, functionType=PandasUDFType.GROUPED_MAP)
def g(df):
    # summarize the df and write to the Azure blob storage for the HDInsight cluster

df = spark.createDataFrame(...)
df.groupBy("x").apply(g).show()

在

函数中，我们希望汇总

df

，然后将汇总的数据写入连接到HDInsight群集的Azure存储容器中的a blob(wasbs:///etc). 我们如何做到这一点