Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在java中使用crealytics/spark excel将多个org.apache.spark.sql.Dataset写入.xls文件时提及各个工作表名称?_Java_Apache Spark_Dataset_Rdd_Spark Excel - Fatal编程技术网

如何在java中使用crealytics/spark excel将多个org.apache.spark.sql.Dataset写入.xls文件时提及各个工作表名称?

如何在java中使用crealytics/spark excel将多个org.apache.spark.sql.Dataset写入.xls文件时提及各个工作表名称?,java,apache-spark,dataset,rdd,spark-excel,Java,Apache Spark,Dataset,Rdd,Spark Excel,我正在尝试使用crealytics/spark excel库将不同的Java数据集写入excel文件,该文件将包含多个工作表 <dependency> <groupId>com.crealytics</groupId> <artifactId>spark-excel_2.11</artifactId> <version>0.13.0</versio

我正在尝试使用crealytics/spark excel库将不同的Java数据集写入excel文件,该文件将包含多个工作表

<dependency>
            <groupId>com.crealytics</groupId>
            <artifactId>spark-excel_2.11</artifactId>
            <version>0.13.0</version>
</dependency>

com.crealytics
spark-excel_2.11
0.13.0
如何为这些单独的excel工作表提供名称

以下是我正在尝试做的:

import org.apache.spark.api.java.JavaRDD;

SparkSession spark = SparkSession.builder().appName("LineQuery").getOrCreate();

Dataset<Row> df1 = spark.sql("SELECT * FROM my_table1");
Dataset<Row> df2 = spark.sql("SELECT * FROM my_table2");

df1.write().format("com.crealytics.spark.excel").option("sheetName","My Sheet 1").option("header", "true").save("hdfs://127.0.0.1:9000/var/www/" + outFile + ".xls");

df2.write().format("com.crealytics.spark.excel").option("sheetName","My Sheet 2").option("header", "true").mode(SaveMode.Append).save("hdfs://127.0.0.1:9000/var/www/" + outFile + ".xls");
import org.apache.spark.api.java.JavaRDD;
SparkSession spark=SparkSession.builder().appName(“LineQuery”).getOrCreate();
数据集df1=spark.sql(“从my_table1中选择*);
数据集df2=spark.sql(“从my_table2中选择*);
df1.write().format(“com.crealytics.spark.excel”).option(“sheetName”,“我的工作表1”).option(“header”,“true”).save(“hdfs://127.0.0.1:9000/var/www/“+outFile+”.xls”);
df2.write().format(“com.crealytics.spark.excel”).option(“sheetName”,“我的工作表2”).option(“header”,“true”).mode(SaveMode.Append).save(“hdfs://127.0.0.1:9000/var/www/“+outFile+”.xls”);

改用
数据地址
选项

例如:

>>> df = spark.createDataFrame([(11, 12), (21, 22)])
>>> df.show()
+---+---+
| _1| _2|
+---+---+
| 11| 12|
| 21| 22|
+---+---+
>>> df.where("_1 == 11").write.format("com.crealytics.spark.excel").option("dataAddress", "my sheet 1[#All]").option("header", "true").mode("append").save("/tmp/excel-df.xlsx")
>>> df.where("_1 == 21").write.format("com.crealytics.spark.excel").option("dataAddress", "my sheet 2[#All]").option("header", "true").mode("append").save("/tmp/excel-df.xlsx")