Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/360.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在Java的SparkSQL中正确创建视图_Java_Apache Spark_Apache Spark Sql - Fatal编程技术网

如何在Java的SparkSQL中正确创建视图

如何在Java的SparkSQL中正确创建视图,java,apache-spark,apache-spark-sql,Java,Apache Spark,Apache Spark Sql,我试图做一件非常基本的事情,但是我遇到了一个错误,我找不到解决方法 我想做的是能够在数据帧上执行SQL查询。为此,我在文档中看到,您需要从dataframe创建一个视图。问题是,尽管我完全按照文档中所写的去做,IntelliJ it却一直给我一个编译错误 这是我的代码: @Override public Long execute() { log.info("Starting processing query"); Instant start = Ins

我试图做一件非常基本的事情,但是我遇到了一个错误,我找不到解决方法

我想做的是能够在数据帧上执行SQL查询。为此,我在文档中看到,您需要从dataframe创建一个视图。问题是,尽管我完全按照文档中所写的去做,IntelliJ it却一直给我一个编译错误

这是我的代码:

    @Override
public Long execute() {
    log.info("Starting processing query");
    Instant start = Instant.now();


    Dataset<Row> dataframe = this.hdfsIO.readParquetAsDataframe(vaccineAdministrationSummaryFile);
//    Dataset<Row> dataframe = this.sparkSession.read().parquet(this.hdfsUrl + inputDir + "/" + filename);
    dataframe.createOrReplaceTempView("query");

    Dataset<Row> sqlDF = sparkSession.sql("SELECT * FROM query");
Istead,在dataframe.createOrReplaceTempView(“查询”)之前运行相同的命令我得到这个:

21/05/31 09:36:29 INFO CodeGenerator: Code generated in 27.591936 ms
21/05/31 09:36:29 INFO CodeGenerator: Code generated in 11.52196 ms
21/05/31 09:36:29 INFO CodeGenerator: Code generated in 15.643281 ms
+-----+--------+-----------+---------+-----------+
|name |database|description|tableType|isTemporary|
+-----+--------+-----------+---------+-----------+
|query|null    |null       |TEMPORARY|true       |
+-----+--------+-----------+---------+-----------+
+----+--------+-----------+---------+-----------+
|name|database|description|tableType|isTemporary|
+----+--------+-----------+---------+-----------+
+----+--------+-----------+---------+-----------+
函数
readParquetAsDataframe()
定义如下,传递的字符串是hdfs中文件的路径:

    public Dataset<Row> readParquetAsDataframe(String filename){
        return this.sparkSession.read().parquet(this.hdfsUrl + inputDir + "/" + filename);
}

看起来你有java代码。查看这个JDBC示例,一旦收集了
数据集
,阅读拼花就没什么不同了,请注意注释

public static void main(String[] args) throws Exception {
        
        SparkSession sparkSession = SparkSession.builder()
                                .master("local")
                                .appName("Test App")
                                .getOrCreate();

        // JDBC connection details
        String driver = "com.mysql.cj.jdbc.Driver";
        String url = "jdbc:mysql://192.168.1.113:3306/db";
        String user = "user";
        String pass = "password";
        
        // JDBC Connection and load table in Dataframe
        Dataset<Row> trans = sparkSession.read()
                                .format("jdbc")
                                .option("driver", driver)
                                .option("url", url)
                                //your table name here
                                .option("dbtable", "<table name>")
                                .option("user", user)
                                .option("password", pass).load();
        
        //your table view name here
        trans.createOrReplaceTempView("<table_view_name>");

        Dataset<Row> sqlResult = sparkSession
                                    //your same table view name here
                                    .sql("select firstname,lastname from <table_view_name>");

        sqlResult.foreach(f -> {
            //print the firstname from your table
            System.out.println("firstname -> "+f.get(0));
        });

    }
publicstaticvoidmain(字符串[]args)引发异常{
SparkSession SparkSession=SparkSession.builder()
.master(“本地”)
.appName(“测试应用程序”)
.getOrCreate();
//JDBC连接详细信息
String driver=“com.mysql.cj.jdbc.driver”;
String url=“jdbc:mysql://192.168.1.113:3306/db";
字符串user=“user”;
字符串pass=“password”;
//Dataframe中的JDBC连接和加载表
Dataset trans=sparkSession.read()
.格式(“jdbc”)
.选项(“驱动程序”,驱动程序)
.选项(“url”,url)
//这里是您的桌子名
.option(“dbtable”,“”)
.选项(“用户”,用户)
.option(“password”,pass).load();
//这里是您的表视图名称
trans.createOrReplaceTempView(“”);
数据集sqlResult=sparkSession
//在这里输入相同的表视图名称
.sql(“从中选择名字和姓氏”);
sqlResult.foreach(f->{
//打印表中的名字
System.out.println(“firstname->”+f.get(0));
});
}

这是否回答了您的问题?不,我已经阅读了这个答案,但是它包含了文档中已经提供的信息,仅此而已。您真的需要原始sql吗?使用可用的数据帧选择方法在编译时会更加优化老实说,我认为我可以不使用它,但我尝试编写此指令,因为我遇到了此错误,我不明白为什么我发布此问题以了解发生了什么。如果将
spark.catalog.listTables.show(false)放入
就在
createOrReplaceTempView(…)
之后?
public static void main(String[] args) throws Exception {
        
        SparkSession sparkSession = SparkSession.builder()
                                .master("local")
                                .appName("Test App")
                                .getOrCreate();

        // JDBC connection details
        String driver = "com.mysql.cj.jdbc.Driver";
        String url = "jdbc:mysql://192.168.1.113:3306/db";
        String user = "user";
        String pass = "password";
        
        // JDBC Connection and load table in Dataframe
        Dataset<Row> trans = sparkSession.read()
                                .format("jdbc")
                                .option("driver", driver)
                                .option("url", url)
                                //your table name here
                                .option("dbtable", "<table name>")
                                .option("user", user)
                                .option("password", pass).load();
        
        //your table view name here
        trans.createOrReplaceTempView("<table_view_name>");

        Dataset<Row> sqlResult = sparkSession
                                    //your same table view name here
                                    .sql("select firstname,lastname from <table_view_name>");

        sqlResult.foreach(f -> {
            //print the firstname from your table
            System.out.println("firstname -> "+f.get(0));
        });

    }