Jdbc 将HiveServer2指向用于配置单元测试的最小群集

Jdbc 将HiveServer2指向用于配置单元测试的最小群集,jdbc,hive,integration-testing,Jdbc,Hive,Integration Testing,我一直想对我开发的一些代码进行配置单元集成测试。我需要的测试框架的两个主要要求: 它需要使用Cloudera版本的Hive和Hadoop (最好是2.0.0-cdh4.7.0) 它必须是所有本地的。Hadoop集群和Hive的意义 服务器应该在测试开始时启动,运行一些查询, 测试结束后再拆卸 所以我把这个问题分为三个部分: 获取HiveServer2部分的代码(我决定使用JDBC) Thrift服务客户端上的连接器) 获取用于构建内存中MapReduce集群的代码(我决定为此使用MinimerC

我一直想对我开发的一些代码进行配置单元集成测试。我需要的测试框架的两个主要要求:

  • 它需要使用Cloudera版本的Hive和Hadoop (最好是2.0.0-cdh4.7.0)
  • 它必须是所有本地的。Hadoop集群和Hive的意义 服务器应该在测试开始时启动,运行一些查询, 测试结束后再拆卸
  • 所以我把这个问题分为三个部分:

  • 获取HiveServer2部分的代码(我决定使用JDBC) Thrift服务客户端上的连接器)
  • 获取用于构建内存中MapReduce集群的代码(我决定为此使用MinimerCluster)
  • 设置上述(1)和(2)以相互配合 通过查看许多资源,我得以避开。其中一些非常有用的是:

    对于第(2)项,我在StackOverflow中关注了这篇优秀的文章:

    到目前为止,一切顺利。此时,我的Maven项目中的pom.xml(包括上述两项功能)看起来如下所示:

    <repositories>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
        </repository>
    </repositories>
    
    <dependencies>
        <dependency>
            <groupId>commons-io</groupId>
            <artifactId>commons-io</artifactId>
            <version>2.1</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.11</version>
        </dependency>
        <!-- START: dependencies for getting MiniMRCluster to work -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-auth</artifactId>
            <version>2.0.0-cdh4.7.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-test</artifactId>
            <version>2.0.0-mr1-cdh4.7.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.0.0-cdh4.7.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>2.0.0-cdh4.7.0</version>
            <classifier>tests</classifier>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.0.0-cdh4.7.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.0.0-cdh4.7.0</version>
            <classifier>tests</classifier>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
            <version>2.0.0-mr1-cdh4.7.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-core</artifactId>
            <version>2.0.0-mr1-cdh4.7.0</version>
            <classifier>tests</classifier>
        </dependency>
        <!-- END: dependencies for getting MiniMRCluster to work -->
    
        <!-- START: dependencies for getting Hive JDBC to work -->
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-builtins</artifactId>
            <version>${hive.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-cli</artifactId>
            <version>${hive.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-metastore</artifactId>
            <version>${hive.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-serde</artifactId>
            <version>${hive.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-common</artifactId>
            <version>${hive.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-exec</artifactId>
            <version>${hive.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-jdbc</artifactId>
            <version>${hive.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.thrift</groupId>
            <artifactId>libfb303</artifactId>
            <version>0.9.1</version>
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>1.2.15</version>
        </dependency>
        <dependency>
            <groupId>org.antlr</groupId>
            <artifactId>antlr-runtime</artifactId>
            <version>3.5.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.derby</groupId>
            <artifactId>derby</artifactId>
            <version>10.10.1.1</version>
        </dependency>
        <dependency>
            <groupId>javax.jdo</groupId>
            <artifactId>jdo2-api</artifactId>
            <version>2.3-ec</version>
        </dependency>
        <dependency>
            <groupId>jpox</groupId>
            <artifactId>jpox</artifactId>
            <version>1.1.9-1</version>
        </dependency>
        <dependency>
            <groupId>jpox</groupId>
            <artifactId>jpox-rdbms</artifactId>
            <version>1.2.0-beta-5</version>
        </dependency>
        <!-- END: dependencies for getting Hive JDBC to work -->
    </dependencies>
    
    我希望(3)将通过上述方法中的以下代码行解决:

        Connection hiveConnection = DriverManager.getConnection(
                "jdbc:hive2:///", "", "");
    
    但是,我得到了以下错误:

    java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:161)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:150)
        at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:207)
        at com.ask.nkelkar.hive.HiveUnitTest.testHiveMiniDFSClusterIntegration(HiveUnitTest.java:54)
    
    有人能告诉我我需要做些什么吗?/我做错了什么才能让这项工作正常进行


    另外,我将和项目视为选项,但我无法将它们与Cloudera版本的Hadoop一起使用。

    您的测试在第一个
    创建表
    语句中失败。配置单元正在无效地抑制以下错误消息:

    file:/user/hive/warehouse/test is not a directory or unable to create one
    
    配置单元正在尝试使用文件系统中不存在的默认仓库目录
    /user/Hive/warehouse
    。您可以创建目录,但对于测试,您可能希望覆盖默认值。例如:

    import static org.apache.hadoop.hive.conf.HiveConf.ConfVars;
    ...
    System.setProperty(ConfVars.METASTOREWAREHOUSE.toString(), "/Users/nishantkelkar/IdeaProjects/" +
                "nkelkar-incubator/hive-test/target/hive/warehouse");
    

    谢谢你回复这个帖子!现在它不再给我那个错误了。您是对的,它需要查看/warehouse文件夹才能继续。我希望Hive的错误消息在语义上更加详细。如果你能看一看,我将不胜感激。谢谢大家!--
    import static org.apache.hadoop.hive.conf.HiveConf.ConfVars;
    ...
    System.setProperty(ConfVars.METASTOREWAREHOUSE.toString(), "/Users/nishantkelkar/IdeaProjects/" +
                "nkelkar-incubator/hive-test/target/hive/warehouse");