Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java Dataproc依赖项冲突-google api客户端_Java_Scala_Google Cloud Storage_Google Cloud Dataproc_Google Cloud Kms - Fatal编程技术网

Java Dataproc依赖项冲突-google api客户端

Java Dataproc依赖项冲突-google api客户端,java,scala,google-cloud-storage,google-cloud-dataproc,google-cloud-kms,Java,Scala,Google Cloud Storage,Google Cloud Dataproc,Google Cloud Kms,我正在构建一个用于从云存储获取加密机密的库(在Scala中,使用Java客户端)。我正在使用以下google库: "com.google.apis" % "google-api-services-cloudkms" % "v1-rev26-1.23.0" exclude("com.google.guava", "guava-jdk5"), "com.google.cloud" % "google-cloud-storage" % "1.14.0", 本地一切正常,但当我尝试在

我正在构建一个用于从云存储获取加密机密的库(在Scala中,使用Java客户端)。我正在使用以下google库:

"com.google.apis"  % "google-api-services-cloudkms" % "v1-rev26-1.23.0" exclude("com.google.guava", "guava-jdk5"),
"com.google.cloud" % "google-cloud-storage"         % "1.14.0",
本地一切正常,但当我尝试在Dataproc中运行代码时,出现以下错误:

Exception in thread "main" java.lang.NoSuchMethodError: com.google.api.client.googleapis.services.json.AbstractGoogleJsonClient$Builder.setBatchPath(Ljava/lang/String;)Lcom/google/api/client/googleapis/services/AbstractGoogleClient$Builder;
    at com.google.api.services.cloudkms.v1.CloudKMS$Builder.setBatchPath(CloudKMS.java:4250)
    at com.google.api.services.cloudkms.v1.CloudKMS$Builder.<init>(CloudKMS.java:4229)
    at gcp.encryption.EncryptedSecretsUser$class.clients(EncryptedSecretsUser.scala:111)
    at gcp.encryption.EncryptedSecretsUser$class.getEncryptedSecrets(EncryptedSecretsUser.scala:62)
我在图中看到,一些google库在Dataproc上可用(我使用的是图像版本为1.2.15的Spark集群)。但就我所见,GoogleAPI客户端的可传递依赖项与我在本地使用的相同(1.23.0)。那么为什么找不到这种方法呢

我应该为在Dataproc上运行设置不同的依赖项吗

编辑 最终在另一个项目中解决了这个问题。事实证明,除了着色所有google依赖项(包括gcs连接器!!),还必须向JVM注册着色类来处理gs://文件系统。 以下是适用于我的maven配置,sbt也可以实现类似的功能:

父POM:

<project xmlns="http://maven.apache.org/POM/4.0.0"...>
...
<properties>
    <!-- Spark version -->
    <spark.version>[2.2.1]</spark.version>
    <!-- Jackson-libs version pulled in by spark -->
    <jackson.version>[2.6.5]</jackson.version>
    <!-- Avro version pulled in by jackson -->
    <avro.version>[1.7.7]</avro.version>
    <!-- Kryo-shaded version pulled in by spark -->
    <kryo.version>[3.0.3]</kryo.version>
    <!-- Apache commons-lang version pulled in by spark -->
    <commons.lang.version>2.6</commons.lang.version>

    <!-- TODO: need to shade google libs because of version-conflicts on Dataproc. Remove this when Dataproc 1.3/2.0 is released -->
    <bigquery-conn.version>[0.10.6-hadoop2]</bigquery-conn.version>
    <gcs-conn.version>[1.6.5-hadoop2]</gcs-conn.version>
    <google-storage.version>[1.29.0]</google-storage.version>
    <!-- The guava version we want to use -->
    <guava.version>[23.2-jre]</guava.version>
    <!-- The google api version used by the google-cloud-storage lib -->
    <api-client.version>[1.23.0]</api-client.version>
    <!-- The google-api-services-storage version used by the google-cloud-storage lib -->
    <storage-api.version>[v1-rev114-1.23.0]</storage-api.version>

    <!-- Picked up by compiler and resource plugins -->
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

...

<build>
    <pluginManagement>
        <plugins>
...

        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>3.1.1</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                        <minimizeJar>true</minimizeJar>
                        <filters>
                            <filter>
                                <artifact>com.google.**:*</artifact>
                                <includes>
                                    <include>**</include>
                                </includes>
                            </filter>
                            <filter>
                                <artifact>com.google.cloud.bigdataoss:gcs-connector</artifact>
                                <excludes>
                                    <!-- Register a provider with the shaded name instead-->
                                    <exclude>META-INF/services/org.apache.hadoop.fs.FileSystem</exclude>
                                </excludes>
                            </filter>
                        </filters>
                        <artifactSet>
                            <includes>
                                <include>com.google.*:*</include>
                            </includes>
                            <excludes>
                                <exclude>com.google.code.findbugs:jsr305</exclude>
                            </excludes>
                        </artifactSet>
                        <relocations>
                            <relocation>
                                <pattern>com.google</pattern>
                                <shadedPattern>com.shaded.google</shadedPattern>
                            </relocation>
                        </relocations>
                    </configuration>
                </execution>
            </executions>
        </plugin>
...
    </plugins>
</build>

<dependencyManagement>
    <dependencies>
        <dependency>
...
            <groupId>com.google.cloud.bigdataoss</groupId>
            <artifactId>gcs-connector</artifactId>
            <version>${gcs-conn.version}</version>
            <exclusions>
                <!-- conflicts with Spark dependencies -->
                <exclusion>
                    <groupId>org.apache.hadoop</groupId>
                    <artifactId>hadoop-common</artifactId>
                </exclusion>
                <!-- conflicts with Spark dependencies -->
                <exclusion>
                    <groupId>org.apache.hadoop</groupId>
                    <artifactId>hadoop-mapreduce-client-core</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>com.google.guava</groupId>
                    <artifactId>guava</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <!-- Avoid conflict with the version pulled in by the GCS-connector on Dataproc -->
            <groupId>com.google.apis</groupId>
            <artifactId>google-api-services-storage</artifactId>
            <version>${storage-api.version}</version>
        </dependency>
        <dependency>
            <groupId>commons-lang</groupId>
            <artifactId>commons-lang</artifactId>
            <version>${commons.lang.version}</version>
        </dependency>
        <dependency>
            <groupId>com.esotericsoftware</groupId>
            <artifactId>kryo-shaded</artifactId>
            <version>${kryo.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>${jackson.version}</version>
        </dependency>
        <dependency>
            <groupId>com.google.api-client</groupId>
            <artifactId>google-api-client</artifactId>
            <version>${api-client.version}</version>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>${guava.version}</version>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <dependency>
        <groupId>com.google.cloud</groupId>
        <artifactId>google-cloud-storage</artifactId>
        <version>${google-storage.version}</version>
        <exclusions>
            <!-- conflicts with Spark dependencies -->
            <exclusion>
                <groupId>com.google.guava</groupId>
                <artifactId>guava</artifactId>
            </exclusion>
        </exclusions>
    </dependency>
    <dependency>
        <groupId>com.google.guava</groupId>
        <artifactId>guava</artifactId>
    </dependency>
...
</dependencies>

...
</project>

(请注意,此文件是从父POM中的
gcs连接器
库中筛选出来的)

可能不明显,但最新稳定的gcs连接器中的
google api客户端
版本实际上是
1.20.0

原因是,它是一系列提交的一部分,包括这一部分,其总体目标是不再将可传递依赖项泄漏到作业类路径中,这正是为了避免将来的版本冲突问题,代价是每个人都必须自带包含完整api客户机依赖项的胖罐子

然而,事实证明,许多人已经开始依赖GCS连接器提供的api客户机在类路径上,因此存在一些生产工作负载,它们无法在小版本升级中经受住这样的变化;因此,升级后的GCS连接器使用1.23.0,但也对其进行了着色,使其不再出现在作业类路径中,该连接器将保留给未来的Dataproc 1.3+或2.0+版本

在您的情况下,您可以尝试使用依赖项的
1.20.0
版本(您可能还必须降级所包含的
googlecloudstorage
依赖项的版本,尽管该版本的
1.22.0
版本在假设没有重大更改的情况下仍然可以工作,因为setBatchPath实际上只在
1.23.0
中引入),否则您可以尝试

我们可以验证
setBatchPath
仅在
1.23.0
中引入:

$ javap -cp google-api-client-1.22.0.jar com.google.api.client.googleapis.services.AbstractGoogleClient.Builder | grep set
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setRootUrl(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setServicePath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setGoogleClientRequestInitializer(com.google.api.client.googleapis.services.GoogleClientRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setHttpRequestInitializer(com.google.api.client.http.HttpRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setApplicationName(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressPatternChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressRequiredParameterChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressAllChecks(boolean);

$ javap -cp google-api-client-1.23.0.jar com.google.api.client.googleapis.services.AbstractGoogleClient.Builder | grep set
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setRootUrl(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setServicePath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setBatchPath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setGoogleClientRequestInitializer(com.google.api.client.googleapis.services.GoogleClientRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setHttpRequestInitializer(com.google.api.client.http.HttpRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setApplicationName(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressPatternChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressRequiredParameterChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressAllChecks(boolean);

这可能并不明显,但最新稳定的GCS连接器中的
google api客户端
版本实际上是
1.20.0

原因是,它是一系列提交的一部分,包括这一部分,其总体目标是不再将可传递依赖项泄漏到作业类路径中,这正是为了避免将来的版本冲突问题,而代价是每个人都必须自带包含完整api客户机依赖项的胖jar美国

然而,事实证明,许多人已经越来越依赖GCS连接器提供的api客户机在类路径上,因此有一些生产工作负载无法在小版本升级中经受住这样的变化;因此,升级的GCS连接器使用1.23.0,但也对其进行了着色,以便它不会出现在e作业类路径已为将来的Dataproc 1.3+或2.0+版本保留

在您的情况下,您可以尝试使用依赖项的
1.20.0
版本(您可能还必须降级所包含的
googlecloudstorage
依赖项的版本,尽管该版本的
1.22.0
版本在假设没有重大更改的情况下仍然可以工作,因为setBatchPath实际上只在
1.23.0
中引入),否则您可以尝试

我们可以验证
setBatchPath
仅在
1.23.0
中引入:

$ javap -cp google-api-client-1.22.0.jar com.google.api.client.googleapis.services.AbstractGoogleClient.Builder | grep set
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setRootUrl(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setServicePath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setGoogleClientRequestInitializer(com.google.api.client.googleapis.services.GoogleClientRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setHttpRequestInitializer(com.google.api.client.http.HttpRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setApplicationName(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressPatternChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressRequiredParameterChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressAllChecks(boolean);

$ javap -cp google-api-client-1.23.0.jar com.google.api.client.googleapis.services.AbstractGoogleClient.Builder | grep set
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setRootUrl(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setServicePath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setBatchPath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setGoogleClientRequestInitializer(com.google.api.client.googleapis.services.GoogleClientRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setHttpRequestInitializer(com.google.api.client.http.HttpRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setApplicationName(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressPatternChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressRequiredParameterChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressAllChecks(boolean);

谢谢你的回复Dennis。我正在尽我最大的努力给google库加上阴影,但是我在Dataproc上一直遇到一个错误:
线程“main”java.lang.AbstractMethodError:MyClass$.gcp$EncryptedSecretsUser$\u setter\u$creds\u$eq(Lcomgoogleshade/api/client/googleapis/auth/oauth2/GoogleCredential;)V
。有问题的行是
val creds:GoogleCredential=GoogleCredential.getApplicationDefault
。显然,我仍然从类路径中找到一些不兼容的库。知道这是从哪里来的吗?你能告诉我要对哪些库进行着色吗?是的,我的着色规则是
ShadeRule.rename(“com.google.*.*”->>“comgoogleshade@1shade@2”).inAll
,我验证了我正在生成的胖罐子中没有com/google路径。错误表明我的shade com.google.api被拾取:
线程“main”java.lang.AbstractMethodError中的异常:MarketMonitorETL$.gcp$EncryptedSecretsUser$\u setter\uds\u$eq(L**comgoogleshade/apishade/**client/googleapis/auth/oauth2/GoogleCredential;)V
成功实现了这一点。必须屏蔽所有google依赖项,包括gcs连接器,并将我的屏蔽类注册为文件系统提供程序。请参阅问题中的编辑。感谢您的回复Dennis。我正在尽最大努力屏蔽google库,但在Dataproc上不断出现错误:
线程中的异常”main“java.lang.AbstractMethodError:MyClass$.gcp$encryption$EncryptedSecretsUser$\u setter\u$creds\u$eq(Lcomgoogleshade/api/client/googleapis/auth/oauth2/GoogleCredential;)V
。有问题的行是
val creds:GoogleCredential=GoogleCredential.getApplicationDefault
。显然,我仍然从类路径中找到一些不兼容的库。知道这是从哪里来的吗?你能告诉我要对哪些库进行着色吗?是的,我的着色规则是
ShadeRule.rename(“com.google.*.*”->>“comgoogleshade@1shade@2”).inAll
,我验证了我正在生成的胖罐子中没有com/google路径
# WORKAROUND FOR DEPENDENCY CONFLICTS ON DATAPROC
#
# Use the shaded class as a provider for the gs:// file system
#

com.shaded.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem
$ javap -cp google-api-client-1.22.0.jar com.google.api.client.googleapis.services.AbstractGoogleClient.Builder | grep set
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setRootUrl(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setServicePath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setGoogleClientRequestInitializer(com.google.api.client.googleapis.services.GoogleClientRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setHttpRequestInitializer(com.google.api.client.http.HttpRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setApplicationName(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressPatternChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressRequiredParameterChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressAllChecks(boolean);

$ javap -cp google-api-client-1.23.0.jar com.google.api.client.googleapis.services.AbstractGoogleClient.Builder | grep set
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setRootUrl(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setServicePath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setBatchPath(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setGoogleClientRequestInitializer(com.google.api.client.googleapis.services.GoogleClientRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setHttpRequestInitializer(com.google.api.client.http.HttpRequestInitializer);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setApplicationName(java.lang.String);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressPatternChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressRequiredParameterChecks(boolean);
  public com.google.api.client.googleapis.services.AbstractGoogleClient$Builder setSuppressAllChecks(boolean);