Apache flink 弗林克。当我对OrcTableSource使用hdfs文件时,不支持FileSystemSchemeException

Apache flink 弗林克。当我对OrcTableSource使用hdfs文件时,不支持FileSystemSchemeException,apache-flink,Apache Flink,我尝试使用OrcTableSource并从hdfs读取文件 String fileName = "hdfs://master.host/data/groot.db/events/dt=2017-11-24/000044_0"; Configuration config = new Configuration(); OrcTableSource orcTableSource = OrcTableSource.builder() // path to ORC file(s) .p

我尝试使用OrcTableSource并从hdfs读取文件

String fileName = "hdfs://master.host/data/groot.db/events/dt=2017-11-24/000044_0";

Configuration config = new Configuration();

OrcTableSource orcTableSource = OrcTableSource.builder()
    // path to ORC file(s)
    .path(fileName)
    // schema of ORC files
    .forOrcSchema(Schema.schema)
    // Hadoop configuration
    .withConfiguration(config)
    // build OrcTableSource
    .build();
但当我启动它时,我捕捉到异常

Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:472)
at org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:62)
at org.apache.flink.runtime.executiongraph.ExecutionJobVertex.<init>(ExecutionJobVertex.java:248)
... 22 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Cannot support file system for 'hdfs' via Hadoop, because Hadoop is not in the classpath, or some classes are missing from the classpath.
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:179)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
... 27 more
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hdfs.HdfsConfiguration
at org.apache.flink.runtime.util.HadoopUtils.getHadoopConfiguration(HadoopUtils.java:49)
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:85)
... 28 more
原因:org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:找不到方案“hdfs”的文件系统实现。Flink不直接支持该方案,无法加载支持该方案的Hadoop文件系统。
位于org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
位于org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
位于org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
位于org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:472)
位于org.apache.flink.api.common.io.FileInputFormat.createInputSplits(FileInputFormat.java:62)
位于org.apache.flink.runtime.executiongraph.ExecutionJobVertex.(ExecutionJobVertex.java:248)
... 还有22个
原因:org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:无法通过Hadoop支持“hdfs”的文件系统,因为Hadoop不在类路径中,或者类路径中缺少某些类。
位于org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:179)
位于org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
... 还有27个
原因:java.lang.NoClassDefFoundError:无法初始化类org.apache.hadoop.hdfs.HdfsConfiguration
位于org.apache.flink.runtime.util.HadoopUtils.getHadoopConfiguration(HadoopUtils.java:49)
位于org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:85)
... 28多
我的pom.xml是

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>ru.ivi</groupId>
<artifactId>hdfs2ch</artifactId>
<version>1.0-SNAPSHOT</version>

<repositories>
    <repository>
        <id>apache.snapshots</id>
        <name>Apache Development Snapshot Repository</name>
        <url>https://repository.apache.org/content/repositories/snapshots/</url>
        <releases>
            <enabled>false</enabled>
        </releases>
        <snapshots>
            <enabled>true</enabled>
        </snapshots>
    </repository>
</repositories>

<dependencies>
    <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-core -->
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-core</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-java</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-streaming-java_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-streaming-java_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-table_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-clients_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-connector-filesystem_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-orc_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-jdbc</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>


    <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-table -->
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-table_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-scala -->
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-scala_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-scala -->
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-streaming-scala_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>


    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-hadoop-fs</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>1.2.1</version>
    </dependency>


    <dependency>
        <groupId>ru.yandex.clickhouse</groupId>
        <artifactId>clickhouse-jdbc</artifactId>
        <version>0.1.34</version>
        <scope>system</scope>
        <systemPath>
            /home/akonyaev/git/clickhouse-jdbc/target/clickhouse-jdbc-0.1-SNAPSHOT-jar-with-dependencies.jar
        </systemPath>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>2.6.0</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.6.0</version>
    </dependency>

    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-connector-filesystem_2.11</artifactId>
        <version>1.5-SNAPSHOT</version>
    </dependency>

    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <version>42.1.4</version>
    </dependency>

</dependencies>


<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>

        <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <configuration>
                <archive>
                    <manifest>
                        <mainClass>hdfs2ch.Migrate</mainClass>
                    </manifest>
                </archive>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
            </configuration>
            <executions>
                <execution>
                    <id>make-assembly</id> <!-- this is used for inheritance merges -->
                    <phase>package</phase> <!-- bind to the packaging phase -->
                    <goals>
                        <goal>single</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

</project>

4.0.0
鲁伊维
hdfs2ch
1.0-快照
apache.snapshots
Apache开发快照存储库
https://repository.apache.org/content/repositories/snapshots/
假的
真的
org.apache.flink
燧石芯
1.5-1
org.apache.flink
弗林克爪哇
1.5-1
org.apache.flink
flink-streaming-java_2.11
1.5-1
org.apache.flink
flink-streaming-java_2.11
1.5-1
org.apache.flink
flink-table_2.11
1.5-1
org.apache.flink
flink-U 2.11
1.5-1
org.apache.flink
flink-connector-U 2.11
1.5-1
org.apache.flink
flink-orc_2.11
1.5-1
org.apache.flink
flink jdbc
1.5-1
org.apache.flink
flink-table_2.11
1.5-1
org.apache.flink
flink-scala_2.11
1.5-1
org.apache.flink
flink-streaming-scala_2.11
1.5-1
org.apache.flink
flink hadoop fs
1.5-1
org.apache.hadoop
hadoop内核
1.2.1
ru.yandex.clickhouse
clickhouse jdbc
0.1.34
系统
/home/akonyaev/git/clickhouse-jdbc/target/clickhouse-jdbc-0.1-SNAPSHOT-jar-with-dependencies.jar
org.apache.hadoop
hadoop hdfs
2.6.0
org.apache.hadoop
hadoop通用
2.6.0
org.apache.flink
flink-connector-U 2.11
1.5-1
org.postgresql
postgresql
42.1.4
org.apache.maven.plugins
maven编译器插件
1.8
1.8
maven汇编插件
hdfs2ch.迁移
带有依赖项的jar
组装
包裹
单一的
我添加了库“flink hadoop fs”,但对我没有帮助

知道我将在类路径中添加什么吗

我使用flink 1.5-SNAPSHOT

谢谢!z为我解决了。 这是马文的问题


在重新格式化pom.xml并删除不需要的依赖项后,它开始工作了

我在Eclipse中运行我的flink作业,并得到了这个错误。添加flink-shaded-hadoop2对maven的依赖性为我解决了这个问题。

这听起来像是maven的问题。你也可以发布你的pom.xml吗?我尝试使用maven repo的1.4.0版本,但仍然得到相同的资源。我得到了相同的错误。您能提供更多详细信息吗?您在pom.xml中做了什么?删除hadoop hdfs的所有依赖项并设置hadoop_CONF_DIR