Apache flink 得到;pyflink.util.exceptions.TableException:findAndCreateTableSource失败;运行PyFlink示例时

Apache flink 得到;pyflink.util.exceptions.TableException:findAndCreateTableSource失败;运行PyFlink示例时,apache-flink,pyflink,Apache Flink,Pyflink,我在PyFlink程序下运行(复制自) 从pyflink.dataset导入ExecutionEnvironment 从pyflink.table导入TableConfig、数据类型、BatchTableEnvironment 从pyflink.table.descriptor导入模式、OldCsv和文件系统 从pyflink.table.expressions导入 exec\u env=ExecutionEnvironment.get\u execution\u environment() 执

我在PyFlink程序下运行(复制自)

从pyflink.dataset导入ExecutionEnvironment
从pyflink.table导入TableConfig、数据类型、BatchTableEnvironment
从pyflink.table.descriptor导入模式、OldCsv和文件系统
从pyflink.table.expressions导入
exec\u env=ExecutionEnvironment.get\u execution\u environment()
执行环境设置并行(1)
t_config=TableConfig()
t_env=BatchTableEnvironment.create(exec_env,t_config)
t_env.connect(FileSystem().path('/tmp/input'))\
.使用_格式(OldCsv()
.field('word',DataTypes.STRING()))\
.with_schema(schema()
.field('word',DataTypes.STRING()))\
.创建临时表(“mySource”)
t_env.connect(FileSystem().path('/tmp/output'))\
.使用_格式(OldCsv()
.field_分隔符('\t')
.field('word',DataTypes.STRING())
.field('count',DataTypes.BIGINT()))\
.with_schema(schema()
.field('word',DataTypes.STRING())
.field('count',DataTypes.BIGINT()))\
.创建临时表(“myLink”)
tab=t_env.from_路径('mySource'))
tab.group_by(tab.word)\
.选择(tab.word,light(1).计数)\
.execute_insert('mySink')。wait()
为了验证它是否有效,我按顺序执行了以下操作:

  • 运行echo-e“flink\npyflink\nflink”>/tmp/input
  • 运行
    python WordCount.py
  • 运行
    cat/tmp/out
    并找到预期输出
  • 然后我改变了我的PyFlink程序,让它更喜欢SQL而不是表API,但我发现它不起作用

    从pyflink.dataset导入ExecutionEnvironment
    从pyflink.table导入TableConfig、数据类型、BatchTableEnvironment
    从pyflink.table.descriptor导入模式、OldCsv和文件系统
    从pyflink.table.expressions导入
    exec\u env=ExecutionEnvironment.get\u execution\u environment()
    执行环境设置并行(1)
    t_config=TableConfig()
    t_env=BatchTableEnvironment.create(exec_env,t_config)
    我的_source_ddl=”“”
    创建表mySource(
    单词VARCHAR
    )与(
    “连接器”=“文件系统”,
    “格式”=“csv”,
    '路径'='/tmp/input'
    )
    """
    我的水槽
    创建表myLink(
    单词VARCHAR,
    `比基特伯爵
    )与(
    “连接器”=“文件系统”,
    “格式”=“csv”,
    '路径'='/tmp/output'
    )
    """
    环境sql更新(我的源代码)
    环境sql更新(我的接收器ddl)
    tab=t_env.from_路径('mySource'))
    tab.group_by(tab.word)\
    .选择(tab.word,light(1).计数)\
    .execute_insert('mySink')。wait()
    
    以下是错误:

    Traceback (most recent call last):
      File "WordCount.py", line 38, in <module>
        .execute_insert('mySink').wait()
      File "/usr/local/anaconda3/envs/pyflink-quickstart/lib/python3.7/site-packages/pyflink/table/table.py", line 864, in execute_insert
        return TableResult(self._j_table.executeInsert(table_path, overwrite))
      File "/usr/local/anaconda3/envs/pyflink-quickstart/lib/python3.7/site-packages/py4j/java_gateway.py", line 1286, in __call__
        answer, self.gateway_client, self.target_id, self.name)
      File "/usr/local/anaconda3/envs/pyflink-quickstart/lib/python3.7/site-packages/pyflink/util/exceptions.py", line 162, in deco
        raise java_exception
    pyflink.util.exceptions.TableException: findAndCreateTableSink failed.
         at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSink(TableFactoryUtil.java:87)
         at org.apache.flink.table.api.internal.TableEnvImpl.getTableSink(TableEnvImpl.scala:1097)
         at org.apache.flink.table.api.internal.TableEnvImpl.org$apache$flink$table$api$internal$TableEnvImpl$$writeToSinkAndTranslate(TableEnvImpl.scala:929)
         at org.apache.flink.table.api.internal.TableEnvImpl$$anonfun$1.apply(TableEnvImpl.scala:556)
         at org.apache.flink.table.api.internal.TableEnvImpl$$anonfun$1.apply(TableEnvImpl.scala:554)
         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
         at scala.collection.Iterator$class.foreach(Iterator.scala:891)
         at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
         at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
         at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
         at scala.collection.AbstractTraversable.map(Traversable.scala:104)
         at org.apache.flink.table.api.internal.TableEnvImpl.executeInternal(TableEnvImpl.scala:554)
         at org.apache.flink.table.api.internal.TableImpl.executeInsert(TableImpl.java:572)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:498)
         at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
         at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
         at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
         at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
         at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
         at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
         at java.lang.Thread.run(Thread.java:748)
    
    回溯(最近一次呼叫最后一次):
    文件“WordCount.py”,第38行,在
    .execute_insert('mySink')。wait()
    文件“/usr/local/anaconda3/envs/pyflink quickstart/lib/python3.7/site packages/pyflink/table/table.py”,第864行,插入执行
    返回TableResult(self.\u j\u table.executeInsert(table\u path,overwrite))
    文件“/usr/local/anaconda3/envs/pyflink quickstart/lib/python3.7/site packages/py4j/java_gateway.py”,第1286行,in_u调用__
    回答,self.gateway\u客户端,self.target\u id,self.name)
    文件“/usr/local/anaconda3/envs/pyflink quickstart/lib/python3.7/site-packages/pyflink/util/exceptions.py”,第162行,装饰
    引发java_异常
    pyflink.util.exceptions.TableException:FindAndCreateTableLink失败。
    位于org.apache.flink.table.factories.TableFactoryUtil.FindAndCreateTableLink(TableFactoryUtil.java:87)
    位于org.apache.flink.table.api.internal.TableEnvImpl.getTableSink(TableEnvImpl.scala:1097)
    在org.apache.flink.table.api.internal.TableEnvImpl.org$apache$flink$table$api$internal$TableEnvImpl$$writeToSinkAndTranslate(TableEnvImpl.scala:929)
    在org.apache.flink.table.api.internal.TableEnvImpl$$anonfun$1.apply(TableEnvImpl.scala:556)
    在org.apache.flink.table.api.internal.TableEnvImpl$$anonfun$1.apply(TableEnvImpl.scala:554)
    在scala.collection.TraversableLike$$anonfun$map$1.apply处(TraversableLike.scala:234)
    在scala.collection.TraversableLike$$anonfun$map$1.apply处(TraversableLike.scala:234)
    位于scala.collection.Iterator$class.foreach(Iterator.scala:891)
    位于scala.collection.AbstractIterator.foreach(迭代器.scala:1334)
    位于scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    位于scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    位于scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    位于scala.collection.AbstractTraversable.map(Traversable.scala:104)
    位于org.apache.flink.table.api.internal.TableEnvImpl.executeInternal(TableEnvImpl.scala:554)
    在org.apache.flink.table.api.internal.TableImpl.executeInsert上(TableImpl.java:572)
    在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
    位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
    位于java.lang.reflect.Method.invoke(Method.java:498)
    位于org.apache.flink.api.python.shade.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    位于org.apache.flink.api.python.shade.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    位于org.apache.flink.api.python.shade.py4j.Gateway.invoke(Gateway.java:282)
    位于org.apache.flink.api.python.shade.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    位于org.apache.flink.api.python.shade.py4j.commands.CallCommand.execute(CallCommand.java:79)
    在org.apache.flink.api.python.shade.py4j.GatewayConnection.run(GatewayConnection.java:238)上
    运行(Thread.java:748)
    

    我想知道我的新程序出了什么问题?

    问题是您使用的旧数据集不支持您声明的文件系统连接器。您可以使用blink Planner来满足您的需求

    t_env = BatchTableEnvironment.create(
        environment_settings=EnvironmentSettings.new_instance()
        .in_batch_mode().use_blink_planner().build())
    t_env._j_tenv.getPlanner().getExecEnv().setParallelism(1)
    
    my_source_ddl = """
        create table mySource (
            word VARCHAR
        ) with (
            'connector' = 'filesystem',
            'format' = 'csv',
            'path' = '/tmp/input'
        )
    """
    
    my_sink_ddl = """
        create table mySink (
            word VARCHAR,
            `count` BIGINT
        ) with (
            'connector' = 'filesystem',
            'format' = 'csv',
            'path' = '/tmp/output'
        )
    """
    
    t_env.execute_sql(my_source_ddl)
    t_env.execute_sql(my_sink_ddl)
    
    tab = t_env.from_path('mySource')
    tab.group_by(tab.word) \
        .select(tab.word, lit(1).count) \
        .execute_insert('mySink').wait()
    
    我正在使用