Apache flink ApacheFlink：将数据流写入Postgres表_Apache Flink_Flink Streaming

Apache flink ApacheFlink：将数据流写入Postgres表

apache-flink

Apache flink ApacheFlink：将数据流写入Postgres表,apache-flink,flink-streaming,Apache Flink,Flink Streaming,我试图编写一个流作业，将数据流放入postgres表。为了提供完整的信息，我将我的工作基于以下文章：建议使用JDBCOutputFormat 我的代码如下所示： 98 ... 99 String strQuery = "INSERT INTO public.alarm (entity, duration, first, type, windowsize) VALUES (?, ?, ?, 'dur', 6)"; 100 101 JDBCOutputFormat jdbc

我试图编写一个流作业，将数据流放入postgres表。为了提供完整的信息，我将我的工作基于以下文章：建议使用JDBCOutputFormat

我的代码如下所示：

98     ... 
99     String strQuery = "INSERT INTO public.alarm (entity, duration, first, type, windowsize) VALUES (?, ?, ?, 'dur', 6)";
100
101     JDBCOutputFormat jdbcOutput = JDBCOutputFormat.buildJDBCOutputFormat()
102      .setDrivername("org.postgresql.Driver")
103      .setDBUrl("jdbc:postgresql://localhost:5432/postgres?user=michel&password=polnareff")
104      .setQuery(strQuery)
105      .setSqlTypes(new int[] { Types.VARCHAR, Types.INTEGER, Types.VARCHAR}) //set the types
106      .finish();
107
108     DataStream<Row> rows = FilterStream
109                 .map((tuple)-> {
110                    Row row = new Row(3);                  // our prepared statement has 3 parameters
111                    row.setField(0, tuple.f0);             // first parameter is case ID
112                    row.setField(1, tuple.f1);             // second paramater is tracehash
113                    row.setField(2, f.format(tuple.f2));   // third paramater is tracehash
114                    return row;
115                 });
116
117     rows.writeUsingOutputFormat(jdbcOutput);
118
119     env.execute();
120
121     }
122 }

98。。。
99 String strQuery=“插入public.alarm（实体、持续时间、第一个、类型、窗口大小）值（？，，，，'dur'，6）”；
100
101 JDBCOutputFormat jdbcOutput=JDBCOutputFormat.buildJDBCOutputFormat（）
102.setDrivername（“org.postgresql.Driver”）
103.setDBUrl（“jdbc:postgresql://localhost:5432/postgres?user=michel&password=polnareff")
104.设置查询（strQuery）
105.setSqlTypes（新int[]{Types.VARCHAR，Types.INTEGER，Types.VARCHAR}）//设置类型
106.完成（）；
107
108数据流行=过滤器流
109.映射（（元组）->{
110 Row Row=新行（3）；//我们准备的语句有3个参数
111 row.setField（0，tuple.f0）；//第一个参数是大小写ID
112 row.setField（1，tuple.f1）；//第二个参数是tracehash
113 row.setField（2，f.format（tuple.f2））；//第三个参数是tracehash
114返回行；
115                 });
116
117行。写入输出格式（jdbcOutput）；
118
119 env.execute（）；
120
121     }
122 }

我现在的问题是，只有当我的作业停止时（确切地说，当我从apache flink dashboard取消作业时），才会插入值

所以我的问题是：我错过了什么吗？我应该在某个地方提交我插入的行吗

致以最良好的祝愿，伊格纳修斯

正如切斯尼在中所说，您必须调整批次间隔

然而，这并不是故事的全部。如果希望至少获得一次结果，则必须将批写入与Flink的检查点同步。基本上，您必须将

JdbcOutputFormat

包装在

SinkFunction

中，该函数也实现了

CheckpointedFunction

接口。调用

snapshotState（）

时，您已将批写入数据库。您可以在下一个版本中查看此功能，它将提供此功能。

的答案是至少实现一次语义的一种方法；通过将写操作与Flink的检查点同步。但是，这样做的缺点是，接收器的数据新鲜度现在与检查点间隔周期紧密相关

作为替代方案，您可以将具有（实体、持续时间、第一个）字段的元组或行存储在Flink自己的托管状态中，以便Flink负责检查点设置（换句话说，使接收器的状态具有容错性）。为此，您实现了CheckpointedFunction和CheckpointedRestoring接口（无需将写入与检查点同步。如果不必使用JDBCOutputFormat，您甚至可以单独执行SQL插入）。请参阅：。另一种解决方案是只实现ListCheckpointed接口（可以以与不推荐的CheckpointedRestoring接口类似的方式使用，并支持列表样式的状态重新分布）。

JDBCOutputFormat批量写入值；默认大小为5000。可以通过调用setBatchInterval（）在buildJDBCOutputFormat块中控制此参数。如果作业的输入小于间隔，则仅当接收器关闭时，即作业终止时，才会提交批处理。您好，您的评论实际上就是我问题的答案。我在第106行添加了“.setBatchInterval（1）”，它非常有效。非常感谢你