Java Kafka Streams应用程序在docker容器中的奇怪行为
我使用docker compose在docker容器中运行Kafka Streams应用程序。但是,streams应用程序的行为异常。因此,我有一个源主题(Java Kafka Streams应用程序在docker容器中的奇怪行为,java,docker,apache-kafka,apache-kafka-streams,Java,Docker,Apache Kafka,Apache Kafka Streams,我使用docker compose在docker容器中运行Kafka Streams应用程序。但是,streams应用程序的行为异常。因此,我有一个源主题(topicSource)和多个目标主题(topicDestination1,topicDestination2…topicDestination10),我基于某些谓词进行分支 TopicSource和topicDestination1有一个直接映射,即所有记录只需进入目标主题,无需任何筛选 现在,当我在本地运行应用程序或在没有容器的服务器上运
topicSource
)和多个目标主题(topicDestination1
,topicDestination2
…topicDestination10
),我基于某些谓词进行分支
TopicSource
和topicDestination1
有一个直接映射,即所有记录只需进入目标主题,无需任何筛选
现在,当我在本地运行应用程序或在没有容器的服务器上运行应用程序时,所有这些都可以很好地工作
另一方面,当我在容器中运行streams应用程序(使用docker compose和kubernetes)时,它不会将所有日志从TopicSource
转发到topicDestination1
。事实上,只有少数记录被转发。例如,在源主题中有3000多条记录,而在目标主题中只有6条记录。这一切真的很奇怪
这是我的Dockerfile:
#FROM openjdk:8u151-jdk-alpine3.7
FROM openjdk:8-jdk
COPY /target/streams-examples-0.1.jar /streamsApp/
COPY /target/libs /streamsApp/libs
COPY log4j.properties /
CMD ["java", "-jar", "/streamsApp/streams-examples-0.1.jar"]
public static void pushToTopic(KStream<String, String> sourceTopic, HashMap<String, String> hmap, String destTopicName) {
sourceTopic.flatMapValues(new ValueMapper<String, Iterable<String>>() {
@Override
public Iterable<String> apply(String value) {
ArrayList<String> keywords = new ArrayList<String>();
try {
JSONObject send = new JSONObject();
JSONObject received = processJSON(new JSONObject(value), destTopicName);
boolean valid_json = true;
for(String key: hmap.keySet()) {
if (received.has(hmap.get(key))) {
send.put(key, received.get(hmap.get(key)));
}
else {
valid_json = false;
}
}
if (valid_json) {
keywords.add(send.toString());
}
} catch (Exception e) {
System.err.println("Unable to convert to json");
e.printStackTrace();
}
return keywords;
}
}).to(destTopicName);
}
注意:在创建映像之前,我正在构建一个jar,以便始终有一个更新的代码。我已经确定了这两个代码,一个没有容器运行的代码和一个有容器运行的代码是相同的
Main.java:
#FROM openjdk:8u151-jdk-alpine3.7
FROM openjdk:8-jdk
COPY /target/streams-examples-0.1.jar /streamsApp/
COPY /target/libs /streamsApp/libs
COPY log4j.properties /
CMD ["java", "-jar", "/streamsApp/streams-examples-0.1.jar"]
public static void pushToTopic(KStream<String, String> sourceTopic, HashMap<String, String> hmap, String destTopicName) {
sourceTopic.flatMapValues(new ValueMapper<String, Iterable<String>>() {
@Override
public Iterable<String> apply(String value) {
ArrayList<String> keywords = new ArrayList<String>();
try {
JSONObject send = new JSONObject();
JSONObject received = processJSON(new JSONObject(value), destTopicName);
boolean valid_json = true;
for(String key: hmap.keySet()) {
if (received.has(hmap.get(key))) {
send.put(key, received.get(hmap.get(key)));
}
else {
valid_json = false;
}
}
if (valid_json) {
keywords.add(send.toString());
}
} catch (Exception e) {
System.err.println("Unable to convert to json");
e.printStackTrace();
}
return keywords;
}
}).to(destTopicName);
}
从源主题创建源流:
KStream<String, String> source_stream = builder.stream("topicSource");
将日志从分支发送到目标主题:
AppUtil.pushToTopic(branches_source_topic[0], Constant.SHARING_SET_BY_DATE, "topicDestination2");
AppUtil.pushToTopic(branches_source_topic[1], Constant.ADDED_TO_SECURE_LINK_BY_DATE, "topicDestination3");
AppUtil.pushToTopic(branches_source_topic[2], Constant.ADDED_TO_GROUP_BY_DATE, "topicDestination4");
AppUtil.pushToTopic(branches_source_topic[3], Constant.ROLE_UPDATE_BY_DATE, "topicDestination5");
AppUtil.pushToTopic(branches_source_topic[4], Constant.UPLOAD_FILE_BY_DATE, "topicDestination6");
AppUtil.pushToTopic(branches_source_topic[5], Constant.USER_LOGGED_IN_BY_DATE, "topicDestination7");
AppUtil.pushToTopic(branches_source_topic[6], Constant.MANAGE_USER_BY_DATE, "topicDestination8");
AppUtli.java:
#FROM openjdk:8u151-jdk-alpine3.7
FROM openjdk:8-jdk
COPY /target/streams-examples-0.1.jar /streamsApp/
COPY /target/libs /streamsApp/libs
COPY log4j.properties /
CMD ["java", "-jar", "/streamsApp/streams-examples-0.1.jar"]
public static void pushToTopic(KStream<String, String> sourceTopic, HashMap<String, String> hmap, String destTopicName) {
sourceTopic.flatMapValues(new ValueMapper<String, Iterable<String>>() {
@Override
public Iterable<String> apply(String value) {
ArrayList<String> keywords = new ArrayList<String>();
try {
JSONObject send = new JSONObject();
JSONObject received = processJSON(new JSONObject(value), destTopicName);
boolean valid_json = true;
for(String key: hmap.keySet()) {
if (received.has(hmap.get(key))) {
send.put(key, received.get(hmap.get(key)));
}
else {
valid_json = false;
}
}
if (valid_json) {
keywords.add(send.toString());
}
} catch (Exception e) {
System.err.println("Unable to convert to json");
e.printStackTrace();
}
return keywords;
}
}).to(destTopicName);
}
publicstaticvoidpushtotopic(kstreamsourcetopic、HashMap hmap、stringdesttopicname){
sourceTopic.flatMapValues(新的ValueMapper(){
@凌驾
公共Iterable应用(字符串值){
ArrayList关键字=新建ArrayList();
试一试{
JSONObject send=新建JSONObject();
接收到的JSONObject=processJSON(新JSONObject(值),destTopicName);
布尔值valid_json=true;
for(字符串键:hmap.keySet()){
if(received.has(hmap.get(key))){
send.put(key,received.get(hmap.get(key));
}
否则{
valid_json=false;
}
}
if(有效的_json){
关键词.add(send.toString());
}
}捕获(例外e){
System.err.println(“无法转换为json”);
e、 printStackTrace();
}
返回关键字;
}
})。致(目的地名称);
}
日志来自哪里:
#FROM openjdk:8u151-jdk-alpine3.7
FROM openjdk:8-jdk
COPY /target/streams-examples-0.1.jar /streamsApp/
COPY /target/libs /streamsApp/libs
COPY log4j.properties /
CMD ["java", "-jar", "/streamsApp/streams-examples-0.1.jar"]
public static void pushToTopic(KStream<String, String> sourceTopic, HashMap<String, String> hmap, String destTopicName) {
sourceTopic.flatMapValues(new ValueMapper<String, Iterable<String>>() {
@Override
public Iterable<String> apply(String value) {
ArrayList<String> keywords = new ArrayList<String>();
try {
JSONObject send = new JSONObject();
JSONObject received = processJSON(new JSONObject(value), destTopicName);
boolean valid_json = true;
for(String key: hmap.keySet()) {
if (received.has(hmap.get(key))) {
send.put(key, received.get(hmap.get(key)));
}
else {
valid_json = false;
}
}
if (valid_json) {
keywords.add(send.toString());
}
} catch (Exception e) {
System.err.println("Unable to convert to json");
e.printStackTrace();
}
return keywords;
}
}).to(destTopicName);
}
因此,日志来自在线连续流。python作业获取基本上是URL的日志,并将它们发送到源代码之前的主题
。然后在streams应用程序中,我从该主题创建一个streams,点击这些URL,然后返回json日志,我将其推送到topicSource
我花了很多时间试图解决这个问题。我不知道出了什么问题,也不知道它为什么不处理所有日志。请帮我解决这个问题。因此,经过大量调试后,我知道我在错误的方向上探索,这是一个消费者比生产者慢的简单例子。制作人继续写关于主题的新记录,由于消息是在流处理后被消费的,消费者显然很慢。简单地增加主题分区并使用相同的应用程序id启动多个应用程序实例就可以做到这一点。我唯一的想法是,如果引发异常,您可以
flatMapValues()
删除数据。。。您是否捕获了stderr
和stdout
并验证了数据是否有效,以及flatMapValues()
是否不删除任何内容?