Apache kafka 阅读卡夫卡流中的peek主题

Apache kafka 阅读卡夫卡流中的peek主题,apache-kafka,apache-kafka-streams,Apache Kafka,Apache Kafka Streams,我有一个主题名是push-processing-KSTREAM-PEEK-00000000 14 repartition,这是卡夫卡的内部主题。我没有创建此主题,在重新分区后,我正在使用.peek()方法并使用peek方法3-4次 我的问题是我可以阅读主题主题阅读推送处理-KSTREAM-PEEK-00000000 14重新分区,但当我说主题阅读推送处理-KSTREAM-PEEK-00000000 14重新分区时,我无法阅读--从一开始 创建此内部主题是因为peek方法,对吗 或者它与其他重新分

我有一个主题名是
push-processing-KSTREAM-PEEK-00000000 14 repartition
,这是卡夫卡的内部主题。我没有创建此主题,在重新分区后,我正在使用
.peek()
方法并使用peek方法3-4次

我的问题是我可以阅读主题
主题阅读推送处理-KSTREAM-PEEK-00000000 14重新分区
,但当我说
主题阅读推送处理-KSTREAM-PEEK-00000000 14重新分区时,我无法阅读--从一开始

创建此内部主题是因为
peek
方法,对吗

或者它与其他重新分区流代码相关,但它的名称是
KSTREEAM-PEEK

它有50个分区。由于
peek
是无状态操作,它不应该正确创建内部主题,但为什么它的名称与
peek
相关,为什么我不能从头开始阅读

有什么想法吗/

以下是第一个拓扑:

   Sub-topology: 0
    Source: KSTREAM-SOURCE-0000000000 (topics: [appconnect_deviceIds_exported_for_push])
      --> KSTREAM-FLATMAP-0000000004
    Processor: KSTREAM-FLATMAP-0000000004 (stores: [])
      --> KSTREAM-PEEK-0000000005
      <-- KSTREAM-SOURCE-0000000000
    Processor: KSTREAM-PEEK-0000000005 (stores: [])
      --> KSTREAM-FILTER-0000000007
      <-- KSTREAM-FLATMAP-0000000004
    Processor: KSTREAM-FILTER-0000000007 (stores: [])
      --> KSTREAM-SINK-0000000006
      <-- KSTREAM-PEEK-0000000005
    Sink: KSTREAM-SINK-0000000006 (topic: KSTREAM-PEEK-0000000005-repartition)
      <-- KSTREAM-FILTER-0000000007

  Sub-topology: 1
    Source: KSTREAM-SOURCE-0000000008 (topics: [KSTREAM-PEEK-0000000005-repartition])
      --> KSTREAM-JOIN-0000000009
    Source: KSTREAM-SOURCE-0000000028 (topics: [KSTREAM-PEEK-0000000025-repartition])
      --> KSTREAM-JOIN-0000000029
    Processor: KSTREAM-JOIN-0000000009 (stores: [appconnect_device_stream-STATE-STORE-0000000001])
      --> KSTREAM-MAP-0000000010
      <-- KSTREAM-SOURCE-0000000008
    Processor: KSTREAM-JOIN-0000000029 (stores: [appconnect_device_stream-STATE-STORE-0000000001])
      --> KSTREAM-PEEK-0000000030
      <-- KSTREAM-SOURCE-0000000028
    Processor: KSTREAM-MAP-0000000010 (stores: [])
      --> KSTREAM-PEEK-0000000011
      <-- KSTREAM-JOIN-0000000009
    Processor: KSTREAM-PEEK-0000000030 (stores: [])
      --> KSTREAM-MAP-0000000031
      <-- KSTREAM-JOIN-0000000029
    Processor: KSTREAM-MAP-0000000031 (stores: [])
      --> KSTREAM-SINK-0000000032
      <-- KSTREAM-PEEK-0000000030
    Processor: KSTREAM-PEEK-0000000011 (stores: [])
      --> KSTREAM-SINK-0000000012
      <-- KSTREAM-MAP-0000000010
    Source: KSTREAM-SOURCE-0000000002 (topics: [appconnect_device_stream])
      --> KTABLE-SOURCE-0000000003
    Sink: KSTREAM-SINK-0000000012 (topic: appconnect_devices_exported_for_push)
      <-- KSTREAM-PEEK-0000000011
    Sink: KSTREAM-SINK-0000000032 (topic: appconnect_devices_exported_for_push)
      <-- KSTREAM-MAP-0000000031
    Processor: KTABLE-SOURCE-0000000003 (stores: [appconnect_device_stream-STATE-STORE-0000000001])
      --> none
      <-- KSTREAM-SOURCE-0000000002

  Sub-topology: 2
    Source: KSTREAM-SOURCE-0000000013 (topics: [appconnect_userIds_exported_for_push])
      --> KSTREAM-FLATMAP-0000000017
    Processor: KSTREAM-FLATMAP-0000000017 (stores: [])
      --> KSTREAM-PEEK-0000000018
      <-- KSTREAM-SOURCE-0000000013
    Processor: KSTREAM-PEEK-0000000018 (stores: [])
      --> KSTREAM-FILTER-0000000020
      <-- KSTREAM-FLATMAP-0000000017
    Processor: KSTREAM-FILTER-0000000020 (stores: [])
      --> KSTREAM-SINK-0000000019
      <-- KSTREAM-PEEK-0000000018
    Sink: KSTREAM-SINK-0000000019 (topic: KSTREAM-PEEK-0000000018-repartition)
      <-- KSTREAM-FILTER-0000000020

  Sub-topology: 3
    Source: KSTREAM-SOURCE-0000000021 (topics: [KSTREAM-PEEK-0000000018-repartition])
      --> KSTREAM-JOIN-0000000022
    Processor: KSTREAM-JOIN-0000000022 (stores: [appconnect_user_stream-STATE-STORE-0000000014])
      --> KSTREAM-PEEK-0000000023
      <-- KSTREAM-SOURCE-0000000021
    Processor: KSTREAM-PEEK-0000000023 (stores: [])
      --> KSTREAM-MAP-0000000024
      <-- KSTREAM-JOIN-0000000022
    Processor: KSTREAM-MAP-0000000024 (stores: [])
      --> KSTREAM-PEEK-0000000025
      <-- KSTREAM-PEEK-0000000023
    Processor: KSTREAM-PEEK-0000000025 (stores: [])
      --> KSTREAM-FILTER-0000000027
      <-- KSTREAM-MAP-0000000024
    Processor: KSTREAM-FILTER-0000000027 (stores: [])
      --> KSTREAM-SINK-0000000026
      <-- KSTREAM-PEEK-0000000025
    Source: KSTREAM-SOURCE-0000000015 (topics: [appconnect_user_stream])
      --> KTABLE-SOURCE-0000000016
    Sink: KSTREAM-SINK-0000000026 (topic: KSTREAM-PEEK-0000000025-repartition)
      <-- KSTREAM-FILTER-0000000027
    Processor: KTABLE-SOURCE-0000000016 (stores: [appconnect_user_stream-STATE-STORE-0000000014])
      --> none
      <-- KSTREAM-SOURCE-0000000015
子拓扑:0
来源:KSTREAM-Source-0000000000(主题:[appconnect\u DeviceID\u exported\u for\u push])
-->KSTREAM-FLATMAP-000000000 4
处理器:KSTREAM-FLATMAP-0000000004(门店:[])
-->KSTREAM-PEEK-000000000 5
KSTREAM-FILTER-0000000007
KSTREAM-SINK-0000000006
KSTREAM-JOIN-00000000 29
处理器:KSTREAM-JOIN-0000000009(存储:[appconnect\u device\u stream-STATE-STORE-0000000001])
-->KSTREAM-MAP-00000000 10
KSTREAM-PEEK-00000000 30
KSTREAM-PEEK-00000000 11
KSTREAM-MAP-00000000 31
KSTREAM-SINK-00000000 32
KSTREAM-SINK-00000000 12
KTABLE-SOURCE-0000000003
接收器:KSTREAM-Sink-00000000 12(主题:appconnect\u设备\u导出\u用于\u推送)
KSTREAM-PEEK-00000000 18
KSTREAM-FILTER-00000000 20
KSTREAM-SINK-00000000 19
KSTREAM-PEEK-00000000 23
KSTREAM-MAP-00000000 24
KSTREAM-PEEK-00000000 25
KSTREAM-FILTER-00000000 27
KSTREAM-SINK-00000000 26
KTABLE-SOURCE-00000000 16
接收器:KSTREAM-Sink-00000000 26(主题:KSTREAM-PEEK-00000000 25重新分区)
没有一个
KSTREAM-JOIN-00000000 18
处理者:KSTREAM-JOIN-00000000 18(存储:[appconnect\u push\u processing\u submissions-STATE-STORE-0000000000])
-->KSTREAM-FILTER-00000000 19
KSTREAM-SINK-00000000 20
KTABLE-SOURCE-0000000002
接收器:KSTREAM-Sink-00000000 20(主题:appconnect\u push\u send\u bulk)
没有一个
KSTREAM-MAP-0000000007
处理器:KSTREAM-MAP-0000000007(门店:[])
-->KSTREAM-PEEK-000000000 8
KSTREAM-FILTER-00000000 10
KSTREAM-SINK-0000000009
KSTREAM-KEY-SELECT-00000000 13
KSTREAM-PEEK-00000000 14
KSTREAM-FILTER-00000000 16
KSTREAM-SINK-00000000 15
KTABLE-SOURCE-0000000006
接收器:KSTREAM-Sink-00000000 15(主题:KSTREAM-PEEK-00000000 14重新分区)
没有一个
peek()
操作与此无关。查看您在程序中发布的拓扑描述(部分)如下:

KStream inputUser=builder.stream().flatMap().peek().filter();
KStream inputDevice=builder.stream().flatMap().peek().filter();
inputUser.join(inputDevice,…)
(如果您也在问题中发布代码,会更容易)

因为调用
flatMap()
Kafka Streams会假定您更改了密钥,因此,调用
join()
会触发数据重新分区。重新分区主题名称是由上游操作员生成的(为了公平起见,我不能100%确定为什么选择
PEEK
而不是
FILTER

所有这些操作都使用相同的密钥

对于这种情况,您可能希望使用
flatMapValues()
而不是
flatMap()
。对于这种情况,Kafka Streams知道密钥没有更改,因此不会创建重新分区主题

类似地,如果不更改键以避免不必要的重新分区,则可能需要使用
mapValues()
而不是
map()

我的问题是,我可以从主题“主题阅读推送处理-KSTREAM-PEEK-00000000 14重新分区”中阅读,但当我说“主题阅读推送处理-KSTREAM-PEEK-00000000 14重新分区--从头开始”时,我无法阅读

我不知道你这是什么意思。什么是

当我说“topic read push-processing-KSTREAM-PEEK-00000000 14重新分区--从头开始”


什么意思?您是否参考了命令行工具bin/kafka consumer.sh
?一般来说,是的,您可以阅读重新分区主题,但我不确定为什么这会有用?

对不起,我已经编辑了我的问题并添加了拓扑结构,我的意思是bin/kafka-console-consumer.sh--引导服务器:9092--主题测试--从头开始。我不能从头读内部主题。根据我的逻辑,我不能用flatMapValues代替flatMap。我的行动是这样的。flatMap().peek().join().map().peek().to(y主题)…之后,我正在使用kafka流收听y主题,连续收听y主题并执行.peek().leftJoin().selectKey().peek().join().filter().to(),我的操作完成。我担心所有选择新密钥的操作都会使用同一个密钥。我可以将消息(大约200-300万条记录)分发到不同的分区,但与只使用一个分区相比会更快吗。我很好奇RocksDB和kafka磁盘操作,以及它们如何在一个分区或多个分区上工作。根据我的拓扑结构;我所有的数据都进入同一个分区,当使用1个分区时,KStream的性能更好吗?还是我应该明确使用分布式分区?因为我做了3-4次重新分区,只选择了一个相同的键。因此,我通过不分发数据来阻止这个过程。我不确定哪种策略更快
我无法从一开始就阅读内部主题
——确切的问题是什么?你有什么错误?那你为什么要这么做呢?--为什么不能使用
flatMapValues()
   Sub-topology: 0
    Source: KSTREAM-SOURCE-0000000017 (topics: [KSTREAM-PEEK-0000000014-repartition])
      --> KSTREAM-JOIN-0000000018
    Processor: KSTREAM-JOIN-0000000018 (stores: [appconnect_push_processing_submissions-STATE-STORE-0000000000])
      --> KSTREAM-FILTER-0000000019
      <-- KSTREAM-SOURCE-0000000017
    Processor: KSTREAM-FILTER-0000000019 (stores: [])
      --> KSTREAM-SINK-0000000020
      <-- KSTREAM-JOIN-0000000018
    Source: KSTREAM-SOURCE-0000000001 (topics: [appconnect_push_processing_submissions])
      --> KTABLE-SOURCE-0000000002
    Sink: KSTREAM-SINK-0000000020 (topic: appconnect_push_send_bulk)
      <-- KSTREAM-FILTER-0000000019
    Processor: KTABLE-SOURCE-0000000002 (stores: [appconnect_push_processing_submissions-STATE-STORE-0000000000])
      --> none
      <-- KSTREAM-SOURCE-0000000001

  Sub-topology: 1
    Source: KSTREAM-SOURCE-0000000003 (topics: [appconnect_devices_exported_for_push])
      --> KSTREAM-MAP-0000000007
    Processor: KSTREAM-MAP-0000000007 (stores: [])
      --> KSTREAM-PEEK-0000000008
      <-- KSTREAM-SOURCE-0000000003
    Processor: KSTREAM-PEEK-0000000008 (stores: [])
      --> KSTREAM-FILTER-0000000010
      <-- KSTREAM-MAP-0000000007
    Processor: KSTREAM-FILTER-0000000010 (stores: [])
      --> KSTREAM-SINK-0000000009
      <-- KSTREAM-PEEK-0000000008
    Sink: KSTREAM-SINK-0000000009 (topic: KSTREAM-PEEK-0000000008-repartition)
      <-- KSTREAM-FILTER-0000000010

  Sub-topology: 2
    Source: KSTREAM-SOURCE-0000000011 (topics: [KSTREAM-PEEK-0000000008-repartition])
      --> KSTREAM-LEFTJOIN-0000000012
    Processor: KSTREAM-LEFTJOIN-0000000012 (stores: [appconnect_user_stream-STATE-STORE-0000000004])
      --> KSTREAM-KEY-SELECT-0000000013
      <-- KSTREAM-SOURCE-0000000011
    Processor: KSTREAM-KEY-SELECT-0000000013 (stores: [])
      --> KSTREAM-PEEK-0000000014
      <-- KSTREAM-LEFTJOIN-0000000012
    Processor: KSTREAM-PEEK-0000000014 (stores: [])
      --> KSTREAM-FILTER-0000000016
      <-- KSTREAM-KEY-SELECT-0000000013
    Processor: KSTREAM-FILTER-0000000016 (stores: [])
      --> KSTREAM-SINK-0000000015
      <-- KSTREAM-PEEK-0000000014
    Source: KSTREAM-SOURCE-0000000005 (topics: [appconnect_user_stream])
      --> KTABLE-SOURCE-0000000006
    Sink: KSTREAM-SINK-0000000015 (topic: KSTREAM-PEEK-0000000014-repartition)
      <-- KSTREAM-FILTER-0000000016
    Processor: KTABLE-SOURCE-0000000006 (stores: [appconnect_user_stream-STATE-STORE-0000000004])
      --> none
      <-- KSTREAM-SOURCE-0000000005