Java Storm拓扑将进入空闲状态,无法从kafkaSpout读取消息

Java Storm拓扑将进入空闲状态,无法从kafkaSpout读取消息,java,apache-kafka,apache-storm,Java,Apache Kafka,Apache Storm,我的storm拓扑运行了一定的时间(比如18小时、21小时),然后进入空闲状态。该拓扑没有响应卡夫卡发送的消息。我已经看过日志了我不知道发生了什么 当拓扑没有响应卡夫卡消息时,我会收到日志: .ZkCoordinator[INFO]任务[7/8]刷新分区管理器 连接 2016-03-30 02:59:15 s.k.DynamicBrokersReader[信息]从zookeeper读取分区信息: 全局分区信息{partitionMap={0=IP:6667,1=IP:6667, 2=IP:66

我的storm拓扑运行了一定的时间(比如18小时、21小时),然后进入空闲状态。该拓扑没有响应卡夫卡发送的消息。我已经看过日志了我不知道发生了什么

当拓扑没有响应卡夫卡消息时,我会收到日志:

.ZkCoordinator[INFO]任务[7/8]刷新分区管理器 连接 2016-03-30 02:59:15 s.k.DynamicBrokersReader[信息]从zookeeper读取分区信息: 全局分区信息{partitionMap={0=IP:6667,1=IP:6667, 2=IP:6667,3=IP:6667,4=IP:6667,5=IP:6667,6=IP:6667,7=IP:6667} 2016-03-30 02:59:15 s.k.KafkaUtils[INFO]任务[7/8]已分配[Partition{host=IP:6667,Partition=6}] 2016-03-30 02:59:15 s.k.ZkCoordinator[INFO]任务[7/8]已删除分区管理器:[] 2016-03-30 02:59:15 s.k.ZkCoordinator[INFO]任务[7/8]新分区管理器:[] 2016-03-30 02:59:15 s.k.ZkCoordinator[INFO]任务[7/8]完成刷新 2016-03-30 03:01:15 s.k.ZkCoordinator[INFO]任务[7/8]刷新分区管理器连接 2016-03-30 03:01:15 s.k.DynamicBrokersReader[INFO]从zookeeper读取分区信息: 全局分区信息{partitionMap={0=IP:6667,1=IP:6667, 2=IP:6667,3=IP:6667,4=IP:6667,5=IP:6667,6=IP:6667} 2016-03-30 03:01:15 s.k.KafkaUtils[INFO]任务[7/8]已分配[Partition{host=IP:6667,Partition=6}] 2016-03-30 03:01:15 s.k.ZkCoordinator[INFO]任务[7/8]已删除分区管理器:[] 2016-03-30 03:01:15 s.k.ZkCoordinator[INFO]任务[7/8]新分区管理器:[] 2016-03-30 03:01:15 s.k.ZkCoordinator[INFO]任务[7/8]已完成刷新 2016-03-30 03:03:15 s.k.ZkCoordinator[INFO]任务[7/8]刷新分区管理器连接 2016-03-30 03:03:15 s.k.DynamicBrokersReader[INFO]从zookeeper读取分区信息:GlobalPartitionInformation{partitionMap=
{0=IP:6667,1=IP:6667,2=IP:6667,3=IP:6667,4=IP:6667,5=IP:6667, 6=IP} 2016-03-30 03:03:15 s.k.KafkaUtils[INFO]任务[7/8]已分配[Partition{host=IP:6667,Partition=6}] 2016-03-30 03:03:15 s.k.ZkCoordinator[INFO]任务[7/8]已删除分区管理器:[] 2016-03-30 03:03:15 s.k.ZkCoordinator[INFO]任务[7/8]新分区管理器:[] 2016-03-30 03:03:15 s.k.ZkCoordinator[INFO]任务[7/8]已完成刷新 2016-03-30 03:05:15 s.k.ZkCoordinator[INFO]任务[7/8]刷新分区管理器连接

我如何追踪问题

下面是其中一个Pid的线程转储

Thread 26783: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) @bci=20, line=226 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(long) @bci=68, line=2082 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take() @bci=122, line=1090 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take() @bci=1, line=807 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=156, line=1068 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1130 (Compiled 
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
Thread 26776: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take() @bci=98, line=1085 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take() @bci=1, line=807 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=156, line=1068 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1130 (Interpreted 
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
Thread 26769: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Interpreted frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Interpreted frame)
 - java.util.concurrent.DelayQueue.take() @bci=28, line=209 (Interpreted frame)
 - java.util.concurrent.DelayQueue.take() @bci=1, line=68 (Interpreted frame)
 - org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop() @bci=10, line=781 (Interpreted frame)
 - org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(org.apache.curator.framework.imps.CuratorFrameworkImpl)                line=57 (Interpreted frame)
 - org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call() @bci=4, line=275 (Interpreted frame)

您使用的是哪个版本的storm?@JayaAnanthram storm-0.9.3Did您试图获取所有工作线程的线程转储,尤其是喷口工作线程。有一次,我遇到了类似的问题,在执行tread转储后,我在打印日志语句时发现java代码级死锁。因此,喷口工作线程被困在死锁中,因此它没有尝试关闭来自主题的ume消息,即使kafka有大量挂起的消息。线程转储可以通过
jstack PID\u of_speut\u WORKER>dump.txt
。在dump.txt中,您可以在末尾找到死锁信息。@JayaAnanthram我正在使用RHEL如何进行线程转储它与您的操作系统无关。转到
$JAVA\u HOME/bin
d type
jstack PID\u OF theu-spoot\u-WORKER>dump.txt
。PID\u OF-spoot\u-WORKER是storm-spoot-WORKER的进程id。