Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/356.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java Storm Word Count拓扑-执行次数的概念问题_Java_Apache Storm_Word Count - Fatal编程技术网

Java Storm Word Count拓扑-执行次数的概念问题

Java Storm Word Count拓扑-执行次数的概念问题,java,apache-storm,word-count,Java,Apache Storm,Word Count,下午好,我正在跟踪风暴。以下是Java文件供参考 这是主文件: public class WordCountTopology { public static class SplitSentence extends ShellBolt implements IRichBolt { public SplitSentence() { super("python", "splitsentence.py"); } @Override public void declareOutputFields(

下午好,我正在跟踪风暴。以下是Java文件供参考

这是主文件:

public class WordCountTopology {
public static class SplitSentence extends ShellBolt implements IRichBolt {

public SplitSentence() {
  super("python", "splitsentence.py");
}

@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
  declarer.declare(new Fields("word"));
}

@Override
public Map<String, Object> getComponentConfiguration() {
  return null;
}
}

public static class WordCount extends BaseBasicBolt {
Map<String, Integer> counts = new HashMap<String, Integer>();

@Override
public void execute(Tuple tuple, BasicOutputCollector collector) {
  String word = tuple.getString(0);
  Integer count = counts.get(word);
  if (count == null)
    count = 0;
  count++;
  counts.put(word, count);
  collector.emit(new Values(word, count));
}

@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
  declarer.declare(new Fields("word", "count"));
}
}

public static void main(String[] args) throws Exception {

TopologyBuilder builder = new TopologyBuilder();

builder.setSpout("spout", new TextFileSpout(), 5);

builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

Config conf = new Config();
conf.setDebug(true);

if (args != null && args.length > 0) {
  conf.setNumWorkers(3);

  StormSubmitter.submitTopology(args[0], conf, builder.createTopology());
}
else {
  conf.setMaxTaskParallelism(3);
  LocalCluster cluster = new LocalCluster();
  cluster.submitTopology("word-count", conf, builder.createTopology());
  Thread.sleep(10000);
  cluster.shutdown();
}
}
}
这段代码运行时输出大量线程/发射。问题是,程序重复执行一个句子85次,而不是一次。我猜这是因为原始代码多次执行新的随机语句


是什么导致NextTuple被调用如此多次?

您应该使用in-open方法移动文件初始化代码,否则每次调用NextTuple时,您的文件处理程序都将被初始化

编辑:

在open方法中,执行如下操作

    br = new BufferedReader(new FileReader(csvFileToRead));
然后读取文件的逻辑应该在nextTuple方法中

     while ((line = br.readLine()) != null) {
         // your logic
     }

你能和我共用你的壶嘴吗code@user2720864共享喷口代码。很抱歉,我已将文件初始化移到“打开”状态。生成的句子将被更正,文件中的所有单词用空格分隔。然而,nextTuple的调用次数是86倍,因此我的计数是它们应该的86倍。我想这会把我的问题缩小到如何只调用一次nextTuple。非常感谢您的时间。阅读逻辑应该在nextTuple方法中,更新我的答案谢谢您的回答。即使我删除了整个文件读取部分,只把句子变成一个单词,nextTuple也会被重复调用85次。在本例中,您知道Storm是如何决定下一次运行多少次的吗?也许我错过了某个配置。谢谢。我把代码简化为一句话。我浏览了代码,不知道下一步调用什么。我需要喷口运行一次,并返回wordOne:1和Word2:1,而不是85和85。Thank youStorm专为在数据可用时发出数据的流媒体源而设计。nextTuple()是在无限循环中调用的,因此对于您的情况,它需要跟踪它在数据源中的位置。如果希望至少处理一次,它还应该跟踪ack()和fail()调用。
     while ((line = br.readLine()) != null) {
         // your logic
     }