Java 计算用户数量的螺栓';s原创推文
在我使用storm将下载的所有推文存储到MongoDB数据库之后,我正在尝试统计用户的原始推文数量。不管怎样,每当我使用下面的代码计算作者原创推文的数量时,它都会不断地读取(并计算)相同的推文 螺栓:Java 计算用户数量的螺栓';s原创推文,java,twitter,apache-storm,Java,Twitter,Apache Storm,在我使用storm将下载的所有推文存储到MongoDB数据库之后,我正在尝试统计用户的原始推文数量。不管怎样,每当我使用下面的代码计算作者原创推文的数量时,它都会不断地读取(并计算)相同的推文 螺栓: public class CalculateTheMetrics extends BaseBasicBolt { Map<String,Double>OT1=new HashMap<String, Double>(); @Override public void de
public class CalculateTheMetrics extends BaseBasicBolt {
Map<String,Double>OT1=new HashMap<String, Double>();
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("USERNAME","OT1"));
}
@Override
public void execute(Tuple input,BasicOutputCollector collector) {
String author=input.getString(0);
String tweet=input.getString(2);
Double OT1=this.OT1.get(author);
if(OT1==null){
OT1=0.0;
}
if(author!=null && tweet!=null ){
if(!tweet.startsWith("@") || !tweet.startsWith("RT")){
OT1+=1;
}
this.OT1.put(author,OT1);
System.out.println(author+" +OT1);
collector.emit(new Values(author,OT1))
}
}
}
我想要的是,让它停止重复阅读同一条tweet并计算同一条tweet。请帮助类似的事情
public class Calculate1Metric extends BaseRichBolt {
private OutputCollector collector;
Map<String ,Integer>OT1;
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("username","OT1"));
}
@Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector=collector;
this.OT1=new HashMap<String, Integer>();
}
@Override
public void execute(Tuple input) {
final String sourceComponent = input.getSourceComponent();
String author = input.getString(0);
String tweet = input.getString(2);
if (author != null && tweet != null) {
Integer OT1 = this.OT1.get(author);
if (OT1 == null) {
OT1 = 0;
}
if (!tweet.startsWith("@") || !tweet.contains("RT ") || !tweet.startsWith("RT")) {
OT1 += 1;
}
if(!this.OT1.containsKey(author)) {
this.OT1.put(author, OT1);
}else{
collector.emit(new Values(author,OT1,OT2));
System.out.println(author + " " + OT1+" "+OT2);
this.OT1.remove(author);
}
}else{
collector.fail(input);
}
collector.ack(input);
}
公共类Calculate1Metric扩展BaseRichBolt{
专用输出采集器;
MapOT1;
@凌驾
公共无效申报输出字段(OutputFields申报器申报器){
declarer.declare(新字段(“用户名”、“OT1”));
}
@凌驾
public void prepare(地图风暴形态、拓扑上下文、OutputCollector){
this.collector=collector;
this.OT1=新的HashMap();
}
@凌驾
公共void执行(元组输入){
最后一个字符串sourceComponent=input.getSourceComponent();
String author=input.getString(0);
String tweet=input.getString(2);
if(author!=null&&tweet!=null){
整数OT1=this.OT1.get(author);
如果(OT1==null){
OT1=0;
}
如果(!tweet.startsWith(“@”)| |!tweet.contains(“RT”)| |!tweet.startsWith(“RT”)){
OT1+=1;
}
如果(!this.OT1.containsKey(作者)){
this.OT1.put(作者,OT1);
}否则{
emit(新值(author、OT1、OT2));
System.out.println(author+“”+OT1+“”+OT2);
此.OT1.remove(作者);
}
}否则{
收集器失败(输入);
}
collector.ack(输入);
}
您是否启用了容错功能?在这种情况下,您需要在bolt中确认已处理的元组。收集器.ack(输入)
(一个侧面的备注:为什么要使用双精度
而不是长型
进行计数?)添加到@MatthiasJ.Sax comment,您需要在发出时锚定元组,以便正确地确认它们。但是BaseColt不是自动确认或锚定的吗?
public class Calculate1Metric extends BaseRichBolt {
private OutputCollector collector;
Map<String ,Integer>OT1;
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("username","OT1"));
}
@Override
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector=collector;
this.OT1=new HashMap<String, Integer>();
}
@Override
public void execute(Tuple input) {
final String sourceComponent = input.getSourceComponent();
String author = input.getString(0);
String tweet = input.getString(2);
if (author != null && tweet != null) {
Integer OT1 = this.OT1.get(author);
if (OT1 == null) {
OT1 = 0;
}
if (!tweet.startsWith("@") || !tweet.contains("RT ") || !tweet.startsWith("RT")) {
OT1 += 1;
}
if(!this.OT1.containsKey(author)) {
this.OT1.put(author, OT1);
}else{
collector.emit(new Values(author,OT1,OT2));
System.out.println(author + " " + OT1+" "+OT2);
this.OT1.remove(author);
}
}else{
collector.fail(input);
}
collector.ack(input);
}