Apache spark spark中reduceByKey函数的意外输出

Apache spark spark中reduceByKey函数的意外输出,apache-spark,Apache Spark,我正在编写代码,需要使用reduceBykey函数聚合密钥 //mapToPair代码 JavaPairRDD<String,Integer> taxiPair = taxiData.mapToPair( x->{ if(!x.isEmpty()) { String [] split = x.split(",");

我正在编写代码,需要使用reduceBykey函数聚合密钥

//mapToPair代码

JavaPairRDD<String,Integer> taxiPair = taxiData.mapToPair(

            x->{


                if(!x.isEmpty())
                {

                    String [] split = x.split(",");
                    x=split[9]; //Extracting Index Value 9

                }



           return new Tuple2<String,Integer>("Payment:"+x,1);
        }

    );

    List<Tuple2<String,Integer>> sample = taxiPair.take(10);

    for(Tuple2<String,Integer> t: sample)
    {


        System.out.println(t._1+","+t._2);


    }
根据以上我的理解,一旦reduceByKey完成,它应该给出以下结果:

Payment:1,9
Payment:2,1
但是,

//代码还原键

JavaPairRDD<String,Integer> taxiReduce = taxiPair.reduceByKey(

     (x,y)-> (y+y)



    );


    List<Tuple2<String,Integer>> sample2 = taxiReduce.collect();

    for(Tuple2<String,Integer> t: sample2)
    {


        System.out.println(t._1+","+t._2);


    }
语句中的拼写错误,此处需要“x+y”而不是“y+y”:

javapairdd-taxiReduce=taxiPair.reduceByKey(
(x,y)->(y+y));

它应该是
(x,y)->(x+y)

虽然这可以回答问题,但最好添加一些上下文来解释代码的功能。
JavaPairRDD<String,Integer> taxiReduce = taxiPair.reduceByKey(

     (x,y)-> (y+y)



    );


    List<Tuple2<String,Integer>> sample2 = taxiReduce.collect();

    for(Tuple2<String,Integer> t: sample2)
    {


        System.out.println(t._1+","+t._2);


    }
Payment:3,2
Payment:2,2
Payment:,2
Payment:4,2
Payment:1,2
  (x,y)-> (y+y)
 JavaPairRDD<String,Integer> taxiReduce = taxiPair.reduceByKey(

 (x,y)-> (y+y) );