Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/sorting/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sorting setOutputKeyComparator-Mapreduce辅助排序(在值分组之后)_Sorting_Hadoop_Mapreduce_Key_Grouping - Fatal编程技术网

Sorting setOutputKeyComparator-Mapreduce辅助排序(在值分组之后)

Sorting setOutputKeyComparator-Mapreduce辅助排序(在值分组之后),sorting,hadoop,mapreduce,key,grouping,Sorting,Hadoop,Mapreduce,Key,Grouping,我只是在AskUbuntu数据集上试用Mapreduce程序 我的映射器输出是 语法 Key: TAG-<TAG_NAME>-<PARAM> Value: <PARAM>-<Count> 我使用Partitioner根据key的前三个字符对其进行分区。 标签 此外,我正在使用值分组比较器对一组值进行分组 例如,在窗户上贴标签 public static class ValueGroupingComparator implements RawCo

我只是在AskUbuntu数据集上试用Mapreduce程序

我的映射器输出是

语法

Key: TAG-<TAG_NAME>-<PARAM>  Value: <PARAM>-<Count>
我使用Partitioner根据key的前三个字符对其进行分区。 标签

此外,我正在使用值分组比较器对一组值进行分组 例如,在窗户上贴标签

public static class ValueGroupingComparator implements RawComparator<Text> {

        /*value grouping comparator will group by the first few letters of the key till a second hyphen (“-”) symbol  is found. */
        public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
            String sOne = new String(b1);
            String sTwo = new String(b2);

            // return new Character((char)b1[0]).compareTo((char)b2[0]);
            return sOne.substring(0, sOne.indexOf('-', 4)).compareTo(
                    sTwo.substring(0, sTwo.indexOf('-', 4)));
        }

        public int compare(Text o1, Text o2) {
            return compare(o1.getBytes(), 0, o1.getLength(), o2.getBytes(), 0,
                    o2.getLength());
        }
    }
这样我就可以从我的减速机输出以下内容

Key - TAG-windows Value - QUE-1242, VIEWS-4370
我尝试了以下键比较器。但我无法实现我的预期产出

public static class KeyComparator extends WritableComparator {
    public KeyComparator() {
        super(Text.class);
    }

    public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {

        int hypen = '-';
        int s1Ind = 0;
        int s2Ind = 0;
        for (int i = 4; i < b1.length; i++) {
            if (b1[i] == hypen) {
                s1Ind = i;
                break;
            }
        }

        for (int i = 4; i < b2.length; i++) {
            if (b2[i] == hypen) {
                s2Ind = i;
                break;
            }
        }

        if (s1Ind == 0 || s2Ind == 0)
            System.out.println(s1Ind + "<->" + s2Ind);

        int compare = compareBytes(b1, s1, s1Ind, b2, s2, s2Ind);
        if (compare == 0) {
            return compareBytes(b1, s1Ind + 1, l1 - s1Ind + 2, b2,
                    s2Ind + 1, l2 - s2Ind + 2);             
        }
        return compare;
    }
}
公共静态类KeyComparator扩展了WritableComparator{
公钥比较器(){
super(Text.class);
}
公共整数比较(字节[]b1、整数s1、整数l1、字节[]b2、整数s2、整数l2){
int-hypen='-';
int s1Ind=0;
int s2Ind=0;
对于(int i=4;i

需要hadoop mapreduce专家的帮助。

我在stackoverflow中找到了与我的问题相关的以下两个链接

让我试试运气

Key - TAG-windows Value - QUE-1242, VIEWS-4370
public static class KeyComparator extends WritableComparator {
    public KeyComparator() {
        super(Text.class);
    }

    public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {

        int hypen = '-';
        int s1Ind = 0;
        int s2Ind = 0;
        for (int i = 4; i < b1.length; i++) {
            if (b1[i] == hypen) {
                s1Ind = i;
                break;
            }
        }

        for (int i = 4; i < b2.length; i++) {
            if (b2[i] == hypen) {
                s2Ind = i;
                break;
            }
        }

        if (s1Ind == 0 || s2Ind == 0)
            System.out.println(s1Ind + "<->" + s2Ind);

        int compare = compareBytes(b1, s1, s1Ind, b2, s2, s2Ind);
        if (compare == 0) {
            return compareBytes(b1, s1Ind + 1, l1 - s1Ind + 2, b2,
                    s2Ind + 1, l2 - s2Ind + 2);             
        }
        return compare;
    }
}