Java 如何以与List.hashCode（）相同的方式计算流的哈希代码_Java_Hash_Java Stream_Hashcode

Java 如何以与List.hashCode（）相同的方式计算流的哈希代码

java hash

Java 如何以与List.hashCode（）相同的方式计算流的哈希代码,java,hash,java-stream,hashcode,Java,Hash,Java Stream,Hashcode,我刚刚意识到，使用以下算法来计算流的哈希代码是不可能的。问题是哈希代码的初始种子是1，这不是累加器的标识算法 : 您可能会认为以下内容是正确的，但事实并非如此，尽管如果不拆分流处理，它将起作用 List<Object> list = Arrays.asList(1,null, new Object(),4,5,6); int hashCode = list.stream().map(Objects::hashCode).reduce(1, (a, b) -> 31 * a +

我刚刚意识到，使用以下算法来计算流的哈希代码是不可能的。问题是哈希代码的初始种子是

，这不是累加器的标识

算法 :

您可能会认为以下内容是正确的，但事实并非如此，尽管如果不拆分流处理，它将起作用

List<Object> list = Arrays.asList(1,null, new Object(),4,5,6);
int hashCode = list.stream().map(Objects::hashCode).reduce(1, (a, b) -> 31 * a + b);

List List=Arrays.asList（1，null，newobject（），4,5,6）；
int hashCode=list.stream（）.map（Objects:：hashCode）.reduce（1，（a，b）->31*a+b）；

似乎唯一明智的方法是获取

流的迭代器，然后进行正常的顺序处理，或者首先将其收集到列表中。
作为第一种方法，只要不考虑性能问题，我会使用收集到列表的解决方案。这样你就避免了重新实现轮子，如果有一天哈希算法发生了变化，你就会从中受益，如果流是并行的，你也会安全（即使我不确定这是不是一个真正的问题）
我实现它的方式可能会有所不同，这取决于您需要如何以及何时比较不同的数据结构（我们称之为Foo
）
如果手动执行，只需一个简单的静态功能即可：
public static int computeHash(Foo origin, Collection<Function<Foo, ?>> selectors) {
    return selectors.stream()
            .map(f -> f.apply(origin))
            .collect(Collectors.toList())
            .hashCode();
}

但是，如果Foo
的实例本身存储在Collection
中，并且您需要同时实现hashCode（）
和equals（）
（来自Object
），我会将其包装在FooEqualable
中：
public final class FooEqualable {
    private final Foo origin;
    private final Collection<Function<Foo, ?>> selectors;

    public FooEqualable(Foo origin, Collection<Function<Foo, ?>> selectors) {
        this.origin = origin;
        this.selectors = selectors;
    }

    @Override
    public int hashCode() {
        return selectors.stream()
                .map(f -> f.apply(origin))
                .collect(Collectors.toList())
                .hashCode();
    }

    @Override
    public boolean equals(Object obj) {
        if (obj instanceof FooEqualable) {
            FooEqualable that = (FooEqualable) obj;

            Object[] a1 = selectors.stream().map(f -> f.apply(this.origin)).toArray();
            Object[] a2 = selectors.stream().map(f -> f.apply(that.origin)).toArray();

            return Arrays.equals(a1, a2);
        }
        return false;
    }
}

公共最终类FooEqualable{
私人最终食物来源；
私人最终收藏选择器；
公共FooEqualable（Foo来源、集合选择器）{
this.origin=origin；
this.selectors=选择器；
}
@凌驾
公共int hashCode（）{
返回选择器。stream（）
.map（f->f.apply（原点））
.collect（收集器.toList（））
.hashCode（）；
}
@凌驾
公共布尔等于（对象obj）{
if（FooEqualable的obj实例）{
FooEqualable that=（FooEqualable）obj；
Object[]a1=selectors.stream（）.map（f->f.apply（this.origin））.toArray（）；
Object[]a2=selectors.stream（）.map（f->f.apply（that.origin））.toArray（）；
返回数组。等于（a1，a2）；
}
返回false；
}
}

我完全知道，如果多次调用hashCode（）
和equals（）
，此解决方案不会优化（性能方面），但我倾向于不进行优化，除非它成为一个问题。
写得对，如果您想要一种简单的方法，还有两种可能性：
1.收集到列表
并调用hashCode（）
作为提醒，List.hashCode（）
使用的算法：
int hashCode = 1;
for (E e : list)
  hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());

虽然乍一看，哈希代码算法由于其非关联性似乎是不可并行的，但如果我们转换函数，则有可能：
((a * 31 + b) * 31 + c ) * 31 + d

到
这基本上是
a * 31³ + b * 31² + c * 31¹ + d * 31⁰

或者对于大小为n
的任意列表
：
1 * 31ⁿ + e₀ * 31ⁿ⁻¹ + e₁ * 31ⁿ⁻² + e₂ * 31ⁿ⁻³ +  …  + eₙ₋₃ * 31² + eₙ₋₂ * 31¹ + eₙ₋₁ * 31⁰

第一个1
是原始算法的初始值，eₓ
是索引x
处列表元素的哈希代码。虽然求和现在是独立于求值顺序的，但显然存在对元素位置的依赖性，我们可以首先通过对索引进行流式处理来解决这个问题，这适用于随机访问列表和数组，或者通常使用跟踪遇到对象数目的收集器来解决。收集器可以使用重复乘法进行累加，并且只能使用幂函数来组合结果：
static <T> Collector<T,?,Integer> hashing() {
    return Collector.of(() -> new int[2],
        (a,o)    -> { a[0]=a[0]*31+Objects.hashCode(o); a[1]++; },
        (a1, a2) -> { a1[0]=a1[0]*iPow(31,a2[1])+a2[0]; a1[1]+=a2[1]; return a1; },
        a -> iPow(31,a[1])+a[0]);
}
// derived from http://stackoverflow.com/questions/101439
private static int iPow(int base, int exp) {
    int result = 1;
    for(; exp>0; exp >>= 1, base *= base)
        if((exp & 1)!=0) result *= base;
    return result;
}

静态收集器散列（）{
返回（（）->new int[2]的收集器，
（a，o）->{a[0]=a[0]*31+对象。哈希代码（o）；a[1]+；}，
（a1，a2）->{a1[0]=a1[0]*iPow（31，a2[1]）+a2[0]；a1[1]+=a2[1]；返回a1；}，
a->iPow（31，a[1]）+a[0]）；
}
//源自http://stackoverflow.com/questions/101439
私有静态int-iPow（int-base，int-exp）{
int结果=1；
对于（；exp>0；exp>>=1，base*=base）
如果（（exp&1）！=0）结果*=base；
返回结果；
}


List List=Arrays.asList（1，null，newobject（），4,5,6）；
int expected=list.hashCode（）；
int hashCode=list.stream（）.collect（hashing（））；
if（hashCode！=预期值）
抛出新的断言错误（）；
//并行工作
hashCode=list.parallelStream（）.collect（hashing（））；
if（hashCode！=预期值）
抛出新的断言错误（）；
//避免自动装箱的方法更复杂：
int[]result=list.parallelStream（）.mapToInt（Objects:：hashCode）
.collect（（）->new int[2]，
（a，o）->{a[0]=a[0]*31+对象。哈希代码（o）；a[1]+；}，
（a1，a2）->{a1[0]=a1[0]*iPow（31，a2[1]）+a2[0]；a1[1]+=a2[1]；}）；
hashCode=iPow（31，结果[1]）+结果[0]；
if（hashCode！=预期值）
抛出新的断言错误（）；
//随机访问列表提供了更好的解决方案：
hashCode=IntStream.range（0，list.size（））.parallel（）
.map（ix->Objects.hashCode（list.get（ix））*iPow（31，list.size（）-ix-1））
.sum（）+iPow（31，list.size（））；
if（hashCode！=预期值）
抛出新的断言错误（）；
我找到的最简单和最短的方法是使用收集器实现收集器
/**
*创建一个收集元素哈希代码的新收集器。
*@param输入元素的类型
*@返回哈希代码
*@see Arrays#hashCode（java.lang.Object[]）
*@见AbstractList#hashCode（）
*/
公共静态收集器toHashCode（）{
减少（1，Objects:：hashCode，（i，j）->31*i+j）；
}
@试验
public void testHashCode（）{
List=Arrays.asList（Math.PI，42，“stackoverflow.com”）；
在里面
((a * 31 + b) * 31 + c ) * 31 + d

a * 31 * 31 * 31 + b * 31 * 31 + c * 31 + d

a * 31³ + b * 31² + c * 31¹ + d * 31⁰

1 * 31ⁿ + e₀ * 31ⁿ⁻¹ + e₁ * 31ⁿ⁻² + e₂ * 31ⁿ⁻³ +  …  + eₙ₋₃ * 31² + eₙ₋₂ * 31¹ + eₙ₋₁ * 31⁰

static <T> Collector<T,?,Integer> hashing() {
    return Collector.of(() -> new int[2],
        (a,o)    -> { a[0]=a[0]*31+Objects.hashCode(o); a[1]++; },
        (a1, a2) -> { a1[0]=a1[0]*iPow(31,a2[1])+a2[0]; a1[1]+=a2[1]; return a1; },
        a -> iPow(31,a[1])+a[0]);
}
// derived from http://stackoverflow.com/questions/101439
private static int iPow(int base, int exp) {
    int result = 1;
    for(; exp>0; exp >>= 1, base *= base)
        if((exp & 1)!=0) result *= base;
    return result;
}

List<Object> list = Arrays.asList(1,null, new Object(),4,5,6);
int expected = list.hashCode();

int hashCode = list.stream().collect(hashing());
if(hashCode != expected)
    throw new AssertionError();

// works in parallel
hashCode = list.parallelStream().collect(hashing());
if(hashCode != expected)
    throw new AssertionError();

// a method avoiding auto-boxing is more complicated:
int[] result=list.parallelStream().mapToInt(Objects::hashCode)
    .collect(() -> new int[2],
    (a,o)    -> { a[0]=a[0]*31+Objects.hashCode(o); a[1]++; },
    (a1, a2) -> { a1[0]=a1[0]*iPow(31,a2[1])+a2[0]; a1[1]+=a2[1]; });
hashCode = iPow(31,result[1])+result[0];

if(hashCode != expected)
    throw new AssertionError();

// random access lists allow a better solution:
hashCode = IntStream.range(0, list.size()).parallel()
    .map(ix -> Objects.hashCode(list.get(ix))*iPow(31, list.size()-ix-1))
    .sum() + iPow(31, list.size());

if(hashCode != expected)
    throw new AssertionError();