Java 保存和重新加载番石榴花过滤器时出错-需要帮助查找代码中的任何错误吗
我最近正在测试google对经典bloom过滤器的实现,然后在生产中使用它。我正在使用guava库的第18版。当我运行下面的程序时,我在sysout中得到了超过200个不同的计数。我看不出这里会出什么问题,有人能提供第二双眼睛吗Java 保存和重新加载番石榴花过滤器时出错-需要帮助查找代码中的任何错误吗,java,guava,bloom-filter,Java,Guava,Bloom Filter,我最近正在测试google对经典bloom过滤器的实现,然后在生产中使用它。我正在使用guava库的第18版。当我运行下面的程序时,我在sysout中得到了超过200个不同的计数。我看不出这里会出什么问题,有人能提供第二双眼睛吗 import com.google.common.collect.Lists; import com.google.common.hash.BloomFilter; import com.google.common.hash.Funnels; import com.go
import com.google.common.collect.Lists;
import com.google.common.hash.BloomFilter;
import com.google.common.hash.Funnels;
import com.google.common.hash.Hashing;
import org.apache.commons.lang3.RandomStringUtils;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.Charset;
import java.util.*;
/**
* http://code.google.com/p/guava-libraries/wiki/HashingExplained
* stackoverflow.com/questions/12319560/how-should-i-use-guavas-hashingconsistenthash
*/
public class GuavaHashing {
private static final int N = 2500;
public static void main(String[] args) throws IOException {
List<String> ids = generateStoryIds(N);
Set<String> testIds = generateTest(ids);
bloomfiltertime(ids, testIds);
}
private static List<String> generateStoryIds(int size) {
List<String> stories = new ArrayList<>();
for (int i=0; i<size; ++i) {
stories.add(RandomStringUtils.randomAlphanumeric(16));
}
return stories;
}
private static Set<String> generateTest(List<String> presList) {
Set<String> test = new HashSet<>();
Random rand = new Random(System.currentTimeMillis());
for (int i=0; i<200; ++i) {
test.add(presList.get(Math.abs(rand.nextInt()%N)));
}
for (int i=0; i<250; ++i) {
test.add(RandomStringUtils.randomAlphanumeric(16));
}
return test;
}
public static void bloomfiltertime(List<String> storyIds, Set<String> testPresent) throws IOException {
BloomFilter<String> stories = BloomFilter.create(Funnels.stringFunnel(Charset.defaultCharset()), N, 0.05);
long startTime = System.currentTimeMillis();
for(String story : storyIds) {
stories.put(story);
}
long endTime = System.currentTimeMillis();
System.out.println("bloom put time " + (endTime - startTime));
FileOutputStream fos = new FileOutputStream("testfile.dat");
stories.writeTo(fos);
fos.close();
FileInputStream fis = new FileInputStream("testfile.dat");
BloomFilter<String> readStories = BloomFilter.create(Funnels.stringFunnel(Charset.defaultCharset()), N, 0.05);
startTime = System.currentTimeMillis();
readStories.readFrom(fis, Funnels.stringFunnel(Charset.defaultCharset()));
endTime = System.currentTimeMillis();
System.out.println("bloom read file time " + (endTime - startTime));
startTime = System.currentTimeMillis();
int count = 0;
for(String story : testPresent) {
if(stories.mightContain(story) != readStories.mightContain(story)) {
++count;
}
}
endTime = System.currentTimeMillis();
System.out.println("bloom check time " + (endTime - startTime));
System.out.println("varying : " + count);
}
}
import com.google.common.collect.list;
导入com.google.common.hash.BloomFilter;
导入com.google.common.hash.Funnels;
导入com.google.common.hash.Hashing;
导入org.apache.commons.lang3.RandomStringUtils;
导入java.io.FileInputStream;
导入java.io.FileOutputStream;
导入java.io.IOException;
导入java.nio.charset.charset;
导入java.util.*;
/**
* http://code.google.com/p/guava-libraries/wiki/HashingExplained
*stackoverflow.com/questions/12319560/how-should-i-use-guavas-hashingconsistenthash
*/
公共类Guavahishing{
专用静态最终整数N=2500;
公共静态void main(字符串[]args)引发IOException{
列表ID=generateStoryID(N);
Set testIds=生成测试(ids);
bloomfiltertime(ID、睾丸);
}
私有静态列表GeneratesToryDS(整数大小){
列表故事=新的ArrayList();
对于(int i=0;i该方法是一个静态方法,它返回一个新的BloomFilter
对象。您忽略了这个返回值(并且显然假设该方法“填充”了调用它的对象)
所以改变
BloomFilter<String> readStories =
BloomFilter.create(Funnels.stringFunnel(Charset.defaultCharset()), N, 0.05);
readStories.readFrom(fis, Funnels.stringFunnel(Charset.defaultCharset()));
BloomFilter readStories=
创建(Funnels.stringFunnel(Charset.defaultCharset()),N,0.05);
readStories.readFrom(fis,Funnels.stringFunnel(Charset.defaultCharset());
到
BloomFilter readStories=
BloomFilter.readFrom(fis,Funnels.stringFunnel(Charset.defaultCharset());
它应该会起作用
(顺便说一句:现代IDE在对实例调用静态方法时会发出警告。例如,Eclipse:Window->Preferences->Java->Compiler->Errors/Warnings->Code Style->将“对静态成员的非静态访问”设置为“warning”)
BloomFilter<CharSequence> readStories =
BloomFilter.readFrom(fis, Funnels.stringFunnel(Charset.defaultCharset()));