Java:按多个字段筛选收集和检索数据

Java:按多个字段筛选收集和检索数据,java,algorithm,collections,data-retrieval,Java,Algorithm,Collections,Data Retrieval,我有一门课: public class Address { private String country; private String state; private String city; } 还有一个Person对象列表。Person类看起来像: public class Person { private String country; private String state; private String city; //oth

我有一门课:

public class Address {
    private String country;
    private String state;
    private String city;
}
还有一个Person对象列表。Person类看起来像:

public class Person {
    private String country;
    private String state;
    private String city;
    //other fields
}
我需要过滤
Person
对象并获得最合适的对象<代码>地址对象可以至少有一个非空字段
Person
对象不能初始化任何、部分或所有提到的字段

以下是一个可能的输入示例:

Three Person objects:
a. PersonA: country = 'A'
b. PersonB: country = 'A', state = 'B'
c. PersonC: country = 'A', state = 'B', city = 'C'

Address object:
a. Address: country = 'A', state = 'B'
过滤后的预期结果是PersonB。如果只有PersonA和PersonC对象,那么PersonA更可取

我想展示一下我是如何做到这一点的,但事实上这是纯暴力算法,我不喜欢它。算法复杂度随字段的增加而增加。我还考虑过使用guava filter by predicate,但不知道谓词应该是什么


如果除暴力外还有其他过滤算法,那么什么算法更适合这种过滤?

据我所知,暴力指的是检查所有实体的所有字段。好吧,如果你不重构你的类,这是不可能的,但是有一个简单的技巧可以帮助你。它使用
状态
模式

您可以将flag
notNulls
添加到这两个类:

public class Address {
    private int notNulls = 0;
    private String country;
    private String state;
    private String city;
}

public class Person {
    private int notNulls = 0;
    private String country;
    private String state;
    private String city;
    //other fields
}
我将向您展示一个setter的可能实现,因为其他类似:

public void setCountry(String s) {
    if (country == null {
        if (s != null) {
            country = s;
            notNulls++;
        }
    } else {
        if (s == null) {
            country == null;
            notNulls--;
        } else {
            country = s;
        }
    }
}

public boolean isValid() {
    return notNulls != 0;
}

现在,您可以简单地在对象之间循环。

为了避免暴力,您需要按地址为您的人员编制索引。对于一个好的搜索,你肯定需要一个国家(猜测它或以某种方式默认它,否则结果无论如何都会太不准确)

索引将是一个数字,前3位代表国家,后3位代表州,后4位代表城市。在这种情况下,在int中,您将能够存储213个国家(),最多包含999个州和9999个城市

它使我们能够使用hashCode和树集来索引Person实例,并以O(log(n))的方式部分按地址查找它们,而不必涉及它们的字段。字段将在树集构造上被触及,并且您需要添加一些额外的逻辑来修改Person以保持索引的完整性

从国家开始,按顺序计算每个部分的指数

    import java.util.HashMap;
    import java.util.Map;

    public class PartialAddressSearch {

        private final static Map<String, AddressPartHolder> COUNTRY_MAP = new HashMap<>(200);

        private static class AddressPartHolder {
            int id;
            Map<String, AddressPartHolder> subPartMap;

            public AddressPartHolder(int id, Map<String, AddressPartHolder> subPartMap) {
                this.id = id;
                this.subPartMap = subPartMap;
            }
        }

        public static int getCountryStateCityHashCode(String country, String state, String city) {
            if (country != null && country.length() != 0) {
                int result = 0;
                AddressPartHolder countryHolder = COUNTRY_MAP.get(country);
                if (countryHolder == null) {
                    countryHolder = new AddressPartHolder(COUNTRY_MAP.size() + 1, new HashMap<>());
                    COUNTRY_MAP.put(country, countryHolder);
                }
                result += countryHolder.id * 10000000;

                if (state != null) {
                    AddressPartHolder stateHolder = countryHolder.subPartMap.get(state);
                    if (stateHolder == null) {
                        stateHolder = new AddressPartHolder(countryHolder.subPartMap.size() + 1, new HashMap<>());
                        countryHolder.subPartMap.put(state, stateHolder);
                    }
                    result += stateHolder.id * 10000;

                    if (city != null && city.length() != 0) {
                        AddressPartHolder cityHolder = stateHolder.subPartMap.get(city);
                        if (cityHolder == null) {
                            cityHolder = new AddressPartHolder(stateHolder.subPartMap.size() + 1, null);
                            stateHolder.subPartMap.put(city, cityHolder);
                        }
                        result += cityHolder.id;
                    }
                }

                return result;
            } else {
                throw new IllegalArgumentException("Non-empty country is expected");
            }
    }
在此之后,您的过滤代码将收缩到填充索引和计算部分地址的下限:

    TreeSet<Person> personSetByAddress = new TreeSet<>();
    Person personA = new Person();
    personA.setCountry("A");
    personSetByAddress.add(personA);
    Person personB = new Person();
    personB.setCountry("A");
    personB.setState("B");
    personSetByAddress.add(personB);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    Person{hashCode=10010000, country='A', state='B', city='null'}
对不同状态的测试也会产生
null

    TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
    Person personD = new Person();
    personD.setCountry("D");
    personSetByAddress.add(personD);

    Person personE = new Person();
    personE.setCountry("A");
    personE.setState("E");
    personSetByAddress.add(personE);

    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressA = new Address();
    addressA.setCountry("A");

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    Address addressABC = new Address();
    addressABC.setCountry("A");
    addressABC.setState("B");
    addressABC.setCity("C");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    null
TreeSet personSetByAddress=new StrictCountryTreeSet();
Person personD=新人();
国家(以下简称“D”);
personSetByAddress.add(personD);
Person personE=新人();
国家(以下简称“A”);
个人。设定状态(“E”);
personSetByAddress.add(personE);
Person personC=新的Person();
个人设置国家(“A”);
人员设置状态(“B”);
个人设置(“C”);
personSetByAddress.add(personC);
地址A=新地址();
地址A.国家(以下简称“A”);
地址AB=新地址();
地址b.setCountry(“A”);
地址B.设置状态(“B”);
地址ABC=新地址();
地址abc.setCountry(“A”);
地址ABC.设置状态(“B”);
地址ABC.setCity(“C”);
System.out.println(personSetByAddress.floor(新地址personadapter(addressAB));
产量:
无效的

请注意,在这种情况下,您需要将哈希代码结果存储在地址和人员类别中,以避免重新计算。

希望,我的回答会有所帮助!国家/地区字段是必填字段是否正确?@bashnesnos否,所有字段都可以为空或部分初始化。若地址处的国家/地区和州不为空,并且人员列表包含仅初始化了国家/地区字段的人员,则筛选器应返回此类人员对象。由于字段之间存在严格的依赖关系,Xentros提出的想法非常适合。您仍然需要检查每个字段是否相等,因此您将检查具有相同字段数量的实体的所有字段,如果没有合适的字段,则会退而求其次(假设A-null-null、A-D-nul、A-E-null和A-B-null进行过滤)。“所以它离蛮力并不远,真的。”bashnesnos说,“是的。我想知道在这种情况下是否有避免暴力的算法。在更具体的情况下,可以应用Xentros建议。为什么在这里使用
notNulls
变量?完全多余。您只需检查
isValid()
方法中是否填写了所有字段。它根本不会加速。问题是如何避免每次都遍历所有字段。@Xentros我明白了。很好的解决方案。我不认为当任何字段都可以为空并且应该选择最适用的结果时,它会正确工作。但是,如果像这里这样的字段之间存在严格的依赖关系(城市不能没有州,州不能没有国家),您的解决方案是一个不错的选择。如果所有字段都可以为空,则无需这样过滤:)@Xentros否,我的意思是地址有国家和州字段,而Person对象只有国家字段。在这种情况下,我们可以返回更一般的Person对象,这样Person只需初始化country字段即可得到合适的结果。但我认为人们可以在这里推测,直到他来检查所有领域,并决定什么结果应该算作一般结果。哇。看起来不可思议。我必须用它做实验。@Dragon,实验有什么进展吗?这对您有用吗?看起来不错,但我尝试将验证逻辑应用于某人的国家和地址不同的情况。在测试用例#2中,当角色具有国家“D”时,过滤器不应返回任何内容。无论如何,我几乎决定接受你的答案,需要更多的实验。@Dragon是的,这是一个很好的例子。我已经编辑了我的答案,把它也包括在内。@Dragon,我刚刚还添加了一个脱离州的案例
    TreeSet<Person> personSetByAddress = new TreeSet<>();
    Person personA = new Person();
    personA.setCountry("A");
    personSetByAddress.add(personA);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    Person{hashCode=10000000, country='A', state='null', city='null'}
    TreeSet<Person> personSetByAddress = new TreeSet<>();
    Person personA = new Person();
    personA.setCountry("D");
    personSetByAddress.add(personA);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    Person{hashCode=10000000, country='D', state='null', city='null'}
 //we need this class to allow flooring just by id
 public class IntegerPersonAdapter extends Person {
    private Integer id;
    public IntegerPersonAdapter(Integer id) {
        this.id = id;
    }

    @Override
    public boolean equals(Object o) {
        return id.equals(o);
    }

    @Override
    public int hashCode() {
        return id.hashCode();
    }

    @Override
    public int compareTo(Object o) {
        return id.hashCode() - o.hashCode();
    }

    @Override
    public String toString() {
        return id.toString();
    }
}

public class StrictCountryTreeSet extends TreeSet<Person> {

    @Override
    public Person floor(Person e) {
        Person candidate = super.floor(e);
        if (candidate != null) {
            //we check if the country is the same
            int candidateCode = candidate.hashCode();
            int eCode = e.hashCode();
            if (candidateCode == eCode) {
                return candidate;
            } else {
                int countryCandidate = candidateCode / 10000000;
                if (countryCandidate == (eCode / 10000000)) {
                    //we check if the state is the same
                    int stateCandidate = candidateCode / 10000;
                    if (stateCandidate == (eCode / 10000)) {
                        //we check if is a state
                        if (candidateCode % 10 == 0) {
                            return candidate;
                        } else { //since it's not exact match we haven't found a city - we need to get someone just from state
                            return this.floor(new IntegerPersonAdapter(stateCandidate * 10000));
                        }

                    } else if (stateCandidate % 10 == 0) { //we check if it's a country already
                        return candidate;
                    } else {
                        return this.floor(new IntegerPersonAdapter(countryCandidate * 10000000));
                    }
                }
            }
        }
        return null;
    }
    TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
    Person personA = new Person();
    personA.setCountry("D");
    personSetByAddress.add(personA);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    null
    TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
    Person personD = new Person();
    personD.setCountry("D");
    personSetByAddress.add(personD);

    Person personE = new Person();
    personE.setCountry("A");
    personE.setState("E");
    personSetByAddress.add(personE);

    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressA = new Address();
    addressA.setCountry("A");

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    Address addressABC = new Address();
    addressABC.setCountry("A");
    addressABC.setState("B");
    addressABC.setCity("C");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    null