Java：按多个字段筛选收集和检索数据_Java_Algorithm_Collections_Data Retrieval

Java：按多个字段筛选收集和检索数据

java algorithm collections

Java：按多个字段筛选收集和检索数据,java,algorithm,collections,data-retrieval,Java,Algorithm,Collections,Data Retrieval,我有一门课： public class Address { private String country; private String state; private String city; } 还有一个Person对象列表。Person类看起来像： public class Person { private String country; private String state; private String city; //oth

我有一门课：

public class Address {
    private String country;
    private String state;
    private String city;
}

还有一个Person对象列表。Person类看起来像：

public class Person {
    private String country;
    private String state;
    private String city;
    //other fields
}

我需要过滤

Person

对象并获得最合适的对象<代码>地址对象可以至少有一个非空字段

Person

对象不能初始化任何、部分或所有提到的字段

以下是一个可能的输入示例：

Three Person objects:
a. PersonA: country = 'A'
b. PersonB: country = 'A', state = 'B'
c. PersonC: country = 'A', state = 'B', city = 'C'

Address object:
a. Address: country = 'A', state = 'B'

过滤后的预期结果是PersonB。如果只有PersonA和PersonC对象，那么PersonA更可取

我想展示一下我是如何做到这一点的，但事实上这是纯暴力算法，我不喜欢它。算法复杂度随字段的增加而增加。我还考虑过使用guava filter by predicate，但不知道谓词应该是什么

如果除暴力外还有其他过滤算法，那么什么算法更适合这种过滤？

据我所知，暴力指的是检查所有实体的所有字段。好吧，如果你不重构你的类，这是不可能的，但是有一个简单的技巧可以帮助你。它使用

状态

模式

您可以将flag

notNulls

添加到这两个类：

public class Address {
    private int notNulls = 0;
    private String country;
    private String state;
    private String city;
}

public class Person {
    private int notNulls = 0;
    private String country;
    private String state;
    private String city;
    //other fields
}

我将向您展示一个setter的可能实现，因为其他类似：

public void setCountry(String s) {
    if (country == null {
        if (s != null) {
            country = s;
            notNulls++;
        }
    } else {
        if (s == null) {
            country == null;
            notNulls--;
        } else {
            country = s;
        }
    }
}

public boolean isValid() {
    return notNulls != 0;
}

现在，您可以简单地在对象之间循环。

为了避免暴力，您需要按地址为您的人员编制索引。对于一个好的搜索，你肯定需要一个国家（猜测它或以某种方式默认它，否则结果无论如何都会太不准确）

索引将是一个数字，前3位代表国家，后3位代表州，后4位代表城市。在这种情况下，在int中，您将能够存储213个国家（），最多包含999个州和9999个城市

它使我们能够使用hashCode和树集来索引Person实例，并以O（log（n））的方式部分按地址查找它们，而不必涉及它们的字段。字段将在树集构造上被触及，并且您需要添加一些额外的逻辑来修改Person以保持索引的完整性

从国家开始，按顺序计算每个部分的指数

    import java.util.HashMap;
    import java.util.Map;

    public class PartialAddressSearch {

        private final static Map<String, AddressPartHolder> COUNTRY_MAP = new HashMap<>(200);

        private static class AddressPartHolder {
            int id;
            Map<String, AddressPartHolder> subPartMap;

            public AddressPartHolder(int id, Map<String, AddressPartHolder> subPartMap) {
                this.id = id;
                this.subPartMap = subPartMap;
            }
        }

        public static int getCountryStateCityHashCode(String country, String state, String city) {
            if (country != null && country.length() != 0) {
                int result = 0;
                AddressPartHolder countryHolder = COUNTRY_MAP.get(country);
                if (countryHolder == null) {
                    countryHolder = new AddressPartHolder(COUNTRY_MAP.size() + 1, new HashMap<>());
                    COUNTRY_MAP.put(country, countryHolder);
                }
                result += countryHolder.id * 10000000;

                if (state != null) {
                    AddressPartHolder stateHolder = countryHolder.subPartMap.get(state);
                    if (stateHolder == null) {
                        stateHolder = new AddressPartHolder(countryHolder.subPartMap.size() + 1, new HashMap<>());
                        countryHolder.subPartMap.put(state, stateHolder);
                    }
                    result += stateHolder.id * 10000;

                    if (city != null && city.length() != 0) {
                        AddressPartHolder cityHolder = stateHolder.subPartMap.get(city);
                        if (cityHolder == null) {
                            cityHolder = new AddressPartHolder(stateHolder.subPartMap.size() + 1, null);
                            stateHolder.subPartMap.put(city, cityHolder);
                        }
                        result += cityHolder.id;
                    }
                }

                return result;
            } else {
                throw new IllegalArgumentException("Non-empty country is expected");
            }
    }

在此之后，您的过滤代码将收缩到填充索引和计算部分地址的下限：

    TreeSet<Person> personSetByAddress = new TreeSet<>();
    Person personA = new Person();
    personA.setCountry("A");
    personSetByAddress.add(personA);
    Person personB = new Person();
    personB.setCountry("A");
    personB.setState("B");
    personSetByAddress.add(personB);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    Person{hashCode=10010000, country='A', state='B', city='null'}

对不同状态的测试也会产生

null

：

    TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
    Person personD = new Person();
    personD.setCountry("D");
    personSetByAddress.add(personD);

    Person personE = new Person();
    personE.setCountry("A");
    personE.setState("E");
    personSetByAddress.add(personE);

    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressA = new Address();
    addressA.setCountry("A");

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    Address addressABC = new Address();
    addressABC.setCountry("A");
    addressABC.setState("B");
    addressABC.setCity("C");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    null

TreeSet personSetByAddress=new StrictCountryTreeSet（）；
Person personD=新人（）；
国家（以下简称“D”）；
personSetByAddress.add（personD）；
Person personE=新人（）；
国家（以下简称“A”）；
个人。设定状态（“E”）；
personSetByAddress.add（personE）；
Person personC=新的Person（）；
个人设置国家（“A”）；
人员设置状态（“B”）；
个人设置（“C”）；
personSetByAddress.add（personC）；
地址A=新地址（）；
地址A.国家（以下简称“A”）；
地址AB=新地址（）；
地址b.setCountry（“A”）；
地址B.设置状态（“B”）；
地址ABC=新地址（）；
地址abc.setCountry（“A”）；
地址ABC.设置状态（“B”）；
地址ABC.setCity（“C”）；
System.out.println（personSetByAddress.floor（新地址personadapter（addressAB））；
产量：
无效的

请注意，在这种情况下，您需要将哈希代码结果存储在地址和人员类别中，以避免重新计算。

希望，我的回答会有所帮助！国家/地区字段是必填字段是否正确？@bashnesnos否，所有字段都可以为空或部分初始化。若地址处的国家/地区和州不为空，并且人员列表包含仅初始化了国家/地区字段的人员，则筛选器应返回此类人员对象。由于字段之间存在严格的依赖关系，Xentros提出的想法非常适合。您仍然需要检查每个字段是否相等，因此您将检查具有相同字段数量的实体的所有字段，如果没有合适的字段，则会退而求其次（假设A-null-null、A-D-nul、A-E-null和A-B-null进行过滤）。“所以它离蛮力并不远，真的。”bashnesnos说，“是的。我想知道在这种情况下是否有避免暴力的算法。在更具体的情况下，可以应用Xentros建议。为什么在这里使用

notNulls

变量？完全多余。您只需检查

isValid（）

方法中是否填写了所有字段。它根本不会加速。问题是如何避免每次都遍历所有字段。@Xentros我明白了。很好的解决方案。我不认为当任何字段都可以为空并且应该选择最适用的结果时，它会正确工作。但是，如果像这里这样的字段之间存在严格的依赖关系（城市不能没有州，州不能没有国家），您的解决方案是一个不错的选择。如果所有字段都可以为空，则无需这样过滤：）@Xentros否，我的意思是地址有国家和州字段，而Person对象只有国家字段。在这种情况下，我们可以返回更一般的Person对象，这样Person只需初始化country字段即可得到合适的结果。但我认为人们可以在这里推测，直到他来检查所有领域，并决定什么结果应该算作一般结果。哇。看起来不可思议。我必须用它做实验。@Dragon，实验有什么进展吗？这对您有用吗？看起来不错，但我尝试将验证逻辑应用于某人的国家和地址不同的情况。在测试用例#2中，当角色具有国家“D”时，过滤器不应返回任何内容。无论如何，我几乎决定接受你的答案，需要更多的实验。@Dragon是的，这是一个很好的例子。我已经编辑了我的答案，把它也包括在内。@Dragon，我刚刚还添加了一个脱离州的案例

    TreeSet<Person> personSetByAddress = new TreeSet<>();
    Person personA = new Person();
    personA.setCountry("A");
    personSetByAddress.add(personA);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    Person{hashCode=10000000, country='A', state='null', city='null'}

    TreeSet<Person> personSetByAddress = new TreeSet<>();
    Person personA = new Person();
    personA.setCountry("D");
    personSetByAddress.add(personA);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    Person{hashCode=10000000, country='D', state='null', city='null'}

 //we need this class to allow flooring just by id
 public class IntegerPersonAdapter extends Person {
    private Integer id;
    public IntegerPersonAdapter(Integer id) {
        this.id = id;
    }

    @Override
    public boolean equals(Object o) {
        return id.equals(o);
    }

    @Override
    public int hashCode() {
        return id.hashCode();
    }

    @Override
    public int compareTo(Object o) {
        return id.hashCode() - o.hashCode();
    }

    @Override
    public String toString() {
        return id.toString();
    }
}

public class StrictCountryTreeSet extends TreeSet<Person> {

    @Override
    public Person floor(Person e) {
        Person candidate = super.floor(e);
        if (candidate != null) {
            //we check if the country is the same
            int candidateCode = candidate.hashCode();
            int eCode = e.hashCode();
            if (candidateCode == eCode) {
                return candidate;
            } else {
                int countryCandidate = candidateCode / 10000000;
                if (countryCandidate == (eCode / 10000000)) {
                    //we check if the state is the same
                    int stateCandidate = candidateCode / 10000;
                    if (stateCandidate == (eCode / 10000)) {
                        //we check if is a state
                        if (candidateCode % 10 == 0) {
                            return candidate;
                        } else { //since it's not exact match we haven't found a city - we need to get someone just from state
                            return this.floor(new IntegerPersonAdapter(stateCandidate * 10000));
                        }

                    } else if (stateCandidate % 10 == 0) { //we check if it's a country already
                        return candidate;
                    } else {
                        return this.floor(new IntegerPersonAdapter(countryCandidate * 10000000));
                    }
                }
            }
        }
        return null;
    }

    TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
    Person personA = new Person();
    personA.setCountry("D");
    personSetByAddress.add(personA);
    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    null

    TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
    Person personD = new Person();
    personD.setCountry("D");
    personSetByAddress.add(personD);

    Person personE = new Person();
    personE.setCountry("A");
    personE.setState("E");
    personSetByAddress.add(personE);

    Person personC = new Person();
    personC.setCountry("A");
    personC.setState("B");
    personC.setCity("C");
    personSetByAddress.add(personC);

    Address addressA = new Address();
    addressA.setCountry("A");

    Address addressAB = new Address();
    addressAB.setCountry("A");
    addressAB.setState("B");

    Address addressABC = new Address();
    addressABC.setCountry("A");
    addressABC.setState("B");
    addressABC.setCity("C");

    System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));

    Yields:
    null