Java:按多个字段筛选收集和检索数据
我有一门课:Java:按多个字段筛选收集和检索数据,java,algorithm,collections,data-retrieval,Java,Algorithm,Collections,Data Retrieval,我有一门课: public class Address { private String country; private String state; private String city; } 还有一个Person对象列表。Person类看起来像: public class Person { private String country; private String state; private String city; //oth
public class Address {
private String country;
private String state;
private String city;
}
还有一个Person对象列表。Person类看起来像:
public class Person {
private String country;
private String state;
private String city;
//other fields
}
我需要过滤Person
对象并获得最合适的对象<代码>地址对象可以至少有一个非空字段Person
对象不能初始化任何、部分或所有提到的字段
以下是一个可能的输入示例:
Three Person objects:
a. PersonA: country = 'A'
b. PersonB: country = 'A', state = 'B'
c. PersonC: country = 'A', state = 'B', city = 'C'
Address object:
a. Address: country = 'A', state = 'B'
过滤后的预期结果是PersonB。如果只有PersonA和PersonC对象,那么PersonA更可取
我想展示一下我是如何做到这一点的,但事实上这是纯暴力算法,我不喜欢它。算法复杂度随字段的增加而增加。我还考虑过使用guava filter by predicate,但不知道谓词应该是什么
如果除暴力外还有其他过滤算法,那么什么算法更适合这种过滤?据我所知,暴力指的是检查所有实体的所有字段。好吧,如果你不重构你的类,这是不可能的,但是有一个简单的技巧可以帮助你。它使用
状态
模式
您可以将flagnotNulls
添加到这两个类:
public class Address {
private int notNulls = 0;
private String country;
private String state;
private String city;
}
public class Person {
private int notNulls = 0;
private String country;
private String state;
private String city;
//other fields
}
我将向您展示一个setter的可能实现,因为其他类似:
public void setCountry(String s) {
if (country == null {
if (s != null) {
country = s;
notNulls++;
}
} else {
if (s == null) {
country == null;
notNulls--;
} else {
country = s;
}
}
}
public boolean isValid() {
return notNulls != 0;
}
现在,您可以简单地在对象之间循环。为了避免暴力,您需要按地址为您的人员编制索引。对于一个好的搜索,你肯定需要一个国家(猜测它或以某种方式默认它,否则结果无论如何都会太不准确) 索引将是一个数字,前3位代表国家,后3位代表州,后4位代表城市。在这种情况下,在int中,您将能够存储213个国家(),最多包含999个州和9999个城市 它使我们能够使用hashCode和树集来索引Person实例,并以O(log(n))的方式部分按地址查找它们,而不必涉及它们的字段。字段将在树集构造上被触及,并且您需要添加一些额外的逻辑来修改Person以保持索引的完整性 从国家开始,按顺序计算每个部分的指数
import java.util.HashMap;
import java.util.Map;
public class PartialAddressSearch {
private final static Map<String, AddressPartHolder> COUNTRY_MAP = new HashMap<>(200);
private static class AddressPartHolder {
int id;
Map<String, AddressPartHolder> subPartMap;
public AddressPartHolder(int id, Map<String, AddressPartHolder> subPartMap) {
this.id = id;
this.subPartMap = subPartMap;
}
}
public static int getCountryStateCityHashCode(String country, String state, String city) {
if (country != null && country.length() != 0) {
int result = 0;
AddressPartHolder countryHolder = COUNTRY_MAP.get(country);
if (countryHolder == null) {
countryHolder = new AddressPartHolder(COUNTRY_MAP.size() + 1, new HashMap<>());
COUNTRY_MAP.put(country, countryHolder);
}
result += countryHolder.id * 10000000;
if (state != null) {
AddressPartHolder stateHolder = countryHolder.subPartMap.get(state);
if (stateHolder == null) {
stateHolder = new AddressPartHolder(countryHolder.subPartMap.size() + 1, new HashMap<>());
countryHolder.subPartMap.put(state, stateHolder);
}
result += stateHolder.id * 10000;
if (city != null && city.length() != 0) {
AddressPartHolder cityHolder = stateHolder.subPartMap.get(city);
if (cityHolder == null) {
cityHolder = new AddressPartHolder(stateHolder.subPartMap.size() + 1, null);
stateHolder.subPartMap.put(city, cityHolder);
}
result += cityHolder.id;
}
}
return result;
} else {
throw new IllegalArgumentException("Non-empty country is expected");
}
}
在此之后,您的过滤代码将收缩到填充索引和计算部分地址的下限:
TreeSet<Person> personSetByAddress = new TreeSet<>();
Person personA = new Person();
personA.setCountry("A");
personSetByAddress.add(personA);
Person personB = new Person();
personB.setCountry("A");
personB.setState("B");
personSetByAddress.add(personB);
Person personC = new Person();
personC.setCountry("A");
personC.setState("B");
personC.setCity("C");
personSetByAddress.add(personC);
Address addressAB = new Address();
addressAB.setCountry("A");
addressAB.setState("B");
System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));
Yields:
Person{hashCode=10010000, country='A', state='B', city='null'}
对不同状态的测试也会产生null
:
TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
Person personD = new Person();
personD.setCountry("D");
personSetByAddress.add(personD);
Person personE = new Person();
personE.setCountry("A");
personE.setState("E");
personSetByAddress.add(personE);
Person personC = new Person();
personC.setCountry("A");
personC.setState("B");
personC.setCity("C");
personSetByAddress.add(personC);
Address addressA = new Address();
addressA.setCountry("A");
Address addressAB = new Address();
addressAB.setCountry("A");
addressAB.setState("B");
Address addressABC = new Address();
addressABC.setCountry("A");
addressABC.setState("B");
addressABC.setCity("C");
System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));
Yields:
null
TreeSet personSetByAddress=new StrictCountryTreeSet();
Person personD=新人();
国家(以下简称“D”);
personSetByAddress.add(personD);
Person personE=新人();
国家(以下简称“A”);
个人。设定状态(“E”);
personSetByAddress.add(personE);
Person personC=新的Person();
个人设置国家(“A”);
人员设置状态(“B”);
个人设置(“C”);
personSetByAddress.add(personC);
地址A=新地址();
地址A.国家(以下简称“A”);
地址AB=新地址();
地址b.setCountry(“A”);
地址B.设置状态(“B”);
地址ABC=新地址();
地址abc.setCountry(“A”);
地址ABC.设置状态(“B”);
地址ABC.setCity(“C”);
System.out.println(personSetByAddress.floor(新地址personadapter(addressAB));
产量:
无效的
请注意,在这种情况下,您需要将哈希代码结果存储在地址和人员类别中,以避免重新计算。希望,我的回答会有所帮助!国家/地区字段是必填字段是否正确?@bashnesnos否,所有字段都可以为空或部分初始化。若地址处的国家/地区和州不为空,并且人员列表包含仅初始化了国家/地区字段的人员,则筛选器应返回此类人员对象。由于字段之间存在严格的依赖关系,Xentros提出的想法非常适合。您仍然需要检查每个字段是否相等,因此您将检查具有相同字段数量的实体的所有字段,如果没有合适的字段,则会退而求其次(假设A-null-null、A-D-nul、A-E-null和A-B-null进行过滤)。“所以它离蛮力并不远,真的。”bashnesnos说,“是的。我想知道在这种情况下是否有避免暴力的算法。在更具体的情况下,可以应用Xentros建议。为什么在这里使用
notNulls
变量?完全多余。您只需检查isValid()
方法中是否填写了所有字段。它根本不会加速。问题是如何避免每次都遍历所有字段。@Xentros我明白了。很好的解决方案。我不认为当任何字段都可以为空并且应该选择最适用的结果时,它会正确工作。但是,如果像这里这样的字段之间存在严格的依赖关系(城市不能没有州,州不能没有国家),您的解决方案是一个不错的选择。如果所有字段都可以为空,则无需这样过滤:)@Xentros否,我的意思是地址有国家和州字段,而Person对象只有国家字段。在这种情况下,我们可以返回更一般的Person对象,这样Person只需初始化country字段即可得到合适的结果。但我认为人们可以在这里推测,直到他来检查所有领域,并决定什么结果应该算作一般结果。哇。看起来不可思议。我必须用它做实验。@Dragon,实验有什么进展吗?这对您有用吗?看起来不错,但我尝试将验证逻辑应用于某人的国家和地址不同的情况。在测试用例#2中,当角色具有国家“D”时,过滤器不应返回任何内容。无论如何,我几乎决定接受你的答案,需要更多的实验。@Dragon是的,这是一个很好的例子。我已经编辑了我的答案,把它也包括在内。@Dragon,我刚刚还添加了一个脱离州的案例
TreeSet<Person> personSetByAddress = new TreeSet<>();
Person personA = new Person();
personA.setCountry("A");
personSetByAddress.add(personA);
Person personC = new Person();
personC.setCountry("A");
personC.setState("B");
personC.setCity("C");
personSetByAddress.add(personC);
Address addressAB = new Address();
addressAB.setCountry("A");
addressAB.setState("B");
System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));
Yields:
Person{hashCode=10000000, country='A', state='null', city='null'}
TreeSet<Person> personSetByAddress = new TreeSet<>();
Person personA = new Person();
personA.setCountry("D");
personSetByAddress.add(personA);
Person personC = new Person();
personC.setCountry("A");
personC.setState("B");
personC.setCity("C");
personSetByAddress.add(personC);
Address addressAB = new Address();
addressAB.setCountry("A");
addressAB.setState("B");
System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));
Yields:
Person{hashCode=10000000, country='D', state='null', city='null'}
//we need this class to allow flooring just by id
public class IntegerPersonAdapter extends Person {
private Integer id;
public IntegerPersonAdapter(Integer id) {
this.id = id;
}
@Override
public boolean equals(Object o) {
return id.equals(o);
}
@Override
public int hashCode() {
return id.hashCode();
}
@Override
public int compareTo(Object o) {
return id.hashCode() - o.hashCode();
}
@Override
public String toString() {
return id.toString();
}
}
public class StrictCountryTreeSet extends TreeSet<Person> {
@Override
public Person floor(Person e) {
Person candidate = super.floor(e);
if (candidate != null) {
//we check if the country is the same
int candidateCode = candidate.hashCode();
int eCode = e.hashCode();
if (candidateCode == eCode) {
return candidate;
} else {
int countryCandidate = candidateCode / 10000000;
if (countryCandidate == (eCode / 10000000)) {
//we check if the state is the same
int stateCandidate = candidateCode / 10000;
if (stateCandidate == (eCode / 10000)) {
//we check if is a state
if (candidateCode % 10 == 0) {
return candidate;
} else { //since it's not exact match we haven't found a city - we need to get someone just from state
return this.floor(new IntegerPersonAdapter(stateCandidate * 10000));
}
} else if (stateCandidate % 10 == 0) { //we check if it's a country already
return candidate;
} else {
return this.floor(new IntegerPersonAdapter(countryCandidate * 10000000));
}
}
}
}
return null;
}
TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
Person personA = new Person();
personA.setCountry("D");
personSetByAddress.add(personA);
Person personC = new Person();
personC.setCountry("A");
personC.setState("B");
personC.setCity("C");
personSetByAddress.add(personC);
Address addressAB = new Address();
addressAB.setCountry("A");
addressAB.setState("B");
System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));
Yields:
null
TreeSet<Person> personSetByAddress = new StrictCountryTreeSet();
Person personD = new Person();
personD.setCountry("D");
personSetByAddress.add(personD);
Person personE = new Person();
personE.setCountry("A");
personE.setState("E");
personSetByAddress.add(personE);
Person personC = new Person();
personC.setCountry("A");
personC.setState("B");
personC.setCity("C");
personSetByAddress.add(personC);
Address addressA = new Address();
addressA.setCountry("A");
Address addressAB = new Address();
addressAB.setCountry("A");
addressAB.setState("B");
Address addressABC = new Address();
addressABC.setCountry("A");
addressABC.setState("B");
addressABC.setCity("C");
System.out.println(personSetByAddress.floor(new AddressPersonAdapter(addressAB)));
Yields:
null