Java 比较两个迭代器并检查两个迭代器之间添加、删除或相同的元素
我正在编写一些代码,基于两个巨大的数据列表,使用两个迭代器。为了简单起见,您可以想象两者都是数字列表。相同的数字可以存在于列表1或列表2或两者中 它应该做的是检查两个列表,并在执行此操作时,确定两个列表中都存在哪些遇到的项,哪些只存在于列表1中,哪些只存在于列表2中 我可以创建包含第一个和第二个的所有值的集合,并使用它们进行区分,但代码将用于比较真正大的数据集(数百万条记录)。在内存中加载这两个集不是一个选项,它必须以“流式方式”加载 我可以在处理这两个列表之前对它们进行排序,这样您就可以假设它们将被排序 这是我能想到的最好办法,但在某些情况下,它会陷入一个无休止的循环:Java 比较两个迭代器并检查两个迭代器之间添加、删除或相同的元素,java,iterator,Java,Iterator,我正在编写一些代码,基于两个巨大的数据列表,使用两个迭代器。为了简单起见,您可以想象两者都是数字列表。相同的数字可以存在于列表1或列表2或两者中 它应该做的是检查两个列表,并在执行此操作时,确定两个列表中都存在哪些遇到的项,哪些只存在于列表1中,哪些只存在于列表2中 我可以创建包含第一个和第二个的所有值的集合,并使用它们进行区分,但代码将用于比较真正大的数据集(数百万条记录)。在内存中加载这两个集不是一个选项,它必须以“流式方式”加载 我可以在处理这两个列表之前对它们进行排序,这样您就可以假设它
public class ChangeScanner {
public static <T> void compareEntriesOfTwoStreams(Iterator<T> sourceOne,
Iterator<T> sourceTwo,
Comparator<T> comparator) {
T valueInOne = sourceOne.next();
T valueInTwo = sourceTwo.next();
while (sourceOne.hasNext() || sourceTwo.hasNext()) {
if (comparator.compare(valueInOne, valueInTwo) == 0) {
System.out.println("Present in both list 1 and 2: " + valueInOne);
valueInOne = getNextValue(valueInOne, sourceOne);
valueInTwo = getNextValue(valueInTwo, sourceTwo);
} else if (comparator.compare(valueInOne, valueInTwo) < 0) {
System.out.println("Present in list 1, Not present in list 2: " + valueInOne);
valueInOne = getNextValue(valueInOne, sourceOne);
} else if (comparator.compare(valueInOne, valueInTwo) > 0) {
System.out.println("Not present in list 1, Present in list 2: " + valueInTwo);
valueInTwo = getNextValue(valueInTwo, sourceTwo);
}
}
}
private static <T> T getNextValue(T current, Iterator<T> iterator) {
if (iterator.hasNext()) {
return iterator.next();
}
return current;
}
}
在其中一个迭代器结束之前,这一切都非常有效。
在本例中,迭代器2现在已经完成(13是最后一个元素)
当前逻辑将被卡在最后一次检查中,因为它不能再推进迭代器2,但迭代器1仍有更多元素:
check 14 and 13 -> 14 is larger than 13 so I know that 13 is not present in list 1. Advance two.
check 14 and 13 -> 14 is larger than 13 so I know that 13 is not present in list 1. Advance two.
check 14 and 13 -> 14 is larger than 13 so I know that 13 is not present in list 1. Advance two.
这就是我不知道该怎么办的地方。我很确定,当两个迭代器中的任何一个都完成时,我必须包含一些额外的逻辑
两个问题:
我一直在寻找一个第三方图书馆,可以做到这一点,因为我不想自己发明这个。如果有,请告诉我:)
如果没有,我想知道我可以添加什么检查来处理两个迭代器中的一个。这是一个有趣的挑战,因为迭代器只能使用一次,代码不应该过早地放弃从迭代器读取的值 我只能提出一个递归的解决方案,但是如果你能用循环重写它会更好
static <T> void diff(Iterator<T> lefts, Iterator<T> rights, Comparator<T> comparator,
Consumer<T> onlyLeft, Consumer<T> equals, Consumer<T> onlyRight) {
while (lefts.hasNext() && rights.hasNext()) {
recur(lefts.next(), rights.next(), lefts, rights, comparator, onlyLeft, equals, onlyRight);
}
if (!lefts.hasNext()) {
rights.forEachRemaining(onlyRight);
}
if (!rights.hasNext()) {
lefts.forEachRemaining(onlyLeft);
}
}
static <T> void recur(T left, T right, Iterator<T> lefts, Iterator<T> rights,
Comparator<T> comparator, Consumer<T> onlyLeft, Consumer<T> equals,
Consumer<T> onlyRight) {
if (comparator.compare(left, right) == 0) {
equals.accept(left);
} else if (comparator.compare(left, right) < 0) {
onlyLeft.accept(left);
if (lefts.hasNext()) {
recur(lefts.next(), right, lefts, rights, comparator, onlyLeft, equals, onlyRight);
} else {
onlyRight.accept(right);
}
} else {
onlyRight.accept(right);
if (rights.hasNext()) {
recur(left, rights.next(), lefts, rights, comparator, onlyLeft, equals, onlyRight);
} else {
onlyLeft.accept(left);
}
}
}
我们可以使用下面的伪函数编写函数,以处理角点情况
输出
Not present in list 2, Present in list 1: 1
Not present in list 2, Present in list 1: 2
present in both list : 3
Not present in list 1, Present in list 2: 4
present in both list : 10
Not present in list 1, Present in list 2: 12
在while循环中添加以下行解决了这个问题
if(!sourceOne.hasNext())
{
sourceTwo.next();
}
if(!sourceTwo.hasNext())
{
sourceOne.next();
}
完整代码:
public static <T> void compareEntriesOfTwoStreams(Iterator<T> sourceOne,
Iterator<T> sourceTwo,
Comparator<T> comparator) {
T valueInOne = sourceOne.next();
T valueInTwo = sourceTwo.next();
while (sourceOne.hasNext() || sourceTwo.hasNext()) {
if (comparator.compare(valueInOne, valueInTwo) == 0) {
System.out.println("Present in both list 1 and 2: " + valueInOne);
valueInOne = getNextValue(valueInOne, sourceOne);
valueInTwo = getNextValue(valueInTwo, sourceTwo);
} else if (comparator.compare(valueInOne, valueInTwo) < 0) {
System.out.println("Present in list 1, Not present in list 2: " + valueInOne);
valueInOne = getNextValue(valueInOne, sourceOne);
} else if (comparator.compare(valueInOne, valueInTwo) > 0) {
System.out.println("Not present in list 1, Present in list 2: " + valueInTwo);
valueInTwo = getNextValue(valueInTwo, sourceTwo);
}
if(!sourceOne.hasNext())
{
sourceTwo.next();
}
if(!sourceTwo.hasNext())
{
sourceOne.next();
}
}
}
private static <T> T getNextValue(T current, Iterator<T> iterator) {
if (iterator.hasNext()) {
return iterator.next();
}
return current;
}
public static void compareEntriesOfTwoStreams(迭代器sourceOne,
迭代器Source2,
比较器(比较器){
T valueInOne=sourceOne.next();
T valueInTwo=sourceTwo.next();
while(sourceOne.hasNext()| | sourceTwo.hasNext()){
if(比较器比较(valueInOne,valueInTwo)=0){
System.out.println(“同时出现在列表1和列表2中:“+valueInOne”);
valueInOne=getNextValue(valueInOne,sourceOne);
valueInTwo=getNextValue(valueInTwo,sourceTwo);
}else if(比较器比较(valueInOne,valueInTwo)<0){
System.out.println(“在列表1中存在,在列表2中不存在:“+valueInOne”);
valueInOne=getNextValue(valueInOne,sourceOne);
}else if(比较器比较(valueInOne,valueInTwo)>0){
System.out.println(“列表1中不存在,列表2中存在:“+valueInTwo”);
valueInTwo=getNextValue(valueInTwo,sourceTwo);
}
如果(!sourceOne.hasNext())
{
sourceTwo.next();
}
如果(!sourceTwo.hasNext())
{
sourceOne.next();
}
}
}
私有静态T getNextValue(T current,迭代器迭代器){
if(iterator.hasNext()){
返回iterator.next();
}
回流;
}
您希望从函数中得到什么?不同的元素、不同元素的数量、一个布尔值它们不相等?在实际代码中,system.out.printlns是对(java 8)函数的调用,您也可以提供给此实用程序。这样,当三种情况(添加、删除或相同)中的任何一种发生时,您都可以“挂接”并执行任何操作。在本例中,我的目标是为每个案例执行正确的打印。
Left 1
Both 2
Right 2
Left 3
Left 4
Both 5
Left 6
Right 7
Right 8
while list1 and list2 has element
if(list1.next < list2.next)
keep advancing list1 these are in list1 and not in list2
else if(list1.next > list2.next)
keep advancing list2 these are in list2 and not in list1
else if(list1.next == list2.next)
advance both list1 and list2 these are common in both list
while(list1.hasNext)
all remaining are only in list1
while(list2.hasNext)
all remaining are only in list2
public static <T> void compareEntriesOfTwoStreams(Iterator<T> sourceOne, Iterator<T> sourceTwo,
Comparator<T> comparator) {
T valueInOne = sourceOne!=null ? sourceOne.hasNext() ? sourceOne.next() : null:null;
T valueInTwo = sourceTwo!=null ? sourceTwo.hasNext() ? sourceTwo.next() : null:null;
while (valueInOne != null && valueInTwo != null) {
if (comparator.compare(valueInOne, valueInTwo) > 0) {
// advance sourcetwo
while (valueInTwo != null && comparator.compare(valueInOne, valueInTwo) > 0) {
System.out.println("Not present in list 1, Present in list 2: " + valueInTwo);
valueInTwo = sourceTwo.hasNext() ? sourceTwo.next() : null;
}
} else if (comparator.compare(valueInOne, valueInTwo) < 0) {
// advance sourceone
while (valueInOne != null && comparator.compare(valueInOne, valueInTwo) < 0) {
// this will advance
System.out.println("Not present in list 2, Present in list 1: " + valueInOne);
valueInOne = sourceOne.hasNext() ? sourceOne.next() : null;
}
} else if (comparator.compare(valueInOne, valueInTwo) ==0) {
System.out.println("present in both list:" + valueInOne);
valueInTwo = sourceTwo.hasNext() ? sourceTwo.next() : null;
valueInOne = sourceOne.hasNext() ? sourceOne.next() : null;
// present in both list if one of list is ended
}
}
while (valueInOne != null) {
// all these are only in list1
System.out.println("Not present in list 2, Present in list 1: " + valueInOne);
valueInOne = sourceOne.hasNext() ? sourceOne.next() : null;
}
while (valueInTwo != null) {
// these are only in list2
System.out.println("Not present in list 1, Present in list 2: " + valueInTwo);
valueInTwo = sourceTwo.hasNext() ? sourceTwo.next() : null;
}
}
compareEntriesOfTwoStreams(Stream.of(1,2,3,10).iterator(), Stream.of(3,4,10,12).iterator(), Integer::compare);
Not present in list 2, Present in list 1: 1
Not present in list 2, Present in list 1: 2
present in both list : 3
Not present in list 1, Present in list 2: 4
present in both list : 10
Not present in list 1, Present in list 2: 12
if(!sourceOne.hasNext())
{
sourceTwo.next();
}
if(!sourceTwo.hasNext())
{
sourceOne.next();
}
public static <T> void compareEntriesOfTwoStreams(Iterator<T> sourceOne,
Iterator<T> sourceTwo,
Comparator<T> comparator) {
T valueInOne = sourceOne.next();
T valueInTwo = sourceTwo.next();
while (sourceOne.hasNext() || sourceTwo.hasNext()) {
if (comparator.compare(valueInOne, valueInTwo) == 0) {
System.out.println("Present in both list 1 and 2: " + valueInOne);
valueInOne = getNextValue(valueInOne, sourceOne);
valueInTwo = getNextValue(valueInTwo, sourceTwo);
} else if (comparator.compare(valueInOne, valueInTwo) < 0) {
System.out.println("Present in list 1, Not present in list 2: " + valueInOne);
valueInOne = getNextValue(valueInOne, sourceOne);
} else if (comparator.compare(valueInOne, valueInTwo) > 0) {
System.out.println("Not present in list 1, Present in list 2: " + valueInTwo);
valueInTwo = getNextValue(valueInTwo, sourceTwo);
}
if(!sourceOne.hasNext())
{
sourceTwo.next();
}
if(!sourceTwo.hasNext())
{
sourceOne.next();
}
}
}
private static <T> T getNextValue(T current, Iterator<T> iterator) {
if (iterator.hasNext()) {
return iterator.next();
}
return current;
}