如何在Java中查找两个列表之间的公共序列_Java_List_Algorithm_Pattern Matching

如何在Java中查找两个列表之间的公共序列

java list algorithm

如何在Java中查找两个列表之间的公共序列,java,list,algorithm,pattern-matching,Java,List,Algorithm,Pattern Matching,我试图在两个列表之间找到公共序列。如果我们试图在所有具有唯一值的列表中找到公共序列，我可以做到。例如： list one: [1, 8, 3, 13, 14, 6, 11] listTwo : [8, 9, 10, 11, 12, 13, 14, 15] 正如我们所看到的，[13,14]序列对于两个列表是通用的。我的算法是，使用retainAll函数，我得到了公共值，对于这个例子，它是[8,11,13,14]。但是由于列表一已经被“retainAll”函数改变了，我正在创建列表一的副本。然后我

我试图在两个列表之间找到公共序列。如果我们试图在所有具有唯一值的列表中找到公共序列，我可以做到。例如：

list one: [1, 8, 3, 13, 14, 6, 11]
listTwo : [8, 9, 10, 11, 12, 13, 14, 15]

正如我们所看到的，[13,14]序列对于两个列表是通用的。我的算法是，使用

retainAll

函数，我得到了公共值，对于这个例子，它是

[8,11,13,14]

。但是由于列表一已经被“retainAll”函数改变了，我正在创建列表一的副本。然后我从它们的原始列表（列表一和列表二）中获取这些公共值的位置。之后，我得到连续值的位置差。比如：

       list1   list2   difList1     difList2
[8]     1      0     -1  (0-1)   -1  (0-1)
[11]    6      3     -5  (1-6)   -3  (0-3)
[13]    3      5      3  (6-3)   -2  (3-5)
[14]    4      6     -1  (3-4)   -1  (5-6)

如果difList1和difLis2的值都显示为“-1”，则表示该值和前一个值是连续的，并形成序列。由于[14]满足本例中的条件，因此序列为[13][14]

对于这种情况，我的代码是：

public static void main(String args[]) {
    List<Integer> list1= new ArrayList(Arrays.asList(1, 8, 3, 13, 14, 6, 11));
    List<Integer> list2= new ArrayList(Arrays.asList(8, 9, 10, 11, 12, 13, 14, 15));
    list1.retainAll(list2);
    List<Integer> ori_list1= new ArrayList(Arrays.asList(1, 8, 3, 13, 14, 6, 11));
    List<Integer> difList1= new ArrayList<>();
    List<Integer> diffList2= new ArrayList<>();
    difList1.add(-1); // Since the first element doesn't have any previous element in common elements list,i'm putting -1 on first index.
    diffList2.add(-1); // Since the first element doesn't have any previous element in common elements list,i'm putting -1 on first index.
    System.out.println(list1); // common elements are [8, 13, 14, 11]


    for(int k=1;k<list1.size();k++){ // Let's say k = 2 ..
        int index1_1 = ori_list1.indexOf(list1.get(k)); // For index 2, it takes actual index of 14 value -> 4
        int index1_2 = ori_list1.indexOf(list1.get(k-1)); // it takes actual index of 13 value -> 3
        int diff_list1 = index1_2-index1_1; // 3-4= -1 -> we got -1 .That means they're consecutive.
        difList1.add(diff_list1); // And putting the -1 into the diffList1.
        int index2_1 = list2.indexOf(list1.get(k)); // doing the same thing for list2.. -> 6
        int index2_2 = list2.indexOf(list1.get(k-1)); // doing the same thing for list2.. -> 5
        int diff_doc2 = index2_2-index2_1;  // 5-6 = -1
        diffList2.add(diff_doc2); // put -1 in diffList2 
    }
    for(int y=1;y<difList1.size();y++){ 
        if(difList1.get(y)==-1 && diffList2.get(y)==-1){  // Since they are both -1 for 14 value 

            System.out.println("The common sequence is:"+list1.get(y-1)+" "+list1.get(y)); // Print them
        }
    }
}

publicstaticvoidmain（字符串参数[]）{
list1=newarraylist（Arrays.asList（1,8,3,13,14,6,11））；
list2=新的ArrayList（Arrays.asList（8,9,10,11,12,13,14,15））；
清单1.保留（清单2）；
List ori_list1=新的ArrayList（Arrays.asList（1,8,3,13,14,6,11））；
List difList1=new ArrayList（）；
List diffList2=新的ArrayList（）；
difList1.add（-1）；//由于第一个元素在公共元素列表中没有任何前一个元素，所以我将-1放在第一个索引中。
diffList2.add（-1）；//由于第一个元素在公共元素列表中没有任何前一个元素，所以我将-1放在第一个索引中。
System.out.println（list1）；//公共元素是[8,13,14,11]
对于（int k=1；k 6
int index2_2=list2.indexOf（list1.get（k-1））；//对list2执行相同的操作..->5
int diff_doc2=index2_2-index2_1；//5-6=-1
diffList2.add（diff_doc2）；//将-1放入diffList2
}
对于（int y=1；y，我不确定您想做什么，但如果我在做公共序列，我会创建子列表并比较它们：
        public static Set<List<Integer>> findCommonSequence(List<Integer> source, List<Integer> target, int startLength) {
        Set<List<Integer>> sequences = new LinkedHashSet<>();

        // algorithm works in this way:
        // we prepare all possible sublists of source list that are at least startLength length
        // and then we check every of those sublists against the target list to see if it contains any

        // length is from startLength to maxSize, to check all sublists with that length
        // ie if startLength is 2 and source is 10, it will be 2 - 10 and thus it will check all sublist sizes
        for (int length = startLength; length < source.size(); length++) {
            // startIndex will move from 0 to original_list - length, so if length is 2, it will generate sublists
            // with indexes 0,1; 1,2; 2,3 ... 8,9
            for (int startIndex = 0; startIndex+length < source.size(); startIndex++) {
                // creates lightweight sublist that shares the data
                List<Integer> sublist = source.subList(startIndex, startIndex+length);
                // add all found subsequences into the set
                sequences.addAll(findSequenceIn(target, sublist));
            }
        }

        return sequences;
    }

    // Returns all subsequences that are inside the target list
    private static Set<List<Integer>> findSequenceIn(List<Integer> target, List<Integer> sublist) {
        Set<List<Integer>> subsequences = new LinkedHashSet<>();

        // simply do the same process as in first method but with fixed length to the length of sublist
        for (int i=0; i<target.size() - sublist.size(); i++) {
            // create another sublist, this time from target (again, share data)
            List<Integer> testSublist = target.subList(i, i+sublist.size());

            // compare two sublists, if they are equal, that means target list contains sublist from original list
            if (testSublist.equals(sublist)) {
                // add it to the set
                subsequences.add(new ArrayList<>(sublist));
            }
        }

        return subsequences;
    }

公共静态集合findCommonSequence（列表源、列表目标、int长度）{
Set sequences=new LinkedHashSet（）；
//算法的工作方式如下：
//我们准备所有可能的源列表子列表，这些子列表的长度至少是惊人的
//然后我们对照目标列表检查这些子列表中的每一个子列表，看看它是否包含任何子列表
//长度是从startLength到maxSize，用于检查具有该长度的所有子列表
//也就是说，如果startLength为2，source为10，则它将为2-10，因此它将检查所有子列表的大小
for（int length=startLength；length码点处理：

[1] 将列表2转换为str2

[2] 将列表1转换为str1，并在左侧去除所有不在列表2中的int值

[3] 将str1移到str2上，记住str1位于str2顶部的最长序列
ArrayList<int[]> results = new ArrayList<>();

String str2 = new String(
            new int[] { 1, 8, 3, 10, 13, 14, 8, 10, 14, 6, 11 }, 0, 11 ); //[1]
int[] tmp = new int[] { 8, 9, 10, 11, 12, 8, 10, 13, 14, 15 };
int[] arr1 = IntStream.of( tmp ).dropWhile(
    c -> str2.indexOf( c ) < 0 ).toArray();      //[2]
String str1 = new String( arr1, 0, arr1.length );
for( int i = str1.length() - 2; i >= 0; i-- ) {  //[3]
  int[] rslt = new int[0];
  for( int j = 0; j < str2.length() - 2; j++ ) {
    int[] idx2 = new int[] { j };
    rslt = str1.substring( i ).codePoints().takeWhile(
        c -> c == (int)str2.charAt( idx2[0]++ ) ).toArray();
    if( rslt.length >= 2 ) {
      results.add( rslt );
    }
  }
}
  
results.forEach(a -> System.out.println( Arrays.toString( a ) ));

ArrayList结果=新建ArrayList（）；
字符串str2=新字符串(
新int[]{1,8,3,10,13,14,8,10,14,6,11}，0,11）；//[1]
int[]tmp=新的int[]{8,9,10,11,12,8,10,13,14,15}；
int[]arr1=IntStream.of（tmp）.dropWhile(
c->str2.indexOf（c）<0.toArray（）；//[2]
字符串str1=新字符串（arr1，0，arr1.length）；
对于（inti=str1.length（）-2；i>=0；i--）{/[3]
int[]rslt=新的int[0]；
对于（int j=0；jc==（int）str2.charAt（idx2[0]+）.toArray（）；
如果（rslt.length>=2）{
结果：添加（rslt）；
}
}
}
results.forEach（a->System.out.println（Arrays.toString（a））；

获取：[13,14]
，[10,13,14]
，[8,10]
Hello@Enerccio。我的主要问题是，基本上是：我们有列表1:[1,8,3,13,14,8,10,6,11]，列表2:[8,9,10,11,12,8,10,13,14,15]。我想看到如下的输出：常见序列是：[[13,14]，[8,10]]。正如您所看到的，有两个常见的序列，其中有两个值。但它不仅限于两个值，还可能超过两个值序列。您能稍微解释一下您的代码吗？也许需要一点演示。它可以与您的列表一起使用。当我调用System.out.println（findCommonSequence）时，我会得到结果[[13,14]，[8,10]]
（列表一，列表二，2））；
。我将在编辑中添加算法的分解。是的，我在我的项目中使用了你的算法，但它也有O（n4）复杂度比我预期的要大得多。这看起来像蛮力，需要一些优化。非常感谢！您可以随时预计算所有子列表，并将它们存储在地图中，这将以内存为代价至少减少n1或n2 Tank you@Kaplan。您对算法的时间复杂度有何想法？以及lso我们需要动态地获取数组大小和数组值。我如何创建一个列表，比如list list1=new ArrayList（Arrays.asList（1,8,3,13,14,6,11））并将值放入字符串str2=new S