C# 在较长序列中查找子序列

C# 在较长序列中查找子序列,c#,C#,我需要在其他大序列中找到一个序列,例如,{1,3,2,3}出现在{1,3,2,3,4,3}和{5,1,3,2,3}中。有没有办法用IEnumerable或其他方法快速完成这项工作?您可以尝试类似的方法开始。将此列表转换为字符串后,可以使用子字符串查找序列: if (String.Join(",", numericList.ConvertAll<string>(x => x.ToString()).ToArray()) { //get sequence } if(Str

我需要在其他大序列中找到一个序列,例如,
{1,3,2,3}
出现在
{1,3,2,3,4,3}
{5,1,3,2,3}
中。有没有办法用
IEnumerable
或其他方法快速完成这项工作?

您可以尝试类似的方法开始。将此列表转换为字符串后,可以使用子字符串查找序列:

if (String.Join(",", numericList.ConvertAll<string>(x => x.ToString()).ToArray())
{
    //get sequence
}
if(String.Join(“,”,numericslist.ConvertAll(x=>x.ToString()).ToArray())
{
//获取序列
}

此方法将在父序列中找到任何类型的子序列,可以通过
Equals()
进行比较:

注意,它假定目标序列没有
null
元素

更新:谢谢大家的支持,但是上面的代码中实际上有一个错误!如果找到了部分匹配,但没有变成完全匹配,则过程将结束,而不是重置(当应用于
{1,2,1,2,3}之类的东西时,这显然是不正确的)。ContainsSubsequence({1,2,3})

上述代码对于更常见的子序列定义(即不需要连续性)非常有效,但为了处理重置(大多数
IEnumerator
不支持),需要预先枚举目标序列。这导致以下代码:

public static bool ContainsSubequence<T>(this IEnumerable<T> parent, IEnumerable<T> target)
{
    bool foundOneMatch = false;
    var enumeratedTarget = target.ToList();
    int enumPos = 0;

    using (IEnumerator<T> parentEnum = parent.GetEnumerator())
    {
        while (parentEnum.MoveNext())
        {
            if (enumeratedTarget[enumPos].Equals(parentEnum.Current))
            {
                // Match, so move the target enum forward
                foundOneMatch = true;
                if (enumPos == enumeratedTarget.Count - 1)
                {
                    // We went through the entire target, so we have a match
                    return true;
                }

                enumPos++;
            }
            else if (foundOneMatch)
            {
                foundOneMatch = false;
                enumPos = 0;

                if (enumeratedTarget[enumPos].Equals(parentEnum.Current))
                {
                    foundOneMatch = true;
                    enumPos++;
                }
            }
        }

        return false;
    }
}
公共静态bool包含序列(此IEnumerable父级,IEnumerable目标)
{
bool foundOneMatch=false;
var enumeratedTarget=target.ToList();
int-enumPos=0;
使用(IEnumerator parentEnum=parent.GetEnumerator())
{
while(parentEnum.MoveNext())
{
if(enumeratedTarget[enumPos].Equals(parentEnum.Current))
{
//匹配,因此向前移动目标枚举
foundOneMatch=true;
如果(enumPos==enumeratedTarget.Count-1)
{
//我们穿过了整个目标,所以我们有一场比赛
返回true;
}
enumPos++;
}
else if(foundOneMatch)
{
foundOneMatch=false;
enumPos=0;
if(enumeratedTarget[enumPos].Equals(parentEnum.Current))
{
foundOneMatch=true;
enumPos++;
}
}
}
返回false;
}
}

这段代码没有任何错误,但对于大(或无限)序列来说效果不佳。

类似于@dlev,但它也处理
{1,1,2}。ContainsSubsequence({1,1,2})

公共静态bool包含ssubsequence(此IEnumerable父级,IEnumerable目标)
{
var pattern=target.ToArray();
var source=new LinkedList();
foreach(父级中的var元素)
{
source.AddLast(元素);
if(source.Count==pattern.Length)
{
if(source.SequenceEqual(模式))
返回true;
source.RemoveFirst();
}
}
返回false;
}
这对我很有用

var a1 = new List<int> { 1, 2, 3, 4, 5 };
var a2 = new List<int> { 2, 3, 4 };

int index = -1;
bool res = a2.All(
    x => index != -1 ? (++index == a1.IndexOf(x)) : ((index = a1.IndexOf(x)) != -1)
);
var a1=新列表{1,2,3,4,5};
var a2=新列表{2,3,4};
int指数=-1;
bool res=a2.全部(
x=>index!=-1?(++index==a1.IndexOf(x)):((index=a1.IndexOf(x))!=-1)
);

此函数使用一些LINQ检查列表<代码>父级是否包含列表<代码>目标:

    public static bool ContainsSequence<T>(this List<T> parent, List<T> target)
    {
        for (int fromElement = parent.IndexOf(target.First());
            (fromElement != -1) && (fromElement <= parent.Count - target.Count);
            fromElement = parent.FindIndex(fromElement + 1, p => p.Equals(target.First())))
        {
            var comparedSequence = parent.Skip(fromElement).Take(target.Count);
            if (comparedSequence.SequenceEqual(target)) return true;
        }
        return false;
    }       
公共静态bool包含序列(此列表父级,列表目标)
{
for(int-fromElement=parent.IndexOf(target.First());
(fromElement!=-1)&(fromElement p.Equals(target.First()))
{
var comparedSequence=parent.Skip(fromElement).Take(target.Count);
if(comparedSequence.SequenceEqual(target))返回true;
}
返回false;
}       

如果您处理的是简单的可序列化类型,那么如果您将数组转换为字符串,则可以非常轻松地执行此操作:

public static bool ContainsList<T>(this List<T> containingList, List<T> containedList)
{
    string strContaining = "," + string.Join(",", containingList) + ",";
    string strContained = "," + string.Join(",", containedList) + ",";
    return strContaining.Contains(strContained);
}

这是一个研究得很好的问题,根据我的研究,有两种算法是可行的

即算法和算法

我谨此陈辞

它被写来处理长度不确定的源或父序列,最大可达

正如我们所看到的,内部实现返回子字符串或目标模式出现的索引序列

你可以这样简单地使用它

var contains = new[] { 1, 3, 2, 3, 4, 3 }.Contains(new[] { 1, 3, 2, 3 });

下面是我的答案的完整注释代码

namespace Code
{
    using System;
    using System.Collections.Generic;
    using System.Linq;

    /// <summary>
    /// A generic implementation of the Knuth-Morris-Pratt algorithm that searches,
    /// in a memory efficient way, over a given <see cref="IEnumerable{T}"/>.
    /// </summary>
    public static class KMP
    {
        /// <summary>
        /// Determines whether a sequence contains the search string.
        /// </summary>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>
        /// </typeparam>
        /// <param name="source">
        /// A sequence of elements
        /// </param>
        /// <param name="pattern">The search string.</param>
        /// <param name="equalityComparer">
        /// Determines whether the sequence contains a specified element.
        /// If <c>null</c>
        /// <see cref="EqualityComparer{T}.Default"/> will be used.
        /// </param>
        /// <returns>
        /// <c>true</c> if the source contains the specified pattern;
        /// otherwise, <c>false</c>.
        /// </returns>
        /// <exception cref="ArgumentNullException">pattern</exception>
        public static bool Contains<T>(
                this IEnumerable<T> source,
                IEnumerable<T> pattern,
                IEqualityComparer<T> equalityComparer = null)
        {
            if (pattern == null)
            {
                throw new ArgumentNullException(nameof(pattern));
            }

            equalityComparer = equalityComparer ?? EqualityComparer<T>.Default;

            return SearchImplementation(source, pattern, equalityComparer).Any();
        }

        public static IEnumerable<long> IndicesOf<T>(
                this IEnumerable<T> source,
                IEnumerable<T> pattern,
                IEqualityComparer<T> equalityComparer = null)
        {
            if (pattern == null)
            {
                throw new ArgumentNullException(nameof(pattern));
            }

            equalityComparer = equalityComparer ?? EqualityComparer<T>.Default;

            return SearchImplementation(source, pattern, equalityComparer);
        }

        /// <summary>
        /// Identifies indices of a pattern string in a given sequence.
        /// </summary>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>
        /// </typeparam>
        /// <param name="source">
        /// The sequence to search.
        /// </param>
        /// <param name="patternString">
        /// The string to find in the sequence.
        /// </param>
        /// <param name="equalityComparer">
        /// Determines whether the sequence contains a specified element.
        /// </param>
        /// <returns>
        /// A sequence of indices where the pattern can be found
        /// in the source.
        /// </returns>
        /// <exception cref="ArgumentOutOfRangeException">
        /// patternSequence - The pattern must contain 1 or more elements.
        /// </exception>
        private static IEnumerable<long> SearchImplementation<T>(
            IEnumerable<T> source,
            IEnumerable<T> patternString,
            IEqualityComparer<T> equalityComparer)
        {
            // Pre-process the pattern
            (var slide, var pattern) = GetSlide(patternString, equalityComparer);
            var patternLength = pattern.Count;

            if (patternLength == 0)
            {
                throw new ArgumentOutOfRangeException(
                    nameof(patternString),
                    "The pattern must contain 1 or more elements.");
            }

            var buffer = new Dictionary<long, T>(patternLength);
            var more = true;

            long sourceIndex = 0; // index for source
            int patternIndex = 0; // index for pattern

            using(var sourceEnumerator = source.GetEnumerator())
            while (more)
            {
                more = FillBuffer(
                        buffer,
                        sourceEnumerator,
                        sourceIndex,
                        patternLength,
                        out T t);

                if (equalityComparer.Equals(pattern[patternIndex], t))
                {
                    patternIndex++;
                    sourceIndex++;

                    more = FillBuffer(
                        buffer,
                        sourceEnumerator,
                        sourceIndex,
                        patternLength,
                        out t);
                }

                if (patternIndex == patternLength)
                {
                    yield return sourceIndex - patternIndex;
                    patternIndex = slide[patternIndex - 1];
                }
                else if (more && !equalityComparer.Equals(pattern[patternIndex], t))
                {
                    if (patternIndex != 0)
                    {
                        patternIndex = slide[patternIndex - 1];
                    }
                    else
                    {
                        sourceIndex = sourceIndex + 1;
                    }
                }
            }
        }

        /// <summary>
        /// Services the buffer and retrieves the value.
        /// </summary>
        /// <remarks>
        /// The buffer is used so that it is not necessary to hold the
        /// entire source in memory.
        /// </remarks>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>.
        /// </typeparam>
        /// <param name="buffer">The buffer.</param>
        /// <param name="source">The source enumerator.</param>
        /// <param name="sourceIndex">The element index to retrieve.</param>
        /// <param name="patternLength">Length of the search string.</param>
        /// <param name="value">The element value retrieved from the source.</param>
        /// <returns>
        /// <c>true</c> if there is potentially more data to process;
        /// otherwise <c>false</c>.
        /// </returns>
        private static bool FillBuffer<T>(
            IDictionary<long, T> buffer,
            IEnumerator<T> source,
            long sourceIndex,
            int patternLength,
            out T value)
        {
            bool more = true;
            if (!buffer.TryGetValue(sourceIndex, out value))
            {
                more = source.MoveNext();
                if (more)
                {
                    value = source.Current;
                    buffer.Remove(sourceIndex - patternLength);
                    buffer.Add(sourceIndex, value);
                }
            }

            return more;
        }

        /// <summary>
        /// Gets the offset array which acts as a slide rule for the KMP algorithm.
        /// </summary>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>.
        /// </typeparam>
        /// <param name="pattern">The search string.</param>
        /// <param name="equalityComparer">
        /// Determines whether the sequence contains a specified element.
        /// If <c>null</c>
        /// <see cref="EqualityComparer{T}.Default"/> will be used.
        /// </param>
        /// <returns>A tuple of the offsets and the enumerated pattern.</returns>
        private static (IReadOnlyList<int> Slide, IReadOnlyList<T> Pattern) GetSlide<T>(
                IEnumerable<T> pattern,
                IEqualityComparer<T> equalityComparer)
        {
            var patternList = pattern.ToList();
            var slide = new int[patternList.Count];

            int length = 0;
            int patternIndex = 1;

            while (patternIndex < patternList.Count)
            {
                if (equalityComparer.Equals(
                        patternList[patternIndex],
                        patternList[length]))
                {
                    length++;
                    slide[patternIndex] = length;
                    patternIndex++;
                }
                else
                {
                    if (length != 0)
                    {
                        length = slide[length - 1];
                    }
                    else
                    {
                        slide[patternIndex] = length;
                        patternIndex++;
                    }
                }
            }

            return (slide, patternList);
        }
    }
}
名称空间代码
{
使用制度;
使用System.Collections.Generic;
使用System.Linq;
/// 
///Knuth-Morris-Pratt算法的通用实现,用于搜索,
///以内存有效的方式,在给定的时间内。
/// 
公共静态类KMP
{
/// 
///确定序列是否包含搜索字符串。
/// 
/// 
///元素的类型
/// 
/// 
///元素序列
/// 
///搜索字符串。
/// 
///确定序列是否包含指定的元素。
///如果为空
///将使用。
/// 
/// 
///如果源包含指定的模式,则为true;
///否则,错误。
/// 
///图案
公共静态布尔包含(
这是一个数不清的来源,
可数模式,
IEqualityComparer equalityComparer=null)
{
if(pattern==null)
{
抛出新的ArgumentNullException(nameof(pattern));
}
equalityComparer=equalityComparer??equalityComparer.Default;
返回SearchImplementation(source、pattern、equalityComparer).Any();
}
公共静态IEnumerable指标(
public static bool ContainsList<T>(this List<T> containingList, List<T> containedList)
{
    string strContaining = "," + string.Join(",", containingList) + ",";
    string strContained = "," + string.Join(",", containedList) + ",";
    return strContaining.Contains(strContained);
}
if (bigList.ContainsList(smallList))
{
    ...
}
var contains = new[] { 1, 3, 2, 3, 4, 3 }.Contains(new[] { 1, 3, 2, 3 });
namespace Code
{
    using System;
    using System.Collections.Generic;
    using System.Linq;

    /// <summary>
    /// A generic implementation of the Knuth-Morris-Pratt algorithm that searches,
    /// in a memory efficient way, over a given <see cref="IEnumerable{T}"/>.
    /// </summary>
    public static class KMP
    {
        /// <summary>
        /// Determines whether a sequence contains the search string.
        /// </summary>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>
        /// </typeparam>
        /// <param name="source">
        /// A sequence of elements
        /// </param>
        /// <param name="pattern">The search string.</param>
        /// <param name="equalityComparer">
        /// Determines whether the sequence contains a specified element.
        /// If <c>null</c>
        /// <see cref="EqualityComparer{T}.Default"/> will be used.
        /// </param>
        /// <returns>
        /// <c>true</c> if the source contains the specified pattern;
        /// otherwise, <c>false</c>.
        /// </returns>
        /// <exception cref="ArgumentNullException">pattern</exception>
        public static bool Contains<T>(
                this IEnumerable<T> source,
                IEnumerable<T> pattern,
                IEqualityComparer<T> equalityComparer = null)
        {
            if (pattern == null)
            {
                throw new ArgumentNullException(nameof(pattern));
            }

            equalityComparer = equalityComparer ?? EqualityComparer<T>.Default;

            return SearchImplementation(source, pattern, equalityComparer).Any();
        }

        public static IEnumerable<long> IndicesOf<T>(
                this IEnumerable<T> source,
                IEnumerable<T> pattern,
                IEqualityComparer<T> equalityComparer = null)
        {
            if (pattern == null)
            {
                throw new ArgumentNullException(nameof(pattern));
            }

            equalityComparer = equalityComparer ?? EqualityComparer<T>.Default;

            return SearchImplementation(source, pattern, equalityComparer);
        }

        /// <summary>
        /// Identifies indices of a pattern string in a given sequence.
        /// </summary>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>
        /// </typeparam>
        /// <param name="source">
        /// The sequence to search.
        /// </param>
        /// <param name="patternString">
        /// The string to find in the sequence.
        /// </param>
        /// <param name="equalityComparer">
        /// Determines whether the sequence contains a specified element.
        /// </param>
        /// <returns>
        /// A sequence of indices where the pattern can be found
        /// in the source.
        /// </returns>
        /// <exception cref="ArgumentOutOfRangeException">
        /// patternSequence - The pattern must contain 1 or more elements.
        /// </exception>
        private static IEnumerable<long> SearchImplementation<T>(
            IEnumerable<T> source,
            IEnumerable<T> patternString,
            IEqualityComparer<T> equalityComparer)
        {
            // Pre-process the pattern
            (var slide, var pattern) = GetSlide(patternString, equalityComparer);
            var patternLength = pattern.Count;

            if (patternLength == 0)
            {
                throw new ArgumentOutOfRangeException(
                    nameof(patternString),
                    "The pattern must contain 1 or more elements.");
            }

            var buffer = new Dictionary<long, T>(patternLength);
            var more = true;

            long sourceIndex = 0; // index for source
            int patternIndex = 0; // index for pattern

            using(var sourceEnumerator = source.GetEnumerator())
            while (more)
            {
                more = FillBuffer(
                        buffer,
                        sourceEnumerator,
                        sourceIndex,
                        patternLength,
                        out T t);

                if (equalityComparer.Equals(pattern[patternIndex], t))
                {
                    patternIndex++;
                    sourceIndex++;

                    more = FillBuffer(
                        buffer,
                        sourceEnumerator,
                        sourceIndex,
                        patternLength,
                        out t);
                }

                if (patternIndex == patternLength)
                {
                    yield return sourceIndex - patternIndex;
                    patternIndex = slide[patternIndex - 1];
                }
                else if (more && !equalityComparer.Equals(pattern[patternIndex], t))
                {
                    if (patternIndex != 0)
                    {
                        patternIndex = slide[patternIndex - 1];
                    }
                    else
                    {
                        sourceIndex = sourceIndex + 1;
                    }
                }
            }
        }

        /// <summary>
        /// Services the buffer and retrieves the value.
        /// </summary>
        /// <remarks>
        /// The buffer is used so that it is not necessary to hold the
        /// entire source in memory.
        /// </remarks>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>.
        /// </typeparam>
        /// <param name="buffer">The buffer.</param>
        /// <param name="source">The source enumerator.</param>
        /// <param name="sourceIndex">The element index to retrieve.</param>
        /// <param name="patternLength">Length of the search string.</param>
        /// <param name="value">The element value retrieved from the source.</param>
        /// <returns>
        /// <c>true</c> if there is potentially more data to process;
        /// otherwise <c>false</c>.
        /// </returns>
        private static bool FillBuffer<T>(
            IDictionary<long, T> buffer,
            IEnumerator<T> source,
            long sourceIndex,
            int patternLength,
            out T value)
        {
            bool more = true;
            if (!buffer.TryGetValue(sourceIndex, out value))
            {
                more = source.MoveNext();
                if (more)
                {
                    value = source.Current;
                    buffer.Remove(sourceIndex - patternLength);
                    buffer.Add(sourceIndex, value);
                }
            }

            return more;
        }

        /// <summary>
        /// Gets the offset array which acts as a slide rule for the KMP algorithm.
        /// </summary>
        /// <typeparam name="T">
        /// The type of elements of <paramref name="source"/>.
        /// </typeparam>
        /// <param name="pattern">The search string.</param>
        /// <param name="equalityComparer">
        /// Determines whether the sequence contains a specified element.
        /// If <c>null</c>
        /// <see cref="EqualityComparer{T}.Default"/> will be used.
        /// </param>
        /// <returns>A tuple of the offsets and the enumerated pattern.</returns>
        private static (IReadOnlyList<int> Slide, IReadOnlyList<T> Pattern) GetSlide<T>(
                IEnumerable<T> pattern,
                IEqualityComparer<T> equalityComparer)
        {
            var patternList = pattern.ToList();
            var slide = new int[patternList.Count];

            int length = 0;
            int patternIndex = 1;

            while (patternIndex < patternList.Count)
            {
                if (equalityComparer.Equals(
                        patternList[patternIndex],
                        patternList[length]))
                {
                    length++;
                    slide[patternIndex] = length;
                    patternIndex++;
                }
                else
                {
                    if (length != 0)
                    {
                        length = slide[length - 1];
                    }
                    else
                    {
                        slide[patternIndex] = length;
                        patternIndex++;
                    }
                }
            }

            return (slide, patternList);
        }
    }
}