C# 从字符串列表中提取公共前缀

C# 从字符串列表中提取公共前缀,c#,string,C#,String,我有一个字符串列表,例如: { abc001, abc002, abc003, cdef001, cdef002, cdef004, ghi002, ghi001 } 我想得到所有通用的唯一前缀;例如,对于上述列表: { abc, cdef, ghi } 如何做到这一点?您可以使用正则表达式选择文本部分,然后使用HashSet添加该文本部分,这样就不会添加重复: using System.Text.RegularExpressions; //simulate your real list

我有一个字符串列表,例如:

{ abc001, abc002, abc003, cdef001, cdef002, cdef004, ghi002, ghi001 }
我想得到所有通用的唯一前缀;例如,对于上述列表:

{ abc, cdef, ghi }

如何做到这一点?

您可以使用正则表达式选择文本部分,然后使用
HashSet
添加该文本部分,这样就不会添加重复:

using System.Text.RegularExpressions;


//simulate your real list 
List<string> myList = new List<string>(new string[] { "abc001", "abc002", "cdef001" });   

string pattern = @"^(\D*)\d+$";
//  \D* any non digit characters, and \d+ means followed by at least one digit,
// Note if you want also to capture string like "abc" alone without followed by numbers
// then the pattern will be "^(\D*)$"

Regex regex = new Regex(pattern);

HashSet<string> matchesStrings = new HashSet<string>();

foreach (string item in myList)
{
    var match = regex.Match(item);

    if (match.Groups.Count > 1)
    {
        matchesString.Add(match.Groups[1].Value);
    }
}
var list=新列表{
“abc001”、“abc002”、“abc003”、“cdef001”,
“cdef002”、“cdef004”、“ghi002”、“ghi001”
};
var prefixes=list.Select(x=>Regex.Match(x,@“^[^\d]+”).Value).Distinct();

编写一个助手类来表示数据可能是个好主意。例如:

public class PrefixedNumber
{
    private static Regex parser = new Regex(@"^(\p{L}+)(\d+)$");

    public PrefixedNumber(string source) // you may want a static Parse method.
    {
        Match parsed = parser.Match(source); // think about an error here when it doesn't match
        Prefix = parsed.Groups[1].Value;
        Index = parsed.Groups[2].Value;
    }

    public string Prefix { get; set; }
    public string Index { get; set; }
}
当然,您需要想出一个更好的名称和更好的访问修饰符

现在任务很简单:

List<string> data = new List<string> { "abc001", "abc002", "abc003", "cdef001",
                                       "cdef002", "cdef004", "ghi002", "ghi001" };
var groups = data.Select(str => new PrefixedNumber(str))
                 .GroupBy(prefixed => prefixed.Prefix);
列表数据=新列表{“abc001”、“abc002”、“abc003”、“cdef001”,
“cdef002”、“cdef004”、“ghi002”、“ghi001”};
变量组=数据。选择(str=>newprefixednumber(str))
.GroupBy(前缀=>前缀.prefixed);

结果是所有数据,通过前缀进行解析和分组。

假设您的前缀都是字母字符,并由第一个非字母字符终止,您可以使用以下LINQ表达式

List<string> listOfStrings = new List<String>() 
  { "abc001d", "abc002", "abc003", "cdef001", "cdef002", "cdef004", "ghi002", "ghi001" }; 

var prefixes = (from s in listOfStrings
                select new string(s.TakeWhile(c => char.IsLetter(c)).ToArray())).Distinct();
List listOfStrings=new List()
{“abc001d”、“abc002”、“abc003”、“cdef001”、“cdef002”、“cdef004”、“ghi002”、“ghi001”};
变量前缀=(来自ListOfstring中的s)
选择新字符串(s.TakeWhile(c=>char.isleter(c)).ToArray()).Distinct();

我知道。“这可能有点过分了。”赛义德——这是一个想象中的问题,OP从来没有提出过。按照这个逻辑,您可以通过返回
或所有首字母来解决问题。这一组如何:
{abc,abd,abd,ad}
常见的前缀是什么?我不知道您的代码做什么,但我知道您最多只能选择
n
前缀,但它可以超过
n
前缀,见我对问题的评论。
List<string> data = new List<string> { "abc001", "abc002", "abc003", "cdef001",
                                       "cdef002", "cdef004", "ghi002", "ghi001" };
var groups = data.Select(str => new PrefixedNumber(str))
                 .GroupBy(prefixed => prefixed.Prefix);
List<string> listOfStrings = new List<String>() 
  { "abc001d", "abc002", "abc003", "cdef001", "cdef002", "cdef004", "ghi002", "ghi001" }; 

var prefixes = (from s in listOfStrings
                select new string(s.TakeWhile(c => char.IsLetter(c)).ToArray())).Distinct();