C#parse&;比较巨大的列表/字符串

C#parse&;比较巨大的列表/字符串,c#,string,list,parsing,compare,C#,String,List,Parsing,Compare,我有两个巨大的列表(每个超过2000个) 我想分析和比较它们 列表内容如下: zone "exampledomain.com" { zone "exampledomain2.com" { zone "exampledomain3.com" { zone "exampledomain4.com" { zone "exampledomain5.com" { zone "exampledomain6.com" { zone "exampledomain7.com" { zone "exampledo

我有两个巨大的列表(每个超过2000个)

我想分析和比较它们

列表内容如下:

zone "exampledomain.com" {
zone "exampledomain2.com" {
zone "exampledomain3.com" {
zone "exampledomain4.com" {
zone "exampledomain5.com" {
zone "exampledomain6.com" {
zone "exampledomain7.com" {
zone "exampledomain.com" {
zone "exampledomain3.com" {
zone "exampledomain5.com" {
zone "exampledomain7.com" {
另一个列表是什么样子的:

zone "exampledomain.com" {
zone "exampledomain2.com" {
zone "exampledomain3.com" {
zone "exampledomain4.com" {
zone "exampledomain5.com" {
zone "exampledomain6.com" {
zone "exampledomain7.com" {
zone "exampledomain.com" {
zone "exampledomain3.com" {
zone "exampledomain5.com" {
zone "exampledomain7.com" {
两个列表都有相同格式的区域“\uuuuuuuuuuuuuuuuuu”{ 我想解析,这样我可以比较域,然后得到域的差异,这样我就知道另一个缺少什么,它们应该都有相同的结果

我遇到了以下代码:

  static void Main(string[] args)
{
    string s1 = "i have a car a car";
    string s2 = "i have a new car bmw";

    List<string> diff;
    IEnumerable<string> set1 = s1.Split(' ').Distinct();
    IEnumerable<string> set2 = s2.Split(' ').Distinct();

    if (set2.Count() > set1.Count())
    {
        diff = set2.Except(set1).ToList();
    }
    else
    {
        diff = set1.Except(set2).ToList();
    }
}
static void Main(字符串[]args)
{
string s1=“我有一辆车”;
string s2=“我有一辆宝马新车”;
列表差异;
IEnumerable set1=s1.Split(“”).Distinct();
IEnumerable set2=s2.Split(“”).Distinct();
if(set2.Count()>set1.Count())
{
diff=set2.Except(set1.ToList();
}
其他的
{
diff=set1.Except(set2.ToList();
}
}

但考虑到每个列表中有2000多行,我想知道最好的方法是什么。哈希集用于唯一元素的列表:

HashSet uniqueStrings=newhashset();
foreach(列表1中的字符串s1)
{
uniqueStrings.Add(s1);
}
foreach(列表2中的字符串s2)
{
uniqueStrings.Add(s2);
}

您给出的示例仅显示列表1,其中删除了列表2中的项目。如果您还需要列表2中不在列表1中的项目,则必须执行两个查询

var difference1 = list1.Except(list2);
var difference2 = list2.Except(list1);
我不确定执行Except时涉及到什么代码,但如果您希望看到如何生成两个包含差异的列表的实现,那么这里有一个解决方案:

static void Differerence(
  IEnumerable<string> source1, IEnumerable<string> source2, 
  out List<string> difference1, out List<string> difference2)
{
    //Move the data from the sources into ordered queues
    var sourceValues1 = new Queue<string>(source1.OrderBy(x => x));
    var sourceValues2 = new Queue<string>(source2.OrderBy(x => x));

    difference1 = new List<string>();
    difference2 = new List<string>();

    while(sourceValues1.Count > 0 && sourceValues2.Count > 0)
    {
        string value1 = sourceValues1.Peek();
        string value2 = sourceValues2.Peek();
        switch (string.Compare(value1, value2))
        {
            //If they match then don't add difference to either list
            case 0:
                sourceValues1.Dequeue();
                sourceValues2.Dequeue();
                break;

            //The left queue has the lowest value, record that and move on
            case -1:
                difference1.Add(value1);
                sourceValues1.Dequeue();
                break;

            //The right queue has the lowest value, record that and move on
            case 1:
                difference2.Add(value2);
                sourceValues2.Dequeue();
                break;

        }
    }
    //At least one of the queues is empty, so everything left in the other queue
    difference1.AddRange(sourceValues1);
    difference2.AddRange(sourceValues2);
}
var allDifferences = differenceX1.Union(differenceX2);