C# 修饰排序不修饰,如何按降序对字母字段排序

C# 修饰排序不修饰,如何按降序对字母字段排序,c#,sorting,C#,Sorting,我有一个很大的数据集,计算排序键是相当昂贵的。我想做的是使用DSU模式,在这里我获取行并计算排序键。例如: Qty Name Supplier Row 1: 50 Widgets IBM Row 2: 48 Thingies Dell Row 3: 99 Googaws IBM 要按数量和供应商排序,我可以使用排序键:0050 IBM,0048 Dell,0099 IBM。数字是右对齐的,文本是左对齐的,所有内容都根据需要填

我有一个很大的数据集,计算排序键是相当昂贵的。我想做的是使用DSU模式,在这里我获取行并计算排序键。例如:

        Qty   Name      Supplier   
Row 1:   50   Widgets   IBM
Row 2:   48   Thingies  Dell
Row 3:   99   Googaws   IBM
要按数量和供应商排序,我可以使用排序键:
0050 IBM
0048 Dell
0099 IBM
。数字是右对齐的,文本是左对齐的,所有内容都根据需要填充

如果我需要按数量降序排序,我可以从常数(比如10000)中减去该值来构建排序键:
9950 IBM
9952 Dell
9901 IBM

如何快速/廉价地为C#中的字母字段构建降序键

[我的数据都是8位ASCII,带有ISO 8859扩展字符。]

注意:在Perl中,这可以通过以下方式完成:

将此解决方案直接移植到C#中不起作用:

 subkey = encoding.GetString(encoding.GetBytes(stringval).
                      Select(x => (byte)(x ^ 0xff)).ToArray());
我怀疑是因为C#/Perl处理字符串的方式不同。也许Perl是在按ASCII顺序排序,而C#是在试图变得聪明

下面是一段试图实现这一点的示例代码:

        System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
        List<List<string>> sample = new List<List<string>>() {
            new List<string>() { "", "apple", "table" },
            new List<string>() { "", "apple", "chair" },
            new List<string>() { "", "apple", "davenport" },
            new List<string>() { "", "orange", "sofa" },
            new List<string>() { "", "peach", "bed" },
        };
        foreach(List<string> line in sample)
        {
            StringBuilder sb = new StringBuilder();

            string key1 = line[1].PadRight(10, ' ');
            string key2 = line[2].PadRight(10, ' ');

            // Comment the next line to sort desc, desc
            key2 = encoding.GetString(encoding.GetBytes(key2).
                  Select(x => (byte)(x ^ 0xff)).ToArray());

            sb.Append(key2);
            sb.Append(key1);
            line[0] = sb.ToString();
        }

        List<List<string>> output = sample.OrderBy(p => p[0]).ToList();

        return;
System.Text.asciencoding encoding=new System.Text.asciencoding();
列表示例=新列表(){
新列表(){“,”苹果“,”表格“},
新列表(){“,”苹果“,”椅子“},
新列表(){“,”苹果“,”达文波特“},
新列表(){“,”橙色“,”沙发“},
新列表(){“,”桃“,”床“},
};
foreach(示例中的列表行)
{
StringBuilder sb=新的StringBuilder();
字符串key1=行[1]。PadRight(10',);
字符串key2=行[2]。右键(10',);
//注释下一行以排序desc,desc
key2=encoding.GetString(encoding.GetBytes(key2)。
选择(x=>(字节)(x^0xff)).ToArray();
某人附加(键2);
某人附加(键1);
第[0]行=sb.ToString();
}
列表输出=sample.OrderBy(p=>p[0]).ToList();
返回;

只需编写一个IComparer,它可以作为一个比较器链来工作。 如果每个阶段都是平等的,则应将评估传递给下一个关键部分。如果小于或大于此值,请返回

你需要这样的东西:

int comparision = 0;

foreach(i = 0; i < n; i++)
{
 comparision = a[i].CompareTo(b[i]) * comparisionSign[i];

 if( comparision != 0 )
  return comparision;
}
return comparision;
var comparerChain = new ComparerChain<Row>()
.By(r => r.Qty, false)
.By(r => r.Name, false)
.By(r => r.Supplier, false);

var sortedByCustom = rows.OrderBy(i => i, comparerChain).ToList();
第一次调用返回IOrderedEnumerable,它可以按其他字段排序。

回答了我自己的问题(但并不令人满意)。为了构造降序字母键,我使用了以下代码,然后将此子键附加到对象的搜索键:

   if ( reverse )
        subkey = encoding.GetString(encoding.GetBytes(subkey)
                 .Select(x => (byte)(0x80 - x)).ToArray());
   rowobj.sortKey.Append(subkey);
一旦我有了钥匙,我就不能这么做:

   rowobjList.Sort();
因为默认比较器不是ASCII顺序(my
0x80-x
trick所依赖的)。因此,我必须编写一个使用顺序排序的
IComparable

    public int CompareTo(RowObject other)
    {
        return String.Compare(this.sortKey, other.sortKey, 
                                 StringComparison.Ordinal);
    }
这似乎奏效了。我有点不满意,因为在C语言中,字符串的编码/解码让人感觉很笨拙。

你可以到达你想要的地方,尽管我承认我不知道是否有更好的整体方法

直接翻译Perl方法的问题是.NET根本不允许您对编码如此放任。但是,如果如您所说,您的数据都是可打印的ASCII(即由Unicode代码点范围为32..127的字符组成)-请注意,没有“8位ASCII”这样的东西-那么您可以执行以下操作:

            key2 = encoding.GetString(encoding.GetBytes(key2).
                Select(x => (byte)(32+95-(x-32))).ToArray());
在这个表达中,我明确地表达了我在做什么:

  • x
    (我假设是在32..127中)
  • 将范围映射到0..95以使其基于零
  • 通过从95中减去来反转
  • 添加32以映射回可打印范围

这不是很好,但确实有效。

如果密钥计算很昂贵,为什么还要计算密钥?字符串比较本身并不是免费的,它实际上是一个昂贵的字符循环,并且不会比自定义比较循环执行得更好

在这个测试中,自定义比较排序的性能大约是DSU的3倍

请注意,DSU密钥计算在本测试中没有测量,而是预先计算的

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace DSUPatternTest
{
 [TestClass]
 public class DSUPatternPerformanceTest
 {
  public class Row
  {
   public int Qty;
   public string Name;
   public string Supplier;
   public string PrecomputedKey;

   public void ComputeKey()
   {
    // Do not need StringBuilder here, String.Concat does better job internally.
    PrecomputedKey =
     Qty.ToString().PadLeft(4, '0') + " "
     + Name.PadRight(12, ' ') + " "
     + Supplier.PadRight(12, ' ');
   }

   public bool Equals(Row other)
   {
    if (ReferenceEquals(null, other)) return false;
    if (ReferenceEquals(this, other)) return true;
    return other.Qty == Qty && Equals(other.Name, Name) && Equals(other.Supplier, Supplier);
   }

   public override bool Equals(object obj)
   {
    if (ReferenceEquals(null, obj)) return false;
    if (ReferenceEquals(this, obj)) return true;
    if (obj.GetType() != typeof (Row)) return false;
    return Equals((Row) obj);
   }

   public override int GetHashCode()
   {
    unchecked
    {
     int result = Qty;
     result = (result*397) ^ (Name != null ? Name.GetHashCode() : 0);
     result = (result*397) ^ (Supplier != null ? Supplier.GetHashCode() : 0);
     return result;
    }
   }
  }

  public class RowComparer : IComparer<Row>
  {
   public int Compare(Row x, Row y)
   {
    int comparision;

    comparision = x.Qty.CompareTo(y.Qty);
                if (comparision != 0) return comparision;

    comparision = x.Name.CompareTo(y.Name);
                if (comparision != 0) return comparision;

    comparision = x.Supplier.CompareTo(y.Supplier);

    return comparision;
   }
  }

  [TestMethod]
  public void CustomLoopIsFaster()
  {
   var random = new Random();
   var rows = Enumerable.Range(0, 5000).Select(i =>
             new Row
              {
               Qty = (int) (random.NextDouble()*9999),
               Name = random.Next().ToString(),
     Supplier = random.Next().ToString()

              }).ToList();

   foreach (var row in rows)
   {
    row.ComputeKey();
   }

   var dsuSw = Stopwatch.StartNew();
   var sortedByDSU = rows.OrderBy(i => i.PrecomputedKey).ToList();
   var dsuTime = dsuSw.ElapsedMilliseconds;

   var customSw = Stopwatch.StartNew();
   var sortedByCustom = rows.OrderBy(i => i, new RowComparer()).ToList();
   var customTime = customSw.ElapsedMilliseconds;

   Trace.WriteLine(dsuTime);
   Trace.WriteLine(customTime);

   CollectionAssert.AreEqual(sortedByDSU, sortedByCustom);

   Assert.IsTrue(dsuTime > customTime * 2.5);
  }
 }
}
使用系统;
使用System.Collections.Generic;
使用系统诊断;
使用System.Linq;
使用系统文本;
使用Microsoft.VisualStudio.TestTools.UnitTesting;
命名空间DSUPatterTest
{
[测试类]
公共类DSUPatterPerformanceTest
{
公共类行
{
公共整数数量;
公共字符串名称;
公共字符串供应商;
公共字符串预计算;
公共无效计算机()
{
//这里不需要StringBuilder,String.Concat在内部做得更好。
预计算基=
Qty.ToString().PadLeft(4,'0')+“”
+Name.PadRight(12'')+“”
+供应商。右键(12“);
}
公共布尔等于(其他行)
{
if(ReferenceEquals(null,other))返回false;
if(ReferenceEquals(this,other))返回true;
返回other.Qty==数量和等于(other.Name,Name)和等于(other.Supplier,Supplier);
}
公共覆盖布尔等于(对象对象对象)
{
if(ReferenceEquals(null,obj))返回false;
if(ReferenceEquals(this,obj))返回true;
if(obj.GetType()!=typeof(Row))返回false;
返回等于((行)obj);
}
公共覆盖int GetHashCode()
{
未经检查
{
int结果=数量;
结果=(结果*397)^(名称!=null?名称。GetHashCode():0);
结果=(结果*397)^(供应商!=null?供应商。GetHashCode():0);
返回结果;
}
}
}
公共类RowComparer:IComparer
{
公共整数比较(第x行,第y行)
{
综合比较;
比较=x数量比较到(y数量);
如果(比较!=0)返回比较;
comparison=x.Name.comparieto(y.Name);
如果(比较!=0)返回比较;
比较=x.供应商。比较到(y.供应商);
回归比较;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using Microsoft.VisualStudio.TestTools.UnitTesting;

namespace DSUPatternTest
{
 [TestClass]
 public class DSUPatternPerformanceTest
 {
  public class Row
  {
   public int Qty;
   public string Name;
   public string Supplier;
   public string PrecomputedKey;

   public void ComputeKey()
   {
    // Do not need StringBuilder here, String.Concat does better job internally.
    PrecomputedKey =
     Qty.ToString().PadLeft(4, '0') + " "
     + Name.PadRight(12, ' ') + " "
     + Supplier.PadRight(12, ' ');
   }

   public bool Equals(Row other)
   {
    if (ReferenceEquals(null, other)) return false;
    if (ReferenceEquals(this, other)) return true;
    return other.Qty == Qty && Equals(other.Name, Name) && Equals(other.Supplier, Supplier);
   }

   public override bool Equals(object obj)
   {
    if (ReferenceEquals(null, obj)) return false;
    if (ReferenceEquals(this, obj)) return true;
    if (obj.GetType() != typeof (Row)) return false;
    return Equals((Row) obj);
   }

   public override int GetHashCode()
   {
    unchecked
    {
     int result = Qty;
     result = (result*397) ^ (Name != null ? Name.GetHashCode() : 0);
     result = (result*397) ^ (Supplier != null ? Supplier.GetHashCode() : 0);
     return result;
    }
   }
  }

  public class RowComparer : IComparer<Row>
  {
   public int Compare(Row x, Row y)
   {
    int comparision;

    comparision = x.Qty.CompareTo(y.Qty);
                if (comparision != 0) return comparision;

    comparision = x.Name.CompareTo(y.Name);
                if (comparision != 0) return comparision;

    comparision = x.Supplier.CompareTo(y.Supplier);

    return comparision;
   }
  }

  [TestMethod]
  public void CustomLoopIsFaster()
  {
   var random = new Random();
   var rows = Enumerable.Range(0, 5000).Select(i =>
             new Row
              {
               Qty = (int) (random.NextDouble()*9999),
               Name = random.Next().ToString(),
     Supplier = random.Next().ToString()

              }).ToList();

   foreach (var row in rows)
   {
    row.ComputeKey();
   }

   var dsuSw = Stopwatch.StartNew();
   var sortedByDSU = rows.OrderBy(i => i.PrecomputedKey).ToList();
   var dsuTime = dsuSw.ElapsedMilliseconds;

   var customSw = Stopwatch.StartNew();
   var sortedByCustom = rows.OrderBy(i => i, new RowComparer()).ToList();
   var customTime = customSw.ElapsedMilliseconds;

   Trace.WriteLine(dsuTime);
   Trace.WriteLine(customTime);

   CollectionAssert.AreEqual(sortedByDSU, sortedByCustom);

   Assert.IsTrue(dsuTime > customTime * 2.5);
  }
 }
}
var comparerChain = new ComparerChain<Row>()
.By(r => r.Qty, false)
.By(r => r.Name, false)
.By(r => r.Supplier, false);

var sortedByCustom = rows.OrderBy(i => i, comparerChain).ToList();
public class ComparerChain<T> : IComparer<T>
    {
        private List<PropComparer<T>> Comparers = new List<PropComparer<T>>();

        public int Compare(T x, T y)
        {
            foreach (var comparer in Comparers)
            {
                var result = comparer._f(x, y);
                if (result != 0)
                    return result;
            }
            return 0;
        }
        public ComparerChain<T> By<Tp>(Func<T,Tp> property, bool descending) where Tp:IComparable<Tp>
        {
            Comparers.Add(PropComparer<T>.By(property, descending));
            return this;
        }
    }

    public class PropComparer<T>
    {
        public Func<T, T, int> _f;

        public static PropComparer<T> By<Tp>(Func<T,Tp> property, bool descending) where Tp:IComparable<Tp>
        {
            Func<T, T, int> ascendingCompare = (a, b) => property(a).CompareTo(property(b));
            Func<T, T, int> descendingCompare = (a, b) => property(b).CompareTo(property(a));
            return new PropComparer<T>(descending ?  descendingCompare : ascendingCompare);
        }

        public PropComparer(Func<T, T, int> f)
        {
            _f = f;
        }
    }