c#分类列表<;字符串,TValue>;。成功添加密钥的ContainsKey返回false
检查下面的更新3 我发现我遇到的问题与.Net 4.0、4.0客户端和4.5的c#字符串比较器的一个已知严重问题有关,这将导致字符串列表的排序顺序不一致(导致输出取决于输入的顺序和使用的排序算法)。该问题于2012年12月报告给微软,并以“不会被修复”而关闭。有一种变通方法可用,但速度太慢,难以用于大型收藏 在实现不可变的PatriciaTrie时,我想将其性能与System.Collections.Generic.SortedList进行比较。我使用以下文件创建了一个用于测试的输入字列表 使用c#分类列表<;字符串,TValue>;。成功添加密钥的ContainsKey返回false,c#,sortedlist,stringcomparer,C#,Sortedlist,Stringcomparer,检查下面的更新3 我发现我遇到的问题与.Net 4.0、4.0客户端和4.5的c#字符串比较器的一个已知严重问题有关,这将导致字符串列表的排序顺序不一致(导致输出取决于输入的顺序和使用的排序算法)。该问题于2012年12月报告给微软,并以“不会被修复”而关闭。有一种变通方法可用,但速度太慢,难以用于大型收藏 在实现不可变的PatriciaTrie时,我想将其性能与System.Collections.Generic.SortedList进行比较。我使用以下文件创建了一个用于测试的输入字列表 使用
Comparer.Default
或StringComparer.InvariantCulture
作为键比较器,在c#SortedList中插入每个单词时,无法使用常规搜索方法检索成功插入的许多条目(例如ContainsKey
返回false)但是,正如通过迭代列表所观察到的那样,该键存在于列表中
更奇怪的是,当比较从排序列表检索到的键与使用ContainsKey
无法找到的搜索键时,比较器返回值“0”
下面的完整示例在我的系统上演示了此问题
using System;
using System.IO;
using System.Linq;
using System.Collections.Generic;
class Program
{
static void Main(string[] args)
{
// the problem is possibly related to comparison.
var fail = true;
var comparer = fail ? StringComparer.InvariantCulture : StringComparer.Ordinal;
// read hamlet (contains duplicate words)
var words = File
.ReadAllLines("hamlet.txt")
.SelectMany(l => l.Split(new[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries))
.Select(w => w.Trim())
.Where(w => !string.IsNullOrEmpty(w))
.Distinct(comparer)
.ToArray();
// insert hamlet's words in the sorted list.
var list = new SortedList<string, int>(comparer);
var ndx = 0;
foreach (var word in words)
list[word] = ndx++;
// search for each of the added words.
foreach (var keyToSearch in words)
{
if (!list.ContainsKey(keyToSearch))
{
// was inserted, but cannot be retrieved.
Console.WriteLine("Error - Key not found: \"{0}\"", keyToSearch);
// however when we iterate over the list, we see that the entry is present
var prefix = keyToSearch.Substring(0, Math.Min(keyToSearch.Length, 3));
foreach (var wordCloseToSearchKey in list.Keys.Where(s => s.StartsWith(prefix)))
{
// and using the SortedList's supplied comparison returns 0, signaling equality
var comparisonResult = list.Comparer.Compare(wordCloseToSearchKey, keyToSearch);
Console.WriteLine("{0} - comparison result = {1}", wordCloseToSearchKey, comparisonResult);
}
}
}
// Check that sort order of List.Keys is correct
var keys = list.Keys.ToArray();
BinarySearchAll("list.Keys", keys, list.Comparer);
CheckCorrectSortOrder("list.Keys", keys, list.Comparer);
// Check that sort order of Array.Sort(List.Keys) is correct
var arraySortedKeys = CopySortSearchAndCheck("Array.Sort(List.Keys)", keys, list.Comparer);
// Check that sort order of the Array.Sort(input words) is correct
var sortedInput = CopySortSearchAndCheck("Array.Sort(input words)", words, list.Comparer);
Console.ReadLine();
}
static string[] CopySortSearchAndCheck(string arrayDesc, string[] input, IComparer<string> comparer)
{
// copy input
var sortedInput = new string[input.Length];
Array.Copy(input, sortedInput, sortedInput.Length);
// sort it
Array.Sort(sortedInput, comparer);
// check that we can actually find the keys in the array using bin. search
BinarySearchAll(arrayDesc, sortedInput, comparer);
// check that sort order is correct
CheckCorrectSortOrder(arrayDesc, sortedInput, comparer);
return sortedInput;
}
static void BinarySearchAll(string arrayDesc, string[] sortedInput, IComparer<string> comparer)
{
// check that each key in the input can be found using bin. search
foreach (var word in sortedInput)
{
var ix = Array.BinarySearch(sortedInput, word, comparer);
if (ix < 0)
// and it appears it cannot!
Console.WriteLine("Error - {0} - Key not found: \"{1}\"", arrayDesc, word);
}
}
static void CheckCorrectSortOrder(string arrayDesc, string[] sortedKeys, IComparer<string> comparer)
{
for (int n = 0; n < sortedKeys.Length; n++)
{
for (int up = n + 1; up < sortedKeys.Length; up++)
{
var cmp = comparer.Compare(sortedKeys[n], sortedKeys[up]);
if (cmp >= 0)
{
Console.WriteLine(
"{0}[{1}] = \"{2}\" not < than {0}[{3}] = \"{4}\" - cmp = {5}",
arrayDesc, n, sortedKeys[n], up, sortedKeys[up], cmp);
}
}
for (int down = n - 1; down > 0; down--)
{
var cmp = comparer.Compare(sortedKeys[n], sortedKeys[down]);
if (cmp <= 0)
{
Console.WriteLine(
"{0}[{1}] = \"{2}\" not > than {0}[{3}] = \"{4}\" - cmp = {5}",
arrayDesc, n, sortedKeys[n], down, sortedKeys[down], cmp);
}
}
}
}
}
这是它的输出:
a1 < a2 (A < a')
a2 < a3 (a' < 'a)
a1 > a3 (A > 'a)
所以我仍然希望微软重新考虑,或者有人知道一个可行的替代方案。否则,剩下的唯一选项就是使用
StringComparer.Ordinal
它是否与.Net Framework 4/4.5相关?我已将您的示例改编为.Net 3.5,如下所示:
var words = ReadFile("hamlet.txt");
//...
private static string[] ReadFile(string path)
{
List<string> lines = new List<string>();
using (StreamReader sr = new StreamReader(path))
{
string text = sr.ReadToEnd();
lines.Add(text);
}
return lines.SelectMany(l => l.Split(new[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries).Select(w => w.Trim()))
.Where(w => !(w.ToCharArray().All(c => c == ' ')))
.ToArray();
}
var words=ReadFile(“hamlet.txt”);
//...
私有静态字符串[]读取文件(字符串路径)
{
列表行=新列表();
使用(StreamReader sr=新StreamReader(路径))
{
string text=sr.ReadToEnd();
行。添加(文本);
}
返回行。SelectMany(l=>l.Split(new[]{'','\t'},StringSplitOptions.RemoveEmptyEntries)。选择(w=>w.Trim())
.Where(w=>!(w.ToCharArray().All(c=>c=''))
.ToArray();
}
使用.Net 3.5时,两个比较器在XP上都能正常工作。确认:在同一系统上使用.Net 3.5和.Net 3.5客户端配置文件时,不会出现问题。4.0、4.0客户端和4.5都有相同的问题。我排除了更多的故障,并确认
Array.BinarySearch
显示了相同的问题。信息添加到原始问题中。使用此新信息,似乎这是一个已知问题:请参阅并链接到已报告(已关闭)的问题有趣的是,Marc Gravell在2006年为.Net 2.0报告了相同/类似的问题,也以“不会修复”结束。可能在3.5中固定,在4.0中回归?不是100%反对称的比较器有什么实际用途吗?我可以想象使用比较器来定义偏序(如果a>b
和b>c
,并且a
和c
有一个定义的秩,那么a>c
),但是对于a>b
和b>a
都可能为真的比较器,没有一个比较器是这样的。
public class WorkAroundStringComparer : StringComparer
{
private static readonly Func<CompareInfo, string, CompareOptions, int> _getHashCodeOfString;
private readonly CompareInfo _compareInfo;
private readonly CompareOptions _compareOptions;
static WorkAroundStringComparer()
{
// Need this internal method to compute hashcode
// as an IEqualityComparer implementation.
_getHashCodeOfString = BuildGetHashCodeOfStringDelegate();
}
static Func<CompareInfo, string, CompareOptions, int> BuildGetHashCodeOfStringDelegate()
{
var compareInfoType = typeof(CompareInfo);
var argTypes = new[] { typeof(string), typeof(CompareOptions) };
var flags = BindingFlags.NonPublic | BindingFlags.Instance;
var methods = compareInfoType.GetMethods(flags).ToArray(); ;
var method = compareInfoType.GetMethod("GetHashCodeOfString", flags, null, argTypes, null);
var instance = Expression.Parameter(compareInfoType, "instance");
var stringArg = Expression.Parameter(typeof(string), "string");
var optionsArg = Expression.Parameter(typeof(CompareOptions), "options");
var methodCall = Expression.Call(instance, method, stringArg, optionsArg);
var expr = Expression.Lambda<Func<CompareInfo, string, CompareOptions, int>>(methodCall, instance, stringArg, optionsArg);
return expr.Compile();
}
public WorkAroundStringComparer()
: this(CultureInfo.InvariantCulture)
{
}
public WorkAroundStringComparer(CultureInfo cultureInfo, CompareOptions compareOptions = CompareOptions.None)
{
if (cultureInfo == null)
throw new ArgumentNullException("cultureInfo");
this._compareInfo = cultureInfo.CompareInfo;
this._compareOptions = compareOptions;
}
public override int Compare(string x, string y)
{
if (ReferenceEquals(x, y))
return 0;
if (ReferenceEquals(x, null))
return -1;
if (ReferenceEquals(y, null))
return 1;
var sortKeyFor_x = _compareInfo.GetSortKey(x, _compareOptions);
var sortKeyFor_y = _compareInfo.GetSortKey(y, _compareOptions);
return SortKey.Compare(sortKeyFor_x, sortKeyFor_y);
}
public override bool Equals(string x, string y)
{
return Compare(x, y) == 0;
}
public override int GetHashCode(string obj)
{
return _getHashCodeOfString(_compareInfo, obj, _compareOptions);
}
}
StringComparer.InvariantCulture : 00:00:15.3120013
WorkAroundStringComparer : 00:01:35.8322409
var words = ReadFile("hamlet.txt");
//...
private static string[] ReadFile(string path)
{
List<string> lines = new List<string>();
using (StreamReader sr = new StreamReader(path))
{
string text = sr.ReadToEnd();
lines.Add(text);
}
return lines.SelectMany(l => l.Split(new[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries).Select(w => w.Trim()))
.Where(w => !(w.ToCharArray().All(c => c == ' ')))
.ToArray();
}