C# 每次需要重置值时返回多重枚举安全IEnumerable_C#_Ienumerable

C# 每次需要重置值时返回多重枚举安全IEnumerable

C# 每次需要重置值时返回多重枚举安全IEnumerable,c#,ienumerable,C#,Ienumerable,以下代码多次枚举失败，因为existingNames哈希集仍然包含上次枚举的结果，因此数字后缀的高级程度超过了正确程度。什么是完善此方法的优雅方法，使其能够正确处理多个枚举 public static IEnumerable<TOutput> UniquifyNames<TSource, TOutput>( this IEnumerable<TSource> source, Func<TSource, string> nameSelec

以下代码多次枚举失败，因为

existingNames

哈希集仍然包含上次枚举的结果，因此数字后缀的高级程度超过了正确程度。什么是完善此方法的优雅方法，使其能够正确处理多个枚举

public static IEnumerable<TOutput> UniquifyNames<TSource, TOutput>(
   this IEnumerable<TSource> source,
   Func<TSource, string> nameSelector,
   Func<TSource, string, TOutput> resultProjection
) {
   HashSet<string> existingNames = new HashSet<string>();
   return source
      .Select(item => {
         string name = nameSelector(item);
         return resultProjection(
            item,
            Enumerable.Range(1, int.MaxValue)
               .Select(i => {
                  string suffix = i == 1
                     ? ""
                     : (name.EndsWithDigit() ? "-" : "") + i.ToString();
                  return $@"{name}{suffix}";
               })
               .First(candidateName => existingNames.Add(candidateName))
         );
      });
}

private static bool EndsWithDigit(this string value)
   => !string.IsNullOrEmpty(value) && "0123456789".Contains(value[value.Length - 1]);

公共静态IEnumerable UniquifyNames(
这是一个数不清的来源，
Func nameSelector，
Func结果投影
) {
HashSet existingNames=新HashSet（）；
返回源
.选择（项目=>{
字符串名称=名称选择器（项）；
返回结果投影(
项目,，
可枚举范围（1，int.MaxValue）
.选择（i=>{
字符串后缀=i==1
? ""
：（name.EndsWithDigit（）？“-”：“”）+i.ToString（）；
返回$@“{name}{suffix}”；
})
.First（candidateName=>existingNames.Add（candidateName））
);
});
}
私有静态bool EndsWithDigit（此字符串值）
=> !string.IsNullOrEmpty（value）和&“0123456789”.Contains（value[value.Length-1]）；

我考虑创建一个扩展方法，比如

UponEnumeration

来包装外部可枚举项，当枚举再次开始时，它将采取回调

操作

来运行（并且可以用来重置

哈希集

）。这是个好主意吗

我刚刚意识到，如上所述，这不是一个好主意，因为相同的结果

IEnumerable

可以由不同的类同时枚举（在一个地方开始枚举，而另一个地方仍然是枚举的一半，因此在恢复枚举后情况会发生变化，因为

HashSet

被清除）。听起来最好的办法就是简单地

ToList（）

，但如果可能的话，我真的很想保留惰性计算。

通过使您的代码成为延迟的

IEnumerable

本身，当其他人多次运行它时，它也会多次运行

public static IEnumerable<TOutput> UniquifyNames<TSource, TOutput>(
   this IEnumerable<TSource> source,
   Func<TSource, string> nameSelector,
   Func<TSource, string, TOutput> resultProjection
) {
   HashSet<string> existingNames = new HashSet<string>();
   var items = source
      .Select(item => {
         string name = nameSelector(item);
         return resultProjection(
            item,
            Enumerable.Range(1, int.MaxValue)
               .Select(i => {
                  string suffix = i == 1
                     ? ""
                     : (name.EndsWithDigit() ? "-" : "") + i.ToString();
                  return $@"{name}{suffix}";
               })
               .First(candidateName => existingNames.Add(candidateName))
         );
      });
    foreach(TOutput item in items)
    {
        yield return item;
    }
}

当其他人多次运行代码时，将代码设置为延迟的

IEnumerable

本身，它也将被多次运行

public static IEnumerable<TOutput> UniquifyNames<TSource, TOutput>(
   this IEnumerable<TSource> source,
   Func<TSource, string> nameSelector,
   Func<TSource, string, TOutput> resultProjection
) {
   HashSet<string> existingNames = new HashSet<string>();
   var items = source
      .Select(item => {
         string name = nameSelector(item);
         return resultProjection(
            item,
            Enumerable.Range(1, int.MaxValue)
               .Select(i => {
                  string suffix = i == 1
                     ? ""
                     : (name.EndsWithDigit() ? "-" : "") + i.ToString();
                  return $@"{name}{suffix}";
               })
               .First(candidateName => existingNames.Add(candidateName))
         );
      });
    foreach(TOutput item in items)
    {
        yield return item;
    }
}

我确实想出了一个有效的方法，但我不知道它是否好：

public class ResettingEnumerable<T> : IEnumerable<T> {
    private readonly Func<IEnumerable<T>> _enumerableFetcher;

    public ResettingEnumerable(Func<IEnumerable<T>> enumerableFetcher) {
        _enumerableFetcher = enumerableFetcher;
    }

    public IEnumerator<T> GetEnumerator() => _enumerableFetcher().GetEnumerator();
    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

不过，看过之后，我认为他的想法可能更好：只需将其编写为一个可生成的

IEnumerable

，它具有每次调用

GetEnumerator

时重新运行的属性。这是一个很好的通用解决方案，当不能容忍多重枚举时，切换到延迟的

IEnumerable

为了记录在案，我选择了一个稍微不同的最终实现。最初，我希望保留

IEnumerable

的延迟计算特性，在该特性中，集合的枚举可能少于完全枚举，并产生有用的结果。然而，我意识到，我的目标是尽可能少地改变任何现有名称，这导致我选择了一种不同的算法，需要完全枚举列表（在开始任何数字递增之前，按照可以采用的方式采用所有名称）。以下是您的解决方案：

private class NamedItem<TSource> {
   public TSource Item { get; set; }
   public string Name { get; set; }
}

private static bool EndsWithADigit(this string value) =>
   !string.IsNullOrEmpty(value) && "0123456789".Contains(value[value.Length - 1]);

private static string GetNumberedName(string name, int index) =>
   name + (index == 1 ? "" : name.EndsWithADigit() ? $"-{index}" : $"{index}");

private static bool ConditionalSetName<T>(
   NamedItem<T> namedItem, string name, HashSet<string> hashset
) {
   bool isNew = hashset.Add(name);
   if (isNew) { namedItem.Name = name; }
   return !isNew;
}

public static IEnumerable<TOutput> UniquifyNames<TSource, TOutput>(
   this IEnumerable<TSource> source,
   Func<TSource, string> nameSelector,
   Func<TSource, string, TOutput> resultProjection
) {
   var seen = new HashSet<string>();
   var result = source.Select((item, seq) => new NamedItem<TSource>{
      Item = item, Name = nameSelector(item)
   }).ToList();
   var remaining = result;
   int i = 1;
   do {
      remaining = remaining.Where(namedItem =>
         ConditionalSetName(namedItem, GetNumberedName(namedItem.Name, i++), seen)
      ).ToList();
   } while (remaining.Any());
   return result.Select(namedItem => resultProjection(namedItem.Item, namedItem.Name));
}

结果如下：

"String2", "String", "String4", "String3", "String3-2"

这更好，因为名称

String3

保持不变

我最初的实现给出了以下结果：

"String2", "String", "String3", "String3-2", "String3-3"

这更糟糕，因为它不必要地变异了第一个

String3

。

我确实想出了一个可行的方法，但我不知道它是否好：

public class ResettingEnumerable<T> : IEnumerable<T> {
    private readonly Func<IEnumerable<T>> _enumerableFetcher;

    public ResettingEnumerable(Func<IEnumerable<T>> enumerableFetcher) {
        _enumerableFetcher = enumerableFetcher;
    }

    public IEnumerator<T> GetEnumerator() => _enumerableFetcher().GetEnumerator();
    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

不过，看过之后，我认为他的想法可能更好：只需将其编写为一个可生成的

IEnumerable

，它具有每次调用

GetEnumerator

时重新运行的属性。这是一个很好的通用解决方案，当不能容忍多重枚举时，切换到延迟的

IEnumerable

为了记录在案，我选择了一个稍微不同的最终实现。最初，我希望保留

IEnumerable

private class NamedItem<TSource> {
   public TSource Item { get; set; }
   public string Name { get; set; }
}

private static bool EndsWithADigit(this string value) =>
   !string.IsNullOrEmpty(value) && "0123456789".Contains(value[value.Length - 1]);

private static string GetNumberedName(string name, int index) =>
   name + (index == 1 ? "" : name.EndsWithADigit() ? $"-{index}" : $"{index}");

private static bool ConditionalSetName<T>(
   NamedItem<T> namedItem, string name, HashSet<string> hashset
) {
   bool isNew = hashset.Add(name);
   if (isNew) { namedItem.Name = name; }
   return !isNew;
}

public static IEnumerable<TOutput> UniquifyNames<TSource, TOutput>(
   this IEnumerable<TSource> source,
   Func<TSource, string> nameSelector,
   Func<TSource, string, TOutput> resultProjection
) {
   var seen = new HashSet<string>();
   var result = source.Select((item, seq) => new NamedItem<TSource>{
      Item = item, Name = nameSelector(item)
   }).ToList();
   var remaining = result;
   int i = 1;
   do {
      remaining = remaining.Where(namedItem =>
         ConditionalSetName(namedItem, GetNumberedName(namedItem.Name, i++), seen)
      ).ToList();
   } while (remaining.Any());
   return result.Select(namedItem => resultProjection(namedItem.Item, namedItem.Name));
}

结果如下：

"String2", "String", "String4", "String3", "String3-2"

这更好，因为名称

String3

保持不变

我最初的实现给出了以下结果：

"String2", "String", "String3", "String3-2", "String3-3"

更糟糕的是，它不必要地变异了第一个

String3

。

这确实是一种非常简单的方法，可以使其多个可枚举！我今天学到了一些好东西。这使得我给出的答案可能太重了。@ErikE请确保刷新，我对“我的版本”做了更多的调整，通过将

.First（）

移到

GenerateName

中，使UniquifyNames更简单。这确实是一种使其可多次枚举的非常简单的方法！我今天学到了一些好东西。这使得我的答案可能太重了。@ErikE请确保刷新，我对“我的版本”进行了更多调整，以使UniquifyNames更简单，方法是将

.First（）

移到

GenerateName

中，这是一个小错误，“s”不存在，我打赌您的意思是“item”，因为该项是TSource的，整个算法基于hashset，然后使用Enumerable.Range.Select.First进行处理，这真的太复杂了，更不用说对序列中的每个元素进行处理了？试着改用Dictionary，测试键、名称是否存在。如果不是，则必须首先添加和。如果是，则递增键下的存储值。它将切断Enumerable.Range.Select.First（hashset.add）的所有乐趣，提高速度并为GC创建更少的垃圾。@ipavlu首先，

字典

比

哈希集

重。感觉