C# c中字符串的错误字符固定#_C#_String_Performance

C# c中字符串的错误字符固定#

c# string performance

C# c中字符串的错误字符固定#,c#,string,performance,C#,String,Performance,我有五根弦，如下所示 abbcd 阿巴德阿巴德阿巴德阿巴德除第四个字符外，所有字符串基本相同。但只有出现时间最长的角色才会出现。例如，D在第四个位置放置了3次。最后一个字符串是ABBDCD。我写了下面的代码，但在时间上似乎效率较低。因为这个函数可以被调用一百万次。我应该做些什么来提高性能这里changedString是要与其他5个字符串匹配的字符串。如果更改字符串的任何位置与其他四个位置不匹配，则出现的最大字符将放置在changedString上 len是所有字符串相同的字符串长度 f

我有五根弦，如下所示

abbcd

阿巴德

除第四个字符外，所有字符串基本相同。但只有出现时间最长的角色才会出现。例如，D在第四个位置放置了3次。最后一个字符串是ABBDCD。我写了下面的代码，但在时间上似乎效率较低。因为这个函数可以被调用一百万次。我应该做些什么来提高性能

这里changedString是要与其他5个字符串匹配的字符串。如果更改字符串的任何位置与其他四个位置不匹配，则出现的最大字符将放置在changedString上

len是所有字符串相同的字符串长度

for (int i = 0; i < len;i++ )
{
    String findDuplicate = string.Empty + changedString[i] + overlapStr[0][i] + overlapStr[1][i] + overlapStr[2][i] +
                           overlapStr[3][i] + overlapStr[4][i];

    char c = findDuplicate.GroupBy(x => x).OrderByDescending(x => x.Count()).First().Key;
    if(c!=changedString[i])
    {
        if (i > 0)
        {
            changedString = changedString.Substring(0, i) + c +
                            changedString.Substring(i + 1, changedString.Length - i - 1);
        }
        else
        {
            changedString = c + changedString.Substring(i + 1, changedString.Length - 1);
        }
    }
    //string cleanString = new string(findDuplicate.ToCharArray().Distinct().ToArray());
}

for（int i=0；ix）.OrderByDescending（x=>x.Count（））.First（）.Key；
如果（c！=changedString[i]）
{
如果（i>0）
{
changedString=changedString.Substring（0，i）+c+
changedString.Substring（i+1，changedString.Length-i-1）；
}
其他的
{
changedString=c+changedString.Substring（i+1，changedString.Length-1）；
}
}
//string cleanString=新字符串（findDuplicate.ToCharArray（）.Distinct（）.ToArray（））；
}

我不太确定您将要做什么，但如果是按第n个字符对字符串排序，那么最好的方法是使用计数排序，它用于对小整数数组进行排序，对于字符也很好。它具有线性O（n）时间。其主要思想是，如果您知道所有可能的元素（看起来它们在这里只能是A-Z），那么您可以创建一个额外的数组并对它们进行计数。例如，如果我们使用0表示“A”，1表示“B”等等，那么它将是{0,0,1,3,1,0，…}。

有一个函数可能有助于提高性能，因为它的运行速度快了五倍。其思想是使用字典自己计算出现次数，将字符转换到计数数组的某个位置，在该位置增加值，并检查其是否大于以前的最高出现次数。如果是，则当前字符位于顶部，并作为结果存储。这将对overlapStr中的每个字符串以及字符串中的每个位置重复。请阅读代码中的注释以查看详细信息

string HighestOccurrenceByPosition(string[] overlapStr)
{
    int len = overlapStr[0].Length;
    //  Dictionary transforms character to offset into counting array
    Dictionary<char, int> char2offset = new Dictionary<char, int>();
    //  Counting array. Each character has an entry here
    int[] counters = new int[overlapStr.Length];
    //  Highest occurrence characters found so far
    char[] topChars = new char[len];

    for (int i = 0; i < len; ++i)
    {
        char2offset.Clear();
        //  faster! char2offset = new Dictionary<char, int>();
        //  Highest number of occurrences at the moment
        int highestCount = 0;
        //  Allocation of counters - as previously unseen character arrives 
        //  it is given a slot at this offset
        int lastOffset = 0;
        //  Current offset into "counters"
        int offset = 0;
        //  Small optimization. As your data seems very similar, this helps
        //  to reduce number of expensive calls to TryGetValue
        //  You might need to remove this optimization if you don't have 
        //  unused value of char in your dataset
        char lastChar = (char)0;

        for (int j = 0; j < overlapStr.Length; ++ j)
        {
            char thisChar = overlapStr[j][i];
            //  If this is the same character as last one
            //  Offset already points to correct cell in "counters"
            if (lastChar != thisChar)
            {
                //  Get offset
                if (!char2offset.TryGetValue(thisChar, out offset))
                {
                    //  First time seen - allocate & initialize cell
                    offset = lastOffset;
                    counters[offset] = 0;
                    //  Map character to this cell
                    char2offset[thisChar] = lastOffset++;
                }
                //  This is now last character
                lastChar = thisChar;
            }
            //  increment and get count for character
            int charCount = ++counters[offset];
            //  This is now highestCount.
            //  TopChars receives current character
            if (charCount > highestCount)
            {
                highestCount = charCount;
                topChars[i] = thisChar;
            }
        }
    }
    return new string(topChars);
}

string HighestOccurrenceByPosition（string[]overlapStr）
{
int len=overlapStr[0]。长度；
//字典将字符偏移量转换为计数数组
字典char2offset=新字典（）；
//计数数组。每个字符在这里都有一个条目
int[]计数器=新的int[overlapStr.Length]；
//迄今为止发现的出现率最高的字符
char[]topChars=新字符[len]；
对于（int i=0；i最高计数）
{
最高计数=字符数；
topChars[i]=此字符；
}
}
}
返回新字符串（topChars）；
}

另外，这肯定不是最好的解决方案。但是，由于它比原始版本快得多，我认为我应该提供帮助。

我遇到了一个问题，如何才能给出标题，好的，我做了简短的说明。不，我正在做一个项目，为此我需要使用此功能。但是这需要花费很多时间。首先，字符串连接很昂贵。使用StringBuilder。如果字符串是固定长度的，char[]会更快。@布赖斯瓦格纳：这是一个很好的观点。如果我理解他们试图用子字符串部分做什么，我认为他们实际上不需要整个子字符串，只需要一个字符。