C# DNA密码突变过程：寻找子串_C#_Regex

C# DNA密码突变过程：寻找子串

c# regex

C# DNA密码突变过程：寻找子串,c#,regex,C#,Regex,我有这个氨基酸序列：PhepheleoargstopValglyargtyrStopperheagglehis 我想在这个序列上应用一个突变点，这样结果将是由上述序列生成的多个键，如下所示： key1:PhePheLeoArgStopValGlyArgTyrStopPheArgGleHis key2:ValGlyArgTyrStopPheArgGleHis key3:PheArgGleHis 因此，当它到达Stop时，它会创建一个包含字符串其余部分的键我尝试了以下代码，但未按预期工作： st

我有这个氨基酸序列：

PhepheleoargstopValglyargtyrStopperheagglehis

我想在这个序列上应用一个突变点，这样结果将是由上述序列生成的多个键，如下所示：

key1:PhePheLeoArgStopValGlyArgTyrStopPheArgGleHis
key2:ValGlyArgTyrStopPheArgGleHis
key3:PheArgGleHis

因此，当它到达

Stop

时，它会创建一个包含字符串其余部分的键

我尝试了以下代码，但未按预期工作：

string mRNA = textBox3.Text;
string Rna = mRNA.Replace("Stop", "*");
string[] keys = Rna.Split('*');
foreach (string key in keys)
{
    listBox1.Items.Add( key);
}

有人能帮我修复代码吗？

Split

可能是最常用的结构之一。很容易看出原因，因为它将问题分为多个部分，这就是当你从鸟瞰的角度看问题时编程的全部内容

但是，在这种情况下，最好使用以下命令在字符串中搜索：

此方法返回输入中与正则表达式模式匹配的第一个子字符串。您可以通过重复调用返回的Match对象的

Match.NextMatch

方法来检索后续匹配

然后，您可以使用查找“键”的索引（至少如果这是匹配后的所有内容，如您所示）。然后，可以通过使用简单地检索密钥本身

下面是示例代码（您似乎已经尽力了）：

在你的文件开始时，让它工作

如果只搜索静态字符串，也可以使用。在这种情况下，您可以通过在找到的索引中添加

“Stop”.Length

（当然，在检查是否实际找到

Stop

字符串之后）来找到下一个字符串的开头

请看一个例子。

我将使用indexOf解决方案来解决这个问题。见下文：

 var rna = "PhePheLeoArgStopValGlyArgTyrStopPheArgGleHis";
 var keys = new List<string>();
 var pos = -1;

 //prime the loop by adding the first item           
 keys.Add(rna);
 listBox1.Items.Add(keys.Last());

 //loop through all the occurrecnces of "Stop" and add they key
 while ((pos = rna.IndexOf("Stop", pos + 1)) >= 0)
 {
     keys.Add(rna.Substring(pos+4));
     listBox1.Items.Add(keys.Last());
 }

var rna=“PhepheleoargstopValglyargtyrStopperheagglehis”；
var keys=新列表（）；
var pos=-1；
//通过添加第一项来初始化循环
添加（rna）；
listBox1.Items.Add（keys.Last（））；
//循环遍历所有发生的“停止”并添加它们
而（（pos=rna.IndexOf（“Stop”，pos+1））>=0）
{
添加（rna.子串（pos+4））；
listBox1.Items.Add（keys.Last（））；
}

您可以使用如下功能：

static IEnumerable<string> Mutate(string input)
{
  if (input == null || input.Length == 0) { yield break; }
  yield return input;
  int pos = 0;
  while (pos < input.Length)
  {
    int index = input.IndexOf("Stop", pos);
    if (index == -1 || index == input.Length - 4) { yield break; }
    index += 4;
    yield return input.Substring(index);
    pos = index;
  }
}

我用以下输入测试了此解决方案：

都是空的
都是空的,
以“停止”开头
以“停止”结尾
连续有多个“停止”，并且
没有“停止”的标志

是否有使用正则表达式的要求？你可以通过其他方式轻松完成这项任务。如何完成？你能帮我个忙吗？在我的回答中加了一个关于如何处理索引的例子。它可能更有效，但我认为

NextMatch

解决方案更具可读性（没有神奇的整数值）和灵活性（regex）。向上投票…如果输入以“停止”开头，则不起作用。我相信while循环需要

=0

而不是

>0

。@CallumWatkins-感谢更新的答案以反映您的建议。@MaartenBodewes-对于每个人来说，我觉得这更具可读性。对于这么短的字符串，我认为效率并不是那么重要。这是一个测试字符串。我认为RNA字符串实际上可能只是稍微大一点。是的，每个人都有自己的。我当然会使用

var pattern=“Stop”

然后使用

pattern.Length

而不是literal

。非常健壮的解决方案！谢谢你们，我的朋友，你们真的帮助我解决了我的问题谢谢你们，我用了你们的代码，它解决了我的问题@CallumWatkins

 var rna = "PhePheLeoArgStopValGlyArgTyrStopPheArgGleHis";
 var keys = new List<string>();
 var pos = -1;

 //prime the loop by adding the first item           
 keys.Add(rna);
 listBox1.Items.Add(keys.Last());

 //loop through all the occurrecnces of "Stop" and add they key
 while ((pos = rna.IndexOf("Stop", pos + 1)) >= 0)
 {
     keys.Add(rna.Substring(pos+4));
     listBox1.Items.Add(keys.Last());
 }

static IEnumerable<string> Mutate(string input)
{
  if (input == null || input.Length == 0) { yield break; }
  yield return input;
  int pos = 0;
  while (pos < input.Length)
  {
    int index = input.IndexOf("Stop", pos);
    if (index == -1 || index == input.Length - 4) { yield break; }
    index += 4;
    yield return input.Substring(index);
    pos = index;
  }
}

string mRNA = textBox3.Text;
foreach (string key in Mutate(mRNA))
{
  listBox1.Items.Add(key);
}