C# 将字符串标记化或拆分为文本&;Html标记项

C# 将字符串标记化或拆分为文本&;Html标记项,c#,.net,regex,C#,.net,Regex,我正在寻找最有效的方法来接受字符串,并将其标记化为一个数组,以分隔任何HTML标记组 Example Input (String): "I can format my text so that <strong>This is bold</strong> and this is not." Desired Output (String[] array): "I can format my text so that", "<strong&g

我正在寻找最有效的方法来接受字符串,并将其标记化为一个数组,以分隔任何HTML标记组

Example Input (String): 
    "I can format my text so that <strong>This is bold</strong> and this is not."

Desired Output (String[] array): 
    "I can format my text so that",
    "<strong>",
    "This is bold",
    "</strong>",
    "and this is not."

Alternate Output Just As Good(String[] array): 
    "I",
    "can",
    "format",
    "my",
    "text",
    "so",
    "that",
    "<strong>",
    "This",
    "is",
    "bold",
    "</strong>",
    "and",
    "this",
    "is",
    "not."
示例输入(字符串):
“我可以设置文本格式,以便这是粗体的,而这不是。”
所需输出(字符串[]数组):
“我可以格式化我的文本,以便”,
“”,
“这是大胆的”,
“”,
“这不是。”
同样好的备用输出(字符串[]数组):
“我”,
“可以”,
“格式”,
“我的”,
“文本”,
“所以”,
“那”,
“”,
“这个”,
“是”,
“粗体”,
“”,
“和”,
“这个”,
“是”,
“不是。”

我不确定解决这个问题的最佳方法。任何帮助都将不胜感激。

您可以使用带有一组零长度断言的
Regex.Split()
后面或前面的位置进行拆分。:

string input=“我可以格式化我的文本,以便这是粗体的,而这不是。”;
string[]output=Regex.Split(输入,“(?=
Regex.Split(inputString)”)|(?=使用
Regex.Split,@”(
string input = "I can format my text so that <strong>This is bold</strong> and this is not.";
string[] output = Regex.Split(input, "(?=<)|(?<=>)");