C# 如何从字符串中删除任何和所有HTML标记？_C#_Html

C# 如何从字符串中删除任何和所有HTML标记？

c# html

C# 如何从字符串中删除任何和所有HTML标记？,c#,html,C#,Html,我有一个字符串定义如下： private const String REFER_TO_BUSINESS = "<pre> (Refer to business office for guidance and explain below the circumstances for exception to policy or attach a copy of request)</pre>"; private const String refere_TO_BUSINESS=

我有一个字符串定义如下：

private const String REFER_TO_BUSINESS = "<pre> (Refer to business office for guidance and explain below the circumstances for exception to policy or attach a copy of request)</pre>";

private const String refere_TO_BUSINESS=“（请向业务办公室寻求指导，并在下面解释政策例外情况或附上请求副本）”；
// For strings that have embedded HTML tags for presentation on the form (such as "<pre>" and such), but need to be rendered free of these (such as on the PDF)
private String RemoveHTMLTags(String stringContainingHTMLTags)
{
    String regexified = Regex.Replace(stringContainingHTMLTags, "<.*?>", string.Empty);
    return regexified;
}

…正如您所看到的，它具有“pre”标记，用于保留字词前面的空格。但是，我想引用这个字符串而不使用“pre”标记。搜索“”和“”并删除它们是很容易的，但对每种HTML标记类型都这样做很快就会变得单调乏味

 string stripMeOfHTML = Regex.Replace(stripMeOfHTML, @"<[^>]+>", "").Trim();

在C#中，如何从字符串中去掉所有标记，而不管它们是“”、“”、“”还是其他任何标记？

这一方法有效：

        var pattern = @"</?\w+((\s+\w+(\s*=\s*(?:"".*?""|'.*?'|[^'"">\s]+))?)+\s*|\s*)/?>";
        var source = "<pre> (Refer to business office for guidance and explain below the circumstances for exception to policy or attach a copy of request)</pre>";
        Regex.Replace(source, pattern, string.Empty);

//用于具有嵌入HTML标记以在表单上显示的字符串（如“”等），但需要不包含这些标记（如在PDF上）
私有字符串移除HtmlTags（字符串stringContainingHTMLTags）
{
String regexified=Regex.Replace（stringContainingHTMLTags，”，String.Empty）；
返回重新验证；
}

这应该是您所需要的：

string stripmeofthml=Regex.Replace（stripmeofthml，@“]+>，”）.Trim（）；

尝试替换正则表达式。此模式匹配字符串中的html标记。从

var模式=@“到目前为止，答案中的正则表达式实现存在问题-它们会破坏字符串，如x<6&&y>8
，它不包含任何HTML标记。这一点很好，但不适用于我的用例。@jdpenix可能是因为要成为有效的HTML，字符串应该是x 6&&；y 8
@B.ClayShannon很好，但不适用于我的用例。
如果您的用例总是像您的问题一样简单，那么您可以使用它，但它不是在html标记之间获取文本的正确方法。使用Html解析器，比如HtmlAgilityPack。@EZI好的-我在这里。没有一个测试用例，因为它是有效的HTML，并且正则表达式解决方案不起作用。只是。。。臭的。：）