C# GetSafeThmlFragment删除所有html标记_C#_Antixsslibrary

C# GetSafeThmlFragment删除所有html标记

C# GetSafeThmlFragment删除所有html标记,c#,antixsslibrary,C#,Antixsslibrary,我在我的网站上使用GetSafeThmlFragment，发现除了和之外的所有标签都被删除了我四处研究，发现微软没有解决方案是否有替代品或解决方案谢谢。另一种解决方案是将与您自己的标签白名单结合使用： using System; using System.IO; using System.Text; using System.Linq; using System.Collections.Generic; using HtmlAgilityPack; class Program {

我在我的网站上使用GetSafeThmlFragment，发现除了

和

之外的所有标签都被删除了

我四处研究，发现微软没有解决方案

是否有替代品或解决方案

谢谢。

另一种解决方案是将与您自己的标签白名单结合使用：

using System;
using System.IO;
using System.Text;
using System.Linq;
using System.Collections.Generic;
using HtmlAgilityPack;

class Program
{
    static void Main(string[] args)
    {
        var whiteList = new[] 
            { 
                "#comment", "html", "head", 
                "title", "body", "img", "p",
                "a"
            };
        var html = File.ReadAllText("input.html");
        var doc = new HtmlDocument();
        doc.LoadHtml(html);
        var nodesToRemove = new List<HtmlAgilityPack.HtmlNode>();
        var e = doc
            .CreateNavigator()
            .SelectDescendants(System.Xml.XPath.XPathNodeType.All, false)
            .GetEnumerator();
        while (e.MoveNext())
        {
            var node =
                ((HtmlAgilityPack.HtmlNodeNavigator)e.Current)
                .CurrentNode;
            if (!whiteList.Contains(node.Name))
            {
                nodesToRemove.Add(node);
            }
        }
        nodesToRemove.ForEach(node => node.Remove());
        var sb = new StringBuilder();
        using (var w = new StringWriter(sb))
        {
            doc.Save(w);
        }
        Console.WriteLine(sb.ToString());
    }
}

使用系统；
使用System.IO；
使用系统文本；
使用System.Linq；
使用System.Collections.Generic；
使用HtmlAgilityPack；
班级计划
{
静态void Main（字符串[]参数）
{
var白名单=新[]
{ 
“#评论”、“html”、“标题”，
“标题”、“正文”、“img”、“p”，
“a”
};
var html=File.ReadAllText（“input.html”）；
var doc=新的HtmlDocument（）；
doc.LoadHtml（html）；
var nodesToRemove=新列表（）；
var e=文件
.CreateNavigator（）
.selectDescents（System.Xml.XPath.XPathNodeType.All，false）
.GetEnumerator（）；
while（如MoveNext（））
{
变量节点=
（（HtmlAgilityPack.HtmlNodeNavigator）e.Current）
.CurrentNode；
如果（！whiteList.Contains（node.Name））
{
nodesToRemove.Add（节点）；
}
}
ForEach（node=>node.Remove（））；
var sb=新的StringBuilder（）；
使用（var w=新的StringWriter（sb））
{
文件保存（w）；
}
Console.WriteLine（sb.ToString（））；
}
}

令人惊讶的是，4.2.1版本中的Microsoft严重地过度补偿了4.2 XSS库中的安全漏洞，一年后仍然没有更新。当我看到有人在某处评论时，

GetSafeHtmlFragment

方法应该被重命名为

StripHtml

最后，我使用了建议中的。我喜欢它通过NuGet作为一个包提供

这个库基本上实现了现在被接受的答案所使用的白名单方法的变体。但是，它基于

CsQuery

而不是HTML敏捷库。该软件包还提供了一些附加选项，比如能够保存样式信息（例如HTML属性）。使用这个库在我的项目中产生了如下代码，至少比公认的答案少了很多代码：）

使用Html；
...
var sanitizer=新的HtmlSanitizer（）；
sanitizer.AllowedTags=新列表{“p”、“ul”、“li”、“ol”、“br”}；
字符串sanitizedHtml=sanitizer.Sanitize（htmlString）；

预期的行为是什么？我有一些图像标签，它们已被删除。我想显示它们。

using Html;

...

var sanitizer = new HtmlSanitizer();
sanitizer.AllowedTags = new List<string> { "p", "ul", "li", "ol", "br" };
string sanitizedHtml  = sanitizer.Sanitize(htmlString);