开发者

How to comment out all script tags in an html document using HTML agility pack

开发者 https://www.devze.com 2023-03-19 02:42 出处:网络
I would like to comment out all script t开发者_开发问答ags from an HtmlDocument.This way when I render the document the scripts are not executed however we can still see what was there.Unfortunately,

I would like to comment out all script t开发者_开发问答ags from an HtmlDocument. This way when I render the document the scripts are not executed however we can still see what was there. Unfortunately, my current approach is failing:

foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
            {
                var commentedScript = new HtmlNode(HtmlNodeType.Comment, htmlDocument, 0) { InnerHtml = scriptTag.ToString() };
                scriptTag.ParentNode.AppendChild(commentedScript);
                scriptTag.Remove();
            }

Note that I can do this using replace functions on the html, but I do not think it would be as robust:

domHtml = domHtml.Replace("<script", "<!-- <script");
domHtml = domHtml.Replace("</script>", "</script> -->");


Try this:

foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
        {
            var commentedScript = HtmlTextNode.CreateNode(string.Format("<!--{0}-->", scriptTag.OuterHtml));
            scriptTag.ParentNode.ReplaceChild(commentedScript, scriptTag);
        }


Refer to this SO post - very clean solution utilising the Linq query support of the HTML Agility Pack: htmlagilitypack - remove script and style?

0

精彩评论

暂无评论...
验证码 换一张
取 消