I currently use the following code to sanitize a string before storing them:
ERB::Util::h(string)
My problem occurs when the string has been sanitized already like this:
string = "Watching baseball `&` football"
The sanitized string will look like:
sanitized_string = "Watching baseball `&` football"
Can I sanitize by just turning < into < and > into > via substitution?
Unescape first, then escape again:
require 'cgi'
string = "Watching baseball & football"
CGI.escapeHTML(CGI.unescapeHTML(string))
=> "Watching baseball & football"
A fast approach based on this snippet from Erubis.
ESCAPE_TABLE = { '<'=>'<', '>'=>'>' }
def custom_h(value)
   value.to_s.gsub(/[<>]/) { |s| ESCAPE_TABLE[s] }
end
Yes you can, or taking it further you can just delete entire tags with a basic regex like this:
mystring.gsub( /<(.|\n)*?>/, '' )
You could write your own sanitizer, but there are lots of corner cases and tricky edges in sanitization.
A better approach might be to unencode your string before sanitizing it - does h() have an inverse you could put your strings through first?
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论