I am parsing the request parameters to find any vulnerable characters to prevent XSS threats. Our web application supports both French and German languages other than English. I am using the f开发者_StackOverflowollowing regular expression to achieve this, but it fails to handle French and German
^[a-zA-Z0-9\r\n\\-=\\*\\.\\?;,+\\/:&_ %@#]*$
Any suggestions on this is highly appreciated
\p{L} will match any unicode character that is a letter.
Try [\p{Latin}\p{Punctuation}\p{Math_Symbol}] or add more character classes. Have a look here for other unicode character classes.
I know this is an old question.
But hope it helps someone out there! you can try this regex:
([\u0020-\u007e\u00a0-\u00ff\u0100-\u017F]+)
Basically it should match all the Latin and extended Latin characters, including numbers, feel free to remove the unicode characters as necessary. I would say that this would be the surest way of getting it right for all your scenarios.
References:
- http://unicode.org/charts/PDF/U0000.pdf
- http://unicode.org/charts/PDF/U0080.pdf
- http://unicode.org/charts/PDF/U0100.pdf
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论