I get a string from my HTML page into my Java HTTPServlet. On my request I get ASCII codes that display Chinese characters:
"& #21487;& #20197;& #21578;& #35785;&开发者_StackOverflow; #25105;" (without the spaces)
How can I transform this string into Unicode?
HTML code:
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Find information</title>
    <link rel="stylesheet" type="text/css" href="layout.css">
</head>
<body>
<form id="lookupform" name="lookupform" action="LookupServlet" method="post" accept-charset="UTF-8">
    <table id="lookuptable" align="center">
        <tr>
            <label>Question:</label>
            <td><textarea cols="30" rows="2" name="lookupstring" id="lookupstring"></textarea></td>
        </tr>
    </table>
    <input type="submit" name="Look up" id="lookup" value="Look up"/>
</form>
Java code:
request.setCharacterEncoding("UTF-8");
javax.servlet.http.HttpSession session = request.getSession();
LoginResult lr = (LoginResult) session.getAttribute("loginResult");
String[] question = request.getParameterValues("lookupstring");
If I print question[0] then I get this value: "& #21487;& #20197;& #21578;& #35785;& #25105;"
There is no such thing as ASCII codes that display Chinese characters. ASCII does not represent Chinese characters.
If you already have a Java string, it already has an internal representation of all characters (US, LATIN, CHINESE). You can then encode that Java string into Unicode using UTF-8 or UTF-16 representations:
String s = "可以告诉我"; (EDIT: This line won't display correctly on systems not having fonts for Chinese characters)
String s = "\u53ef\u4ee5\u544a\u8bc9\u6211";
byte utfString = s.getBytes("UTF-8");
Now that I look at your updated question, you might be looking for the StringEscapeUtils class. It's from Apache Commons Text. And will unescape your HTML entities into a Java string:
String s = StringEscapeUtils.unescapeHtml("& #21487;& #20197;& #21578;& #35785;& #25105;"); // without spaces
A Java String contains unicode characters. The decoding has taken place when the string was constructed.
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论