I have a file with the following content with some characters are UTF-8 hex encoded in the string literal:
<root>
<element type=\"1\">\"Hello W\xC3\x96rld\"</element>
</root>
I want to read the file and decode the UTF-8 hex encoded characters in the file to the actual unicode characters they represent and then write to a new file. Given the above content, the new file should look like the following when you open it in a text editor with UTF-8 encoding:
<root>
<element type=\"1\">\"Hello WÖrld\"</element>
</root>
Notice the double quotes are still escaped and the UTF-8 hex encoded \xC3\x96 has now become Ö (U+00D6 LATIN CAPITAL LETTER O WITH DIAERESIS).
I have got code that is partially working, as follows:
#! /usr/bin/perl -w
use strict;
use Encode::Escape;
while (<>)
{
    # STDOUT is redirected to a new file.
    print decode 'unicode-escape', $_;
}
The problem however, all the other escape sequences such as \" are being decoded as well by decode 'unicode-escape', $_.  So in the end, I get the following:
<root>
<element type="1">"Hello WÖrld"</element>
</root>
I have tried reading the file in UT开发者_JS百科F-8 encoding and/or using Unicode::Escape::unescape such as
open(my $UNICODESFILE, "<:encoding(UTF-8)", shift(@ARGV));
Unicode::Escape::unescape($line);
but neither of them decode the \xhh escape sequences.
Basically all I want is the behavior of decode 'unicode-escape', $_, but that it should only decode on \xhh escape sequences and ignore other escape sequences.
Is this possible?  Is using decode 'unicode-escape', $_ appropriate for this case?  Any other way?  Thanks!
Find groups of \xNN characters and process them, I guess:
s{((?:\\x[0-9A-Fa-f]{2})+)}{decode 'unicode-escape', $1}ge
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论