On a request URL, I can get the query string ?dir=Documents%20partag%C3%A9s or ?dir=Documents%20partag%E9s. I think the first one is UTF-8 and the second is ASCII.
The real string is : Docume开发者_开发百科nts partagés
So, I have a PHP script (in UTF-8) and what I want to do, is to detect if the query string is ASCII or UTF-8, and if ASCII, convert it to UTF-8.
I tried with mb_ functions, but the query string is always detected as ASCII and urldecode version of query string as UTF-8.
How can I achieve this? Note that Wikipedia has a similar function -it encodes itself %E9 to %C3%A9.
E9 is 233 in decimal. It is not a valid ASCII byte (0-127 only), but it is é in ISO-8859-1 (Latin1). When using mb_convert_encoding, you can specify multiple encodings (e.g.: UTF-8 and ISO-8859-1).
This should fix it:
mb_convert_encoding($str, 'UTF-8', 'UTF-8,ISO-8859-1');
With the following script:
$str1 = 'Documents%20partag%E9s';
$str2 = 'Documents%20partag%C3%A9s';
var_dump(mb_convert_encoding(urldecode($str1), 'UTF-8', 'UTF-8,ISO-8859-1'));
var_dump(mb_convert_encoding(urldecode($str2), 'UTF-8', 'UTF-8,ISO-8859-1'));
I get:
string(19) "Documents partagés"
string(19) "Documents partagés"
加载中,请稍侯......
精彩评论