开发者

to unicode or not to unicode

开发者 https://www.devze.com 2023-04-09 07:40 出处:网络
I\'m getting a value from the registry.This value might have double byte characters in it. I will later have to transfer this across the network to a C# client to display.C# is all unicode.

I'm getting a value from the registry. This value might have double byte characters in it. I will later have to transfer this across the network to a C# client to display. C# is all unicode. The function returns MBCS if you call it non-unicode.

What should I use?

string result = string(cbData);
RegQueryValueExA(h_sub_key, "DisplayName", NULL, NULL, (LPBYTE) &result[0], &cbData)

or

string result = string(cbData);
RegQueryValueExW(h_sub_key, L"DisplayName", NULL, NULL, (LPBYTE) &result[0], &a开发者_如何学Pythonmp;cbData)


Using Unicode whenever possible will make your life easier. The registry contains Unicode natively and converts to MBCS on the fly when you use ReqQueryValueExA, why would you want to do an unneeded conversion?

Converting to UTF-8 from UTF-16 might make sense for information going over the network, but if you control both ends of the connection it wouldn't be necessary.


No, that's not the way it works. The string you get back from the first snippet is encoded according to the current system code page. Could be a double-byte encoding. Could be anything. Big problem of course, the C# code on the other end of that Internet connection has no way to guess what the code page might be.

So do not use the first snippet. The second one gets you the string in utf16, the native encoding used in Windows, result needs to be an std::wstring. Also the encoding used by C# so you could send the binary string. Although that's not typically a good idea, xml is popular. It is up to you to set the wire format.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号