POV-Ray: Newsgroups: povray.off-topic: Lots of statistics: Re: C# WTF list

POV-Ray : Newsgroups : povray.off-topic : Lots of statistics : Re: C# WTF list		Server Time 15 Jul 2025 16:53:21 EDT (-0400)

From: Invisible
Date: 15 Aug 2012 07:08:10
Message: <502b831a$1@news.povray.org>

>>> - The "char" type works with Unicode. Well done. Oh, but wait... It only
>>> stores 16 bits, and yet Unicode actually requires 24 bits to represent a
>>> single code-point. So this "Unicode character" only actually covers the
>>> Basic Multilingual Plane. FAIL!
>>
>> Oh great. Apparently "char" doesn't store a code-point at all, it stores
>> a code-unit.
>>
>> For anything in the BMP, these are effectively the same thing. For
>> anything outside that range, *you* must manually write the code to
>> decode UTF-16 into actual code-points (which then do not fit into a
>> "char").
>
> Uh... why does this come as a surprise to you?

I guess I'm used to using a programming language where a Char is... 
well... any valid Unicode code-point, and once you set the encoding of a 
file handle, the library does all necessary encoding and decoding, 
whether it's UTF-8, UTF-16, Latin-1 or whatever.

Still, I suppose it's better than char = unsigned byte. :-P

Post a reply to this message