POV-Ray : Newsgroups : povray.off-topic : Haskell raving : Re: Haskell raving Server Time
15 Nov 2024 11:16:48 EST (-0500)
  Re: Haskell raving  
From: Alain
Date: 2 Nov 2007 14:32:06
Message: <472b7b36$1@news.povray.org>
Orchid XP v7 nous apporta ses lumieres en ce 2007/11/01 17:15:
> Warp wrote:
> 
>>   UTF-8 encoding "wastes" some bits (in order to use less bits for the
>> most used western characters) and requires at most 4 bytes per character
>> (even though the characters requiring more than 3 bytes are very rarely
>> used).
> 
> And thus, like any decent variable-length encoding scheme, it tries to 
> assign short codes to common symbols. (Although UTF-8 probably fails 
> horribly for, say, Japanese text. I don't actually know...)
For Japanese and Chinese, it average around 3 bytes per characters. It's not so 
bad after all, as each characters in those represent a whole word, some even 
represent a whole phrase or some complexe concept.
A 1000 glyphs text in Chinese would be, roughly, a 1000 to 5000 words text in 
English!
> 
>>> UTF-16 another...  and raw storage the worst idea ever!
>>
>>   Why would raw storage be the worst idea? There are several advantages.
>>   The disadvantage is, of course, an increased memory requirement.
> 
> You win some, you loose some. Programming is all about these kinds of 
> compromises. :-)
All possible characters for all European languages fit in 1 or 2 bytes, and I 
think that it also include Arabic and Cyrilic. The Asiatic glyphs use the bulk 
of the 3 and 4 bytes codes.

-- 
Alain
-------------------------------------------------
If you're ever about to be mugged by a couple of clowns, don't hesitate - go for 
the juggler.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.