POV-Ray : Newsgroups : povray.off-topic : Haskell raving : Re: Haskell raving Server Time
11 Oct 2024 17:43:50 EDT (-0400)
  Re: Haskell raving  
From: Alain
Date: 2 Nov 2007 14:20:20
Message: <472b7874@news.povray.org>
Orchid XP v7 nous apporta ses lumieres en ce 2007/11/01 15:03:
> Warp wrote:
>> Alain <ele### [at] netscapenet> wrote:
>>> It's about 12 BITS per characters, on average, not BYTES! That's 
>>> useing UTF8 encoding. About 16 BITS per characters if using UTF16 
>>> encoding.
>>> UTF8 is only 7 BITS per characters if you stick to only standard 
>>> ASCII characters set, but it gets bigger if you also use extended 
>>> ASCII or characters from foreign alphabets.
>>
>>   Except that if each single character is indeed garbage-collected, that
>> requires quite a lot of memory per character (compared to the size of the
>> character).
> 
> I can't actually find documentation to hand to clarify whether it's 12 
> bits or 12 bytes per character. (My strong suspecion is that it *is* 12 
> *bytes* - since, after all, a single Unicode code point is 24 bits 
> already.)
> 
> The situation is actually worse than it looks. All this lazy evaluation 
> magic is implemented by storing program state around the place, so an 
> "unevaluated" string probably takes up more space still...
Using UTF8 encoding, a single character can be 1 BYTE, 2, 3 or 4 Bytes long. 
Standard ASCII characters are 1 BYTE. extended ASCII and some more use 2 bytes.
BIT 7 is a flag for 2 bytes codes. BITS 6 and 7 set to 1 = 3 bytes codes. BITS 
5, 6 and 7 set to 1 = 4 bytes codes.
Using UTF16 encoding, any character is 2 BYTES long, for a grand total is 65536 
possible characters, not all of them been printable.

-- 
Alain
-------------------------------------------------
Error in operator: add beer


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.