|
|
Le 01.11.2007 20:00, Orchid XP v7 nous fit lire :
> No - it's Unicode. 24 bits per character. ;-)
Unicode is just a bunch of tables of glyphs (lot of tables, lot of
glyphs).
It is past 24 bits since a few... (even past 32 bits!!!)
The real thing is how you encode all these.
UTF-8 is one way (the popular one these days),
UTF-16 another... and raw storage the worst idea ever!
UTF-8 is about 8 bits for ascii range, usually go up to 24 bits (3 x
8) for classical japanese, 16 for most french variants...
>
> But yes, the standard Haskell string type is geared to flexibility, not
> performance. See my ByteString comments...
Fixed size unicode... if only they stop adding more tables!
--
The superior man understands what is right;
the inferior man understands what will sell.
-- Confucius
Post a reply to this message
|
|