|
|
Warp nous apporta ses lumieres en ce 2007/11/02 17:10:
> Alain <ele### [at] netscapenet> wrote:
>>> And thus, like any decent variable-length encoding scheme, it tries to
>>> assign short codes to common symbols. (Although UTF-8 probably fails
>>> horribly for, say, Japanese text. I don't actually know...)
>> For Japanese and Chinese, it average around 3 bytes per characters. It's not so
>> bad after all, as each characters in those represent a whole word, some even
>> represent a whole phrase or some complexe concept.
>
> UTF16 is better because it uses 2 bytes for the vast majority of the most
> commonly used kanjis and other symbols used in Japanese.
>
But then, you don't have place for the Chinese ones, then you need Vietnamese,
Corean, Hindu, Sanskrit, Latin, Cyrilic, Arabic, Inuctituk, Math symbols,...
--
Alain
-------------------------------------------------
Make yourself a better person and know who you are before you try and know
someone else and expect them to know you.
Post a reply to this message
|
|