|
|
Le 19-01-04 à 21:04, clipka a écrit :
> Am 04.01.2019 um 19:18 schrieb Alain:
>
>> Will it be possible to directly use UTF-8 characters ?
>> After all, if you can directly enter characters like à é è ô ç (direct
>> access) or easily like €(altchar+e) ñ(altchar+ç,n) from your keyboard
>> as I just did, you should be able to use them instead of the
>> cumbersome codes.
>
> Short answer: The `\uXXXX` notation won't be necessary. I just used it
> to avoid non-ASCII characters in my post.
>
>
> Looooong answer:
>
>
> It depends on what you're taling about.
>
> First, let's get an elephant - or should I say mammoth - out of the
> room: The editor component of the Windows GUI. It's old and crappy, and
> doesn't support UTF-8 at all. It does support Windows-1252 though (at
> least on my system; I guess it may depend on what locale you have
> configured in Windows), which has all the characters you mentioned.
>
>
> Now if you are using a different editor, using verbatim "UTF-8
> characters" should be no problem: Enter the characters, save the file as
> UTF-8, done.
>
> The characters will be encoded directly as UTF-8, and the parser will
> work with them just fine (provided you're only using them in string
> literals or comments); no need for `\uXXXX` notation.
>
>
> Alternatively, you could enter the same characters in the same editor,
> and save the file as "Windows-1252" (or maybe called "ANSI" or
> "Latin-1"), or enter them in POV-Ray for Windows and just save the file
> without specifying a particular encoding (because you can't).
>
> In that case the characters will be encoded as Windows-1252, and in most
> cases the parser will also work with them just fine (again, string
> literals or comments only); again no need for `\uXXXX` notation.
>
> What the parser will do in such a case is first convert the
> Windows-1252-enoded characters to Unicode, and then proceed in just the
> same way.
>
>
> For example:
>
> #declare MyText = "a€b"; // a Euro sign between `a` and `b`
>
> will create a string containing `a` (U+0061) followed by a Euro sign
> (U+20AC) followed by `b` (U+0062), no matter whether the file uses UTF-8
> encoding or Windows-1252 encoding. In both cases, the parser will
> interpret the thing between `a` and `b` as U+20AC, even though in a
> UTF-8 encided file that thing is represented by the byte sequence hex
> E2,82,AC while in a Windows-1252 encoded file it is represented by the
> single byte hex 80.
Nice.
Post a reply to this message
|
|