POV-Ray : Newsgroups : povray.beta-test : POV-Ray v3.7 charset behaviour : Re: POV-Ray v3.7 charset behaviour Server Time
28 Apr 2024 19:19:08 EDT (-0400)
  Re: POV-Ray v3.7 charset behaviour  
From: clipka
Date: 25 Jun 2018 14:00:06
Message: <5b312da6@news.povray.org>
Am 25.06.2018 um 17:12 schrieb Kenneth:

>> If present, the signature is (hex) `EF BB BF`...
> 
> ..... so I assume that the error message I see (the "illegal ef value") is the
> first part of that hex code. It appears to be, anyway.

Exactly.

>> So Wordpad is without fault here (these days, at any rate). It is
>> POV-Ray that is to blame -- or, rather, whoever added support for
>> signatures in UTF-8 files, because while they did add such support for
>> the scene file proper, they completely forgot to do the same for include
>> files.
...
> So I assume that NO one has had any luck trying to #include a UTF-8 -encoded
> file, regardless of the text-editing app that was used.

UTF-8 encoded files /without/ a signature are fine(*). Unfortunately,
Notepad and WordPad can't write those.

(* For certain definitions of "fine", that is; the `global_settings {
charset FOO }` mechanism isn't really useful for mixing files with
different encodings. Also, the editor component of POV-Ray for Windows
doesn't do UTF-8.)

> Just from my own research into all this stuff, it seems that dealing with text
> encoding is a real headache for software developers. What a complex mess.

It's gotten better. The apparent increase in brain-wrecking is just due
to the fact that nowadays it has become worth /trying/. Historically,
the only way to stay sane in a small software project was to just
pretend that there wasn't such a thing as different file encodings,
hoping that the users could work around the fallout. Now proper handling
of text encoding can be as simple as supporting UTF-8 with and without
signature - no need to even worry about plain ASCII, being merely a
special case of UTF-8 without signature.

It is only when you want to implement /some/ support for legacy
encodings, and do it in a clean way, that things still get a bit tricky.

In POV-Ray for Windows, I consider it pretty evident that besides ASCII
and UTF-8 the parser should also support whatever encoding the editor
module is effectively using. On my machine that's Windows-1252, but I
wouldn't be surprised if that depended on the operating system's locale
settings.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.