POV-Ray: Newsgroups: povray.bugreports: String literals can only be 125 characters long: Unicode for POVRay

POV-Ray : Newsgroups : povray.bugreports : String literals can only be 125 characters long : Unicode for POVRay		Server Time 31 Oct 2025 12:03:52 EDT (-0400)

From: Jon A Cruz
Date: 4 Jun 1999 12:16:07
Message: <3757ED9F.76FA510F@geocities.com>

Ron Parker wrote:

> On 4 Jun 1999 09:58:51 -0500, Ron Parker wrote:
> >No, you should tweak it to work in whatever the current encoding is.
> >Not that there's any distinction, for now, but what if someone adds
> >a multibyte encoding later?  Keep in mind that that means the encoding
> >has to be specified first, because it affects parsing, which is not like
> >all the other global_settings.
>
> Doh!  Of course there's a distinction.  A single-byte encoding could
> easily contain the first byte of a double- or triple-byte UTF-8 character,
> so you'd have to be sensitive to encoding type even with the current choices.

But here's where the problems start to creep in.

If you start to allow arbitrary encodings, which do you use? A common thing for
programs in the past has been to use the default multibyte encoding of the
platform it is running on. That's nice for an isolated user, but breaks down
when you start to go global.

The old multibyte true-type patch did just this. It was for exactly this reason
that the patch was useable only on systems that were natively multibyte such as
Japanese or Chinese. So you'd get different results if you ran on a Chinese
system than if you ran on a Japanese system, OR it would just fail completely.

Probably the only way to keep the .pov files portable and generating identical
results on any platform (which I think is one of the design goals of POV-Ray)
would be to include the encoding support in POV-Ray. But, you can't include
everything, so where do you draw the line?

In my Unicode patch I have it converting only after text is passed down in for
generating objects, so the encoding is deferred (this was mainly to avoid
changing all the text handling). However, I did do a few base encodings.

We could just change things to handle the current US-ASCII and Unicode (UTF-8,
UTF-16). For nicer user support we could then add CP-1252 (Windows) and
MacRoman (but Mac has lots of problems). Notepad on NT can do Unicode versions
of text files, and for the rest, we could probably push the conversion burden
out onto the end-user (e.g. if you want to render Klingon text, you are
responsible for converting your Klingon into Unicode before sending it off to
POV-Ray.) If the entire program was changed to be based on Unicode instead of
single-byte text, this might help.

But then... what happens to all the text manipulation routines? Are end-user
scripts dependent on one character=one byte? What compatibility issues could
this cause? Hmmm....

(followups set to povray.programming)

Post a reply to this message