POV-Ray: Newsgroups: povray.bugreports: alpha.10064268 outputting strings: Re: alpha.10064268 outputting strings

POV-Ray : Newsgroups : povray.bugreports : alpha.10064268 outputting strings : Re: alpha.10064268 outputting strings		Server Time 19 Apr 2024 11:00:52 EDT (-0400)

From: Thorsten
Date: 18 Jun 2020 02:27:01
Message: <5eeb0935$1@news.povray.org>

On 18.06.2020 01:50, William F Pokorny wrote:
> On 6/17/20 5:43 PM, B. Gimeno wrote:
> ...
>>
>> Not sure of anything after reading this:
>>
http://news.povray.org/povray.beta-test/message/%3C5c2e6746%40news.povray.org%3E/#%3C5c2e6746%40news.povray.org%3E

>>
>> ....and that someone was inquirinq in a similar way 20 years ago.
>>
http://news.povray.org/povray.beta-test/thread/%3C3BADC84E.1AB6A6C0%40post8.tele.dk%3E/?mtop=176162

> 
> Thanks for those links. I have no recollection of either thread...

Well, I do ;-)

To answer some of the questions, assuming nothing has been changed in 
the meantime ...

Practically, there are four uses of strings in POV-Ray. One is for TTF 
output. Here, UTF-8 cannot be used internally, but UCS2/4 is used, and 
the strings are converted for this use. The reason is that UTF-8 is a 
variable length encoding (as is UTF-16), which is not very efficient to 
decode on the fly, though when building the extrusions for text objects, 
that would hardly matter, but anyway, that is the way it is.

The other main use is for include files and all other files. Here a path 
is needed. 20 years back, opening anything but ASCII based paths was a 
mess, and to an extend it still is. There was some abstraction put in 
place, and by all means, here Unicode or more precisely the charset, 
should also work these days.

The next use case is for debug output.This output goes to the console or 
message window in a GUI. Originally, neither supported Unicode (20 years 
ago), so there comes the implementation: Replace any character code => 
128 with a space. This eliminates the need to support Unicode or UTF-8 
in those windows and on terminals.

And this brings us to the similar use: The output of text files. They 
used to use the same implementation. And from what I am reading, i guess 
nothing has changed.

Oh, and for the 256 character limit, that comes from yet another source: 
The tokenizer responsible for parsing strings reuses the same buffer as 
it uses for parsing tokens. So if this hasn't been changed, that is 
where the limit comes from.

In short, as you can see, the string implementation touches many parts 
of POV-Ray, and in order to make much progress, all these points would 
have to be fixed simultaneously.

Thorsten

Post a reply to this message