POV-Ray: Newsgroups: povray.general: Unicode

POV-Ray : Newsgroups : povray.general : Unicode		Server Time 3 Jul 2026 18:22:25 EDT (-0400)

From: John M Dlugosz
Subject: Unicode
Date: 12 Sep 1999 15:03:30
Message: <37dbf902@news.povray.org>

I've heard mention of a "Unicode Patch" for POV.  Where can I see more
information on it?

Specifically, is it just to allow Unicode in the true-type font text
objects, or will it allow me to use greek letters as variable names and
macro parameters?

--John

Post a reply to this message

From: Ken
Subject: Re: Unicode
Date: 12 Sep 1999 15:11:31
Message: <37DBFA7D.E64C33B@pacbell.net>

"John M. Dlugosz" wrote:
> 
> I've heard mention of a "Unicode Patch" for POV.  Where can I see more
> information on it?
> 
> Specifically, is it just to allow Unicode in the true-type font text
> objects, or will it allow me to use greek letters as variable names and
> macro parameters?
> 
> --John

I do not know anything specific about the patch but it can be found here:
http://www.geocities.com/SiliconValley/Network/4453/unipatch/

-- 
Ken Tyler

See my 850+ Povray and 3D Rendering and Raytracing Links at:
http://home.pacbell.net/tylereng/index.html

Post a reply to this message

From: Ron Parker
Subject: Re: Unicode
Date: 12 Sep 1999 23:23:00
Message: <slrn7toa6g.kq.parkerr@linux.parkerr.fwi.com>

On Sun, 12 Sep 1999 14:03:25 -0500, John M. Dlugosz <joh### [at] dlugoszcom> wrote:
>I've heard mention of a "Unicode Patch" for POV.  Where can I see more
>information on it?
>
>Specifically, is it just to allow Unicode in the true-type font text
>objects, or will it allow me to use greek letters as variable names and
>macro parameters?

It's only a text object patch, it has no effect on what the parser considers
to be a character or on what the editor will allow you to input.

Post a reply to this message

From: Jon A Cruz
Subject: Re: Unicode
Date: 13 Sep 1999 04:05:29
Message: <37DCAFFA.CF25537D@geocities.com>

Ron Parker wrote:

> On Sun, 12 Sep 1999 14:03:25 -0500, John M. Dlugosz <joh### [at] dlugoszcom> wrote:
> >I've heard mention of a "Unicode Patch" for POV.  Where can I see more
> >information on it?
> >
> >Specifically, is it just to allow Unicode in the true-type font text
> >objects, or will it allow me to use greek letters as variable names and
> >macro parameters?
>
> It's only a text object patch, it has no effect on what the parser considers
> to be a character or on what the editor will allow you to input.

There was some talk of wanting to at least allow different languages in the
comments. The main problem is that the text display is fairly isolated, whereas
the parsing is more of all over. Given that the Unicode patch was my first
effort to mess with POV-Ray code, I'm not up to speed on wholesale changes.

Of course since at least one member of the POV-Ray team had been asking me about
this, I might try to look into it if there are actuall people who want to use
it.

--
"My new computer's got the clocks, it rocks
But it was obsolete before I opened the box" - W.A.Y.

Post a reply to this message

From: Ron Parker
Subject: Re: Unicode
Date: 13 Sep 1999 09:52:37
Message: <slrn7tq0d6.v8.parkerr@ron.gwmicro.com>

On Mon, 13 Sep 1999 01:04:10 -0700, Jon A. Cruz wrote:
>There was some talk of wanting to at least allow different languages in the
>comments. The main problem is that the text display is fairly isolated, whereas
>the parsing is more of all over. 

The part you'd want to change, though, is mostly a smallish case statement
in tokenize.c.  At the moment, it treats a-z, A-Z, and _ as the beginning of
a symbol.  You'd want to make it treat anything with the high bit set as a 
symbol as well.  You'd also need to modify the Read_Symbol code to recognize
and parse correctly characters with the high bit set.  This would allow 
high-bit characters to be used inside of declared symbols, which includes 
macro names and arguments.  There'd probably also be a few small modifications
needed to some error-reporting code, unless you're comfortable with sending 
UTF-8 to the error stream in the event of an undefined symbol or the like.

Believe it or not, the tokenizer can already deal with high-bit UTF-8 characters
inside of comments and literal strings.  The built-in editor in the Windows 
version can't, of course, but I'd expect someone using UTF-8 to use something
more suited to Unicode, such as Unipad.

Parsing UCS-2 or UCS-4 would be a lot more difficult, of course.

Post a reply to this message

From: Jon A Cruz
Subject: Re: Unicode
Date: 13 Sep 1999 12:21:03
Message: <37DD247A.11528BC@geocities.com>

Ron Parker wrote:

> On Mon, 13 Sep 1999 01:04:10 -0700, Jon A. Cruz wrote:
> >There was some talk of wanting to at least allow different languages in the
> >comments. The main problem is that the text display is fairly isolated, whereas
> >the parsing is more of all over.
>
> The part you'd want to change, though, is mostly a smallish case statement
> in tokenize.c.  At the moment, it treats a-z, A-Z, and _ as the beginning of
> a symbol.  You'd want to make it treat anything with the high bit set as a
> symbol as well.  You'd also need to modify the Read_Symbol code to recognize
> and parse correctly characters with the high bit set.  This would allow
> high-bit characters to be used inside of declared symbols, which includes
> macro names and arguments.  There'd probably also be a few small modifications
> needed to some error-reporting code, unless you're comfortable with sending
> UTF-8 to the error stream in the event of an undefined symbol or the like.
>
> Believe it or not, the tokenizer can already deal with high-bit UTF-8 characters
> inside of comments and literal strings.  The built-in editor in the Windows
> version can't, of course, but I'd expect someone using UTF-8 to use something
> more suited to Unicode, such as Unipad.
>
> Parsing UCS-2 or UCS-4 would be a lot more difficult, of course.

That's a good start, thanks. But here are few more problems than that :-(

I'd need to go through things to make sure that the "right" thing is done for such
as chr(), asc(), etc. There are several different logical approaches to the
problem, but backwards compatibility is one of the biggies. Also need to handle
invalid UTF-8, etc.

Guess I'd better get on it.

--
"My new computer's got the clocks, it rocks
But it was obsolete before I opened the box" - W.A.Y.

Post a reply to this message