POV-Ray: Newsgroups: povray.pov4.discussion.general: On povr's accidental text{cmap} capability.: On povr's accidental text{cmap} capability.

POV-Ray : Newsgroups : povray.pov4.discussion.general : On povr's accidental text{cmap} capability. : On povr's accidental text{cmap} capability.		Server Time 20 Apr 2024 00:32:22 EDT (-0400)

From: William F Pokorny
Date: 1 Apr 2023 04:08:15
Message: <6427e66f$1@news.povray.org>

Elsewhere while working on the keyword_status() idea - which became 
initially two keywords in word_is() and word_get() - I discovered the 
povr fork picked up Christoph's text object cmap keyword and associated 
code changes due simply to when I created the povr fork.

I admit to not paying attention to 'cmap' when it was being worked on in 
early 2019. I wrongly thought it was freetype fork specific and I tuned 
it out while busy with other code play.

Well, I have it. I took a look over the past couple days. The 'cmap' 
functionality is going to stay in povr. It's impossible to pass on what 
it already offers - even if bugs later turn up.

See attached image. Using v3.8 beta 2 rendering some utf8 text in a 
text{} object at top. On the bottom rendering the same utf8 text with 
povr - identical results with or without the cmap{} block.

I worked first with a cmap{} block and it took the ttfdump utility for 
me to guess probably something like 'cmap { 0,3 charset 0 }' was what I 
needed while using the .../dejavu/DejaVuSans.ttf font file coming with 
Ubuntu 22.04.

I then wondered what would happen if I removed the cmap{} block 
altogether. I was surprised to find it worked just as well. This comes 
to the first character map table in the font being used as the default 
and it being 0,3. Plus the 'cmap{}' related changes when the font is 
read correctly read the right cmap information in the font file.

So! On unix / linux what use is the cmap{} block you ask. Well it might 
or might not be of much use. I don't have enough experience to know.

Do the font files on linux usually have already usable defaults as 
encoded? I'd make a small bet those fonts shipping with linux 
distributions do. Font files coming from, say, a windows environment - 
maybe not. Perhaps the cmap{} functionality will be needed to pick the 
correct internal cmap with those. I'm guessing.

---

Related. The DejaVuSans.ttf font I picked also encodes two indows 
specific character maps too. The following cmap examples also work to 
some degree or other:

cmap { 3,1 charset 0 } // Works as well as linux (Apple) {0,3 charset 0}
cmap { 3,1 charset 1252 } // Works, but less well in general (a).

Anyhow. Going to keep the functionality in povr and play with it. We'll 
see what other issues pop up.

---
Aside: For a short time saw a very strange parsing error more text{} 
related than cmap{}. I started with some example cmap{} code off the 
newsgroup and it initially failed. It seemed somehow related to a 
semicolon following the string declare used within the text{} object! 
The SDL declare looked like:

#declare MyText = "..."; // Semicolon not needed.

However, as I played the parsing error went away and, try as I might, 
I've been unable to reproduce that parsing fail. Maybe something like 
dos line terminations vs unix/linux ones - or some utf8 character 
elsewhere in the scene mangling things. I don't know. Still, for the 
record, I did see what looked to me to be a bogus parsing fail while 
working on a scene with cmap{}.

Bill P.

(a) - With charset 1252 and similar it might be for the restricted set 
of windows characters you are better off for 'reasons.' When it or other 
charsets used, it seems to be dropping utf8 characters outside the first 
256 or something like that. On linux I suspect the charset won't be 
useful unless trying to match some specific windows character set behavior.

Post a reply to this message

Attachments:
Download 'cmapstory.png' (10 KB)

Preview of image 'cmapstory.png'