POV-Ray: Newsgroups: povray.beta-test: Problem with TTF object, danish characters: Re: Problem with TTF object, danish characters

POV-Ray : Newsgroups : povray.beta-test : Problem with TTF object, danish characters : Re: Problem with TTF object, danish characters		Server Time 3 Jul 2025 09:21:35 EDT (-0400)

From: Jon A Cruz
Date: 25 Sep 2001 21:21:34
Message: <3BB12D12.E18CB523@geocities.com>

Ron Parker wrote:

> It is UTF-8, but it starts with EF BB BF, the UTF-8 encoding of the FFFE
> endianness indicator (as written on an Intel machine, obviously.  A Motorola
> machine would use EF BF BE)

Actually, UTF-8 is byte-order independent. So the UTF-8 BOM will always be EF BB
BF.

> We could easily interpret the presence of those
> three bytes as an implicit UTF-8 charmap, and infer the endianness of the
> other UTF-8 characters in the file at the same time.

http://www.unicode.org/unicode/faq/utf_bom.html

I had just run into this on some Java related issues.
Basically, the BOM is a special use of a standard "ZERO WIDTH NON-BREAKING SPACE"
character. Sometimes it might be treated as a BOM (or UTF-8 flag) and stripped out,
but it doesn't have to be. At the begining of a file it's probably a good idea,
though.

--
Jon A. Cruz
http://www.geocities.com/joncruz/action.html

Post a reply to this message