POV-Ray : Newsgroups : povray.beta-test : Problem with TTF object, danish characters : Re: Problem with TTF object, danish characters Server Time
17 May 2024 01:46:53 EDT (-0400)
  Re: Problem with TTF object, danish characters  
From: Jon A  Cruz
Date: 25 Sep 2001 21:21:34
Message: <3BB12D12.E18CB523@geocities.com>
Ron Parker wrote:

> It is UTF-8, but it starts with EF BB BF, the UTF-8 encoding of the FFFE
> endianness indicator (as written on an Intel machine, obviously.  A Motorola
> machine would use EF BF BE)

Actually, UTF-8 is byte-order independent. So the UTF-8 BOM will always be EF BB
BF.


> We could easily interpret the presence of those
> three bytes as an implicit UTF-8 charmap, and infer the endianness of the
> other UTF-8 characters in the file at the same time.

http://www.unicode.org/unicode/faq/utf_bom.html

I had just run into this on some Java related issues.
Basically, the BOM is a special use of a standard "ZERO WIDTH NON-BREAKING SPACE"
character. Sometimes it might be treated as a BOM (or UTF-8 flag) and stripped out,
but it doesn't have to be. At the begining of a file it's probably a good idea,
though.



--
Jon A. Cruz
http://www.geocities.com/joncruz/action.html


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.