POV-Ray: Newsgroups: povray.unix: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails

POV-Ray : Newsgroups : povray.unix : build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails		Server Time 16 May 2024 23:59:16 EDT (-0400)

From: Dennis Clarke
Subject: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 02:05:00
Message: <web.44890de17c275c36d0ea7c2a0@news.povray.org>

This has taken a LOT of hours.

Suffice it to say that the simple ./configure will not work unless one uses
the non-standard GNU revision of sed.  If I do not include a pile of GNU
tools and strictly stick with the tools in /usr/xpg4/bin and /usr/ccs/bin
then numerous things go wrong.  Looks like the build process is heavily
tied to Linux and GNUish things.

I get errors like :

configure.gnu: editing libtiff/Makefile sed: command garbled:
s,^tiffvers.h:.*,, ; s,${SRCDIR}/tiffvers.h,tiffvers.h,g

I solved that by building tiff-3.8.2 manually and I include the GNU things
in my PATH at the end then I can get past the configure step.

However the compile fails after a ghastly number of warnings :

CC  -O3   -L/usr/openwin/lib -R/usr/openwin/lib  -o povray  svga.o unix.o
xwin.o ../source/libpovray.a ../source/base/libbase.a
..../source/frontend/libfrontend.a  -ltiff -ljpeg -lpng12 -lz -lXpm -lsocket
-lnsl  -lSM -lICE -lX11 -lm
Undefined                       first referenced
 symbol                             in file
int UNIX_allow_file_write(const char*,unsigned)
..../source/libpovray.a(pov_util.o)
int UNIX_allow_file_read(const char*,unsigned)
..../source/libpovray.a(pov_util.o)
ld: fatal: Symbol referencing errors. No output written to povray
gmake[2]: *** [povray] Error 1
gmake[2]: Leaving directory
`/export/medusa/dclarke/build/povray/povray-3.6.1-sparc/unix'
gmake[1]: *** [all-recursive] Error 1
gmake[1]: Leaving directory
`/export/medusa/dclarke/build/povray/povray-3.6.1-sparc'
gmake: *** [all] Error 2
$

Since this looks like the final linking stage I will assume that my
LD_OPTIONS did not get passed along.  The absent "int
UNIX_allow_file_write(const char*,unsigned) " must be in some library
somewhere.

I will try this manually with different directory paths in my -L and/or -R
options.

Dennis

Post a reply to this message

From: Nicolas Calimet
Subject: Re: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 03:18:34
Message: <448920ca$1@news.povray.org>

> Suffice it to say that the simple ./configure will not work unless one uses
> the non-standard GNU revision of sed.
 > [...]
 > configure.gnu: editing libtiff/Makefile sed: command garbled:
 > s,^tiffvers.h:.*,, ; s,${SRCDIR}/tiffvers.h,tiffvers.h,g

	Sorry, but the problem is not using GNU sed or not, it is that your
stock sed is buggy.

> Looks like the build process is heavily
> tied to Linux and GNUish things.

	Not a all, please read the "Compatibility" section of the INSTALL file.
Solaris is usually known to be a problematic environment.

> CC  -O3   -L/usr/openwin/lib -R/usr/openwin/lib  -o povray  svga.o unix.o
> xwin.o ../source/libpovray.a ../source/base/libbase.a
> ...../source/frontend/libfrontend.a  -ltiff -ljpeg -lpng12 -lz -lXpm -lsocket
> -lnsl  -lSM -lICE -lX11 -lm
> Undefined                       first referenced
>  symbol                             in file
> int UNIX_allow_file_write(const char*,unsigned)
> ...../source/libpovray.a(pov_util.o)
> int UNIX_allow_file_read(const char*,unsigned)
> ...../source/libpovray.a(pov_util.o)
> ld: fatal: Symbol referencing errors. No output written to povray

	This problem has been reported before and is archived here:

http://news.povray.org/povray.bugreports/thread/%3Cugl011lbbalv4oq1q9in1rh157vm6i7j7n%404ax.com%3E/

> Since this looks like the final linking stage I will assume that my
> LD_OPTIONS did not get passed along.  The absent "int
> UNIX_allow_file_write(const char*,unsigned) " must be in some library
> somewhere.

	No, this is a bug in unix.cpp: simply remove the two 'const' in the
the following function definitions:

int UNIX_allow_file_read (const char *Filename, const unsigned int FileType)
int UNIX_allow_file_write (const char *Filename, const unsigned int FileType)

	so that you get instead:

int UNIX_allow_file_read (const char *Filename, unsigned int FileType)
int UNIX_allow_file_write (const char *Filename, unsigned int FileType)

	- NC

Post a reply to this message

From: Dennis Clarke
Subject: Re: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 04:55:01
Message: <web.448936dcd12e44c5d0ea7c2a0@news.povray.org>

Nicolas Calimet <pov### [at] freefr> wrote:
> > Suffice it to say that the simple ./configure will not work unless one uses
> > the non-standard GNU revision of sed.
>  > [...]
>  > configure.gnu: editing libtiff/Makefile sed: command garbled:
>  > s,^tiffvers.h:.*,, ; s,${SRCDIR}/tiffvers.h,tiffvers.h,g
>
>  Sorry, but the problem is not using GNU sed or not, it is that your
> stock sed is buggy.

This ?

$ which sed
/usr/xpg4/bin/sed

This sed is buggy?

I guess I have to go and see if there is a bug filed against it but somehow
I doubt it.

> > Looks like the build process is heavily
> > tied to Linux and GNUish things.
>
>  Not a all, please read the "Compatibility" section of the INSTALL file.
> Solaris is usually known to be a problematic environment.

Problematic ?  Old revs of POV-Ray were much easier to deal with .. back in
1999 or so.  How very odd.

> > CC  -O3   -L/usr/openwin/lib -R/usr/openwin/lib  -o povray  svga.o unix.o
> > xwin.o ../source/libpovray.a ../source/base/libbase.a
> > ...../source/frontend/libfrontend.a  -ltiff -ljpeg -lpng12 -lz -lXpm -lsocket
> > -lnsl  -lSM -lICE -lX11 -lm
> > Undefined                       first referenced
> >  symbol                             in file
> > int UNIX_allow_file_write(const char*,unsigned)
> > ...../source/libpovray.a(pov_util.o)
> > int UNIX_allow_file_read(const char*,unsigned)
> > ...../source/libpovray.a(pov_util.o)
> > ld: fatal: Symbol referencing errors. No output written to povray
>
>  This problem has been reported before and is archived here:

http://news.povray.org/povray.bugreports/thread/%3Cugl011lbbalv4oq1q9in1rh157vm6i7j7n%404ax.com%3E/

Oh excellent .. a documented bug.

OKay .. I see this :

   This is because the declaration in the .h file does not match the
   intended definition in the .cpp file. g++ does not care about the const
   restriction.

Probably something that g++/gcc ignores.
I know that when I tried the build I saw literally thousands of warnings.

> > Since this looks like the final linking stage I will assume that my
> > LD_OPTIONS did not get passed along.  The absent "int
> > UNIX_allow_file_write(const char*,unsigned) " must be in some library
> > somewhere.
>
>  No, this is a bug in unix.cpp: simply remove the two 'const' in the
> the following function definitions:
>
> int UNIX_allow_file_read (const char *Filename, const unsigned int FileType)
> int UNIX_allow_file_write (const char *Filename, const unsigned int FileType)
>
>  so that you get instead:
>
> int UNIX_allow_file_read (const char *Filename, unsigned int FileType)
> int UNIX_allow_file_write (const char *Filename, unsigned int FileType)
>

Let me give that a try and see how things go.  Hopefully I do not need to do
a "make distclean" and start over.  Actually, since I ran the "./configure"
from a directory other than the source directory I can just kill it and
start over :

Since I am doing builds for two architectures I am not building "in place"
but from another directory at the same level as the source.

$ pwd
/export/medusa/dclarke/build/povray/povray-3.6.1-sparc
$ cd ..
$ rm -rf povray-3.6.1-sparc
$ mkdir povray-3.6.1-sparc
$ cd povray-3.6.1-sparc

I'll see if I can find and edit that source :

$ find ../povray-3.6.1 -type f -name unix.cpp
.../povray-3.6.1/unix/unix.cpp

OKay .. so I then do ... at line 1896 there ...

/**************** buggy line ****************
int UNIX_allow_file_read (const char *Filename, const unsigned int FileType)
 ********************************************/
int UNIX_allow_file_read (const char *Filename, unsigned int FileType)

Then again at line 1972 ...

/******************** buggy line ****************
int UNIX_allow_file_write (const char *Filename, const unsigned int
FileType)
 ************************************************/
int UNIX_allow_file_write (const char *Filename, unsigned int FileType)

I then run ../povray-3.6.1/configure etc etc again and all goes well.

Now things fail again at this point because /usr/xpg4/bin/make somehow does
not work.  Neither does the version of make in /usr/ccs/bin/make but,
strangely, GNU make ver 3.81 does seem to work magically even though these
other two ( highly standards compliant versions ) seem to fail.

But when I run gmake the whole thing seems to build while emitting scads of
warning about everything one can imagine.

Looks like we have linkage !

CC  -O3   -L/usr/openwin/lib -R/usr/openwin/lib  -o povray  svga.o unix.o
xwin.o ../source/libpovray.a ../source/base/libbase.a
.../source/frontend/libfrontend.a  -ltiff -ljpeg -lpng12 -lz -lXpm -lsocket
-lnsl  -lSM -lICE -lX11 -lm
gmake[2]: Leaving directory
`/export/medusa/dclarke/build/povray/povray-3.6.1-sparc/unix'
gmake[2]: Entering directory
`/export/medusa/dclarke/build/povray/povray-3.6.1-sparc'
cat ../povray-3.6.1/povray.ini.in | sed
"s,__POVLIBDIR__,/export/medusa/dclarke/local/sparc/share/povray-3.6,g" >
../povray.ini
gmake[2]: Leaving directory
`/export/medusa/dclarke/build/povray/povray-3.6.1-sparc'
gmake[1]: Leaving directory
`/export/medusa/dclarke/build/povray/povray-3.6.1-sparc'
$ find . -type f -name povray
../unix/povray
$ file ./unix/povray
../unix/povray:  ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+
Required, dynamically linked, not stripped
$ ldd ./unix/povray
        libtiff.so.3 =>  /export/medusa/dclarke/local/sparc/lib/libtiff.so.3
        libjpeg.so.62 =>         /opt/csw/lib/libjpeg.so.62
        libpng12.so.0 =>         /opt/csw/lib/libpng12.so.0
        libz.so =>       /opt/csw/lib/libz.so
        libXpm.so.4.11 =>        /opt/csw/lib/libXpm.so.4.11
        libsocket.so.1 =>        /lib/libsocket.so.1
        libnsl.so.1 =>   /lib/libnsl.so.1
        libSM.so.6 =>    /usr/openwin/lib/libSM.so.6
        libICE.so.6 =>   /usr/openwin/lib/libICE.so.6
        libX11.so.4 =>   /usr/openwin/lib/libX11.so.4
        libm.so.1 =>     /lib/libm.so.1
        libCstd.so.1 =>  /lib/libCstd.so.1
        libCrun.so.1 =>  /lib/libCrun.so.1
        libc.so.1 =>     /lib/libc.so.1
        libdl.so.1 =>    /usr/lib/libdl.so.1
        libmp.so.2 =>    /usr/lib/libmp.so.2
        libXext.so.0 =>  /usr/openwin/lib/libXext.so.0
        /usr/lib/cpu/sparcv8plus/libCstd_isa.so.1
        /usr/platform/SUNW,Sun-Blade-1000/lib/libc_psr.so.1
$

I have the libtiff in an odd location for this "test" build.

Wow .. now comes the really hard part, is there a test scene file that
always reports the same output data byte for byte ?  Some way to confirm
the accurate functionality of the resultant binary that I have here ?

I do have POV-Ray 3.6.1 built on a PowerPC machine here running Fedora Core
4 and I had to build a nearly complete tool chain to get that done.  It
works and if I compare the output from that to this should I expect
agreement if I do not use antialiase options or jitter or any sort of
random noise?

I guess I am asking .. is there a way to verify functionality ?

Lastly .. thank you very much ... you made the long and tortorous
frustrating process possible.

Dennis

Post a reply to this message

From: Warp
Subject: Re: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 06:46:58
Message: <448951a1@news.povray.org>

Dennis Clarke <dcl### [at] blastwaveorg> wrote:
> Wow .. now comes the really hard part, is there a test scene file that
> always reports the same output data byte for byte ?  Some way to confirm
> the accurate functionality of the resultant binary that I have here ?

  Different systems and sometimes even different compilers can produce
POV-Ray binaries which produce different results (in the least significant
bits). All kinds of floating point optimizations, for instance, can produce
different results which usually are not visible because the differences are
so minimal, but revealed by a bit-by-bit comparison of the result.
  Floating point artithmetic is slightly fuzzy and different systems and
compilers may generate slightly differing results (and even the same compiler
may produce different results depending on the optimization flags). Thus you
shouldn't expect identical results from a program so heavily dependant on
floating point as POV-Ray compiled in two totally different architectures
with two totally different compilers.

-- 
                                                          - Warp

Post a reply to this message

From: Thorsten Froehlich
Subject: Re: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 07:36:13
Message: <44895d2d$1@news.povray.org>

Nicolas Calimet wrote:
> Solaris is usually known to be a problematic environment.

Indeed, in particular with Sun's toolchain. Just the list of boost-related
bugs fixed last month is revealing:
http://sunsolve.sun.com/search/advsearch.do?collection=PATCH&type=collections&max=50&language=en&queryKey5=121017&toDocument=yes

In general, sadly Sun is slow in getting their act together when it comes to
their development tools. For many years boost would not work at all with
Sun's compilers due to their many unavoidable bugs.

	Thorsten

PS: To everyone but Nicolas as he already knows - parts of boost
(www.boost.org) will be required to build POV-Ray 3.7 .

Post a reply to this message

From: Dennis Clarke
Subject: Re: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 11:45:00
Message: <web.448996c2d12e44c53bce8e910@news.povray.org>

Warp <war### [at] tagpovrayorg> wrote:
> Dennis Clarke <dcl### [at] blastwaveorg> wrote:
> > Wow .. now comes the really hard part, is there a test scene file that
> > always reports the same output data byte for byte ?  Some way to confirm
> > the accurate functionality of the resultant binary that I have here ?
>
>   Different systems and sometimes even different compilers can produce
> POV-Ray binaries which produce different results (in the least significant
> bits). All kinds of floating point optimizations, for instance, can produce
> different results which usually are not visible because the differences are
> so minimal, but revealed by a bit-by-bit comparison of the result.
>   Floating point artithmetic is slightly fuzzy and different systems and
> compilers may generate slightly differing results (and even the same compiler
> may produce different results depending on the optimization flags). Thus you
> shouldn't expect identical results from a program so heavily dependant on
> floating point as POV-Ray compiled in two totally different architectures
> with two totally different compilers.
>

You are totally right and wrong at the same time :-)

In my opinion.

please .. let me expound on that before you delete me as a mindless troll
that wanders into mailists groping in the dark for anything to beat up on
.... I swear that is not me !

I have this belief, and its just a belief founded on simple theory, that a
computer that complies with well documented specs should be able to take
code and compile it, run it, and produce the exact same results repeatedly
provided that there are no noise components in the process such as
stochastic processes or PRNG/RNG data involved.  One would also expect that
the architecture of the computer would be of no concern.

Yes, a silly dream to be sure but I guess I come from a background of
Fortran and Pascal ( and something we called machine code! ) from the 80's
on mainframes while I worked in the military and aerospace industry.  It
was very common to take code from a given development environment with
tightly defined input data sets and move that code to other systems and
then perform various tests to ensure that the output data matched
precisely.  Needless to say the process of working with floating point was
made more interesting in that we often defined our own floating point spec
( sign + exponent + mantissa + other stuff like an error field etc etc ).
The emergence of commercially available libraries for matrix operations and
bessel functions etc etc changed all that.

I guess i am trying to say that, as my old professor used to say, working
with floating point is like moving piles of dirt on a beach.  Every time
you move the dirt you pick up some sand and lose some dirt.  Its the nature
of the beast.  Yet somehow we were able to take code from a Honeywell
CP6/DPS8 system and move it to an IBM 3090 with an exact match on the
output data given well defined test sets.

I have a strict belief that code like POV-Ray could perform the exact same
sort of test if we exclude all noise components and random signal
components. I was thinking ( brace yourself ) of a sphere hangng over a
checkerboard with no anti-aliasing and nothing that would create expected
differences in output data.

I could then build the binary using highly standards compliant tools on
multiple architectures and get the exact same data from a given scene file
and set of parameters.

Probably a silly idea but it may be fun to get the output data in a floating
point format as opposed to an image.  Perhaps the OpenEXR image format (
http://www.openexr.com/ ) may be of value with its 32-bit floating point
and 32-bit integer data types.

Sorry for the long post here ... I recognize your name going back many years
in this project and was hoping that you could express your feelings on some
of these ideas.  At the very least, there is the issue of my use of
standard tools from the Sun/Solaris world and watching them issue thousands
of warnings.

Dennis

Post a reply to this message

From: Warp
Subject: Re: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 12:26:13
Message: <4489a125@news.povray.org>

Dennis Clarke <dcl### [at] blastwaveorg> wrote:
> I have a strict belief that code like POV-Ray could perform the exact same
> sort of test if we exclude all noise components and random signal
> components. I was thinking ( brace yourself ) of a sphere hangng over a
> checkerboard with no anti-aliasing and nothing that would create expected
> differences in output data.

  It's not even a question of randomness or noise. (In fact, if POV-Ray
uses the its own randomness algorithm, its result will be the same in all
systems.)

  The problem with floating point numbers is that it depends a lot on how
they are calculated whether two different systems will give identical
results.
  For instance, Intel CPUs (and compatibles) use actually 10 bytes long
floating point numbers internally even though the program stores them
as 8 bytes long in memory. Some operations might, just might give a
different result due to the increased accuracy. Some other processors
might use 8 bytes long floating point numbers internally, others might
use even 16 bytes long ones.
  Some numbers (such as 0.1) cannot be represented accurately with
(binary) floating point and the error grows if such inaccurate numbers
are calculated together (for instance, if you sum 0.1 with itself ten
times you will find out that the result is not *exactly* 1.0 but there's
a difference in the least significant bits of the result).

  Also compiler optimizations may affect the accuracy of the computations.
For example, a compiler might take this code:

r1 = a/x; r2 = b/x; r3 = c/x; r4 = d/x;

and optimize it (at least if allowed) into an equivalent of this:

tmp = 1/x; r1 = a*tmp; r2 = b*tmp; r3 = c*tmp; r4 = d*tmp;

  Now, "a/x" often gives a different result than "a*(1/x)" due to how
floating point calculations work. (Technically speaking, the "1/x"
calculation loses information which is not lost when the final result is
calculated with a direct "a/x".)
  There are many other places where the compiler might use shortcuts in
order to create faster code.

  All these differences are in the very least significant bits of the
result. Whether this minuscule difference will end up as a differing
pixel in the resulting image depends a lot on what is done. Some operations
done by a raytracer like POV-Ray may be quite prone to even minuscule
changes in the very least-significant bits of some calculations. For
example a change of 0.00001% in the direction of a ray may in some scenes
cause a big enough difference to produce a pixel which differs in the
least significant bit (and sometimes even more).

-- 
                                                          - Warp

Post a reply to this message

From: Dennis Clarke
Subject: Re: build of povray 3.6.1 on Solaris 8 with Sun ONE Studio 11 fails
Date: 9 Jun 2006 14:15:00
Message: <web.4489b8c5d12e44c53bce8e910@news.povray.org>

Warp <war### [at] tagpovrayorg> wrote:
> Dennis Clarke <dcl### [at] blastwaveorg> wrote:
> > I have a strict belief that code like POV-Ray could perform the exact same
> > sort of test if we exclude all noise components and random signal
> > components. I was thinking ( brace yourself ) of a sphere hangng over a
> > checkerboard with no anti-aliasing and nothing that would create expected
> > differences in output data.
>
>   It's not even a question of randomness or noise. (In fact, if POV-Ray
> uses the its own randomness algorithm, its result will be the same in all
> systems.)

  That stands to reason.

>   The problem with floating point numbers is that it depends a lot on how
> they are calculated whether two different systems will give identical
> results.

  I agree with the "how" but not with the "different systems".

  Let's look further down in your message here :

>   For instance, Intel CPUs (and compatibles) use actually 10 bytes long
> floating point numbers internally even though the program stores them
> as 8 bytes long in memory. Some operations might, just might give a
> different result due to the increased accuracy. Some other processors
> might use 8 bytes long floating point numbers internally, others might
> use even 16 bytes long ones.

  This is an issue of implementation of a given floating point
representation and is no longer a concern in post 1992 era computer work.
I would be so bold as to go back even further to the first draft of the
IEEE 754 standards in 1985 and then again in 1987.  Previous to this we
needed to create our own implementation and having done so with portable
code ( or extensive reams of machine code ) we were able to perform
calculations on any architecture with a precise and exactly identical
result.

  This was because we had total control over "how" the computation was
performed in any given architecture and we also had total control over the
internal representation of the actual floating point data right down to the
storage in memory.  These days were long and difficult but the consistency
and quality of the code was very very high.  Computations executed in a
precise manner and with a predictable result even when the results were not
precise.  Thats a floating point joke.  I hope you caught it. :-)

  The arrival of technology that implemented the floating point
infrastructure for us in both hardware and software was both a boon and a
bane.  As you clearly pointed out above we see that different architectures
implement vastly different  methods of storing floating point data and thus
the calculations are wildly different and not just in the least significant
bits.  Is this a feature of the actual internal register structure of these
chips?  No Sir it was not.  The ability to move binary data around in the
6502 or 6809 or early Intel 8088 or Zilog Z80 were all essentially the
same.  We loaded bits, we shifted bits and we ported floating point
implementations to those processors with the precise exact same results in
every case.  It was the arrival of well documented standards such as IEEE
754 as well as their implementations via the early Weitek chips and Intel
8087 FPU that allowed us to move away from the mainframes and workstations
to lower cost hardware.  Lower cost software and significantly different
results at time. I am sure we all remember the fiasco of the Intel Pentium
P90 chip and I have a dual processor system here today that has the flawed
processor.

But I digress.

The real issue is that we seem to have lost fine control of the "how" a
calculation is performed even though we have access to standard
implementations of floating point data as well as a multitude of
calculations in hardware.  Even transcendental calculations and logarithms
are implemented in hardware for us today.  It is my hope the log(x) will
always report the precise same result on ALL architectures due to these
standard implementations.

>   Some numbers (such as 0.1) cannot be represented accurately with
> (binary) floating point and the error grows if such inaccurate numbers

<snip>

sorry, I guess I just wanted to say that I have had to write low level
floating point code and that I have a passing familiarity with the issues.
:-)

>   Also compiler optimizations may affect the accuracy of the computations.
> For example, a compiler might take this code:
>
> r1 = a/x; r2 = b/x; r3 = c/x; r4 = d/x;
>
> and optimize it (at least if allowed) into an equivalent of this:
>
> tmp = 1/x; r1 = a*tmp; r2 = b*tmp; r3 = c*tmp; r4 = d*tmp;

YES!  This is the loss of fine control that I refer to.

However, one would assume that the exact same code with the exact same
compiler software vendor would result in the exact same results on
differing hardware.  I can report that floating point results that I
perform on either Solaris x86 or Sparc result in the precise same results
given the same compiler switches.  I can say that I have similar
experiences with IBM 3090 mainframes and AS/400 and even with some embedded
technology.  Essentially the bits will always be the same on any architeture
if we have fine control.  The various manipulations of compiler
optimizations will change all of that.  Quickly.  ( another pun .. sorry )

I was once told by a real honest to goodness computer scientist that
compiler optimizations are a great way to arrive at the wrong answer very
quickly.

There was a man named Goldberg that has written some excellent material on
the issues.  I'll have to look him up.

>   Now, "a/x" often gives a different result than "a*(1/x)" due to how
> floating point calculations work. (Technically speaking, the "1/x"
> calculation loses information which is not lost when the final result is
> calculated with a direct "a/x".)
>   There are many other places where the compiler might use shortcuts in
> order to create faster code.

  Indeed.  I agree entirely.

  However ... you probably saw this coming.

  The IEEE754 floating point standard has a few data types and in every case
and in every calculation we are given the possibility of an "inexact" flag
returned to us.  There are very well defined accuracy requirements on
floating-point operations and in all cases the result of any operation must
return an exact result within the 64-bits of the double extended data type.
There is a least 79 bits of data that we can refer to and 64 bits are
absolutely accurate OR we get a return flag of "inexact".  This allows us
to perform the calculation in some other manner.

>   All these differences are in the very least significant bits of the
> result. Whether this minuscule difference will end up as a differing
> pixel in the resulting image depends a lot on what is done.

Well with povray I am willing to try a few experiments on a few
architectures just to see if its possible to get an exact same data result.
 For fun if nothing else.

> Some operations
> done by a raytracer like POV-Ray may be quite prone to even minuscule
> changes in the very least-significant bits of some calculations. For
> example a change of 0.00001% in the direction of a ray may in some scenes
> cause a big enough difference to produce a pixel which differs in the
> least significant bit (and sometimes even more).

The butterfly effect and I agree with that entirely.  I would expcet that
the calculation of a surface normal will result in the exact same vector
every time but the compiler optimizations may modify this in the least bits
as you say.  I think that I saw a post ( years and years ago ) in which
someone repeatedly performed a POVRay trace of a sphere with each iteration
shrinking in scale by a factor of ten.  Eventually the microscopic sphere
was no longer a sphere anymore.  We can only be so precise with computers
and a given floating point implementation.

Dennis

Post a reply to this message