POV-Ray: Newsgroups: povray.programming: qtpovray: Re: qtpovray

POV-Ray : Newsgroups : povray.programming : qtpovray : Re: qtpovray		Server Time 25 Apr 2024 16:02:40 EDT (-0400)
From: clipka
Date: 21 May 2018 04:10:03
Message: <5b027edb$1@news.povray.org>
Am 21.05.2018 um 00:43 schrieb dick balaska:

> I have no idea no to get qmake to do super-optimization, i.e. use the
> libraries
> 
> .../platform/libx86avx.a ../platform/libx86avxfma4.a
> .../platform/libx86avx2fma3.a
> 
> There doesn't seem to be a "try/compile" thingy like in autoconf.

Here is how these things are supposed to work:

- The `platform/x86/FOO` directories contain source code hand-optimized
for particular instruction set extensions.

- It does not hurt to build and link _all_ of them: There is code in
place that will choose the best optimization _at run-time_ for whatever
x86 family CPU the binary actually happens to be running on. Most
importantly, that code will make sure that optimized code is _never_
invoked on a CPU that doesn't support the instruction set used therein.

- The reason we're compiling the instruction set specific code into
separate libraries is _not_ that we may want to exclude any of them, but
just that (1) g++ only accepts the hand-optimized code if we allow it to
use the same instruction set extensions in its own optimizations as
well, (2) we _must not_ allow instruction set extensions to be used
anywhere outside these particular files (because the functions therein
are the only ones guarded by run-time CPU detection), and (3) the
automake build process does not provide a mechanism to vary the compiler
settings on a per-file basis, so we need to place them in different
libraries.

In Visual Studio, this is different: There, instruction set switches
just govern compiler optimizations, while the instructions are always
available to hand-optimized code -- unless the compiler is too old to
support them at all, but that can easily be tested at compile time (and
the problematic code disabled accordingly) via preprocessor macros.

- The reason we have instruction set related tests in our autoconf build
process at all is _not_ to test whether the CPU supports them, but
merely whether the compiler does. A test for compiler brand and version
could conceivably do the same job, but in the autoconf framework a
direct test for compiler flags is the easier and more robust route.


So, to properly build the optimized code, there are two hurdles to overcome:

(1) Make sure the files in question are compiled with the corresponding
compiler flags, differing from the flags used for the other files. I'm
not sure how to do that with qmake, but from what I gather from the
internerds, `SUBDIRS` might be the keyword you want to search for.

The flags in question should be as follows:

/platform/x86/avx       -mavx
/platform/x86/avxfma4   -mavx -mfma4
/platform/x86/avx2fma3  -mavx2 -mfma  [sic!]

(2) Make sure the compiler in general supports the flags in question. My
guess would be that the easiest route with qmake is probably to detect
compiler brand and version, and from that decide which flags to use
(and, accordingly, which of the platform-specific source code files to
incldue in the build).

For GCC, the version numbers appear to be as follows:

-mfma4    since gcc 4.5
-mavx     since gcc 4.6
-mfma     since gcc 4.7
-mavx2    since gcc 4.7

For clang I have no corresponding information. From the perspective of
the source code, it mimicks the GCC signature and version numbers, so
there never was any need to research its own version history with
respect to the flags. (A potential problem with clang to watch out for
is that Apple's Xcode uses its own version of clang with a different
version numbering scheme.)

A very simple solution would be to require gcc 4.7 or later, or a
compatible clang version, as a prerequisite.
Post a reply to this message