POV-Ray: Newsgroups: povray.binaries.images: Divide by negative zero: Re: Divide by negative zero

POV-Ray : Newsgroups : povray.binaries.images : Divide by negative zero : Re: Divide by negative zero		Server Time 29 Apr 2024 03:41:26 EDT (-0400)

From: William F Pokorny
Date: 3 Nov 2018 11:06:45
Message: <5bddb985$1@news.povray.org>

On 11/3/18 4:36 AM, dick balaska wrote:
> On 11/2/18 10:20 AM, William F Pokorny wrote:
> 
...
> 
> -march=native is interesting.  "Up to 60% performance increase" [1].
> Well that's a hell of a thing if true. 

If your talking about not using say AVX2 and AVX512 instructions, the 
performance can easily be 2-4x worse than best possible for some code.

> But, you have to know which
> Intel arch you are compiling for.  Skylake/Haswell/Cannonlake/etc are
> all different.
> 

Christoph earlier this year suggested to me I think about enabling 
solver code optimized for different architectures as I work on solver 
fixes/updates. I found the gcc/g++ compilers have a cool feature called 
Function Multi-Versioning (FMV). The linux kernel has been using the 
feature for a long while. In addition to letting you hard code 
CPU-specific optimizations, it lets you optimize for a collection of 
processors/features with a one line modification to code:

#define MAX 1000000
int a[256], b[256], c[256];

__attribute__((target_clones("avx2","arch=atom","default")))
void foo(){
     int i,x;
     for (x=0; x<MAX; x++){
         for (i=0; i<256; i++){
             a[i] = b[i] + c[i];
         }
     }
}

The compiler creates target versions of the function for each 
architecture clone and at run time the best version (by some priority) 
of a function gets used. For later gcc versions and 'gnu' dynamically 
loaded code the resolution happens once during the load - the pointer to 
the function gets fixed to the 'best' compiled version for the CPU's 
feature set.

Unfortunately not yet gotten time to play with the feature. The FMV 
compiler feature isn't standardized across compilers though others seem 
to have something like it.

> If that is true, it's too bad there isn't a switch in packages based on
> arch.
> (This took me down a rabbit hole lamenting that debian/Ubuntu is stuck
> on "Hey, it runs on any x64 hardware, even AMD".
> Which made me think of Jr; "If I was still running Slackware, I'd have
> built my own Intel Clear Linux kernel with clang" ;)  )
> 
> [1]
> https://www.phoronix.com/scan.php?page=news_item&px=GCC-8-march-native-Skylake
> 

Bill P.

Post a reply to this message