POV-Ray: Newsgroups: povray.beta-test: Technical verification build "v3.8.0-beta.668" - Unix package!: Re: Technical verification build "v3.8.0-beta.668"

POV-Ray : Newsgroups : povray.beta-test : Technical verification build "v3.8.0-beta.668" - Unix package! : Re: Technical verification build "v3.8.0-beta.668" - Unix package!		Server Time 23 Apr 2024 15:35:02 EDT (-0400)

From: clipka
Date: 24 Jun 2021 07:19:37
Message: <60d46a49$1@news.povray.org>

Am 24.06.2021 um 05:20 schrieb Thomas Debe:
> Am 24.06.21 um 01:28 schrieb clipka:
> Hallo Christoph !
> 
> 1.) Problem:
> Build povunix-v3.8.9-beta.668.tar.gz Compiler clang-x:
> Optimized Noise-Functions not compiled !
> Clang defines __clang__ Not __GNUC__

(I presume that's a typo and you mean povunix-v3.8.0-beta.668.tar.gz.)

Can you please double-check whether these problems are specific to the 
Unix source package (povunix-v3.8.0-beta.668.tar.gz), or whether they 
also occur when building form the "raw" repository source 
(https://github.com/c-lipka/povray/archive/refs/tags/v3.8.0-pre-beta.668.tar.gz)?

> 
> Solution :
> File :  unix/povconfig/syspovconfig.h
> 
> Z 169ff:  #if defined (__clang__)
>              #define HAVE_ASM_AVX
>              #define HAVE_ASM_AVX2
>              #define HAVE_ASM_FMA3
>              #define HAVE_ASM_FMA4
>           #endif
> 
> Z. 179: // most notably platform-specific optimized implementations.
>          #if defined (__GNUC__) || defined(__clang__)
> 
> 2.) Problem Optimized Noise-Function Compiler: gcc-10
> Povray-Output :
>                 ....
> 
> Dynamic optimizations:
>    CPU detected: AMD,SSE2,AVX,AVX2,FMA3
>    Noise generator: avx-generic (compiler-optimized)
> 
> CPU: AMD Ryzen 2700
> cat /proc/cpuinfo | grep avx :  se4_1 sse4_2 movbe popcnt aes xsave avx ..
>   avx2
>   fma
> 
> So it should work with Intels implementation, but there is an vendor 
> check in :
> 
> platform/x86/cpuid.cpp

Yes - and that is very much deliberate. When AMD provided us with the 
AVX/FMA4 optimized code back in mid-2017, they also did some thorough 
performance testing on a very diverse farm of ~20 AMD and ~25 Intel 
machines, and ended up strongly recommending to specifically *NOT* use 
the AVX2/FMA3 optimized code (which had been provided by Intel years 
earlier) on AMD processors, but rather give preference to the portable 
code, in a variant compiler-optimized for AVX.

That was the recommendation for the Windows builds, anyway, but unless I 
see numbers from extensive and thorough analysis of Linux builds, I 
presume Linux compilers are on a similar level when it comes to 
automatic optimization.

There was even some suspicion that Intel might, back in the days, have 
custom-tailored their optimized code specifically to work poorly on AMD 
machines.


If you're seeing performance improvements with Intel's AVX2/FMA3 
hand-optimized code, then by all means use it; but I would recommend 
that you double-check whether it even does any good at all.


> Solution:
> 
> bool CPUInfo::IsIntel()
> {
>   return gpData->cpuidInfo.vendorId == kCPUVendor_Intel|| kCPUVendor_AMD;
> }

Um... no, that would be broken on multiple levels. For starters, it 
fails to do what you probably intend it to do (it actually makes the 
function always return `true`, even if the vendor is neither Intel nor 
AMD). And even if it worked as you intend, it would break the whole 
purpose of `CPUInfo::IsIntel()` - namely to detect whether the vendor 
*is*, as a matter of fact, *genuine* Intel.

If it should indeed be the case that modern AMD processors also prefer 
Intel's AVX2/FMA3 hand-optimized code, then what we'd really want to 
change is just the matrix in `platform/x86/optimizednoise.cpp`, which 
tells POV-Ray - based on the CPU features (and vendor!) we detect - what 
optimized noise implementation to use.

Post a reply to this message