|
|
Am 24.06.2021 um 05:20 schrieb Thomas Debe:
> Am 24.06.21 um 01:28 schrieb clipka:
> Hallo Christoph !
>
> 1.) Problem:
> Build povunix-v3.8.9-beta.668.tar.gz Compiler clang-x:
> Optimized Noise-Functions not compiled !
> Clang defines __clang__ Not __GNUC__
(I presume that's a typo and you mean povunix-v3.8.0-beta.668.tar.gz.)
Can you please double-check whether these problems are specific to the
Unix source package (povunix-v3.8.0-beta.668.tar.gz), or whether they
also occur when building form the "raw" repository source
(https://github.com/c-lipka/povray/archive/refs/tags/v3.8.0-pre-beta.668.tar.gz)?
>
> Solution :
> File : unix/povconfig/syspovconfig.h
>
> Z 169ff: #if defined (__clang__)
> #define HAVE_ASM_AVX
> #define HAVE_ASM_AVX2
> #define HAVE_ASM_FMA3
> #define HAVE_ASM_FMA4
> #endif
>
> Z. 179: // most notably platform-specific optimized implementations.
> #if defined (__GNUC__) || defined(__clang__)
>
> 2.) Problem Optimized Noise-Function Compiler: gcc-10
> Povray-Output :
> ....
>
> Dynamic optimizations:
> CPU detected: AMD,SSE2,AVX,AVX2,FMA3
> Noise generator: avx-generic (compiler-optimized)
>
> CPU: AMD Ryzen 2700
> cat /proc/cpuinfo | grep avx : se4_1 sse4_2 movbe popcnt aes xsave avx ..
> avx2
> fma
>
> So it should work with Intels implementation, but there is an vendor
> check in :
>
> platform/x86/cpuid.cpp
Yes - and that is very much deliberate. When AMD provided us with the
AVX/FMA4 optimized code back in mid-2017, they also did some thorough
performance testing on a very diverse farm of ~20 AMD and ~25 Intel
machines, and ended up strongly recommending to specifically *NOT* use
the AVX2/FMA3 optimized code (which had been provided by Intel years
earlier) on AMD processors, but rather give preference to the portable
code, in a variant compiler-optimized for AVX.
That was the recommendation for the Windows builds, anyway, but unless I
see numbers from extensive and thorough analysis of Linux builds, I
presume Linux compilers are on a similar level when it comes to
automatic optimization.
There was even some suspicion that Intel might, back in the days, have
custom-tailored their optimized code specifically to work poorly on AMD
machines.
If you're seeing performance improvements with Intel's AVX2/FMA3
hand-optimized code, then by all means use it; but I would recommend
that you double-check whether it even does any good at all.
> Solution:
>
> bool CPUInfo::IsIntel()
> {
> return gpData->cpuidInfo.vendorId == kCPUVendor_Intel|| kCPUVendor_AMD;
> }
Um... no, that would be broken on multiple levels. For starters, it
fails to do what you probably intend it to do (it actually makes the
function always return `true`, even if the vendor is neither Intel nor
AMD). And even if it worked as you intend, it would break the whole
purpose of `CPUInfo::IsIntel()` - namely to detect whether the vendor
*is*, as a matter of fact, *genuine* Intel.
If it should indeed be the case that modern AMD processors also prefer
Intel's AVX2/FMA3 hand-optimized code, then what we'd really want to
change is just the matrix in `platform/x86/optimizednoise.cpp`, which
tells POV-Ray - based on the CPU features (and vendor!) we detect - what
optimized noise implementation to use.
Post a reply to this message
|
|