|
|
Am 27.06.2021 um 14:33 schrieb Thomas Debe:
> Hello !
> Sorry for the delay !
>
> Am 24.06.21 um 13:19 schrieb clipka:
> ..
>> Can you please double-check whether these problems are specific to the
>> Unix source package (povunix-v3.8.0-beta.668.tar.gz), or whether they
>> also occur when building form the "raw" repository source
>> (https://github.com/c-lipka/povray/archive/refs/tags/v3.8.0-pre-beta.668.tar.gz)?
>>
> Yes I can confirm the behavior for clang-10,clang-11 and clang-12.
"Yes" as in "yes, that's specific to the Unix source package", or as in
"yes, they also occur when building from the 'raw' repository source"?
It can't be both, and to invesigate the matter it would help to know
which of the two is the case.
> ..
>> Yes - and that is very much deliberate. When AMD provided us with the
>> AVX/FMA4 optimized code back in mid-2017, they also did some thorough
>> performance testing on a very diverse farm of ~20 AMD and ~25 Intel
>> machines, and ended up strongly recommending to specifically *NOT* use
>> the AVX2/FMA3 optimized code (which had been provided by Intel years
>> earlier) on AMD processors, but rather give preference to the portable
>> code, in a variant compiler-optimized for AVX.
>
> The real background was probably a bug in the FMA3 implementation of the
> first Ryzen [1].
No, it was a general lack of performance they also observed for older
FMA3-enabled CPUs. Unless AMD provided us with falsified data, the
numbers were quite clear.
> The Ryzen was the first processor from AMD to support FMA3. And FMA4 is
> not officially supported via the CPU flags.
Ryzen was the first to support FMA4. Which did indeed not survive for
long, at least not officially.
AMD have had FMA3-capable CPUs in their portfolio since 2012, 6 years
earlier.
> In german:
> [1]
>
https://www.heise.de/newsticker/meldung/AMD-bestaetigt-FMA3-Bug-bei-Ryzen-3658407.html
That bug did not manifest in performance issues (as far as I know,
anyway), but in total CPU lockups. We have no indication that this bug
was ever triggered by POV-Ray's Intel-optimized noise generator.
Also, I'm rather sure the people we had been dealing with at AMD
expected the Ryzen to support FMA4, so even if the FMA3 bug had been a
known issue back then and they therefore wanted to avoid running into it
in POV-Ray, recommending their AVX/FMA4 optimization would have appeared
to do the job. There would not have been any need to also discourage
AVX2/FMA3 on AMD CPUs in general.
>>> Solution:
>>>
>>> bool CPUInfo::IsIntel()
>>> {
>>> return gpData->cpuidInfo.vendorId == kCPUVendor_Intel||
>>> kCPUVendor_AMD;
>>> }
>>
>> Um... no, that would be broken on multiple levels. For starters, it
>> fails to do what you probably intend it to do (it actually makes the
>> function always return `true`, even if the vendor is neither Intel nor
>> AMD).
>
> Like :
> bool CPUInfo::IsIntel()
> {
> return true;
> }
> ????
Yes, that's what the above code boils down to.
The `==` equality test operator binds stronger than the `||` boolean OR
operator, and constitutes a boolean expression sometimes evaluating as
true and sometimes as false. The expression to the right of the `||` is
just an enum constant though, which is automatically promoted to its
corresponding int value (which is non-zero). Due to its C heritage, C++
allows that int value to be used as a boolean, in which case any value
other than zero is interpreted as "true".
So effectively you have
return
(gpData->cpuidInfo.vendorId == kCPUVendor_Intel) ||
(kCPUVendor_AMD != 0);
Even if you were to put parentheses around the kCPUVendor* codes, it
would not do what you'd expect:
return
gpData->cpuidInfo.vendorId ==
(kCPUVendor_Intel || kCPUVendor_AMD);
is _not_ asking whether vendorId is any one of these values, but rather
whether it is equal to the integer representation of the boolean OR of
the boolean interpretation of the two enum constants' integer IDs:
return
gpData->cpuidInfo.vendorId == (
( (kCPUVendor_Intel != 0) || (kCPUVendor_AMD != 0) ) ? 1 : 0
);
So if at least one of the enum constants of kCPUVendor_Intel or
kCPUVendor_AMD happens to have a non-zero associated int value (which is
indeed the case), then the function returns true if the vendor ID is 1.
If on the other hand both enum constants would have an associated int
value of 0, the function would return true if the vendor ID was 0.
In C++, what you presumably mean would have to be written as:
return
( gpData->cpuidInfo.vendorId == kCPUVendor_Intel ) ||
( gpData->cpuidInfo.vendorId == kCPUVendor_AMD );
Post a reply to this message
|
|