POV-Ray: Newsgroups: povray.off-topic: C/C++ Data Type Ambiguity Backwards

POV-Ray : Newsgroups : povray.off-topic : C/C++ Data Type Ambiguity Backwards		Server Time 21 Dec 2025 06:06:48 EST (-0500)

Goto Latest 10 Messages

Next 10 Messages >>>

From: clipka
Subject: C/C++ Data Type Ambiguity Backwards
Date: 21 Aug 2015 08:12:50
Message: <55d715c2$1@news.povray.org>

Okay, I guess everyone who has ever touched C or C++ has at least heard
rumors of this: The standard data types, such as "int", "short int" or
"long int", are anything but. For instance, a "long int" will typically
be 32 bits wide on a 32 bit machine, but 64 bits on a 64 bit machine -
unless you're running Windows, in which case it's still 32 bit. And
"int" will typically be 32 bits wide - unless you're running on a 64 bit
Cray machine, in which case it will be a whopping 64 bit as well. Or on
an embedded computer, in which case it may be as small as 16 bits. Hell,
there are even systems out there where the most fundamental data type,
"char", is not 8 but 16 bits wide!

I've learned this lesson about a decade ago (well, the thing about the
"char" data type, anyway; I might have heard about the other type woes
long before), and also that fortunately the header file <limits.h> (C)
or <climits> (C++) provides some relief: It provides a set of macros
that at least tell you what the minimum and maximum values of those
types are. For instance, "UINT_MAX" will tell you the highest number
that fits in /your/ particular "unsigned int", "SHRT_MAX" will tell you
the same for (signed) "short", and so forth.

Now the data type ambiguity has struck back with a vengeance, right
before my eyes, in the POV-Ray code:

Imagine you need to read a 32 bit integer from a file, and convert it to
a floating point value in the range from 0.0 (correspondng to integer
value 0) to 1.0 (corresponding to integer value 2^32-1). How do you do that?

Well, after reading 4 bytes from the file into a variable that is
supposedly large enough to hold those 4 bytes (we're using an unsigned
int there... whoops), you convert the value straight to floating point
format (giving you a value from 0.0 to 2.0^32-1.0), and then of course
divide by UINT_MAX...

... wait, *WHAT?*

Okay, I can understand how someone might be oblivious enough of the type
issues to shove a 32 bit value into an unsigned int without thinking
twice. But that /constant/ is there exactly because unsigned int is
/not/ guaranteed to be 32 bits wide - and we're seriously using that
very same constant with the /adamant/ presumption that it /is/?

*NOM!*
There. Another bite mark in my desk.

Needless to say, Imma throw this outta the window. (The code, not the
desk. That would bee a tad too heavy.)

Post a reply to this message

From: Anthony D Baye
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 22 Aug 2015 14:20:01
Message: <web.55d8bc7331c53d072aaea5cb0@news.povray.org>

clipka <ano### [at] anonymousorg> wrote:
> Okay, I guess everyone who has ever touched C or C++ has at least heard
> rumors of this: The standard data types, such as "int", "short int" or
> "long int", are anything but. For instance, a "long int" will typically
> be 32 bits wide on a 32 bit machine, but 64 bits on a 64 bit machine -
> unless you're running Windows, in which case it's still 32 bit. And
> "int" will typically be 32 bits wide - unless you're running on a 64 bit
> Cray machine, in which case it will be a whopping 64 bit as well. Or on
> an embedded computer, in which case it may be as small as 16 bits. Hell,
> there are even systems out there where the most fundamental data type,
> "char", is not 8 but 16 bits wide!
>
> I've learned this lesson about a decade ago (well, the thing about the
> "char" data type, anyway; I might have heard about the other type woes
> long before), and also that fortunately the header file <limits.h> (C)
> or <climits> (C++) provides some relief: It provides a set of macros
> that at least tell you what the minimum and maximum values of those
> types are. For instance, "UINT_MAX" will tell you the highest number
> that fits in /your/ particular "unsigned int", "SHRT_MAX" will tell you
> the same for (signed) "short", and so forth.
>
> Now the data type ambiguity has struck back with a vengeance, right
> before my eyes, in the POV-Ray code:
>
> Imagine you need to read a 32 bit integer from a file, and convert it to
> a floating point value in the range from 0.0 (correspondng to integer
> value 0) to 1.0 (corresponding to integer value 2^32-1). How do you do that?
>
> Well, after reading 4 bytes from the file into a variable that is
> supposedly large enough to hold those 4 bytes (we're using an unsigned
> int there... whoops), you convert the value straight to floating point
> format (giving you a value from 0.0 to 2.0^32-1.0), and then of course
> divide by UINT_MAX...
>
> ... wait, *WHAT?*
>
> Okay, I can understand how someone might be oblivious enough of the type
> issues to shove a 32 bit value into an unsigned int without thinking
> twice. But that /constant/ is there exactly because unsigned int is
> /not/ guaranteed to be 32 bits wide - and we're seriously using that
> very same constant with the /adamant/ presumption that it /is/?
>
> *NOM!*
> There. Another bite mark in my desk.
>
> Needless to say, Imma throw this outta the window. (The code, not the
> desk. That would bee a tad too heavy.)

I've heard about this (Except for the char thing... that's just weird) but never
given it much thought. (Short sighted maybe)

I know that Qt Widgets provides it's own fixed width data types, but the C99
stdint.h had them too.

http://en.cppreference.com/w/cpp/types/integer

Regards,
A.D.B.

Post a reply to this message

From: Anthony D Baye
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 22 Aug 2015 14:40:01
Message: <web.55d8c12031c53d072aaea5cb0@news.povray.org>

That said: The fixed types with -exactly- 8, 16, 32 or 64 bits appear to be
optional, so there's no guarantee they'll be defined...

that would be aggravating.

http://www.azillionmonkeys.com/qed/pstdint.h
http://www.boost.org/doc/libs/1_38_0/libs/integer/index.html
https://en.wikipedia.org/wiki/C_data_types#Downloads

In all likelihood, you're probably aware of all of these, since you're code-fu
is vastly stronger than mine.

Regards,
A.D.B.

Post a reply to this message

From: Stephen
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 22 Aug 2015 15:30:35
Message: <55d8cddb$1@news.povray.org>

On 8/22/2015 7:36 PM, Anthony D. Baye wrote:
> In all likelihood, you're probably aware of all of these, since you're code-fu
> is vastly stronger than mine.


I bet my unco fu is stronger than his ;-)

It is OT after all. :-)

-- 

Regards
     Stephen

Post a reply to this message

From: clipka
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 23 Aug 2015 00:19:55
Message: <55d949eb$1@news.povray.org>

Am 22.08.2015 um 20:16 schrieb Anthony D. Baye:

> I've heard about this (Except for the char thing... that's just weird) but never
> given it much thought. (Short sighted maybe)

Would that be signed or unsigned short sighted? :-P

> I know that Qt Widgets provides it's own fixed width data types, but the C99
> stdint.h had them too.

C++11 <cstdint> has them as well. (*)

Alas, C++03 is based on pre-99 C, so it doesn't have them. So POV-Ray
does its best to try and figure out matching types, but may have to fall
back to relying on the guaranteed minimum widths of the char, short,
int, long and long long types.


----------------------------
(*)

BTW, it should be noted that actually <stdint.h> or <cstdint> does *not*
necessarily have those standard fixed width data types "intN_t" and
"uintN_t" with N={8,16,32,64}, as they are only mandatory if the runtime
environment happens to supports integers of that exact width; in ILP64
environments, for instance, you could theoretically miss out on N=32
(presuming short is 16 bits wide).

Also, all the "intN_t" are unavailable if the runtime environment does
not use two's complement format for negative integers.

The only thing we're guaranteed to get are "intN_least_t",
"uintN_least_t", "intN_fast_t" and "uintN_fast_t", with N={8,16,32,64},
which are /at least/ N bits wide, with the added guarantee that the
"*_least_t" variants are the smallest and the "*_fast_t" variants the
fastest types that fit the bill.

Also note that "intN_least_t" and "intN_fast_t" may be unable to
represent -2^(N-1) (in contrast to "intN_t" which, if present, is
guaranteed to represent exactly the range from -2^(N-1) to 2^(N-1)-1).

So yeah, <stdint.h> alias <cstdint> helps... a tiny bit.

Post a reply to this message

From: clipka
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 23 Aug 2015 00:26:22
Message: <55d94b6e$1@news.povray.org>

Am 22.08.2015 um 20:36 schrieb Anthony D. Baye:
> That said: The fixed types with -exactly- 8, 16, 32 or 64 bits appear to be
> optional, so there's no guarantee they'll be defined...
> 
> that would be aggravating.

... ah, you noticed that alredy.
Did you also note the part where it says that the signed variants are
only present if negatives use two's complement format?

One more thing that bugs me is that standards /still/ don't provide a
straightforward way to detect the byte ordering of the standard integer
data types (unless I missed some recent news).

Post a reply to this message

From: Anthony D Baye
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 23 Aug 2015 04:45:00
Message: <web.55d987bc31c53d072aaea5cb0@news.povray.org>

clipka <ano### [at] anonymousorg> wrote:
> Am 22.08.2015 um 20:16 schrieb Anthony D. Baye:
>
> > I've heard about this (Except for the char thing... that's just weird) but never
> > given it much thought. (Short sighted maybe)
>
> Would that be signed or unsigned short sighted? :-P
>

Definitely unsigned.  My hindsight is perfectly clear.

Regards,
A.D.B.

Post a reply to this message

From: Orchid Win7 v1
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 23 Aug 2015 04:54:50
Message: <55d98a5a$1@news.povray.org>

On 21/08/2015 01:12 PM, clipka wrote:
> Okay, I guess everyone who has ever touched C or C++ has at least heard
> rumors of this: The standard data types, such as "int", "short int" or
> "long int", are anything but.

If I'm remembering my history right, C was basically invented to write 
Unix in. From the very beginning, it was a programming language 
*specifically designed* for system programming.

You know, the kind of programming where knowing exactly how many bits 
you're dealing with is 100% critical.

And yet, this is one of the few programming languages on Earth which 
doesn't guarantee how many bits are in a particular data type, and 
provides no way to specify what you actually want.

Does that seem weird to anybody else??

Post a reply to this message

From: clipka
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 23 Aug 2015 08:52:59
Message: <55d9c22b@news.povray.org>

Am 23.08.2015 um 10:54 schrieb Orchid Win7 v1:
> On 21/08/2015 01:12 PM, clipka wrote:
>> Okay, I guess everyone who has ever touched C or C++ has at least heard
>> rumors of this: The standard data types, such as "int", "short int" or
>> "long int", are anything but.
> 
> If I'm remembering my history right, C was basically invented to write
> Unix in. From the very beginning, it was a programming language
> *specifically designed* for system programming.
> 
> You know, the kind of programming where knowing exactly how many bits
> you're dealing with is 100% critical.

That's only true /some/ of the time; and for those cases, C has bit
fields, which provide far more fine-grained control than any
guaranteed-exact-size type system could ever give you.

More to the point, system programming is the kind of programming where
performance is 100% critical /all/ of the time - and where you therefore
want to use the machine's native data types almost everywhere, rather
than some guaranteed-exact-size type system that might impose an
unnecessary overhead on your particular machine.

Also, it was designed back in the times when "portability" wasn't equal
to "interchangeability"; who cared whether your system used the same
inode size as anyone else - you wouldn't physically mount its hard drive
into another machine anyway. You wouldn't even physically mount your
removable storage media on any other machine. You only /had/ that one
machine.

Networking - yeah, that might have been a bit tedious; but back then
that was only an ever so tiny portion (and as mentioned before bit
fields would be your friend there; ever tried to assemble a raw IP frame
in Java?); most data transfer to the outside world would have been to
and from terminals, with links that would use character-based data
transfer, and hardware that would automatically trim your smallest
native data type ("char") to whatever bits per character the serial link
was configured to use - which more often than not would have been 7
rather than 8.

> And yet, this is one of the few programming languages on Earth which
> doesn't guarantee how many bits are in a particular data type, and
> provides no way to specify what you actually want.
> 
> Does that seem weird to anybody else??

No, not really, for the above reasons.

Post a reply to this message

From: Le Forgeron
Subject: Re: C/C++ Data Type Ambiguity Backwards
Date: 23 Aug 2015 12:24:55
Message: <55d9f3d7@news.povray.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Le 23/08/2015 10:54, Orchid Win7 v1 a écrit :
> On 21/08/2015 01:12 PM, clipka wrote:
>> Okay, I guess everyone who has ever touched C or C++ has at least
>> heard rumors of this: The standard data types, such as "int",
>> "short int" or "long int", are anything but.
> 
> If I'm remembering my history right, C was basically invented to
> write Unix in. From the very beginning, it was a programming
> language *specifically designed* for system programming.
> 

But it had to deal with at least two machines, PDP-11 with 16 bits
capability (the registers where 16 bits, long integers took 2
registers, for 32 bits), and some fancy machines with 9 bits per byte,
providing 18 bits for native integers.

In such a time with different hardwares, adaptation of the language
was the PITA.

> You know, the kind of programming where knowing exactly how many
> bits you're dealing with is 100% critical.

C language provided a common minimal set of assertion:
 at least 8 bits per char (but you can have more)
 at least 16 bits per short
 int is at least as large as short, and long int at least 32 bits.

and btw, signed integer value could, or not, be using the complement
to 2 (or 1, or anything else).

float and double... another story. (I know of a C compiler on a system
which use 48 bits for one of them, nothing like the usual 32 and 64
bits you can get used to). Ieee-757 can be used, or not.

> 
> And yet, this is one of the few programming languages on Earth
> which doesn't guarantee how many bits are in a particular data
> type, and provides no way to specify what you actually want.

And where you have a hell of time to determine if you are on a little
endian, big endian, mixed endian... until you smash an union in the scop
e.

> 
> Does that seem weird to anybody else??

You expect control... the target was easing port of bigger code,
including the compiler itself. At that time, porting code was costing
so much (in time & money).

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iJwEAQEIAAYFAlXZ89YACgkQhKAm8mTpkW0CxQP/W5OTI5c0RDb4TABmOkWmbLp1
Puttcb9mydP3uudq+BEKOeesMkOEO0j/r2gBGNAJHQdDAx1KrO8AYmpB2315TMrp
pajSI316MdCiXL+DtkFyGHLqNXnIR7EWm8j6fXQomD8UKuClo3WFzoK50aYntXC9
e3bQ9x3WZBEp2o4TiH0=
=gdYK
-----END PGP SIGNATURE-----

Post a reply to this message

Goto Latest 10 Messages

Next 10 Messages >>>