|
 |
On 22/10/2010 05:33 PM, Warp wrote:
> I'm not so sure I agree with the necessity of 'union'.
>
> 'union' in C is basically a construct which exists to save some bytes
> of memory in some rare circumstances. The price you pay for it is reduced
> type safety: You always have to make sure you are using the object with
> the correct type, or else you'll get garbage. More often than not, unions
> are not used raw, but embedded in structs, where one of the struct members
> tells the type of the union. While in some situations you can use this to
> save memory (especially if you need to instantiate this struct millions of
> times), it introduces the overhead of the type member and having to always
> check it. (Without the union there would be no need to write any conditionals
> based on the type.)
>
> You could have the language always automatically check the proper type
> of the union, but then you'll be having overhead on every single operation
> you are doing with that union (overhead which could in many cases be avoided).
> It would be rare for the memory saving advantage to be worth the overhead.
You have just described Algebraic Data Types.
An ADT is basically like a (possibly singleton) union of (possibly
nullary) structs together with an enum that tells you which struct is
present at run-time.
If you just want a normal struct, you make an ADT with only one
constructor. If you just want an enum, you make an ADT with several
constructors each with zero fields. But an ADT generalises both, as you
can see.
Having to check which struct is present every single time you want to
access its fields _would_ be very annoying, except that this is where
Haskell's "pattern matching" comes in. With it, you can check which
struct is there and copy the field values into variables all in a single
operation. (It's a single primitive operation as far as Haskell is
concerned, although obviously the generated machine code is potentially
many instructions long.)
(I suppose it goes without saying that how all of this is actually
implemented isn't specified. But GHC does things like optimising away
constructor checks for singleton constructors, and making zero-field
constructors into static global constants, etc.)
> As for removing null pointers, that would present problems of its own.
> Null pointers in languages like C, C++ and Java are often used for useful
> things. For example if you have a linked list, a null 'next' pointer is
> a good indication of the element being the last one in the list. Without
> the possibility of having null pointers you would have to device some other
> kind of solution.
Again, Haskell does not allow null pointers. (Then again, Haskell really
deals with "references" rather than "pointers" as such.) This neatly
eliminates an entire class of bugs.
"But null pointers can be useful!" I hear you cry. For this, Haskell has
the "Maybe" type. Basically, if you say "Foo", it will definitely,
always, be a Foo. But if you say "Maybe Foo", then it may or may not be
a Foo - and the type system *requires* you to explicitly check whether
it's "null" or not before you do stuff to it.
In other words, pointers which can be null are distinguished from
pointers that cannot be null (in the type system). And you have to check
at run-time. (If you don't, your program just won't compile.)
Of course, Maybe is just a plain vanilla ADT, defined in plain vanilla
Haskell. It just happens to be part of the standard library.
I gather Eiffel did this, actually. But they retro-fitted it to the
language. So rather than saying "these [and only these] things might be
null", you get to say "these [and only these] things definitely will NOT
be null". Which is kinda backwards [compatible].
One thing null pointers are often used for are error conditions. Like,
you have a parser function that returns either a pointer to a parse
tree, or a null pointer if there was a parse error. In Haskell, another
ADT comes into play: it's called "Either". Instead of saying "this
returns a SyntaxTree", you say "this returns an Either ParseError
SyntaxTree". Meaning that it either returns a ParseError, or a
SyntaxTree. (And you have to explicitly check which one before you do
stuff to it.) Same deal as above, but you can return a _description_ of
the failure rather than just fail.
Apparently there are two kinds of programmers: The ones who look at all
of the above and go "wow, that's a whole lot of unecessary run-time
overhead!", and the ones who go "wow, that's really going to reduce the
amount of time I waste trying to debug my complex application"...
--
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*
Post a reply to this message
|
 |