POV-Ray : Newsgroups : povray.off-topic : Haskell raving : Re: Haskell raving Server Time
15 Nov 2024 11:16:45 EST (-0500)
  Re: Haskell raving  
From: Invisible
Date: 2 Nov 2007 11:13:46
Message: <472b4cba@news.povray.org>
Warp wrote:

>   Is that really so? Can you refactor an existing solution (eg. in a library)
> so that it will be more efficient (eg. memorywise) without breaking any
> existing programs?

I guess the key to a question like this is to think about what kinds of 
things would possibly *prevent* such a refactor.



I have never written a nontrivial program in C, but in Pascal I 
frequently found that the type system was one such reason. In Pascal 
(and AFAIK in plain C) you *must* expose all implementation details of a 
type in order for anybody else to be able to touch it. This is one of 
the Big Problems that OOP is supposed to fix. (And, IMHO, does fix very 
well indeed.)

Haskell does quite well too in this regard. I can write a module in 
Haskell that exports a type, without telling you a damn thing about what 
the type is. It might just be an alias for some existing type, or it 
could be something really complex. You don't know. You don't care. And I 
can *change* what that type actually is, and you *still* don't need to 
care. All you see is that there exists a type with a specific name, and 
some functions that can do things with it.



We could summarise all that by simply saying that hiding unecessary 
detail is a Good Thing. So perhaps we should look at where details can 
leak out in Haskell...



One place is I/O. Functions that do I/O have to be "marked" as such. 
This is by design; normal functions can be run in any order without 
consequence, whereas I/O operations *usually* need to be run in a very 
precise order. This information is obviously very important, and so is 
exposed at the type level.

What this means is that you can't flip a normal function into an 
I/O-performing function without altering at least the function's type. 
But then, you're turning an "I can run this at any time" function into a 
"it might well matter when I run this" function. Arguably this demands 
that the client code change anyway, in which case the enforced interface 
change is arguably good. Either way, this kind of code change seems 
quite rare in practice. (Certainly rare for performance tuning.)



What other information is leaked? Well, a function's type tells you 
exactly what inputs it depends on. (Unless it does I/O anyway...) So if 
you change the algorithm to take more information into account, the type 
signature must change.

Let us take an example: Suppose you have a function that decodes 
different character sets. You give it a byte, and it translates it to a 
Unicode code-point using some kind of rule.

Now suppose you want to add UTF-8 decoding. Oops! That's a 
variable-length encoding. To decode a byte, you might need to know what 
the previous byte was. But in Haskell, you can't do that.

Now, in C you could just store some status information in a global 
variable somewhere. So in C you can add UTF-8 without having to change 
your public interface. Yay!

Now suppose some client decides to call the function from multiple 
threads processing several input streams. OOPS!! Chaos ensues... The 
*only* way you can fix this is if you alter your public interface. There 
are several possible fixes, but all involve client-visible change.

On the other hand, Haskell would force you to change the public 
interface in the first place. (And again you have several options here.)

To summarise: In both C and Haskell you end up changing the interface if 
you want the thing to work properly. Haskell just forces you to do it 
right at the start, whereas C potentially does strange things until you 
realise what's up. And in both cases, the key to success is to design 
the public interface properly in the first place! (I.e., write a 
function that converts whole strings, not individual octets.)



I said above that leaking unecessary detail is bad. My arguments below 
that could essentially be read as saying "this is leaking detail, but it 
is *necessary* detail, so that's OK". So it's really whether you 
consider these things "necessary" or not.



Another potential problem area is laziness. Being lazy can speed 
programs up no end. It can also slow them down tremendously. 
Deliberately turning off laziness is a fairly common way to improve the 
efficiency of Haskell code. However, this is often an obvservable 
change. Code that works with the lazy version of a library might well 
lock up given a stricter version.

So here is an implementation change that is totally invisible at the 
interface level, yet might cause client code to stop working properly.

(It's actually fairly common to provide libraries with two versions, so 
the client can choose. And because suddenly changing a library's 
implementation can cause problems...)


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.