|
|
Warp wrote:
> Is that really so? Can you refactor an existing solution (eg. in a library)
> so that it will be more efficient (eg. memorywise) without breaking any
> existing programs?
I guess the key to a question like this is to think about what kinds of
things would possibly *prevent* such a refactor.
I have never written a nontrivial program in C, but in Pascal I
frequently found that the type system was one such reason. In Pascal
(and AFAIK in plain C) you *must* expose all implementation details of a
type in order for anybody else to be able to touch it. This is one of
the Big Problems that OOP is supposed to fix. (And, IMHO, does fix very
well indeed.)
Haskell does quite well too in this regard. I can write a module in
Haskell that exports a type, without telling you a damn thing about what
the type is. It might just be an alias for some existing type, or it
could be something really complex. You don't know. You don't care. And I
can *change* what that type actually is, and you *still* don't need to
care. All you see is that there exists a type with a specific name, and
some functions that can do things with it.
We could summarise all that by simply saying that hiding unecessary
detail is a Good Thing. So perhaps we should look at where details can
leak out in Haskell...
One place is I/O. Functions that do I/O have to be "marked" as such.
This is by design; normal functions can be run in any order without
consequence, whereas I/O operations *usually* need to be run in a very
precise order. This information is obviously very important, and so is
exposed at the type level.
What this means is that you can't flip a normal function into an
I/O-performing function without altering at least the function's type.
But then, you're turning an "I can run this at any time" function into a
"it might well matter when I run this" function. Arguably this demands
that the client code change anyway, in which case the enforced interface
change is arguably good. Either way, this kind of code change seems
quite rare in practice. (Certainly rare for performance tuning.)
What other information is leaked? Well, a function's type tells you
exactly what inputs it depends on. (Unless it does I/O anyway...) So if
you change the algorithm to take more information into account, the type
signature must change.
Let us take an example: Suppose you have a function that decodes
different character sets. You give it a byte, and it translates it to a
Unicode code-point using some kind of rule.
Now suppose you want to add UTF-8 decoding. Oops! That's a
variable-length encoding. To decode a byte, you might need to know what
the previous byte was. But in Haskell, you can't do that.
Now, in C you could just store some status information in a global
variable somewhere. So in C you can add UTF-8 without having to change
your public interface. Yay!
Now suppose some client decides to call the function from multiple
threads processing several input streams. OOPS!! Chaos ensues... The
*only* way you can fix this is if you alter your public interface. There
are several possible fixes, but all involve client-visible change.
On the other hand, Haskell would force you to change the public
interface in the first place. (And again you have several options here.)
To summarise: In both C and Haskell you end up changing the interface if
you want the thing to work properly. Haskell just forces you to do it
right at the start, whereas C potentially does strange things until you
realise what's up. And in both cases, the key to success is to design
the public interface properly in the first place! (I.e., write a
function that converts whole strings, not individual octets.)
I said above that leaking unecessary detail is bad. My arguments below
that could essentially be read as saying "this is leaking detail, but it
is *necessary* detail, so that's OK". So it's really whether you
consider these things "necessary" or not.
Another potential problem area is laziness. Being lazy can speed
programs up no end. It can also slow them down tremendously.
Deliberately turning off laziness is a fairly common way to improve the
efficiency of Haskell code. However, this is often an obvservable
change. Code that works with the lazy version of a library might well
lock up given a stricter version.
So here is an implementation change that is totally invisible at the
interface level, yet might cause client code to stop working properly.
(It's actually fairly common to provide libraries with two versions, so
the client can choose. And because suddenly changing a library's
implementation can cause problems...)
Post a reply to this message
|
|