|
![](/i/fill.gif) |
Over Christmas, I read a book called "Clean Code". My employer lent it
to me. No, I don't remember who the authors were.
While I don't agree with absolutely everything the book said, it does
seem to speak a great deal of sense. The book says obvious things like
DRP and SRP. It says things like "name stuff sensibly" and "don't write
500-line methods". But it also says more insightful things.
For example, don't mix multiple levels of abstraction in a single block
of code. Now I don't know about you, but I know for a fact that I have
*totally* written code where I'm like "OK, so I process each item, and
oh wait, I need to dump out some HTML here. Oh well, I'll just quickly
hard-code it in here. It's only small anyway..." It turns out, even for
small things, actually separating stuff out into a few tiny methods with
descriptive names *really does* make it that much easier to read the
code again later. Even if it seems pointless to implement a whole
separate method just for a few lines of code.
Possibly the best insight the book has to offer is "if you need to read
the comments to work out what the code does... the code sucks". The
goal, basically, is to write your code in such a way that it's so
utterly *obvious* what it does that there's really no need to write
comments about it. The book goes into quite some detail on circumstances
where it *is* valid to include comments - e.g., to document obscure
quirks of external code you're interfacing to or something. But the main
argument is that you should simplify your code to the point where
comments become superfluous.
Reading through the book, and looking at the examples, and so on, the
thing that impressed me is how widely applicable the advice is. The book
is written in Java, but the advice is (almost) all equally applicable to
any OO language. But actually, most of this stuff would apply just as
well to Pascal or Bash scripts... or Haskell.
There are a few things I don't agree with. The book asserts that a
method should never have a bool argument. Because "if it has a bool
argument, it does two different things, and you should change it into
two separate methods instead". Yeah, and are you going to split every
single client of this method into two methods as well?? Similarly, you
should never pass an enum to a method, because then it does *multiple*
things. Again, this seems silly if all the method does is pass that enum
on to somebody else. And even if it's dealt with locally, I don't see
how this is "fundamentally bad" (although often you can replace a big
switch-block with some nice OO polymorphism instead of using enums).
People often complain about Haskell's heavily abbreviated variable
names. The book says that variable names should be descriptive, but it
also states that the level of description should be proportional to the
size of the variable's scope. So if you have a global variable (i.e.,
you are evil), it should have a *really* damned descriptive name. But if
you have a loop counter that's only in scope for 3 lines of code, it's
find to use something shorter.
Haskell of course *abounds* with 1-line function definitions. If a
variable is only in scope for one single line of code, how much of a
name do you really need? Similarly, when you have a function like "flip"
that takes a 2-argument function and swaps its two arguments, you could
name those arguments as "x" and "y", or you could call them
"first_argument" and "second_argument". But how is that better? The
shorter names are quicker to read, and it's arguably easier to see
what's going on. And that's a common theme in Haskell: often you write
code that's so utterly abstract that it would be difficult to come up
with meaningfully descriptive names in the first place.
Heavily abbreviating function names, however, is rather less easy to
justify. Was it *really* necessary to shorten "element" to "elem"?
Similarly, the book demands that names *tell you* something about the
items they refer to. The "max" and "maximum" functions are *clearly*
different functions - but the names yield no clue has to *how* they are
different. There _is_ some logic there, once you look it up, but it's
hardly self-describing. ("max" finds the maximum of just two arguments,
"maximum" finds the maximum element of an entire list of items. So one
has a short name, and the other a long name. Logical, but not really
self-describing.)
The same could maybe levelled at the standard class names. We have "Eq",
"Ord" and "Num" rather than "Equal", "Ordered" and "Number". And
"number" is a crappy name anyway; Num defines +, - and *, and also abs,
signum and conversion from Integer. But "/" is defined in Fractional.
(Not "Frac", thankfully.) Then again, designing a decent number
hierarchy is *highly* non-trivial, so...
The "id" function could *easily* have been named "identity" instead.
That would have been way, way more descriptive. But I think the biscuit
has to go to the horrifyingly misleading "return" function, which is
*nothing like* what any programmer experienced in C, C++, C#, Java,
JavaScript, VB, Bash, Pascal, Smalltalk, Tcl, JavaScript... would
expect. That name has such a ubiquitous pre-existing meaning that to
define it to do something totally different seems like madness to me...
Post a reply to this message
|
![](/i/fill.gif) |