|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Over Christmas, I read a book called "Clean Code". My employer lent it
to me. No, I don't remember who the authors were.
While I don't agree with absolutely everything the book said, it does
seem to speak a great deal of sense. The book says obvious things like
DRP and SRP. It says things like "name stuff sensibly" and "don't write
500-line methods". But it also says more insightful things.
For example, don't mix multiple levels of abstraction in a single block
of code. Now I don't know about you, but I know for a fact that I have
*totally* written code where I'm like "OK, so I process each item, and
oh wait, I need to dump out some HTML here. Oh well, I'll just quickly
hard-code it in here. It's only small anyway..." It turns out, even for
small things, actually separating stuff out into a few tiny methods with
descriptive names *really does* make it that much easier to read the
code again later. Even if it seems pointless to implement a whole
separate method just for a few lines of code.
Possibly the best insight the book has to offer is "if you need to read
the comments to work out what the code does... the code sucks". The
goal, basically, is to write your code in such a way that it's so
utterly *obvious* what it does that there's really no need to write
comments about it. The book goes into quite some detail on circumstances
where it *is* valid to include comments - e.g., to document obscure
quirks of external code you're interfacing to or something. But the main
argument is that you should simplify your code to the point where
comments become superfluous.
Reading through the book, and looking at the examples, and so on, the
thing that impressed me is how widely applicable the advice is. The book
is written in Java, but the advice is (almost) all equally applicable to
any OO language. But actually, most of this stuff would apply just as
well to Pascal or Bash scripts... or Haskell.
There are a few things I don't agree with. The book asserts that a
method should never have a bool argument. Because "if it has a bool
argument, it does two different things, and you should change it into
two separate methods instead". Yeah, and are you going to split every
single client of this method into two methods as well?? Similarly, you
should never pass an enum to a method, because then it does *multiple*
things. Again, this seems silly if all the method does is pass that enum
on to somebody else. And even if it's dealt with locally, I don't see
how this is "fundamentally bad" (although often you can replace a big
switch-block with some nice OO polymorphism instead of using enums).
People often complain about Haskell's heavily abbreviated variable
names. The book says that variable names should be descriptive, but it
also states that the level of description should be proportional to the
size of the variable's scope. So if you have a global variable (i.e.,
you are evil), it should have a *really* damned descriptive name. But if
you have a loop counter that's only in scope for 3 lines of code, it's
find to use something shorter.
Haskell of course *abounds* with 1-line function definitions. If a
variable is only in scope for one single line of code, how much of a
name do you really need? Similarly, when you have a function like "flip"
that takes a 2-argument function and swaps its two arguments, you could
name those arguments as "x" and "y", or you could call them
"first_argument" and "second_argument". But how is that better? The
shorter names are quicker to read, and it's arguably easier to see
what's going on. And that's a common theme in Haskell: often you write
code that's so utterly abstract that it would be difficult to come up
with meaningfully descriptive names in the first place.
Heavily abbreviating function names, however, is rather less easy to
justify. Was it *really* necessary to shorten "element" to "elem"?
Similarly, the book demands that names *tell you* something about the
items they refer to. The "max" and "maximum" functions are *clearly*
different functions - but the names yield no clue has to *how* they are
different. There _is_ some logic there, once you look it up, but it's
hardly self-describing. ("max" finds the maximum of just two arguments,
"maximum" finds the maximum element of an entire list of items. So one
has a short name, and the other a long name. Logical, but not really
self-describing.)
The same could maybe levelled at the standard class names. We have "Eq",
"Ord" and "Num" rather than "Equal", "Ordered" and "Number". And
"number" is a crappy name anyway; Num defines +, - and *, and also abs,
signum and conversion from Integer. But "/" is defined in Fractional.
(Not "Frac", thankfully.) Then again, designing a decent number
hierarchy is *highly* non-trivial, so...
The "id" function could *easily* have been named "identity" instead.
That would have been way, way more descriptive. But I think the biscuit
has to go to the horrifyingly misleading "return" function, which is
*nothing like* what any programmer experienced in C, C++, C#, Java,
JavaScript, VB, Bash, Pascal, Smalltalk, Tcl, JavaScript... would
expect. That name has such a ubiquitous pre-existing meaning that to
define it to do something totally different seems like madness to me...
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 11/01/14 18:13, Orchid Win7 v1 wrote:
> Over Christmas, I read a book called "Clean Code". My employer lent it
> to me. No, I don't remember who the authors were.
Robert Martin. I have a copy
John
--
Protect the Earth
It was not given to you by your parents
You hold it in trust for your children
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 11/01/2014 07:09 PM, Doctor John wrote:
> On 11/01/14 18:13, Orchid Win7 v1 wrote:
>> Over Christmas, I read a book called "Clean Code". My employer lent it
>> to me. No, I don't remember who the authors were.
>
> Robert Martin. I have a copy
Good work.
Do you agree that it is righteous?
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid Win7 v1 <voi### [at] devnull> wrote:
> Possibly the best insight the book has to offer is "if you need to read
> the comments to work out what the code does... the code sucks".
There are many situations where comments are extremely helpful, not only
for others, but for the programmer himself.
For example, the implementation of a complex algorithm is often almost
indecipherable without knowing the algorithm in question, and how it
has been implemented in that particular case. Trying to understand a
complex algorithm by reading (uncommented) code only can be really
laborious and difficult.
Describing the algorithm, however, can make it a lot easier to understand
what's going on and save a lot of work.
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 11/01/2014 08:13 PM, Warp wrote:
> Orchid Win7 v1<voi### [at] devnull> wrote:
>> Possibly the best insight the book has to offer is "if you need to read
>> the comments to work out what the code does... the code sucks".
>
> There are many situations where comments are extremely helpful, not only
> for others, but for the programmer himself.
>
> For example, the implementation of a complex algorithm is often almost
> indecipherable without knowing the algorithm in question, and how it
> has been implemented in that particular case. Trying to understand a
> complex algorithm by reading (uncommented) code only can be really
> laborious and difficult.
>
> Describing the algorithm, however, can make it a lot easier to understand
> what's going on and save a lot of work.
Indeed. If you're trying to implement the Bellman-Ford algorithm or
something, some comments are probably merited. And the book says that
non-obvious design choices are one of the few valid reasons to write
comments.
I guess the vast majority of Java code (and probably C# and similar
languages) is just yet-another-order-processing-application or similar.
Hell, I work in data analysis, and > 80% of the codebase is just user
management, loading and saving configuration data, and other such
chores. If you're writing mile after mile of that, who needs comments?
If you label everything clearly enough, it'll probably be fine.
(One could argue that more complex algorithms can be made readable by
suitably suggestive labelling... but at some point that stops working,
IMHO.)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 11/01/2014 19:51, Orchid Win7 v1 wrote:
> On 11/01/2014 07:09 PM, Doctor John wrote:
>> On 11/01/14 18:13, Orchid Win7 v1 wrote:
>>> Over Christmas, I read a book called "Clean Code". My employer lent it
>>> to me. No, I don't remember who the authors were.
>>
>> Robert Martin. I have a copy
>
> Good work.
>
> Do you agree that it is righteous?
It depends what you mean by righteous (the 60s/70s meaning may not
accord with today's meaning). However, it makes a lot of sense. I see it
more as a plea for non-obfuscated code; see it more as as a plea for the
programming equivalent of the Campaign for Plain English.
John
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On Sat, 11 Jan 2014 18:13:44 +0000, Orchid Win7 v1 wrote:
> Possibly the best insight the book has to offer is "if you need to read
> the comments to work out what the code does... the code sucks".
That idea doesn't take into account others who may have to read and fix/
modify the code and their skills with the language.
Self-commenting code, of course, is a good goal - but writing code that
is readable to you doesn't mean it's readable to everyone.
Comments help bridge that gap, IMHO.
Jim
--
"I learned long ago, never to wrestle with a pig. You get dirty, and
besides, the pig likes it." - George Bernard Shaw
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 12/01/2014 06:10 AM, Jim Henderson wrote:
> On Sat, 11 Jan 2014 18:13:44 +0000, Orchid Win7 v1 wrote:
>
>> Possibly the best insight the book has to offer is "if you need to read
>> the comments to work out what the code does... the code sucks".
>
> That idea doesn't take into account others who may have to read and fix/
> modify the code and their skills with the language.
>
> Self-commenting code, of course, is a good goal - but writing code that
> is readable to you doesn't mean it's readable to everyone.
>
> Comments help bridge that gap, IMHO.
Consider the following code:
public SortedList<int> Primes(int max)
{
SortedList<int> x = new SortedList<int>();
SortedList<int> y = new SortedList<int>();
for (int n=2; n<max; n++) x.Add(n);
while (x.Size() > 0)
{
int p = x[0];
y.Add(p);
for (int k=1; k*p < max; k++) x.Remove(k*p);
}
return y;
}
Now it doesn't take a genius to work out what this does, but a few
comments would certainly help make it a bit clearer. There's a couple of
bits that aren't especially obvious, and by throwing in a comment or
two, you could clear things up.
But now suppose we refactor it to look like this:
public SortedList<int> FindPrimesBelow(int max)
{
SortedList<int> candidates = InitialiseCandidates(max);
SortedList<int> primes = new SortedList<int>();
while (candidates.Size() > 0)
{
int prime = GetNextPrime(candidates);
primes.Add(prime);
RemovePrimeAndItsMultiples(prime, candidates, max);
}
return primes;
}
private SortedList<int> InitialiseCandidates(int max)
{
SortedList<int> candidates = new SortedList<int>();
for (int n=2; n<max; n++)
{
candidates.Add(n);
}
return candidates;
}
private int GetNextPrime(SortedList<int> candidates)
{
return candidates[0];
}
private int RemovePrimeAndItsMultiples(int prime, SortedList<int>
candidates, int max)
{
for (int k=1; k*prime < max; k++)
{
candidates.Remove(k*prime);
}
}
This now makes it pretty much drop-dead obvious what the hell the
algorithm does; it builds a list of the numbers from N to Max, and loops
over them. At each step, it finds the next prime, adds it to the list of
primes, and then removes it and all its multiples from the list of
candidates. When the list of candidates becomes empty, it returns the
list of primes. Simples.
There are few places here where a comment would add much of value.
Notice how by replacing candidates[0] with GetNextPrime(), we've made it
obvious what this particular bit of code does, without writing a
comment. It may not be obvious *why* this works, but it is now clear
*what* it does.
This seems to be the main argument of the book. Anything non-obvious,
either put it into a subroutine with a descriptive name, or make it into
a variable who's name says what it's for. Basically, anything that isn't
clear, slap a name on it to clear things up.
Notice RemovePrimeAndItsMultiples(). That's a rather verbose name. Lots
of people would just name this "ProcessPrime()" or something. But that
doesn't tell you what this "processing" is without looking inside the
method body. But by calling it RemovePrimeAndItsMultiples(), you can
tell *exactly* what the method does without ever needing to read its
contents. You may not know *how* it does its job, but you can tell what
its job is.
In particular, the name makes it clear that the method removes multiples
of the prime AND THE PRIME ITSELF. That small but crucial detail wasn't
especially obvious in the original. Sure, you can work it out easily
enough, but it's such an important fact that it merits being called out.
When I first started work at my new job, I was surprised at how long
some of the variable names are. But now I understand what they're trying
(not necessarily successfully) to do: to make it obvious to anyone
reading WTF is going on. Now if only we could do that with our class
names too.
[I'm still bitter that we have two classes both called ItemData, and one
is a field of the other!! Because *that* couldn't possibly lead to
confusion at all...]
I don't have the book in front of me now, but prime number sieving is
actually one of the examples. They even go so far as to remove the "stop
searching after sqrt(max)" optimisation to "make the code simpler".
Because most of the time, being able to understand the code is far more
important than getting maximum performance out of it. Like I said, most
of the code that *I* write all day is just CRUD. It's all GUI stuff that
only needs to be as fast as the human sitting at the keyboard.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On Sun, 12 Jan 2014 10:38:16 +0000, Orchid Win7 v1 wrote:
> This seems to be the main argument of the book. Anything non-obvious,
> either put it into a subroutine with a descriptive name, or make it into
> a variable who's name says what it's for. Basically, anything that isn't
> clear, slap a name on it to clear things up.
That doesn't address the situation that it may be obvious to you, but not
to someone else who's reading the code.
In order to be effective at this, you have to put yourself in the shoes
of someone who isn't familiar with what you're doing. I've worked with a
fair number of developers, and they don't always have that ability.
Jim
--
"I learned long ago, never to wrestle with a pig. You get dirty, and
besides, the pig likes it." - George Bernard Shaw
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 11-1-2014 21:22, Orchid Win7 v1 wrote:
> On 11/01/2014 08:13 PM, Warp wrote:
>> Orchid Win7 v1<voi### [at] devnull> wrote:
>>> Possibly the best insight the book has to offer is "if you need to read
>>> the comments to work out what the code does... the code sucks".
>>
>> There are many situations where comments are extremely helpful, not only
>> for others, but for the programmer himself.
>>
>> For example, the implementation of a complex algorithm is often almost
>> indecipherable without knowing the algorithm in question, and how it
>> has been implemented in that particular case. Trying to understand a
>> complex algorithm by reading (uncommented) code only can be really
>> laborious and difficult.
>>
>> Describing the algorithm, however, can make it a lot easier to understand
>> what's going on and save a lot of work.
>
> Indeed. If you're trying to implement the Bellman-Ford algorithm or
> something, some comments are probably merited. And the book says that
> non-obvious design choices are one of the few valid reasons to write
> comments.
>
> I guess the vast majority of Java code (and probably C# and similar
> languages) is just yet-another-order-processing-application or similar.
> Hell, I work in data analysis, and > 80% of the codebase is just user
> management, loading and saving configuration data, and other such
> chores. If you're writing mile after mile of that, who needs comments?
> If you label everything clearly enough, it'll probably be fine.
>
> (One could argue that more complex algorithms can be made readable by
> suitably suggestive labelling... but at some point that stops working,
> IMHO.)
I am still using Knuth's literate programming style to document the
algorithm if it is non-trivial. That way I can have pictures and
formulas and all other explanatory devices for when comments are really
needed.
--
Everytime the IT department forbids something that a researcher deems
necessary for her work there will be another hole in the firewall.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
|
|