|
![](/i/fill.gif) |
On 12/01/2014 06:10 AM, Jim Henderson wrote:
> On Sat, 11 Jan 2014 18:13:44 +0000, Orchid Win7 v1 wrote:
>
>> Possibly the best insight the book has to offer is "if you need to read
>> the comments to work out what the code does... the code sucks".
>
> That idea doesn't take into account others who may have to read and fix/
> modify the code and their skills with the language.
>
> Self-commenting code, of course, is a good goal - but writing code that
> is readable to you doesn't mean it's readable to everyone.
>
> Comments help bridge that gap, IMHO.
Consider the following code:
public SortedList<int> Primes(int max)
{
SortedList<int> x = new SortedList<int>();
SortedList<int> y = new SortedList<int>();
for (int n=2; n<max; n++) x.Add(n);
while (x.Size() > 0)
{
int p = x[0];
y.Add(p);
for (int k=1; k*p < max; k++) x.Remove(k*p);
}
return y;
}
Now it doesn't take a genius to work out what this does, but a few
comments would certainly help make it a bit clearer. There's a couple of
bits that aren't especially obvious, and by throwing in a comment or
two, you could clear things up.
But now suppose we refactor it to look like this:
public SortedList<int> FindPrimesBelow(int max)
{
SortedList<int> candidates = InitialiseCandidates(max);
SortedList<int> primes = new SortedList<int>();
while (candidates.Size() > 0)
{
int prime = GetNextPrime(candidates);
primes.Add(prime);
RemovePrimeAndItsMultiples(prime, candidates, max);
}
return primes;
}
private SortedList<int> InitialiseCandidates(int max)
{
SortedList<int> candidates = new SortedList<int>();
for (int n=2; n<max; n++)
{
candidates.Add(n);
}
return candidates;
}
private int GetNextPrime(SortedList<int> candidates)
{
return candidates[0];
}
private int RemovePrimeAndItsMultiples(int prime, SortedList<int>
candidates, int max)
{
for (int k=1; k*prime < max; k++)
{
candidates.Remove(k*prime);
}
}
This now makes it pretty much drop-dead obvious what the hell the
algorithm does; it builds a list of the numbers from N to Max, and loops
over them. At each step, it finds the next prime, adds it to the list of
primes, and then removes it and all its multiples from the list of
candidates. When the list of candidates becomes empty, it returns the
list of primes. Simples.
There are few places here where a comment would add much of value.
Notice how by replacing candidates[0] with GetNextPrime(), we've made it
obvious what this particular bit of code does, without writing a
comment. It may not be obvious *why* this works, but it is now clear
*what* it does.
This seems to be the main argument of the book. Anything non-obvious,
either put it into a subroutine with a descriptive name, or make it into
a variable who's name says what it's for. Basically, anything that isn't
clear, slap a name on it to clear things up.
Notice RemovePrimeAndItsMultiples(). That's a rather verbose name. Lots
of people would just name this "ProcessPrime()" or something. But that
doesn't tell you what this "processing" is without looking inside the
method body. But by calling it RemovePrimeAndItsMultiples(), you can
tell *exactly* what the method does without ever needing to read its
contents. You may not know *how* it does its job, but you can tell what
its job is.
In particular, the name makes it clear that the method removes multiples
of the prime AND THE PRIME ITSELF. That small but crucial detail wasn't
especially obvious in the original. Sure, you can work it out easily
enough, but it's such an important fact that it merits being called out.
When I first started work at my new job, I was surprised at how long
some of the variable names are. But now I understand what they're trying
(not necessarily successfully) to do: to make it obvious to anyone
reading WTF is going on. Now if only we could do that with our class
names too.
[I'm still bitter that we have two classes both called ItemData, and one
is a field of the other!! Because *that* couldn't possibly lead to
confusion at all...]
I don't have the book in front of me now, but prime number sieving is
actually one of the examples. They even go so far as to remove the "stop
searching after sqrt(max)" optimisation to "make the code simpler".
Because most of the time, being able to understand the code is far more
important than getting maximum performance out of it. Like I said, most
of the code that *I* write all day is just CRUD. It's all GUI stuff that
only needs to be as fast as the human sitting at the keyboard.
Post a reply to this message
|
![](/i/fill.gif) |