POV-Ray: Newsgroups: povray.off-topic: An interesting read: Re: An interesting read

POV-Ray : Newsgroups : povray.off-topic : An interesting read : Re: An interesting read		Server Time 28 Jul 2024 14:23:31 EDT (-0400)
From: Orchid Win7 v1
Date: 12 Jan 2014 05:38:05
Message: <52d2708d$1@news.povray.org>
On 12/01/2014 06:10 AM, Jim Henderson wrote:
> On Sat, 11 Jan 2014 18:13:44 +0000, Orchid Win7 v1 wrote:
>
>> Possibly the best insight the book has to offer is "if you need to read
>> the comments to work out what the code does... the code sucks".
>
> That idea doesn't take into account others who may have to read and fix/
> modify the code and their skills with the language.
>
> Self-commenting code, of course, is a good goal - but writing code that
> is readable to you doesn't mean it's readable to everyone.
>
> Comments help bridge that gap, IMHO.

Consider the following code:

   public SortedList<int> Primes(int max)
   {
     SortedList<int> x = new SortedList<int>();
     SortedList<int> y = new SortedList<int>();
     for (int n=2; n<max; n++) x.Add(n);
     while (x.Size() > 0)
     {
       int p = x[0];
       y.Add(p);
       for (int k=1; k*p < max; k++) x.Remove(k*p);
     }
     return y;
   }

Now it doesn't take a genius to work out what this does, but a few 
comments would certainly help make it a bit clearer. There's a couple of 
bits that aren't especially obvious, and by throwing in a comment or 
two, you could clear things up.

But now suppose we refactor it to look like this:

   public SortedList<int> FindPrimesBelow(int max)
   {
     SortedList<int> candidates = InitialiseCandidates(max);
     SortedList<int> primes = new SortedList<int>();

     while (candidates.Size() > 0)
     {
       int prime = GetNextPrime(candidates);
       primes.Add(prime);
       RemovePrimeAndItsMultiples(prime, candidates, max);
     }

     return primes;
   }

   private SortedList<int> InitialiseCandidates(int max)
   {
     SortedList<int> candidates = new SortedList<int>();

     for (int n=2; n<max; n++)
     {
       candidates.Add(n);
     }

     return candidates;
   }

   private int GetNextPrime(SortedList<int> candidates)
   {
     return candidates[0];
   }

   private int RemovePrimeAndItsMultiples(int prime, SortedList<int> 
candidates, int max)
   {
     for (int k=1; k*prime < max; k++)
     {
       candidates.Remove(k*prime);
     }
   }

This now makes it pretty much drop-dead obvious what the hell the 
algorithm does; it builds a list of the numbers from N to Max, and loops 
over them. At each step, it finds the next prime, adds it to the list of 
primes, and then removes it and all its multiples from the list of 
candidates. When the list of candidates becomes empty, it returns the 
list of primes. Simples.

There are few places here where a comment would add much of value. 
Notice how by replacing candidates[0] with GetNextPrime(), we've made it 
obvious what this particular bit of code does, without writing a 
comment. It may not be obvious *why* this works, but it is now clear 
*what* it does.

This seems to be the main argument of the book. Anything non-obvious, 
either put it into a subroutine with a descriptive name, or make it into 
a variable who's name says what it's for. Basically, anything that isn't 
clear, slap a name on it to clear things up.

Notice RemovePrimeAndItsMultiples(). That's a rather verbose name. Lots 
of people would just name this "ProcessPrime()" or something. But that 
doesn't tell you what this "processing" is without looking inside the 
method body. But by calling it RemovePrimeAndItsMultiples(), you can 
tell *exactly* what the method does without ever needing to read its 
contents. You may not know *how* it does its job, but you can tell what 
its job is.

In particular, the name makes it clear that the method removes multiples 
of the prime AND THE PRIME ITSELF. That small but crucial detail wasn't 
especially obvious in the original. Sure, you can work it out easily 
enough, but it's such an important fact that it merits being called out.

When I first started work at my new job, I was surprised at how long 
some of the variable names are. But now I understand what they're trying 
(not necessarily successfully) to do: to make it obvious to anyone 
reading WTF is going on. Now if only we could do that with our class 
names too.

[I'm still bitter that we have two classes both called ItemData, and one 
is a field of the other!! Because *that* couldn't possibly lead to 
confusion at all...]

I don't have the book in front of me now, but prime number sieving is 
actually one of the examples. They even go so far as to remove the "stop 
searching after sqrt(max)" optimisation to "make the code simpler". 
Because most of the time, being able to understand the code is far more 
important than getting maximum performance out of it. Like I said, most 
of the code that *I* write all day is just CRUD. It's all GUI stuff that 
only needs to be as fast as the human sitting at the keyboard.
Post a reply to this message