|
![](/i/fill.gif) |
Am 12.01.2014 22:21, schrieb andrel:
> My rule of thumb is that whenever a piece of code does a single thing
> that you can write a specification for that is independent of the
> context, it should be a subroutine/function/method. With a name that
> describes that specification.
> Which would mean that because processing can be specified, and
> generating output can be specified also, and both specifications are
> independent, I would end up with two functions and a wrapper.
>
>
> I am working with another group that has apparently a different opinion
> on writing code. Their code has numerous instances of this code snippet
>
> S=getSmode(T,dep,rep, GEOM.SPECS ,4);
> PHI=lowpassma(GEOM.AMA*S,5);
>
> this is a short version, there are others that do it in one line, take
> subsets and do other stuff as well:
>
> PSIA =lowpassma(SCAN.AMA * getSmode(ones(length(GEOM.VER),1) * ( 1 :
> SCAN.qrsduration ),dep,SCAN.rep,
> GEOM.SPECS,SCAN.scanmode,GEOM),SCAN.lpass);
>
> (never mind that most variables have different names in this place for
> no obvious reason, it still does the same thing. There are many more
> unacceptable things in this snippet of code, but let's stick to the
> naming issue here.)
>
> Question: what do you think this does?
I have no freaking idea.
> Would you create a separate function for this, even if it is just one or
> two lines and how would you name that function?
Absolutely - even though for different reasoning: One of my primary
personal mantras is that no code sequence should appear more than once
in any given program - not even (or, especially not!) with slight
variations.
Probably the most obvious reason for this is a maintainability aspect:
Having the code sequence in just one place means that if it needs to be
changed, it only needs to be changed in one single place. But there are
many other code quality aspects besides:
- If encapsulated in a function, the code sequence gets a better test
coverage. (If you have copies of it in multiple places, one of the
copies might have a typo that the others don't, and if Murphy's right
your test cases will be such that they'd spot the bug in any of the
copies except the one it happens to be in).
- If encapsulated in a function, you can more easily do unit tests on
this particular code sequence.
- If encapsulated in a function, fixing a bug in the code sequence for
one instance where it is used means fixing it for all instances where it
is used.
- If encapsulated in a function, you can more easily document any
assumptions the code sequence relies on. You also have a better chance
of verifying that those assumptions are valid everywhere the code
sequence is used.
- If encapsulated in a function, well-documented and well-tested, you
have a reliable building brick to use in new features.
As for a function's name, I find it very important that it denotes not
/how/ the function does something, nor just /what/ it does, but also
with what /intention/.
For instance, while refactoring the old POV-Ray code I came across
plenty of places where old code computes /some/ scalar measure of an RGB
colour's "intensity": Some just take the value of the brightest colour
component, i.e. max(r,g,b); others take the greyscale value of the
colour, i.e. (r*R)+(g*G)+b*B) with some constant R,G,B. Some use
absolute values, some just keep negative values around. And some add
some more twists to it.
Why are there so many variations on this theme?
Chances are, at least some of these don't always do what they should do.
Some will probably break down for special cases, while some may work
equally well for all cases but are fundamentally unsuited for the
purpose they're used for.
As a first step towards cleaning up all this mess, I ended up with six
different functions:
(1) A function to be called to compute a colour's greyscale value.
(2) A function to be called to compute how intense the most intense
colour channel is.
(3) A function to be called to compute some more or less abstract
"weight" or "importance" from a colour, currently implemented as the
colour's greyscale value. (This reeks of bogosity, and will probably be
replaced by (5) later.)
(4) A function to be called to compute some more or less abstract
"weight" or "importance" from a colour, currently implemented as the
intensity of the most intense colour channel. (This, too, reeks of
bogosity, and will probably be replaced by (6) later.)
(5) A function to be called to compute some more or less abstract
"weight" or "importance" from a colour, currently implemented as the
/absolute/ of the colour's greyscale value. (Less bogus than (3), but
ultimately to be revised as well.)
(6) A function to be called to compute some more or less abstract
"weight" or "importance" from a colour, currently implemented as the
highest /absolute/ intensity of all the colour channels. (Less bogus
than (4), but ultimately to be revised as well.)
Note how the functions (1) and (3), as well as (2) and (4), do exactly
the same thing - yet I decided to separate them, because they do it for
different purposes. (In order to avoid duplicate code, (3) and (4)
actually just call (1) and (2), respectively.)
Also note how (3) to (6) all do whatever they do for apparently the very
same purpose - computing some more or less abstract "weight" or
"importance" from a colour. So the fact that there exist four
functionally different implementations, without any documented reason
whatsoever, is a strong indication that at least three of them are wrong.
After some serious reasoning, I came to the conclusion that all four of
them were wanting, and that the task was best solved by yet another,
seventh function:
(7) A function to be called to compute some more or less abstract
"weight" or "importance" from a colour, implemented as the average of
the absolute of each colour channel.
The intended transition from (3)-(6) to (7) should be an easy job now
that they're all encapsulated into functions, but only because I have
distinguished between (1)-(2) and (3)-(4) right from the start.
Post a reply to this message
|
![](/i/fill.gif) |