POV-Ray: Newsgroups: povray.off-topic: Unit testing - simple question with long explanation for discussion...: Unit testing - simple question with long explanation for discussion...

POV-Ray : Newsgroups : povray.off-topic : Unit testing - simple question with long explanation for discussion... : Unit testing - simple question with long explanation for discussion...		Server Time 1 Jul 2025 05:02:39 EDT (-0400)
From: Darren New
Date: 5 Jan 2009 17:13:54
Message: <49628622@news.povray.org>
Is there a resource that tells how to write good unit tests? Not "given the 
tests, how do you put them in a framework", but "how do you know what tests 
to write, given the framework"?

My problem is, in part, that I'm not sure I can think of any non-trivial 
unit tests that I'd run with any sort of frequency.  Let me illustrate with 
some of the sorts of routines I've recently written tests for, or have been 
unable to write tests for.

1) Given a string, canonicalize it. Take out punctuation, stop words, and 
make it lower case, to increase the likelihood that two song titles or 
artists from different sources will match. Of course I wrote tests for this. 
And I got it wrong the first time without knowing it, because I forgot to 
account for multiple consecutive stop words. But it's a leaf routine, so the 
likelihood I change it once I think it works is almost nil, the likelihood 
that I could possibly change something else and break this routine is almost 
nil (since I'm using a safe language), so I can't imagine why I'd run any 
tests I wrote multiple times.

2) I have a table with 6 million songs from iTunes. I have a table with 
800,000 ringtones. Both get updated approximately weekly. The metadata for 
each is of questionable accuracy. (For example, iTunes has about 4 different 
ways of saying "artist", and it's clearly a manual process adding it, 
because you can see the same mistakes in different clumps of data. The 
ringtone database is in various character sets, but the people building the 
database didn't bother to normalize the characters to utf nor record what 
character encoding was used. Just as examples.) I also have a continuous 
stream of song/artist information coming in from a third company (MG) as 
well as from various individual low-speed streams (like individual radio 
stations, web screen scrapes, etc). I want to be able to rapidly match the 
data from these low-speed streams against the data in the iTunes and 
ringtone tables with the best accuracy possible. How do I write a test to 
ensure I'm serving the right answers for given songs, considering the data 
changes faster than I can make an exhaustive manual test, and that making up 
my own data really doesn't tell me anything of interest? Mind, too, that not 
only does one company call it "the alan parsons project" and the other call 
it "alan parsons", but we also want to find the rendition of Beatle's 
"Yesterday" covered by the Philadelphia Philharmonic if and only if the 
actual Beatles song isn't out there, for example.

So this is basically the question of "how do you write unit tests for data 
sets in the millions where the unobvious errors might be 1% or 0.1% of the 
data?"

3) We have a feed of data coming in from MG. Reading the stream is a 
destructive operation. The data is vital to our ongoing service, in the 
sense that if I consume data while the system is live, people will get the 
wrong answer. The only way to read the stream is to use a binary blob of 
executable provided to us by MG. How do I test responding to outages? (I.e., 
this is basically "how do you mock something whose specs are unspecified and 
which you can't futz with directly?")

4) We output the results to HTML pages. The pages are rather complex, and 
include or exclude lots of stuff depending on what kind of result we found, 
etc. You could unit-test this by generating known data sets (like "it was a 
song from station X, with name and artist Y, with one ringtone perfect 
match, another ringtone with the same artist but different song, and two 
songs on iTunes with the same name but different artists, one of which sells 
better than the other so we want to list it first", and then look at the 
HTML that comes out, but every change to the templates is going to be 
different HTML coming out. I suppose having the data sets and templates will 
let you look at what you think is all the possible HTML coming out 
conveniently, but that's also not the sort of thing you need to run if you 
haven't changed the templates or tests, and whether the answer is "right" is 
not something you can algorithmically check.

It just seems to me that in most everything I do, either I have something 
impossible to mock, something unreasonable to check extensively by hand (and 
which will yield wrong results in a small percentage of cases), or there's 
output from the routines that gets interpreted by something else, so there's 
a huge number of equivalent results that are all "correct". (How do you 
unit-test automatically that your UI javascript does the right thing in all 
the browsers you care about?)

The sophisticated data structures I use all come from someone else. I don't 
need to test the on-disk B-Tree implementation the SQL database engine uses, 
nor do I need unit tests to check that qsort() is actually returning sorted 
lists.

My code tends to work with big piles of messy data, including that where the 
result you get isn't always obviously correct. The actual individual methods 
themselves are 90% of the time straightforward (as in, call these three 
searches and concat the resulting lists), and in the 10% of the time they're 
not, I don't change them but once in a blue moon and only to change things 
that would break unit tests anyway, such that writing unit tests would be 
counter-productive.


-- 
   Darren New, San Diego CA, USA (PST)
   Why is there a chainsaw in DOOM?
   There aren't any trees on Mars.
Post a reply to this message