|
 |
On 16/11/2010 03:07 PM, andrel wrote:
> If the data is from a different distribution, you have to know that
> before you can compute anything.
I guess there really are two cases to consider here.
When you want to, say, anti-alias an algorithmic image by super-sampling
it, what you are effectively trying to do is compute the integral of a
discontinuous function. Usually this function can in principle contain
arbitrarily high frequencies. (That's what "discontinuous" is, loosely.)
But if the results of the function are bounded, I guess you should still
be able to compute the minimum and maximum possible values the integral
could have, given the samples you've collected so far. So I guess you
just keep going until this range gets suitably narrow.
OTOH, any real interval contains an (uncountably) infinite number of
points, so unless you sample an infinite number of points, the minimum
and maximum integral values don't actually change. So then I guess you
need to add some kind of probability estimate for "how evil" the
function you're trying to integrate might perhaps be...
The other case is when you're trying to measure something. The thing you
want to measure should theoretically have a single, fixed, value, but
each time you measure it you get a certain amount of interference. How
many times do you have to measure it? Can you assume that all
interference, from any source, is normally distributed? Hmm, tricky.
Browsing Wikipedia indicates that both the mean and SD are easily biased
by a single distant outlier, and that more sophisticated methods are
preferable.
Then again, perhaps if you're trying to measure something, what you
actually want is the /histogram/ rather than "the value"...
Post a reply to this message
|
 |