POV-Ray: Newsgroups: povray.off-topic: Noise-function statistics?

POV-Ray : Newsgroups : povray.off-topic : Noise-function statistics?		Server Time 29 Jul 2024 18:17:45 EDT (-0400)

From: gregjohn
Subject: Noise-function statistics?
Date: 28 Apr 2011 10:35:01
Message: <web.4db97aaac5af6419a00085090@news.povray.org>

I have a situation (at work, if it matters) where I am seeing a population of
points or defects which are not spread out as random pixelated points but rather
as a cloud or noise3d-like function. I am wondering if we could learn more about
the cause or improve our sampling plan by thinking about it with the right
statistical ideal.

Can anyone point me to the proper terminology for this, or primers on ways to
statistically model this, or even what one can say about the physical mechanisms
when it happens this way?

Post a reply to this message

From: Kevin Wampler
Subject: Re: Noise-function statistics?
Date: 28 Apr 2011 14:23:05
Message: <4db9b089@news.povray.org>

I'm not sure if I'll be of any help or not, but I'd think that some more 
details would be useful.  In particular do you have an image of this? 
Also, what's the application?  How is the image captured?  What are the 
images of?



On 4/28/2011 7:33 AM, gregjohn wrote:
> I have a situation (at work, if it matters) where I am seeing a population of
> points or defects which are not spread out as random pixelated points but rather
> as a cloud or noise3d-like function. I am wondering if we could learn more about
> the cause or improve our sampling plan by thinking about it with the right
> statistical ideal.
>
> Can anyone point me to the proper terminology for this, or primers on ways to
> statistically model this, or even what one can say about the physical mechanisms
> when it happens this way?

Post a reply to this message

From: Le Forgeron
Subject: Re: Noise-function statistics?
Date: 28 Apr 2011 14:35:53
Message: <4db9b389@news.povray.org>

Le 28/04/2011 16:33, gregjohn nous fit lire :
> I have a situation (at work, if it matters) where I am seeing a population of
> points or defects which are not spread out as random pixelated points but rather
> as a cloud or noise3d-like function. I am wondering if we could learn more about
> the cause or improve our sampling plan by thinking about it with the right
> statistical ideal.
> 
> Can anyone point me to the proper terminology for this, or primers on ways to
> statistically model this, or even what one can say about the physical mechanisms
> when it happens this way?

chi-square tests.

http://en.wikipedia.org/wiki/Chi-square_distribution

Post a reply to this message

From: gregjohn
Subject: Re: Noise-function statistics?
Date: 28 Apr 2011 18:35:00
Message: <web.4db9eb50715e47a434d207310@news.povray.org>

Thanks,  Le Forgeron, I'll read up on Chi.

Kevin Wampler <wam### [at] uwashingtonedu> wrote:
> I'm not sure if I'll be of any help or not, but I'd think that some more
> details would be useful.  In particular do you have an image of this?
> Also, what's the application?  How is the image captured?  What are the
> images of?
>

Here's an accurate, if morbid, example.

A) Take a large auditorium filled with people.  Have each individual roll a die
or use some other random number picker. If they are over the threshold, you give
them an injection of cold virus. That's a random "point defects".

B) Then take another auditorium. Put a handful of really sick people up in the
rafters and have them sneeze on the crowd below. Some groups below will be under
a sneeze cloud, others won't. The distribution of sick people will now look
something like povray's noise3d function.  If you're sick, it's very likely the
person next to you is sick, and well for well.

Now you're a statistician who wants to describe auditorium A, then I think it's
pretty straightforward.  Your sampling plan can be pretty simple. I might even
say if you KNOW the population were to have completely random defects, then you
can be lazy in how exhaustively random you sample.  But if you've chosen a lazy
sampling plan for A), say just the first two rows, and you end up with
auditorium B), you're making wrong predictions.

So that's the pitfall.  Are there any benefits when you have B)? Are there ways
to test between the "noise3d" function and true random points?

Post a reply to this message

From: Kevin Wampler
Subject: Re: Noise-function statistics?
Date: 28 Apr 2011 20:01:25
Message: <4db9ffd5$1@news.povray.org>

On 4/28/2011 3:33 PM, gregjohn wrote:
> Here's an accurate, if morbid, example.
>
> A) Take a large auditorium filled with people.  Have each individual roll a die
> or use some other random number picker. If they are over the threshold, you give
> them an injection of cold virus. That's a random "point defects".
>
> B) Then take another auditorium. Put a handful of really sick people up in the
> rafters and have them sneeze on the crowd below. Some groups below will be under
> a sneeze cloud, others won't. The distribution of sick people will now look
> something like povray's noise3d function.  If you're sick, it's very likely the
> person next to you is sick, and well for well.
>
> Now you're a statistician who wants to describe auditorium A, then I think it's
> pretty straightforward.  Your sampling plan can be pretty simple. I might even
> say if you KNOW the population were to have completely random defects, then you
> can be lazy in how exhaustively random you sample.  But if you've chosen a lazy
> sampling plan for A), say just the first two rows, and you end up with
> auditorium B), you're making wrong predictions.
>
> So that's the pitfall.  Are there any benefits when you have B)? Are there ways
> to test between the "noise3d" function and true random points?
>

Ok, I think I see what you're talking about now.  This isn't an area I 
actually know anything about, but for what it's worth here's what I can 
think of off the top of my head.

The issue here is that your distribution of sick people in case B has a 
non-trivial covariance.  More explicitly, consider the probability that 
each person will be sick.  You can represent these probabilities in a 
vector, with one element in the vector for each person giving the 
probability with which that person will be sick.  Since there's 
randomness in who will get sick, this is a random vector, and what you 
care about is how these random vectors are distributed.

There's a few very basic statistical measures which you can use to study 
the distribution of your vector.  The mean gives a vector telling you 
the expected odds with which each person will get sick, and the 
covariance gives you a matrix telling you how correlated the odds that 
eahc of a pair of people will get sick are.  In your case A the 
covariance matrix is diagonal -- that is knowing one person is sick 
doesn't tell you anything about the odds of other people being sick.  In 
your case B the covariance matrix is not diagonal, since if you know one 
person is sick you'd expect nearby people to have a higher chance of 
being sick.

Now, depending on your case you may be able to say some very specific 
things about the expected distribution of your random vector, but a 
standard choice which happens to be computationally tractable is to 
assume that your distribution is a (modified) multivariate Gaussian 
(with a possibly unknown mean and covariance).  I say modified here 
because it's only a true Guassian if the values in the random vector are 
true real numbers, but in your case they're probabilities and thus must 
be between 0 and 1.  A standard solution is to model the distribution as 
a composition of a Gaussian and a sigmoid.

So, now that we have a model for the distribution of sickness in the 
room, on to your questions.

* Firstly, there are indeed advantages to case B.  Knowing that there is 
a non-trivial covariance lets you predict who will be sick with better 
accuracy using fewer samples.  For instance in the case of "perfect" 
covariance where either everyone is sick or nobody is, then you can tell 
who is sick with just a single sample.  For imperfect covariance you'll 
of course need more samples than one, if you sample well you'll always 
be better off than case A.  This of course assumes that you have some 
reasonable idea what the covariance is beforehand, otherwise you 
wouldn't know weather it was case A or case B in the first place.

* Secondly, you can solve for the covariance.  If you have lots of data 
(in your example this would be lots of times running the experiment) you 
can solve for it directly, at least in the case where you've assumed a 
Gaussian distribution.  The computation looks a little bit like solving 
for the mean, but gives you a matrix instead of a vector.  If you don't 
have that much data, you could express the covariance matrix in terms of 
a small number of parameters (I think these would be called 
hyperparameters) and then solve for the correct values of these 
hyperparameters in a maximum likelihood sense.  A sensible 
hyperparameter in your example would be the "width" of a falloff 
function saying how much the sickness of nearby people should covary.

Hopefully this was helpful.  It's hard to know if it's the sort of thing 
you're looking for or not without a better idea what you want to solve.

Post a reply to this message