POV-Ray : Newsgroups : povray.off-topic : Statistics question Server Time
4 Sep 2024 19:19:12 EDT (-0400)
  Statistics question (Message 1 to 10 of 12)  
Goto Latest 10 Messages Next 2 Messages >>>
From: Invisible
Subject: Statistics question
Date: 18 Jan 2010 09:29:25
Message: <4b547045$1@news.povray.org>
Suppose you collect a large quantity of numbers, randomly distributed 
over the interval 0..k.

Assuming these numbers are truly random, and the distribution is truly 
uniform, presumably the arithmetic mean should be k/2.

Can anybody tell me what the value of some other statistics should 
hypothetically be? (Geometric mean, harmonic mean, quadratic mean, 
standard deviation...)


Post a reply to this message

From: Darren New
Subject: Re: Statistics question
Date: 18 Jan 2010 11:58:59
Message: <4b549353$1@news.povray.org>
Invisible wrote:
> Can anybody tell me what the value of some other statistics should 
> hypothetically be? (Geometric mean, harmonic mean, quadratic mean, 
> standard deviation...)

I would think if you look up the definitions and then put in the set of 
numbers 1..k, you should come up with the answer, yes? Or is that not how 
statistics works?

http://en.wikipedia.org/wiki/Standard_deviation

-- 
Darren New, San Diego CA, USA (PST)
   Forget "focus follows mouse." When do
   I get "focus follows gaze"?


Post a reply to this message

From: Vincent Le Chevalier
Subject: Re: Statistics question
Date: 18 Jan 2010 12:13:41
Message: <4b5496c5$1@news.povray.org>
Darren New wrote:
> Invisible wrote:
>> Can anybody tell me what the value of some other statistics should 
>> hypothetically be? (Geometric mean, harmonic mean, quadratic mean, 
>> standard deviation...)
> 
> I would think if you look up the definitions and then put in the set of 
> numbers 1..k, you should come up with the answer, yes? Or is that not 
> how statistics works?

I think he is looking for the value predicated based on the probability 
distribution, not the estimated value for one particular set of data.

Quadratic mean and standard deviation should be fairly easy to find, as 
here:
http://mathworld.wolfram.com/UniformDistribution.html
Quadratic mean = sqrt(k^2/3)
Standard deviation = sqrt(k^2/12)
(Quite easy to compute with a simple integral, too).

Geometric and harmonic mean are more difficult... Because they are not 
commonly useful as far as I'm aware. But I guess the integrals can't be 
too complex for these either...

-- 
Vincent


Post a reply to this message

From: Darren New
Subject: Re: Statistics question
Date: 18 Jan 2010 12:40:15
Message: <4b549cff$1@news.povray.org>
Vincent Le Chevalier wrote:
> Darren New wrote:
>> Invisible wrote:
>>> Can anybody tell me what the value of some other statistics should 
>>> hypothetically be? (Geometric mean, harmonic mean, quadratic mean, 
>>> standard deviation...)
>>
>> I would think if you look up the definitions and then put in the set 
>> of numbers 1..k, you should come up with the answer, yes? Or is that 
>> not how statistics works?
> 
> I think he is looking for the value predicated based on the probability 
> distribution, not the estimated value for one particular set of data.

That's what I understood. If he has a linear set of data, wouldn't it 
theoretically be equal quantities of each value, and hence wouldn't the 
statistics work to calculate the value from just one set of numbers per k?

> Quadratic mean and standard deviation should be fairly easy to find, as 
> here:
> http://mathworld.wolfram.com/UniformDistribution.html
> Quadratic mean = sqrt(k^2/3)
> Standard deviation = sqrt(k^2/12)
> (Quite easy to compute with a simple integral, too).

Oh, you mean "closed form". Yes, I guess that might be a little harder to 
figure out without the math to close over the expression. (I'm kind of 
curious where that 1/12'th comes from.)

I often have (unimportant, idle) questions like "how many times can I expect 
to roll a die before I've seen every face" or "... before a 1 comes up" or 
some such. Never bothered to actually write it down somewhere once I looked 
it up, tho. I suspect if I had to figure it out myself, I'd remember it.

-- 
Darren New, San Diego CA, USA (PST)
   Forget "focus follows mouse." When do
   I get "focus follows gaze"?


Post a reply to this message

From: Vincent Le Chevalier
Subject: Re: Statistics question
Date: 18 Jan 2010 13:04:46
Message: <4b54a2be$1@news.povray.org>
Darren New wrote:
> 
> That's what I understood. If he has a linear set of data, wouldn't it
>  theoretically be equal quantities of each value, and hence wouldn't
> the statistics work to calculate the value from just one set of
> numbers per k?

Ah yes, I guess that would be like trying to numerically find the value
of the integrals instead of finding the closed form...


>> Standard deviation = sqrt(k^2/12)
> 
> (I'm kind of curious where that 1/12'th comes from.)

Well... It's 1/3 - 1/2 + 1/4 :-D

standard deviation = sqrt(\int_0^k f(x) (x-m) dx), where f is the
probability density and m the mean of the density.

Here f(x) = 1/k and m=k/2, so
\int_0^k f(x) (x-m)^2 dx = \int_0^k (x-k/2)^2/k dx
  = 1/k \int_0^k (x^2 - k x + k^2/4) dx
  = 1/k (k^3/3 - k/2 k^2 + k^2/4 k)
  = k^2/3 - k^2/2 + k^2/4
  = k^2 (1/3 - 1/2 + 1/4)
  = k^2/12

(There's probably a shorter way around this one but it works)

-- 
Vincent


Post a reply to this message

From: Darren New
Subject: Re: Statistics question
Date: 18 Jan 2010 14:08:27
Message: <4b54b1ab$1@news.povray.org>
Vincent Le Chevalier wrote:
> Ah yes, I guess that would be like trying to numerically find the value
> of the integrals instead of finding the closed form...

Yes. I'm just so clueless about statistics I'm not sure you can just "assume 
the distribution really *is* uniform". :-)

> Well... It's 1/3 - 1/2 + 1/4 :-D

Ah, bits from the powers of the integral. OK. Thanks.

-- 
Darren New, San Diego CA, USA (PST)
   Forget "focus follows mouse." When do
   I get "focus follows gaze"?


Post a reply to this message

From: Invisible
Subject: Re: Statistics question
Date: 19 Jan 2010 08:06:00
Message: <4b55ae38$1@news.povray.org>
Vincent Le Chevalier wrote:

> Quadratic mean and standard deviation should be fairly easy to find, as 
> here:
> http://mathworld.wolfram.com/UniformDistribution.html
> Quadratic mean = sqrt(k^2/3)
> Standard deviation = sqrt(k^2/12)

OK, thanks.

> Geometric and harmonic mean are more difficult... Because they are not 
> commonly useful as far as I'm aware.

Yeah, fair enough. I was just wondering if they had a simple formula.

Basically I'm trying to prove that my psuedo-random number generator 
generates numbers which are at least psuedo-random...


Post a reply to this message

From: Kevin Wampler
Subject: Re: Statistics question
Date: 19 Jan 2010 13:58:03
Message: <4b5600bb$1@news.povray.org>
Invisible wrote:
> Basically I'm trying to prove that my psuedo-random number generator 
> generates numbers which are at least psuedo-random...

Is there any reason you're not using one of the existing test suites for 
this?

http://www.phy.duke.edu/~rgb/General/dieharder.php


Post a reply to this message

From: scott
Subject: Re: Statistics question
Date: 20 Jan 2010 03:12:13
Message: <4b56badd$1@news.povray.org>
>> Quadratic mean = sqrt(k^2/3)
>> Standard deviation = sqrt(k^2/12)
>
> OK, thanks.
>
>> Geometric and harmonic mean are more difficult... Because they are not 
>> commonly useful as far as I'm aware.
>
> Yeah, fair enough. I was just wondering if they had a simple formula.
>
> Basically I'm trying to prove that my psuedo-random number generator 
> generates numbers which are at least psuedo-random...

Obviously the results from your RNG are not going to *exactly* match the 
formulas above, so how do you plan to figure out how far away is acceptable?


Post a reply to this message

From: Invisible
Subject: Re: Statistics question
Date: 20 Jan 2010 04:06:38
Message: <4b56c79e$1@news.povray.org>
scott wrote:

> Obviously the results from your RNG are not going to *exactly* match the 
> formulas above, so how do you plan to figure out how far away is 
> acceptable?

It depends on the sample size. The more random numbers you generate, the 
closer the computed values should match the expected values, although 
still not exactly.

So far, the generator produces histogram and modulus results that differ 
from the expected values by about 0.1% of the sample size, which seems 
fairly reasonable to me.

Of course, it should *also* be possible to statistically compute the 
probability of a difference from the expected value of a given size. 
(E.g., isn't the sample mean supposed to be normally distributed around 
the true mean or something?) But I don't know how to do that yet. So 
far, all I've done is compute numbers and observe that they seem to get 
closer to the expected values when I increase the sample size.


Post a reply to this message

Goto Latest 10 Messages Next 2 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.