POV-Ray : Newsgroups : povray.off-topic : Linear regression Server Time
4 Sep 2024 03:13:34 EDT (-0400)
  Linear regression (Message 1 to 10 of 15)  
Goto Latest 10 Messages Next 5 Messages >>>
From: Invisible
Subject: Linear regression
Date: 8 Jun 2010 08:26:36
Message: <4c0e36fc$1@news.povray.org>
Hi there. Does anybody here happen to know how to perform linear regression?

As in, I've got a list of data points that ought to lie in a nearly 
perfect straight line, and I want to discover what the slope and 
intercept of that line is. (But each individual point might contain a 
certain amount of measurement error.)

I've had a look on Wikipedia, but it starts talking about chi-squared 
and Student t-distributions and covariance and basically I have no clue 
what the hell any of it means. :-/


Post a reply to this message

From: Mike Raiford
Subject: Re: Linear regression
Date: 8 Jun 2010 09:15:37
Message: <4c0e4279$1@news.povray.org>
On 6/8/2010 7:26 AM, Invisible wrote:
> Hi there. Does anybody here happen to know how to perform linear
> regression?
>
> As in, I've got a list of data points that ought to lie in a nearly
> perfect straight line, and I want to discover what the slope and
> intercept of that line is. (But each individual point might contain a
> certain amount of measurement error.)

It's easy enough to get a line in Point-slope form, given 2 points. 
Could you not integrate the deviation of the points between the 
endpoints and subsequently adjust the location of the line so it falls 
in the middle?

>
> I've had a look on Wikipedia, but it starts talking about chi-squared
> and Student t-distributions and covariance and basically I have no clue
> what the hell any of it means. :-/


-- 
~Mike


Post a reply to this message

From: scott
Subject: Re: Linear regression
Date: 8 Jun 2010 09:23:55
Message: <4c0e446b$1@news.povray.org>
> Hi there. Does anybody here happen to know how to perform linear 
> regression?
>
> As in, I've got a list of data points that ought to lie in a nearly 
> perfect straight line, and I want to discover what the slope and intercept 
> of that line is. (But each individual point might contain a certain amount 
> of measurement error.)
>
> I've had a look on Wikipedia, but it starts talking about chi-squared and 
> Student t-distributions and covariance and basically I have no clue what 
> the hell any of it means. :-/

This page:

http://en.wikipedia.org/wiki/Simple_linear_regression

Gives the formula for the coefficients alpha and beta for the line of best 
fit.  It's pretty self-explanatory if you know how to calculate a summation.

The simplest form for you seems to be:

b = (mean[x*y] - mean[x] * mean[y]) / ( mean[x^2] - mean[x]^2 )

a = mean[y] - b * mean[x]

then your line of best fit is:

y = a + b*x


Post a reply to this message

From: Invisible
Subject: Re: Linear regression
Date: 8 Jun 2010 09:29:50
Message: <4c0e45ce@news.povray.org>
scott wrote:

> The simplest form for you seems to be:
> 
> b = (mean[x*y] - mean[x] * mean[y]) / ( mean[x^2] - mean[x]^2 )
> 
> a = mean[y] - b * mean[x]
> 
> then your line of best fit is:
> 
> y = a + b*x

OK. I'll go see if I can get that to work...


Post a reply to this message

From: Invisible
Subject: Re: Linear regression
Date: 8 Jun 2010 09:58:33
Message: <4c0e4c89$1@news.povray.org>
> OK. I'll go see if I can get that to work...

Right, that seems to work quite well...

Any idea how to get a confidence estimate?


Post a reply to this message

From: Gilles Tran
Subject: Re: Linear regression
Date: 8 Jun 2010 11:52:01
Message: <4c0e6721@news.povray.org>

4c0e4c89$1@news.povray.org...
>> OK. I'll go see if I can get that to work...
>
> Right, that seems to work quite well...
>
> Any idea how to get a confidence estimate?

Years ago I wrote some VBA code to calculate regressions with all 
parameters. The code was adapted from pseudocode found in books of 
statistics, so parts of it are pure magic: I don't understand how it works. 
I can send it to you if you want.

G.


Post a reply to this message

From: Invisible
Subject: Re: Linear regression
Date: 8 Jun 2010 11:57:07
Message: <4c0e6853$1@news.povray.org>
Gilles Tran wrote:

> Years ago I wrote some VBA code to calculate regressions with all 
> parameters. The code was adapted from pseudocode found in books of 
> statistics, so parts of it are pure magic: I don't understand how it works. 
> I can send it to you if you want.

Well, it could certainly be worth a giggle...

(It's surprising how many companies have this "box of VBA magic" where 
they don't know how it works, but it does.)


Post a reply to this message

From: Kevin Wampler
Subject: Re: Linear regression
Date: 8 Jun 2010 12:07:40
Message: <4c0e6acc$1@news.povray.org>
Invisible wrote:
>> OK. I'll go see if I can get that to work...
> 
> Right, that seems to work quite well...
> 
> Any idea how to get a confidence estimate?

You could take a look at the section labeled "Confidence Intervals" on 
the page that scott linked.


Post a reply to this message

From: Gilles Tran
Subject: Re: Linear regression
Date: 8 Jun 2010 12:15:08
Message: <4c0e6c8c@news.povray.org>

4c0e6853$1@news.povray.org...
> Well, it could certainly be worth a giggle...

Here it is. Have fun figuring that out. Note that it pulls the data from a 
database first. Also I've removed the code that writes the results in the 
database.

G.

Public Sub Regression(RelationCode, NEXP)
Dim MaBd As Database, stSQL As String, DataSet As Recordset
Dim I As Integer, J As Integer, K As Integer, NOBS As Integer
Dim S(0 To 9) As Double, SSY As Double, D As Double, A As Double
Dim SPYX(0 To 9) As Double, SPXX(0 To 9, 0 To 9) As Double, VY As Double
Dim VYX(0 To 9) As Double, VXX(0 To 9, 0 To 9) As Double
Dim B(0 To 9) As Double, SB(0 To 9) As Double, R2 As Double, R2ADJ As 
Double, SY As Double
Dim F As Double, T(0 To 9) As Double, P As Double, PCoeff(0 To 9) As Double
Dim ta As Double, Pa As Double
Dim Unitcode As Integer, DataBasisCode As Integer, stCoeff As String
Dim SST As Double, SSR As Double, SSD As Double, SA As Double
Dim MSD As Double, MSR As Double, M(0 To 9) As Double
Set MaBd = CurrentDb
' NEXP = number of predictors
stSQL = "SELECT DISTINCTROW Y "
For I = 1 To NEXP
    stSQL = stSQL & ", X" & CStr(I)
Next I
stSQL = stSQL & " FROM RELATION_TEMP"
'Debug.Print stSQL
NOBS = 0
Set DataSet = MaBd.OpenRecordset(stSQL)
If DataSet.EOF Then Exit Sub
DataSet.MoveFirst
' Browse the data table
Do Until DataSet.EOF
    ' sum of squares Y
    SSY = SSY + (DataSet(0) ^ 2)
    ' for all variables
    For I = 0 To NEXP
        'sum of data
        S(I) = S(I) + DataSet(I)
    Next I
    ' for all predictors
    For I = 0 To NEXP - 1
        ' sum of products Y * X
        SPYX(I) = SPYX(I) + DataSet(0) * DataSet(I + 1) ' sum of products 
Y*x
        For J = 0 To NEXP - 1
            ' sum of products X * X
            SPXX(I, J) = SPXX(I, J) + DataSet(I + 1) * DataSet(J + 1)
        Next J
    Next I
    ' number of observations
    NOBS = NOBS + 1
    DataSet.MoveNext
Loop
DataSet.Close
If (NOBS - NEXP - 1 <= 0) Then
    frm!CurrentMessage = "Relation " & RelationCode & " - Not calculable  : 
N obs <= " & Str(NEXP + 1)
    Exit Sub
End If
' total sum of squares and variance of Y
SST = (SSY - S(0) ^ 2 / NOBS)
VY = SST / NOBS

' Means of Y and X (M(0)=mean for Y)
For I = 0 To NEXP
    M(I) = S(I) / NOBS
Next I

' covariances of Y and X
For I = 0 To NEXP - 1
    VYX(I) = (SPYX(I) - (S(0) * S(I + 1)) / NOBS) / NOBS
Next I

' covariances of the predictors X
For I = 0 To NEXP - 1
    For J = 0 To NEXP - 1
        VXX(I, J) = (SPXX(I, J) - (S(I + 1) * S(J + 1)) / NOBS) / NOBS
    Next J
Next I

' inversion (Gauss method) of the XX covariance matrix
' the original matrix is destroyed
For K = 0 To NEXP - 1
    D = VXX(K, K)
    If (D = 0) Then End 'null determinant
    VXX(K, K) = 1
    For J = 0 To NEXP - 1
        VXX(K, J) = VXX(K, J) / D
    Next J
    For I = 0 To NEXP - 1
        If (I <> K) Then
            D = VXX(I, K)
            VXX(I, K) = 0
            For J = 0 To NEXP - 1
                VXX(I, J) = VXX(I, J) - D * VXX(K, J)
            Next J
        End If
    Next I

Next K
' calculation of the coefficients

For I = 0 To NEXP - 1
    For J = 0 To NEXP - 1
        B(I) = B(I) + VXX(I, J) * VYX(J)
    Next J

Next I

' calculation of the sum of squares of deviations attribuable to the 
regression
For I = 0 To NEXP - 1
        SSR = SSR + B(I) * VYX(I) * NOBS
Next I

' calculation of the sum of squares of deviations about the regression
SSD = SST - SSR

' R2 and R2 for the degrees of freedom
R2 = 1 - (SSD / SST)
R2ADJ = 1 - (SSD * (NOBS - 1)) / (SST * (NOBS - NEXP - 1))

' mean square attribuable to the regression
MSR = SSR / NEXP

' mean square of deviations about the regression
MSD = SSD / (NOBS - NEXP - 1)

' standard error estimate
SY = Sqr(MSD)

' F value and Probability
F = MSR / MSD
P = FISHER(F, NEXP, NOBS - NEXP - 1)

' calculation of the standard error of the regression coefficients
For I = 0 To NEXP - 1
        SB(I) = Sqr(VXX(I, I) * MSD / NOBS)
        T(I) = B(I) / SB(I)
        PCoeff(I) = TSTUDENT(T(I), NOBS - NEXP - 1)
Next I

' constant a
A = M(0)
For I = 0 To NEXP - 1
        A = A - B(I) * M(I + 1)
Next I

' standard error of constant a, t-value and probability
SA = 0
For I = 1 To NEXP
    For J = 1 To NEXP
        SA = SA + M(I) * M(J) * VXX(I - 1, J - 1)
    Next J
Next I

SA = Sqr(MSD * (1 + SA) / NOBS)
ta = A / SA
Pa = TSTUDENT(ta, NOBS - NEXP - 1)

'For i = 0 To NEXP - 1
'    Debug.Print SY
'Next i

End Sub



Public Function TSTUDENT(T, DF)
' Calculates the probability of Student's t-distribution value of being less 
in absolute value than t
Dim teta As Double, I As Integer, K As Integer, A As Double, P As Double, Q 
As Double, cs As Double
Dim Pi As Double
Pi = 3.14159265358979
T = Abs(T)
teta = Atn(T / Sqr(DF))
If DF Mod 2 = 0 Then ' even degrees of freedom
    I = DF / 2 - 2
    Q = 1
    For K = 0 To I
       Q = Q * (K * 2 + 1) / (K * 2 + 2)
       cs = Cos(teta) ^ (K * 2 + 2)
       A = A + Q * cs
    Next K
    A = Sin(teta) * (A + 1)
Else    ' odd degrees of freedom
    If DF > 1 Then
        I = (DF - 1) / 2 - 1
        Q = 1
        For K = 0 To I
            If K = 0 Then
                Q = 1
            Else
                Q = Q * (K * 2) / (K * 2 + 1)
            End If
            cs = Cos(teta) ^ (K * 2 + 1)
            A = A + Q * cs
        Next K
        A = (2 / Pi) * (teta + Sin(teta) * A)

    Else
        A = 2 * teta / Pi
    End If
End If
    A = 1 - A
    If (A < 0.0001) Then A = 0
    TSTUDENT = A
'    Debug.Print P
End Function

Public Function FISHER(F, DF1, DF2)
' Calculates the probability of a F = (X1/DF1)/(X2/DF2)
Dim teta As Double, X As Double, I As Integer, K As Integer, A As Double, B 
As Double
Dim P As Double, Q As Double, cs As Double, NU1 As Double, NU2 As Double
Dim Pi As Double
Pi = 3.14159265358979
If DF1 Mod 2 = 0 Or DF2 Mod 2 = 0 Then
    X = DF2 / (DF2 + DF1 * F)
    If DF1 Mod 2 = 0 Then
        NU1 = DF1
        NU2 = DF2
    Else
        NU2 = DF1
        NU1 = DF2
        X = 1 - X
    End If
    I = (NU1 - 2) / 2 - 1
    Q = 1
    P = 1
    For K = 0 To I
       Q = Q * (NU2 + K * 2) / (K * 2 + 2)
       P = P + Q * ((1 - X) ^ (K + 1))
    Next K
    If DF1 Mod 2 = 0 Then
        P = (X ^ (NU2 / 2)) * P
    Else
        P = 1 - (X ^ (NU2 / 2)) * P
    End If

Else
    teta = Atn(Sqr(DF1 * F / DF2))
'    Debug.Print "teta=", teta
    NU1 = DF1
    NU2 = DF2
    ' calculation of A
    If NU2 > 1 Then
        I = (NU2 - 1) / 2 - 1
        For K = 0 To I
            If K = 0 Then
                Q = 1
            Else
                Q = Q * (K * 2) / (K * 2 + 1)
            End If
            A = A + Q * Cos(teta) ^ (K * 2 + 1)

        Next K
        A = (2 / Pi) * (teta + Sin(teta) * A)
    Else
        A = 2 * teta / Pi
    End If


    ' calculation of B
    If NU1 > 1 Then
        B = 1
        Q = 1
        If (NU1 > 3) Then
            I = (NU1 - 3) / 2 - 1
            For K = 0 To I
                Q = Q * (NU2 + K * 2 + 1) / (K * 2 + 3)
                B = B + Q * Sin(teta) ^ (K * 2 + 2)
            Next K
        End If
'        Debug.Print "b=", B
        ' note : the exponentation is necessary due to overflow
        B = (2 / Sqr(Pi)) * Exp(FACTLN((NU2 - 1) / 2) - FACTLN((NU2 - 2) / 
2)) * Sin(teta) * (Cos(teta) ^ NU2) * B
    Else
        B = 0
    End If

    P = 1 - A + B

End If
    If (P < 0.0001) Then P = 0
    FISHER = P
End Function


Public Function FACTRL(N)
Dim J As Integer, ntop As Integer
Dim A(1 To 33) As Double
ntop = 0
A(1) = 1
If (N < 0) Then
    End
ElseIf (N <= ntop) Then
    FACTRL = A(N + 1)
ElseIf (N <= 32) Then
    For J = ntop + 1 To N
    A(J + 1) = J * A(J)
    Next J
    ntop = N
    FACTRL = A(N + 1)
Else
    FACTRL = Exp(GAMMALN(N + 1))
End If

End Function

Public Function GAMMALN(xx)
' Holy cow!
Dim ser As Double, stp As Double, tmp As Double
Dim X As Double, Y As Double, cof(1 To 6) As Double
Dim J As Integer
cof(1) = 76.1800917294715
cof(2) = -86.5053203294168
cof(3) = 24.0140982408309
cof(4) = -1.23173957245015
cof(5) = 1.20865097386618E-03
cof(6) = -5.395239384953E-06
stp = 2.506628274631
X = xx
Y = X
tmp = X + 5.5
tmp = (X + 0.5) * Log(tmp) - tmp
ser = 1.00000000019001
For J = 1 To 6
    Y = Y + 1
    ser = ser + cof(J) / Y
Next J
GAMMALN = tmp + Log(stp * ser / X)
End Function

Public Function FACTLN(N)
Dim A(1 To 100) As Double
Dim I As Integer
For I = 1 To 100
    A(I) = -100
Next I
If N < 0 Then End
If N <= 99 Then
    If A(N + 1) < 0 Then A(N + 1) = GAMMALN(N + 1)
    FACTLN = A(N + 1)
Else
    FACTLN = GAMMALN(N + 1)
End If
End Function


Post a reply to this message

From: Orchid XP v8
Subject: Re: Linear regression
Date: 8 Jun 2010 13:16:56
Message: <4c0e7b08$1@news.povray.org>
>> Any idea how to get a confidence estimate?
> 
> You could take a look at the section labeled "Confidence Intervals" on 
> the page that scott linked.

Yes. Except, as originally explained, I don't understand it.

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*


Post a reply to this message

Goto Latest 10 Messages Next 5 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.