

"Kenneth" <kdw### [at] gmailcom> wrote:
> That's a really nice result of fitting a set of data points to a function.
It's the reverse. I'm fitting the function describing a line, circle, and
sphere to the measured data. It's "as close to all of the data points as it can
be _simultaneously_". And so the overall error is minimized.
> But
> I wish I understood what all of this was about starting with the fundamental
> idea of the 'sum of least SQUARES' and why 'squaring' the residual data errors
> is used in these techniques.
I never did much in those areas either. And to be honest, it's "dangerous" to
be doing serious work with measured values and not understand the basics of the
statistics. When I worked at American Cyanamid, they had a statistician come in
and give a presentation  in which he showed like 8 sets of data that all had
the same standard deviation, and all looked completely different.
> The general idea of finding the AVERAGE of a set of data points is easy enough
> to understand, as is finding the deviations or 'offsets' of those points from
> the average.
This is an acceptable method, and of course can be found in early treatments of
computing the errors in data sets.
> But why is 'squaring' then used? What does that actually
> accomplish? I have not yet found a simple explanation.
"it makes some of the math simpler" especially when doing multiple dimension
analyses.
(the variance is equal to the expected value of the square of the distribution
minus the square of the mean of the distribution)
You're also doing a sort of Pythagorean/Euclidean distance calculation, and
that's done with squares rather than absolute values.
"Variances are additive for independent random variables"
"Say I toss a fair coin 900 times. What's the probability that the number of
heads I get is between 440 and 455 inclusive? Just find the expected number of
heads (450 ), and the variance of the number of heads (225=152
), then find the probability with a normal (or Gaussian) distribution with
expectation 450 and standard deviation 15 is between 439.5 and 455.5."
"while the absolute value function (unsquared) is continuous everywhere, its
first derivative is not (at x=0). This makes analytical optimization more
difficult"
https://stats.stackexchange.com/questions/118/whysquarethedifferenceinsteadoftakingtheabsolutevalueinstandar
ddevia
https://stats.stackexchange.com/questions/46019/whysquaredresidualsinsteadofabsoluteresidualsinolsestimation
"A lot of reasons."
Post a reply to this message

