

That's a really nice result of fitting a set of data points to a function. But
I wish I understood what all of this was about starting with the fundamental
idea of the 'sum of least SQUARES' and why 'squaring' the residual data errors
is used in these techniques. I don't remember ever being introduced to that
concept, in either highschool or college maths classes. (But, I never took a
course in statistics.) The various internet articles on the subject that I have
read over the years are woefully complex and do not explain the *why* of it.
From the Wikipedia article "Partition of sums of squares":
"The distance from any point in a collection of data, to the mean of the data,
is the deviation. ...If all such deviations are squared, then summed, as in
[equation], this gives the "sum of squares" for these data."
The general idea of finding the AVERAGE of a set of data points is easy enough
to understand, as is finding the deviations or 'offsets' of those points from
the average. But why is 'squaring' then used? What does that actually
accomplish? I have not yet found a simple explanation.
Post a reply to this message

