Conversation
* On gravity * On athletic/IQ scores
…esampling-with into correlation_causation
* Need to iterate on this, not add to it.
* Lightly edit the rest.
| (column) vector. Note that we want to find a solution for any such system - there are no conditions other than that the | ||
| number of rows of $\mathbf{x}$ must be the same as the number of columns of $A$, and the number of rows of $\mathbf{y} | ||
| must be the same as the number of rows of $A$. Note in particular that $m$ need not equal $n$, and if $m=n$ we don't | ||
| require that the determinant of $A$ be non-zero. Let's give examples of the typical situations that one encounters in general, |
There was a problem hiding this comment.
the determinant is sure to confuse
|
|
||
|
|
||
| 1. The first example represents all systems of equations where $m=n$ with non-zero determinant. In all these cases the equation is *solvable* by perhaps using | ||
| Gaussian elimination with partial pivoting. (Not Cramer's rule!) |
There was a problem hiding this comment.
Those new to linear algebra won't know about Gaussian elimination or pivoting
| $$ | ||
| This equation represents all equations where there are more equations than unknowns, i.e. all *over determined* systems. | ||
| Since this system cannot be solved, we look | ||
| instead for a solution that best fits the system in a sense that we'll explain later. Please take our word for it, for now, |
There was a problem hiding this comment.
"take our word" is my least favorite expression! Perhaps we can give a quick intuitive version of the answer, e.g., we have to pick a value, and it looks like that value will have to be somewhere between 1 and 1.2. A best guess turns out to be , and we'll soon learn why.
| that the solution is given by the *normal equations*, | ||
| $$ | ||
| A^T A \mathbf{x} = A^T \mathbf{y}. | ||
| $$ | ||
| Here $A^T$ is the transpose of $A$, $A^TA$ is an $n\times n$, square, symmetrid matrix and $A^T \mathbf{y}$ is an $n\times 1$ (column) | ||
| vector. Moreover, if the columns of $A$ are linearly independent, it can be shown that $A^TA$ has an inverse, | ||
| a situation that is almost always true. |
There was a problem hiding this comment.
I'm not sure this is helpful until the reader can comprehend what it says.
| to identify a natural solution among the infinity of available solutions. To find this solution one has to calculate | ||
| the generalized inverse that will take us too far from our core focus. But it turns out that it can again be cast as an |
There was a problem hiding this comment.
Again, I think simply mentioning these terms will lead to confusion. We'd need to find an accessible way to explain that a solution is possible, but that you need to go about it carefully.
| It should be obvious that this is indeed a solution of the equation. What is more, it is the solution | ||
| that is the closest to the origin, i.e. out of the infinite number of solutions, this is the one with the | ||
| shortest length. |
There was a problem hiding this comment.
This type of language, e.g., is good for beginner level.
| Returning to our question above, we want to identify that value of $\mathbf{x}$ that will minimize the errors, | ||
| $e_1, \ldots, e_m$. We are back at the question, minimize in what sense? A generally used measure for the error is, | ||
| $$ | ||
| \mathbf{e}^T\mathbf{e} = e_1^2 + \cdots + e_m^2, |
There was a problem hiding this comment.
I'd swap these around, since the student is more likely to understand the latter.
|
|
||
| Armed with the normal equations we can explain the linear correlation between variables. | ||
|
|
||
| :::{.callout-note} |
There was a problem hiding this comment.
I like this callout a lot.
| \mathbf{y}^T \mathbf{y}. | ||
| $$ | ||
| In order to find the values of $\mathbf{x}$ that will minimize the sum of the squares of the errors, we need to set the | ||
| partial derivatives to all the components, $x_1, \ldots, x_n$ in the equation to zero. The detailed calculations are messy |
There was a problem hiding this comment.
Is it also messy to derive it from e_1^2 + e_2^2 ...?
There was a problem hiding this comment.
I.e., can we get a sense of least squares without matrix formulation, and in the end just state that the solution can also be written as ... using matrices?
|
Here's the slope with assumed intercept of 0 : https://lisds.github.io/textbook/mean-slopes/mean_and_slopes.html |
|
Source at : https://github.com/lisds/textbook/ including datasets. |
Incomplete