From Vicki (adjusted):
https://online.stat.psu.edu/stat462/node/170/
Talk about how a data point can be both an outlier - it behaves differently from the others - and can influence the model notably by changing the beta coefficients a fair bit. (While not being high-leverage.)
The nuance of all of this is possibly something that could live in a drop-down bonus box in the CS course materials. The distinction between outlier vs high-leverage vs influential point is not relevant for everyone, and we don’t have to be super precise on wording in the lecture, but having the “correct” wording written down somewhere might be a good idea?
From Vicki (adjusted):
https://online.stat.psu.edu/stat462/node/170/
Talk about how a data point can be both an outlier - it behaves differently from the others - and can influence the model notably by changing the beta coefficients a fair bit. (While not being high-leverage.)
The nuance of all of this is possibly something that could live in a drop-down bonus box in the CS course materials. The distinction between outlier vs high-leverage vs influential point is not relevant for everyone, and we don’t have to be super precise on wording in the lecture, but having the “correct” wording written down somewhere might be a good idea?