# An Introduction toStatistics with Klong

## Linear Regression

Here is another x/y set with values rounded to two decimal places. We know that the variables are probably correlated, because of the way in which the set has been created.

 ``` XY::(!30),'rndn(;2)'err(20;20;0.15*!30) ```[[ 0 0.0 ] [ 1 4.27] [ 2 1.48] [ 3 -4.26] [ 4 -5.28] [ 5 0.75] [ 6 0.9 ] [ 7 -1.89] [ 8 -5.27] [ 9 1.35] [10 0.91] [11 1.65] [12 2.39] [13 4.3 ] [14 5.63] [15 1.66] [16 0.64] [17 0.2 ] [18 4.46] [19 -3.62] [20 0.06] [21 7.27] [22 9.77] [23 -2.43] [24 4.78] [25 1.99] [26 3.9 ] [27 3.46] [28 7.73] [29 4.35]]``` ```

Fig.9: larger. Klong

The scatter plot in fig.9 also suggests a correlation. The `lreg` function of `nstat` can be used to compute the slope and intercept coefficients of a regression line through the x/y set:

 ``` lreg(XY) `[0.201581757508342594 -1.21793548387096761]` ```

The function returns a tuple containing the slope and intercept values of the regression line, but these details do not have to be memorized, because the `lr` function will take care if it. While `lreg` fits a model to the data, `lr` uses the model to predict values of the Y variable of the x/y set given values of the X variable. In fig.10, `lr` is used to plot the regression line through the set. Its parameters are an independent variable and a linear regression model delivered by `lreg`.

Fig.10: larger. Klong

The `nstat` module provides two methods for quantifying the correlation between two variables, the covariance (`cov`) and the normalized correlation coefficient (Pearson's r, `cor`). They both expect each variable as a separate data set:

 ``` cov(*'XY;{x@1}'XY) `15.1018333333333333` cor(*'XY;{x@1}'XY) `0.201581757508342604` ```

Given a model, like linear regression, there are several ways to examine the quality of the predictions made by the model. The `nstat` module provides the following of them: the residual sum of squares (RSS), the residual squared error (RSE), the mean squared error (MSE), and the coefficient of determination (r2):

 ``` L::lreg(XY) `[0.201581757508342594 -1.21793548387096761]` rss(XY;lr(;L)) `304.98152685205783` rse(XY;lr(;L)) `3.30033292071777115` mse(XY;lr(;L)) `10.1660508950685943` r2(XY;lr(;L)) `0.230445406440760124` ```