  1. Familiarity Conda package, dependency and virtual environment manager. A handy additional reference for Conda is the blog post “The Definitive Guide to Conda Environments” on “Towards Data Science”.
  2. Familiarity with JupyterLab. See here for my post on JupyterLab.
  3. These projects will also run Python notebooks on VSCode with the Jupyter Notebooks extension. If you do not use VSCode, it is expected that you know how to run notebooks (or alter the method for what works best for you).

Getting started

Let’s first clone the code from part two into the regression-with-scikit-learn-part-three directory.


At this stage, the docs/linear_regression.ipynb notebook currently has cells up to the point where we have created a train/test split regressor and scored all of our test data.

Applying cross-validation

In our file docs/regression-with-scikit-learn-part-three, we can add the following to a new cell.


Today’s post demonstrated how to perform a k-folds cross validation with linear regression (in particular the 5-folds cross validation on our set).

Resources and further reading

  1. Conda
  2. JupyterLab
  3. Jupyter Notebooks
  4. “The Definitive Guide to Conda Environments”
  5. okeeffed/regression-with-scikit-learn-part-three




