Heading image

Prerequisites

  1. Familiarity Conda package, dependency and virtual environment manager. A handy additional reference for Conda is the blog post “The Definitive Guide to Conda Environments” on “Towards Data Science”.
  2. Familiarity with JupyterLab. See here for my post on JupyterLab.
  3. These projects will also run Python notebooks on VSCode with the Jupyter Notebooks extension. If you do not use VSCode, it is expected that you know how to run notebooks (or alter the method for what works best for you).
  4. Read “Regression With Scikit Learn (Part One)”

Getting started

Let’s create the regression-with-scikit-learn-part-two by cloning the work we did yesterday. The packages required will be available in our conda environment.

Linear regression basics

The line equation to calculates the linear line is described as the following:

Higher dimensions of linear regression

So far, the examples we have done are working on a dimension that is easily understood with y being calculated by one feature on the X-axis (from our example yesterday, this was the "Number Of Rooms (feature) vs Value Of House (target variable)"").

  1. Array with he features.
  2. Array with the target variable.

Applying the train/test split to our dataset

In our file docs/linear_regression.ipynb, we can add the following:

Summary

Today’s post spoke to the math that describes our linear line generated by the linear regression fit.

Resources and further reading

  1. Conda
  2. JupyterLab
  3. Jupyter Notebooks
  4. “The Definitive Guide to Conda Environments”
  5. okeeffed/regression-with-scikit-learn-part-two
  6. Ordinate Least Squares (OLS)

--

--

--

Senior Engineer @ UsabilityHub. Formerly Culture Amp.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

What is Data Structures?

A simple guide to Bias-Variance Trade-off — Part 1

What the heck Bias-Variance Tradeoff is??? – mc.ai

Why the IDE is not the future of SQL-based analytics

EFUN Asian Handicap Refund Program

Why is labeling unstructured data still hard?

Stats Learning Week 5

Accessing Fantasy Football Data with Python

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dennis O'Keeffe

Dennis O'Keeffe

Senior Engineer @ UsabilityHub. Formerly Culture Amp.

More from Medium

Anemia Prediction Using Machine Learning Techniques

Taxi Fares in New York City: A Prediction blog

Implementing stochastic gradient descent (SGD) in python

Sandeep 😎😎😎