Heading image

Prerequisites

  1. Familiarity Conda package, dependency and virtual environment manager. A handy additional reference for Conda is the blog post “The Definitive Guide to Conda Environments” on “Towards Data Science”.
  2. Familiarity with JupyterLab. See here for my post on JupyterLab.
  3. These projects will also run Python notebooks on VSCode with the Jupyter Notebooks extension. If you do not use VSCode, it is expected that you know how to run notebooks (or alter the method for what works best for you).
  4. Read “Regression With Scikit Learn (Part One)”

Getting started

Let’s create the regression-with-scikit-learn-part-two by cloning the work we did yesterday. The packages required will be available in our conda environment.

Linear regression basics

The line equation to calculates the linear line is described as the following:

Higher dimensions of linear regression

So far, the examples we have done are working on a dimension that is easily understood with y being calculated by one feature on the X-axis (from our example yesterday, this was the "Number Of Rooms (feature) vs Value Of House (target variable)"").

  1. Array with he features.
  2. Array with the target variable.

Applying the train/test split to our dataset

In our file docs/linear_regression.ipynb, we can add the following:

Summary

Today’s post spoke to the math that describes our linear line generated by the linear regression fit.

Resources and further reading

  1. Conda
  2. JupyterLab
  3. Jupyter Notebooks
  4. “The Definitive Guide to Conda Environments”
  5. okeeffed/regression-with-scikit-learn-part-two
  6. Ordinate Least Squares (OLS)

--

--

--

Senior Engineer @ UsabilityHub. Formerly Culture Amp.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Modeling tree height and basal area in the Finger Lakes National Forest, NY

How to get started on a Data Project? A simple example

Dengue Forecast

How Do We Design a Good Online Course for Business Analytics?

Exploring the Relationships Among Demography, Mobility and COVID Infection

EFUN Asian Handicap Refund Program

9 Machine Learning/Automation use cases for SMEs with a Quick-ROI

How I scored 100% accuracy detecting spam emails using MultinomialNB estimator

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dennis O'Keeffe

Dennis O'Keeffe

Senior Engineer @ UsabilityHub. Formerly Culture Amp.

More from Medium

Start to work quickly with GPUs in Python for Data Science projects.

Use Of Keywords And Identifiers

Creating a Swiss-style Tournament Manager — Part 1: Match Making

Calculating the Dissimilarity for Binary and Asymmetric Binary Attributes