Machine Learning Categories

Richard Farnworth
3 min readJan 24, 2019

--

credit

Machine learning is a very broad term, used to cover wide-ranging techniques of vastly different levels of complexity and utility. Often the sheer number of different technologies, each requiring considerable study to fully understand, can intimidate new students to the field and make it difficult to know where to start.

To help things, we can divide the field into three broad categories…

  • Supervised learning — find out how your data affects a value of interest
  • Unsupervised learning — find interesting patterns or features in your data
  • Reinforcement learning — choose actions and evaluate the consequences to learn how to best achieve a stated goal

Supervised learning

Supervised learning is a class of technique that attempts to find a relationship between a set of independent variables and a particular target variable in a dataset.

This could be used to help understand what levers you can pull to influence a particular KPI, or to predict the value of a variable under hypothetical conditions.

Let’s imagine you’re a slick city stock broker looking to make your next big trade. Perhaps data could help you decide which of today’s stocks are worth investing in. Historical stock prices at many of the world’s major exchanges are often freely available going back a few years, along with supporting information such as volume of shares sold and the daily range. Using supervised learning techniques you can use the data to help find relationships between independent variables that you already know (e.g. opening price, yesterday’s percent change, yesterday’s volume) and the dependent variable you’re interested in(e.g. closing price). That model can then be used to predict the closing price using the information you have at your desk on a Monday morning, leading to a greater chance of making a profit by the end of the day.

It’s named supervised learning because the dataset can be thought of as guiding the learning process. As the dataset contains the target variable, an SL algorithm can check its performance against reality and adjust its parameters to achieve better results.

The target variable could be either numeric (regression) or categorical (classification)

Popular supervised learning techniques include

  • Linear regression — find linear (straight line) relationships between the independent and dependent variables
  • Random forest for classification

Unsupervised learning

In unsupervised learning, you don’t have a target variable, but are instead interested in finding out the underlying structure of a dataset.

For example, you might want to find groups of customers with similar attributes for a targeted marketing strategy or find associations between different products (like Amazon’s “People who bought X, also bought Y”).

Unlike supervised learning there are no correct answers guiding the learning. The algorithm will have some inbuilt heuristic (e.g. “Within Cluster Sum of Squares” in K-means clustering) which is used to guide the learning process.

Popular unsupervised learning techniques include:

  • K-means clustering for finding clusters of similar data points
  • Apriori algorithm extracts association rules for example between products

Reinforcement learning

The third class of machine learning algorithm involves controlling how a software agent performs actions to work towards some kind of goal. The algorithm chooses actions for the agent, and uses the feedback from those actions to inform decisions in the future.

A business use case might be to optimise the effectiveness of your email communications. The reinforcement learner could vary the timing, title text, content and media of individual emails, gradually learning what works best with regards to some goal (e.g. email open rate, revenue).

The algorithm balances the tradeoff between short term results, and the long term benefits of experimenting, learning about the environment and formulating better strategies for the future.

Reinforcement algorithms are prominent within the area of Artificial Intelligence and have been used to solve problems such as playing games

Some popular reinforcement learning algorithms include:

  • Q-learning
  • SARSA, or State-Action-Reward-State-Action

--

--

Richard Farnworth
Richard Farnworth

Written by Richard Farnworth

Data scientist, computer programmer and all-round geek with 10 years of using data in finance, retail and legal industries. Based in Adelaide, Australia.

No responses yet