Ben Meijering - Hello, Machine Learning!

published May 13, 2016, last modified May 17, 2016

Ben Meijering says hello to Machine Learning, at PyGrunn.

Ben Meijering says hello to Machine Learning, at PyGrunn.

See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.

I am using a computer vision task as example because it looks nice, but this works just as well for other data.

Machine learning is: teaching computers to learn how to perform tasks.

Task: determine the digit that is in a picture.

Think of tasks in terms of probability. What is the chance that a picture as a whole is of the number zero? Regression model: change data into a value between zero and one so you can see it as a probability. We do a weighted sum of all pixel values.

Linear regression model. As inputs all the pixels. As outputs, the probability that the digit is zero, or one, or two. We apply softmax (instead of sigmoid) to output nodes, because classifications are mutually exclusive. This is a restraint we choose for this example task.

Machine learning libraries:

  • Theano, nice for getting started in machine learning
  • TensorFlow, good for rapid prototyping, used a lot by Google
  • Keras, unified API to both Theano and TensorFlow. Good, because both libraries have their own strength and Keras makes it easy to switch.

Code is compiled to GPU instructions, which can be 30 times faster than CPU.

Goal of training: learn the best possible weights. Inputs plus weights plus model (transform) give the outputs. During training we know what the correct answer is. Errors should be propagated back through the model, through the network of nodes, so the model can update its weights.

I do the plotting with the pandas library: show how well the model is performing after each iteration over the training data. Our first model starts out bad, and improves only a little bit. A second model, with an extra layer of nodes, performs much better at first and still improves afterwards.

So: keep adding layers for more accuracy? No, because you run out of memory. We can use a convolutional network. Inspired by the visual cortex. In this model, we use less connections: the nodes in the second layer are only connected to three nodes, instead of all. And we use the same three weights everywhere.

Machine learning is a very powerful tool, in solving all sorts of tasks. And it is a creative endeavour: how do you compose your models?

Overfitting: the model knows the training data too well, performing well on this data and poorly on others. How to deal with this? Split the data into data for training and data for testing. Or dropout: obscure part of the image when training.

Slides are at

My website: