Weblog
Dat ene woord
Column Klokgelui voor Overschiese Krant.
Ik loop de laatste tijd in mijn hoofd met een melodie: 'Niets is sterker dan dat ene woord.' Misschien neuriet u dat ook wel. Welk woord was het ook al weer? Niets is sterker dan dat ene woord: agenda! WhatsApp! Facebook! Succes!
Is het niet zo? Je bent aan het werk, of aan het praten met een vriend of met je kind, en dan komt er een appje tussendoor en je bent afgeleid. Letterlijk afgeleid: je laat je leiden door je telefoon. Je telefoon of je agenda is geen handig hulpje meer, maar is je baas.
Het leven is een nooit eindigend spel. Bij voetbal win en verlies je wedstrijden, en aan het eind van het seizoen is Feyenoord de winnaar. In het leven heb je succesvolle periodes, en op andere momenten zit het tegen of is het zelfs een groot drama, en aan het eind ga je dood. Ik geloof dat er leven is na de dood, maar dat is nu niet mijn punt. Je wordt aan het eind van je leven niet gehuldigd op de Coolsingel. Je hoeft dus niet de eerste te zijn, de beste, de snelste, de mooiste, de slimste. Je hoeft niet de meeste appjes gestuurd te hebben om tot winnaar gekroond te worden in het leven. Er is geen roze trui voor wie het snelste reageert op berichten, of voor wie de meeste volgers op Facebook of Twitter heeft.
Als je bang bent om iets te missen, dan mis je het belangrijkste. Dan volg je aandachtstrekkers en mis je dat wat kalm en bescheiden wacht tot je tijd hebt.
Dus zoek eens de stilte. Geen stilte voor de Twitterstorm, maar stilte voor de stilte. Als stilte saai is, verveling, moet je meer oefenen, het langzaam opbouwen.
Toen Jezus op aarde rondliep, nam hij na de drukte van de dag tijd om met zijn vader te praten. Hij liet zich niet beïnvloeden door politieke of sociale dagkoersen. Hij was sterk doordat hij bouwde op het fundament van zijn relatie met God de vader en belangrijk vond wat de maker van het leven belangrijk vond.
Ja, niets is sterker dan dat ene Woord: Jezus.
Keynote Google - Machine Learning APIs for Python Developers
Keynote talk from Google about Machine Learning APIs for Python Developers, at PyGrunn.
See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.
Lee Boonstra and Dmitriy Novakovskiy give this talk, they work for Google Cloud, one of the gold sponsors of PyGrunn.
Python at Google
Google loves Python. :) It is widely used internally and externally. We are sponsoring conferences. We have open source libraries, like a Google Data Python Client Library, libraries for youtube, app engine, etcetera. We use it for build systems, report generation, log analysis, etc.
How can you use Google Cloud Platform for your app or website? You can deploy at scale. You can embed intelligence empowered by machine learning, we provide multiple pre trained models. You can use serverless data processing and analytics.
Machine learning
Let me explain it in a simple way. You want to teach something to a kid: what is a car, what is a bike? You point at a car or bike and explain what it is called. With machines we shoot in lots of data and they start to see patterns.
- Artificial intelligence: process of building smarter computers
- Machine learning: process of making a computer learn
Machine learning is much easier.
Our CEO: "We no longer build mobile-first applications, but AI-first."
We have a lot of data, better models, and more computing power. That is why machine learning is happening a lot now.
Google created the open source Tensorflow Python framework for machine learning. And we have hardware to match. We have ready to use models for vision, speech, jobs, translation, natural language, video intelligence.
- Vision API: object recognition (landmarks), sentiment on faces, extract text, detect inappropriate content. Disney game: search with your phone for a chair, and we show a dragon on the chair. Point your camera at a house, and you see a price tag.
- Speech API: speech recognition (write out the transcript, support for 80 languages).
- Natural language API: really understand the text that is written, recognise nouns and verbs and sentiment.
- Translation API: realtime sub titles, automatic language translation using more context than earlier versions.
- Beta video intelligence: label detection, enable video search (in which frame did the dog first appear).
Demo
Go to the Google Cloud console and create a free account to play with. You need to enable the APIs that you want to use. Install the command line tools if you want to run it on your local machine. And pip install google-cloud.
We use machine learning for example in GMail to show you a possible answer to send for an email you receive.
Walkthrough of machine learning and TensorFlow
Google Cloud Dataflow. Dataflow is a unified programming model for batch or stream data processing. MapReduce-like operations. Parallel workloads. It is open sourced as Apache Beam, and you can run it on Google Cloud Platform.
You put files in Cloud Storage. Process this in batches, with Python and Dataflow. This uses pre-trained machine learning models. Then store results in BigQuery, and visualize the insights in Data Studio.
Reinout van Rees - Querying Django models: fabulous & fast filtering
Reinout van Rees talks about querying Django models: fabulous & fast filtering
See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.
Goal: show what is possible. Everything is in the Django documentation. Just remember a few things you see here.
Example case: time registration system. Everyone seems to do this. A Person belongs to a Group. A Booking belongs to a Project and a Person.
The Python ORM gives you a mapping between the database and Python.
standard:
Person.objects.all()
basic filtering:
Person.objects.filter(group=1)
specific name:
Person.objects.filter(group__name='Systemen')
case insensitive searching for part of a name:
Person.objects.filter(group__name__icontains='onderhoud')
name starting with:
Person.objects.filter(name__startswith='Reinout')
without group:
Person.objects.filter(group__isnull=True)
Filtering strategy:
- sometimes .exclude() is easier
- you can stack: .filter().filter().filter()
- query sets are lazy: only really executed at the moment you need it.
- just assign the query to a variable, to make complicated queries more understandable
- start with the model you want
Speed:
select_related: does a big join in SQL so you get one set of results
prefetch_related: does one query for one table, and then one query to get all related items
if you need only one or two fields, Django does not need to instantiate a model, but can give you a plain dictionary or list instead:
Person.objects.filter(group__name='Systemen').values('name', 'group__name') Person.objects.filter(group__name='Systemen').values_list('name', 'group__name') Person.objects.filter(group__name='Systemen').values_list('group__name', flat=True)
Annotation and aggregation:
- annotate: sum, count, avg
- aggregation
- groupby via values (bit of a weird syntax)
Aggregation gives totals:
from django.db.models import Sum Booking.objects.filter( booked_by__group__name='Systemen' ).aggregate(Sum('hours'))
Annotation adds extra info to each result row:
Booking.objects.filter( booked_by__group__name='Systemen' ).annotate(Sum('bookings__hours'))[10].bookings__hour__sum
Group bookings by year, give sums:
Booking.objects.filter( booked_on__description__icontains='Zwanger' ).values('booked_by__name', 'year_week__year' ).annotate(Sum('hours'))
Practice this with your own code and data! You'll get the gang of it and get to know your data and it is fun.
If you need to do special queries, you can create a sub query yourself:
from django.db.models import Q query = Q(group__name='Systemen') Person.objects.filter(query)
You can write filters that way that are not in default Django.
Twitter: @reinoutvanrees
Òscar Vilaplana - Let's make a GraphQL API in Python
Òscar Vilaplana talks about making a GraphQL API in Python, at PyGrunn.
See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.
This talk is about GraphQL, Graphene, and Python.
I care a lot about value. If something has no value, why are you doing it?
"Our frontend engineers want us to use GraphQL."
Sample case: there are friends, they have plans, the plans are at a location. So various relations between friends, plans and locations.
With REST you usually fetch too much or too little, you have many calls, some documentation but not really standard, it is hard to discover, not really a standard client, much postprocessing and decisions needed.
So you can try to fix some stuff, giving options to include more data, or not include that many fields. I don't really like it. What can we do?
If you go back to the data, you can see a graph: data that is linked to each other.
"Our frontend engineers want us to use GraphQL. They can just ask for what they want."
In the backend you are trying to decide or guess what the client wants. The client want a nice looking web site. What we have is a bunch of data in too many boring tables.
GraphQL is a query language for graphs. You can ask stuff like this and get data in this format back:
{ plans name description creator: { name } }
You define possible queries:
type Query { plans(limit: Integer): [Plan] } type Plan { ... }
With Graphene you can do this in Python. And there is Django support in graphene_django, to easily wrap this around some models. It is smart about combining queries.
GraphQL makes it easier to expose data. It is closer to the data, so less waste. Easy to get started.
You can play with the GitHub GraphQL api.
Twitter: @grimborg
Jonathan Barnoud - Looking at molecules using Python
Jonathan Barnoud talks about looking at molecules using Python, at PyGrunn.
See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.
In this presentation, I will demonstrate how Python can be used throughout the workflow of molecular dynamic simulations. We will see how Python can be used to set up simulations, and how we can visualize simulations in a Jupyter Notebook with NGLView. We will also see the MDAnalysis library to write analysis tools, and datreant to organize the data.
I work at the University of Groningen. I look at fat and proteins, at the level of molecules and atoms. We can simulate them using molecular dynamics. Force is equal to the mass times the accelleration (F = m*a). We need initial positions and initial velocities.
My workflow: prepare system, run a simulation, visualise and analyse in Jupyter notebook, which may need several loops through this system, and then I can write a report.
Preparing a simulation: topology, what are the initial coordinates, what are simulation parameters. I use some bash and python scripts to prepare those text files. These go into the simulation engine, which gives as output a trajectory: how will all those molecules move.
There are lots of simulation engines, which need different file formats as input, and give different output formats. So I use Python to create a library that abstracts these differences away.
One of these engines is MD Analysis. The main object is a universe, with a topology and trajectory. The universe is full of atoms. Each atom has attributes attached to it, like name, position, mass. Everything is in arrays. You can select atoms: universe.select_atoms('not resname SOL'). Sample code:
for time_step in universe.trajectory[:10]: print(universe.atoms[0].position)
nglview can show an analysis from MD analysis (or other engines) by using a javascript library, to visualise it.
Now you may end up with lots of simulation data in lots of directories and files. Your filesystem is now a mess! So we use datreant. (Treant was a talking tree in Dungeons and Dragons.) This helps you to discover where the outcome of which simulation is. And access the data from it.
To conclude:
- Python is awesome.
- Jupyter is awesome too. [See also the talk about a billion stars earlier today.]
- The Python science stack is awesome as well.
- Each field develops awesome tools based on the above.