Python Users Netherlands meeting 22 October 2014
Summary of the meeting of the Dutch Python Users group on 22 October 2014.
Folkje welcomed us at Byte, a web hosting company. We develop tools that help developers. We focus on quality. Cluster and Magenta hosting. Currently working on Hypernode.
A talk about threads, locks, processes and events.
CPU manufacturers have gone to multi-core. So a CPU one year from now may not actually be much faster than now. So: you may need to be smarter when programming. Is your software designed to do many things at the same time? Not one monkey doing many things, but many monkeys each doing their own thing.
You can do multi processing with the multiprocessing module.
Threads: one thread is still doing one thing at the time. That is the threading module.
Green threads, gevent, greenlet library. Basically: threads in user space. They still use one actual thread, it just looks like there are more.
asyncore library, a socket wrapper for an asynchronous event loop
Python 3 has asyncio (sometimes known as tulip).
There are difficulties with multithreading. You do not know when the scheduler of the OS will handle which thread. So you have to use locks when you want to do something threadsafe.
- threading.Lock is the default lock. Other threads that want the same lock will be waiting.
- threading.Rlock is a re-entrant lock: one thread can use the same lock more than once, as long as it releases it as much as it acquires.
- threading.Semaphore is a lock with a counter: four locks, if one gets released, a waiting thread can acquire the free one.
In Python you can use queues to share state between threads: see the Queue module. You can wait on a Queue, instead of checking it every second.
But the GIL (Global Interpreter Lock) holds us back. The more resources, the slower we get, because threads get a few CPU ticks and then Python checks the next thread. The overhead of many threads bites you in the ass.
You can use multiprocessing and move the stuff over multiple cpu cores. Each process has its own GIL. The processes still need to talk to each other, maybe via a database. That is still overhead, but may be faster in your use case.
Consider threads or async (event loop) when using lots of I/O.
Consider multiprocessing when using lots of CPU.
This is a conceptual talk, with discussion. I am interested in what you think about this.
REST: stateless interface, decoupled client and server, strict API. For example Flask, a 'batteries excluded' framework, with Flask-restful, an extension to facilitate REST api development. Class based views only handling standard HTTP requests like get and put. Flask-restful parses the request and decides if it is valid. It may send a Bad Request message back to the user.
So what is Angular (or AngularJS)? It is built and hyped by Google, loved by many. Every one uses it, so everyone uses it... A toolset for building a front-end framework. An html compiler. Elegant asynchronous request handling. Automagic page updates through data binding. You can let it get a (REST) resource from the server and on success bind the data.
Hybrid has been done. For example http://django-angular.readthedocs.org: a collection of utilities to ease integration of Angular and Django.
Example: I have created a Tinder-like app to like members of metal bands: http://hologramearth.com/metalfest Really hybrid: I am mixing Python and Angular code in the template there.
- Hybrid could make your application more DRY.
- Initialization can be done on back-end models.
- REST-ful resources where they are needed.
- Back-end updates might run through the whole stack.
- Is everything REST-ful?
- No clear separation of tasks (server, client).
- Be careful to not need say 25 asynchronous calls when you could easily prepare it all at once on the back-end.
- Hybrid is useful at the beginning of your project. You can grow from there.
- Combining Jinja templates with Angular, which use the same brackets, is possible as you have shown, but I would choose a different templating language.
And let's connect on LinkedIn, I am new there.
These are two members of the Automation and Configuration Management family. You probably know Fabric already.
I like open source and building communities around open source projects.
JumpScale used to be known as PyLabs. Cloud service platform: you can build a cloud service with it. Go to a web interface, connect machines there, just a model, save it, edit it, machines are there. Easy. Q-Layer (taken over meanwhile) and mothership1.com have been built with this.
You have an agent controller and agents. Using vagrant in a demo. There is a command jpackage that knows how to install packages on various Linux systems.
Each JumpScript on the controller is written in Python. It will run scripts on the agents. A script could start a long running process and then you can monitor it on the server, or via a web UI. But you can come up with your own use cases.
Fabric is shell scripting in a Pythonic way. Less cryptic than Ansible. The code can be run on multiple hosts (and can include localhost).
Create a file, fabfile.py is a good default name, and add functions in there. You can call those with fab function1 function2. This will then be executed on the hosts that you have defined. Or override those on the command line.
There is support for running it parallel on the hosts, but I have not tried it.
If you like one of these projects, make sure you join the communities.