Nathan Van Gheem: Introduction to Python Asyncio
Talk by Nathan Van Gheem at the Plone Conference 2017 in Barcelona.
This is about the Python 3 core asyncio library. "This module provides infrastructure for writing single-threaded concurrent code using coroutines, multiplexing I/O access over sockets and other resources, running network clients and servers, and other related primitives."
The first time I read that I was like: what?
Asynchronous programming using async and await syntax. Any network activity should not block other code, that is the main idea. This is useful because web applications use TCP sockets. It is a way to improve performance and scale web applications. Also think of microservices.
The optimised event loop allows you to handle a larger number of requests per second. You can have long running requests with very little performance impact. With standard Plone that is impossible.
Requirements: Python 3.4.
How are typical web servers designed like Flask, and Django? Each request is tied to a thread, so you are limited to handling number of threads and processes you run. Threads are expensive (GIL, context switching, CPU). If no threads are available, further requests are blocked, waiting for an open thread. Threads are blocked by network traffic, for example to a database server.
With asyncio, requests can be tied to tasks. You can have lots of tasks per thread, and if a task needs to wait for network traffic, it does not hurt you. But be careful: if anywhere in your code you use the requests library instead of asyncio, that will block your network traffic.
We have Futures`. ``asyncio.run_until_complete with ensure_future wraps your asynchronous call in a Future object.
You can have long running Tasks. Tasks, futures and coroutines are very similar, in the beginning you don't need to worry about that.
Gotcha: everything must be async. Async functions need to be run by the event loop. If you run it manually, it will not do anything. If you don't call an async function using await it will never be run either.
asyncio is single threaded: only one event loop can run in a thread at a time. Running multi threaded code in asyncio is unsafe. You can have multiple threads, each having their own event loop. You can get the feel of multiprocessing by using asyncio.gather
With an 'executor' you can make synchronous code asynchronous. Typically it is a thread executor. Try to avoid it, but it is a tool that you can use if needed. See concurrent.futures.
asyncio comes with an amazing subprocess module, so you can await the result of executing a command on the terminal.
The event loop is pluggable, for example tokio.
More and more libraries are popping up using asyncio:
- aiohttp: client and server library
- aioes for elastic search
- asyncpg for postgres
- aiosmtpd for smtp
[See https://github.com/aio-libs for more.]
Debugging is more difficult than regulare sequential programs, the pdb is tricky. aioconsole allows you to have a Python prompt with an asyncio loop already setup for you.
guillotina uses asyncio.
In Python 3.7 you have an execution context, which is going to be nice.
Questions and answers:
- You cannot do WSGI with asyncio. But Tornado uses asyncio.
- What was hardest? Wrapping your head around it all.
- Is this only for network calls? Or also useful for disk access? There is an add-on for that. I tried it and then it was kind of a hack.
- Do you have profiler tools, like seeing if code is blocking too long? See an earlier talk. There is `aiomonitor <https://github.com/aio-libs/aiomonitor>`_.