Jim Fulton - The ZODB
Talk by Jim Fulton at the Plone Conference 2016 in Boston.
See slides at http://j1m.me/plone16
Paul Everitt introduces the talk: The ZODB is still amazing after twenty years. Hierarchical object database including permissions, NoSQL, lots of things. On to Jim.
I am working one hundred percent on ZODB currently. Previously for Zope Corporation I could focus only part of the time on it, solving some problems we were having. Zope Corporation no longer exists. I was contracted by ZeroDB, who made this possible. ZeroDB had two products. Database that stores data encrypted at rest. Big-data analysis with hadoop. They decided to focus on their Hadoop-based product for now. I plan to offer ZODB support, consultancy, so get in contact if you need me.
Are any people here using ZODB based on NEO? No. NEO is doing some interesting things for highly durable storage. I bit more effort to setup. Poll: about half the people on the room use RelStorage, all use ZEO, a few use ZRS. I really recommend you to look at ZRS if you use ZEO. ZRS (Zope Replication Services) 1 was a nightmare, but version 2 is very good. We never made backups with repozo, we just replicated it.
ZEO version 4 used asyncore, by far the oldest async library in Python. It has lots of issues and is deprecated. I had a suspicion that maybe asyncore made ZEO slower. I rewrote most of ZEO to use asyncio instead, and cleaned the code up. In most cases there is performance improvement.
The ZODB API is synchronous. I have been using async libraries since say 1996. The API could change. Shane added a cool hack to ZServer to avoid waking up the event loop, which is a big performance win.
Transactions should be short. The longer the transaction, the higher the chance of a conflict. Connections are expensive resources, they take memory. If you have long-running work, try doing this asynchronously. But handing this off reliably is tricky.
Consider using content-aware load balancers, so you don't need all data in memory on all servers. They working set may not even fit in memory.
A challenge for some applications, is to get objects loaded fast, especially on startup. (You can often mitigate this using a ZEO client cache.) There were some problems with persistent caches, but they have been stable for a few years. But you can now prefetch items. You tell ZODB to prefetch some items, and then you can forget about the request and ZODB will meanwhile prefetch it for you, so it may be available later when you really need it. So the items are loaded asynchronously.
ZEO now has SSL. ZEO had authentication, but it made the code harder to understand. It is now out in favor of SSL. So you can restrict access to the ZODB.
ZeroDB stored the data encrypted, which meant the server could not do conflict resolution. So I added conflict resolution on the client. You can then work with real objects instead of just state. Solving conflicts in BTree splits would be easier then. It reduces processing time on the server. I would like to move conflict resolution up to the ZODB, instead of having it in ZEO.
Object-level locks. Currently ZEO locks the database for writes during the second phase of the commit process. In that phase it needs to wait for the clients to maybe do conflict resolution. Object-level locks could help here. I got it working, but it mostly did not give a performance win.
ZODB on the server is actually faster with PyPy.
ZeroDB did some interesting experiments. Split a database into multiple virtual databases, one per user, separate invalidations.
Unification of RelStorage, NEO, ZEO. NEO had some patches for ZODB and they are now merged, like a simpler implementation of multi version concurrency control. This is better for RelStorage as well. RelStorage is no longer a special case, and it has a new maintainer in Jason Maddon.
Inconsistency between ZEO clients. Scenario: add an object in one zeoclient, next request goes to second zeoclient and it potentially does not have the object yet during a very short timespan. There now is a new server-sync option to force a server round trip before each transaction. That is a cost, but maybe it should be the default.
What have I been doing after my work for ZeroDB. I worked on decent documentation, which lagged behind a long time. See http://zodb.org. You can help me improve it, by writing documentation, or also definitely by bugging me about documentation that you are missing.
FileStorage2. FileStorage worked out much better than I ever imagined. The main code has probably not changed in twenty years. It is a bit slow. With FileStorage2 we have better, separate packing, external garbage collection needed though, but that is better. Unneeded features are removed: versions and back-pointers. It uses multiple files, so with a pack you can split a file, write newly incoming transactions to the new part and pack the old part.
Byteserver is an alternative ZEO server implementation, written in the Rust language. Rust is very fast, faster than Go mostly. No Global Interpreter Lock like Python has. Byteserver includes a FileStorage2 implementation, new API between server and storage, built for speed rather than pluggability. Initial tests, from this morning, are promising, twice as fast as ZEO.
We used Zookeeper a lot, which helps keep track of which server are live and which have disappeared.
Future ZODB ideas:
- more speed. I don't need speed to be the reason people use ZODB, but it should not be a barrier.
- more documentation
- OO conflict resolution
- The ability to subscribe to object updates.
- Integration with external indexes like Elastic Search, Solr. ZRS could be used for this: look at that stream of data and push the relevant parts to the external index.
- Persistent pandas data frames
- A 'jsonic' API, to be able to look at the data without having the classes. There are some zodb browsers already.
- ZRS auto fail-over. At Zope Corp we probably only had one or two unexpected fail-overs in all those years.
- Official Docker images would be good. But if that uses Python 3 then your client also needs to be Python 3.
- ZEO authorization.
- Persistent classes?