Jens W. Klein: RelStorage - an alternative ZODB backend
Talk by Jens Klein at the Plone Conference 2015 in Bucharest.
ZODB is pluggable, with lots of options for storage: FS (Data.fs accessed by single zope instance), ZEO, ZRS, demostorage, RelStorage.
RelStorage uses as backend MySQL (gave us problems to setup), Oracle (expensive), PostgreSQL (just right for us).
It can be up to twice as fast, mainly because transactions are faster, so you can have more concurrent writes. The limit is always the transaction conflicts.
PostgreSQL is an active open source project, mature, flexible, fast, standard. Sysadmins can handle it just fine.
You can use RelStorage in history-free storage. With history you can undo transactions in the ZMI. With large sites, history-free makes more sense, because usually so many write are done that mostly anything you want to undone has already been overwritten, so undo will not work.
You can use memcached. You need to poll the database for changes in intervals, with ZEO this happens automatically. You can replicate the database with standard tools. You can migrate from RelStorage to FileStorage and back.
Caching. Different threads and instances can use the same memcached client, saving memory. Use an own memcache per zope client machine: so on that same machine. And do not let this same memcache handle for example LDAP.
You should use blobstorage, so check your code, your content types: you do not want files in either FileStorage or RelStorage. [Note from Maurits: PostgreSQL has large object support, which works, but we switched to a shared filesystem blob storage for a client.]
You can split in edit and read-only clients and use a different poll interval. Use 0.5 to 10 seconds for read-only.
Do have a look at configuring and tuning PostgreSQL.
Now lots of time for questions.
Compared to normal setup, you need to replace the ZEO server by RelStorage, add a memcached server on each machine that has the Zope client. Configure the Zope clients to talk to RelStorage.
We could sprint on it to use more options of Postgres 9, making it even faster.
Original goal maybe ten years ago was that some customers required a specific relational database. Things have changed, NoSQL storage is more common now. But how is it possible that it may still be twice as fast as ZEO, as ZEO is quite simple actually? The portal_catalog is part of it probably. The code is hard, takes time to get into. In ZEO things could be tweaked. But for policy reasons it can still be handy to be able to say to an IT department: just give us a Postgres database.
Transactions are still serialized, not concurrent. Switching from pickled format to JSON may make it easier to use more Postgres features.
Can't we change ZEO so you can have a memcache between the zeo client and zeo server, like for RelStorage? So you need less disk cache and can use the memory more.
Avoid write on read.
Python 3? Does not currenly work as far as I know, but should not be that hard. It is just plain Python. ZODB itself works on Python 3.