Plone

published Nov 03, 2021

This is here to serve as contents for the atom/rss feed for Plone, also read by planet.plone.org.

David Glick: Nice blobs! It'd be a shame if anything happened to them...

published Oct 18, 2017

Talk by David Glick at the Plone Conference 2017 in Barcelona.

This is about storing blobs in the cloud. Mostly I will talk about storing them in S3 from Amazon, but you can store them elsewhere as well.

Earlier this year they had about 750 GB of blob data, growing by a few GB a day, and the hard disk was 90 percent full. It was also hard to keep copies of the site up to date, and keep backups. What to do?

Requirements:

  • move blob data to cloud storage
  • keep reasonable performance
  • no changes at the app level (the Plone Site)
  • smooth migration path: we did not want to bring the site down.

So should we handle it at the app level, storing an S3 url? But you need to have the image locally as well when you create scales.

Inspiration: ZODB client cache. Could we keep some blobs in there? I looked further, and Jim Fulton had already created s3blobstorage and s3blobserver. It looked experimental and not a lot of documentation, though it did not need to be a deal breaker.

So I created collective.s3blobs, which works a bit like the other two packages. When the app needs a blob, it first looks in the ZODB, then a filesystem cache, then an S3 bucket. When getting it from S3, it puts it in the filesystem cache. All recent blobs end up in the S3 blob cache, from both ZODB and the file system cache. It has a size-limited cache.

You can choose which blobs to move with an archive-blobs script, which you can pass some options. This avoided the need for migrating everything at once.

In your buildout you add a storage-wrapper with collective.s3blobs:

storage-wrapper =
  %% import collective.s3blobs
  
    cache-dir ${buildout:directory}/var/blobcache
    cache-size 100000000
    bucket-name your-s3-bucket
    %s
  

This requires a recent plone.recipe.zope2instance recipe.

It was successfull. Certainly for this site, lots of large original images are never shown, only smaller scales, so it helps a lot here.

S3 claims to offer secure storage, that you do not lose your data. We have setup rules to avoid someone accidentally deleting the blobs when they have direct access to S3, outside of Plone. We have versioning in S3.

Ideas:

  • Packing needs to be implemented. The standard ZODB packing will not remove blobs from S3. I have some ideas.
  • We could use CloudFront CDN to serve the images. That has security implications, although for this site it would be fine. But it costs extra, and they did not need it.

It is in stable production use, but unreleased. Needs more documentation and tests.

Code: https://github.com/collective/collective.s3blobs

Alessandro Pisa: Really Unconventional Migrations

published Oct 18, 2017

Talk by Alessandro Pisa at the Plone Conference 2017 in Barcelona.

Migration is just a matter of state, moving from state A to state B. The customer does not care how you get there, but you should, or it costs you too much time.

We migrated a site from Plone 4.1.3 to 5.0.6, and from a custom UI to Quaive, and from Archetypes to dexterity. The standard path is fine. But we also needed to upgrade Solr and several add-ons. And we had 10 GB of Data.fs and 300 GB of blobs, about 200K ATFiles, and previews, and versions, and about 5000 users.

For the standard path, getting the fresh data would take hours, upgrade of Plone hours, to dexterity days, and add-ons unknown. In one weekend we would need to block 5000 users. That was not really an option. So the target was less than one day.

Three environments: production on several machines, staging environment, and a migration server that was not very big.

Tip: use a migration view. You have well organised code, and have fast development with plone.reload, and you can curl it. This is better than having a script that you run with bin/zeoclient run. You get a well defined, reliable and clean upgrade path.

To rsync 300 GB of blobs takes time. Can we start a migration without blobs? Yes, by using experimental.gracefulblobsmissing. How to start:

  • prepare migrating env
  • copy Data.fs and Data.fs.index
  • rsync blob in the background.

Some Plone upgrade bottlenecks:

  1. The portal_catalog: if an index is added or needs reindexing, all objects need to be awoken and for us it took 45 minutes.
  2. portal_catalog
  3. portal_catalog...

Solution: go brainless. Clear the portal_catalog and run the standard Plone migration. This takes no time. For us this worked, because we wanted to touch all objects later anyway. But some upgrade steps might depend on having a fully filled catalog.

Add-ons. You may need newer versions. They may not yet be compatible. Solution: straight to the goal. We made a Plone 5.0.6 buildout with only the new code and packages that we wanted to use. So we had classes in the Data.fs that no longer existed. We used from plone.app.upgrade.utils import alias_module and used that for replacing missing classes with tiny classes that had just the few methods that were needed during migration. We cleaned up the persistent import and export steps from portal_setup before we started. We cleaned up the skins. So we focused on the result, and not on packages that would not be there anymore.

Biggest problem: @@atct_migrator for migrating from Archetypes to dexterity. This basically doubles your blobs because the old blobs are still there. It took too long for us. We needed to avoid recreating the blobs. In a migration step, for each object, we deleted it from the parent, set the newly wanted class, and added it to the parent again. Fields needed to be adapted or updated, like from DateTime to datetime. For each blob in a field, we created a new NamedBlobFile, and directly set _blob to the old blob data, so that no new blob had to be created on the file system. We never actually needed to read the blobs, so it was fine that the blobs were taking time to rsync.

Solr migration could be done in parallel, in half an hour. Then atomic updates in about two hours. SearchableText was not touched, which would have taken to long with all the parsing of text from PDF and Word.

In many cases, the standard Plone migration is fine. So if you can, just use it.

Note that we also needed to handle other catalogs, and link integrity, and some other stuff.

Philip: plone.app.contenttypes has a function for changing the class too. And functions for temporarily switching off linkintegrity.

Paul Roeland: An omnivore's perspective

published Oct 18, 2017

Talk by Paul Roeland at the Plone Conference 2017 in Barcelona.

This is an opinionated view on where we can learn, what we should stop doing, and how choices play out.

What do I mean with an omnivore's perspective? I work for a non-profit organisation. I have used Plone from version 0.99. I build sites, maintain them, renovate them, revive them, keep them, put them to sleep. So I do not build sites and then let someone else handle it. Some sites have content of over twelve years old. Some survive for three years even though the client promised me it would only be there for a year.

What systems do I use? Plone, Wordpress, Drupal, Wagtail, Mezzanine, Ghost, Pagekit, Sulu, Grav, Hugo. Those were the CMSes, well, sort of. I also use CiviVRM, Mailchimp, Odoo. These are ones that I have used in the last twelve months alone.

Some are very good. It helps that some are very new, so they have no balast.

Some good trends that I like:

  • design is getting simpler and simpler
  • designers got into their heads that there are mobile users
  • testing is done better
  • staging and production are known words for most systems

There is also the bad:

  • developer centric consultantware, where for any change you need to ask a developer, instead of having a lot of power already in your hands like in Plone
  • Some are so simple that it is basically brochureware. It works and looks fine as long as you have at most ten pages. So for a larger NGO: maybe not. 'Does it scale?' is really a good question here. Does it survive three years with 300 new pages a year, including several restructurings.

And of course the ugly:

  • The worst ideas resurface, like the inline editing that we had in Plone.
  • Once you publish it, you cannot change the url ever... Do you work in a real organisation?
  • One way street, also known as data grave. Please give me a way to export the data without resorting to crawling the site.
  • Security as an afterthought.

What Plone has that is really awesome and mostly not anywhere else:

  • placeful content: folders with stuff in them. This means you can find it. You can move a folder somewhere else including all its stuff. Again, a flat structure is fine for ten pages, but not for more.
  • collections: show a list of content somewhere else. Awesome, I want more and better of those.
  • workflow: just unrivalled. Others may have a bit, but not good.
  • content rules: if you do not use them, you are missing out.

Plone's happy place, where it shines, according to me:

  • long term content, that you are supposed to keep for years and years. Overkill for a site that is just there for a short term action.
  • skilled editors. They do not need to be full time editors, but they should be happy to at least spend half an hour to learn how the system works.
  • commitment from the organisation
  • where you can trust and empower your power users. They are your ambassadors.

Where can Plone approve? In the way that we approach content management as a whole. We often get bogged down in details in our daily work. But you are a captain and should act like it: Captain Janeway said 'there is coffee in that nebula, go for it,' instead of 'point the ship at direction 180.53 and warp 4.623.' Get the big picture. Tell developers (or Plone) what you want, not how exactly you want it.

Content life:

  • 20 percent is spent writing the content outside of the CMS. Say Word or an email or Google docs.
  • 5 percent of time is spent getting it into the CMS.
  • 75 percent is spent to re-arrange it, re-use, refer, tag, archive. In other words: content management. It is annoying if it takes twenty clicks to get anything done.

Content types:

  • Forget dexterity versus Archetypes for a moment.
  • You have text and images.
  • Embed: youtube, twitter, etc. It is there on basically every site.
  • Snippets and results. Extra things like 'people you bought this article, also bought these other three, so please by them as well'.
  • Office and PDF. People expect to see it this in the site. People do not see what the difference is between a Word file and an html page. Why is one opening in Word and another in the site? Make it easy for them with some integration. There are ways.
  • Composite pages. This is a hard problem. We had at least seven attempts, the last one Mosaic, which is on its way, but good be better. Remember: be a captain. Make it possible to say: item A is the most important, show items B and C if you have room, drop item D and E when you are on a phone. What if we all get extremely large screens or extremely small screens in your glasses? Do you really need to go through all your pages then?

Sub sites:

  • Folder, Composite and Theme.
  • limited navigation
  • handmade is good enough
  • If the sub site is too big, with too many non standard options, make it a separate site.

WYSIWYG:

  • What You See Is What You Get?
  • Actually: What I See Is Most Likely Not What You Get
  • Instead, Markdown is fine for most sites. It is limited, but that is a strength. The editors are less tempted to paste a Word doc into it. A preview is nice.

Stop assuming that something will appear in the left column, because on mobile there may be only one column, or it will not appear at all.

Forms:

  • Basically all form frameworks are sh*t.
  • PloneFormGen is Archetypes and no one is going to keep fixing the fields, so use collective.easyform.
  • A quick edit drag and drop page is nice, but form creation is for power editors, so it is fine if the interface is geeky. So focus on the end user, that the end result looks good.

System setup:

  • should be repeatable
  • containers are here to stay, whether Docker or something else. If you want to do it in a different way, we can say in docs that you are on your own. We should stop supporting all the different deployment strategies in the world.

Configuration:

  • Readability counts. It does not need to be shiny: I don't configure a site every day for hours.
  • TTW: allow theming override. A different color should be possible. Allow overriding the translation strings through the web, let the marketing people do that.
  • I don't want to do Javascript TTW, just on the command line with webpack or whatever. I was always in favour of TTW, but here: no.

Don't reinvent wheels. For example, one of the biggest problems in computer science: writing responsive emails. Integrate something like mosaico that has solved this.

My practical wishlist for users:

  • TinyMCE: use Markdown and raw
  • tiles and layout recipes
  • Make 'embed' easy.

Give me more:

  • smart media handling, without letting me handle it manually
  • consistency in UI
  • clevererer collections: can we have something that says: these three items are relevant for you?

Team player:

  • Good import and export should be in core. JSON dump is fine.
  • REST and GraphQL: good if we can have both.
  • Office integration when needed for intranet sites
  • although security is hard

System:

  • For configuration can we move from XML to YAML? No, not JSON, as that is unreadable and unwritable too.
  • One Plone per Zope would be much easier.
  • A command line interface to do basic maintenance would be great.
  • roundtrip configuration: export config and import in another container. We sort of have this, but it is hard to find.
  • For power users, function is more important than form.
  • Warn me when updates are needed. Should be configurable. Probably not web downloadable updates, for security reasons.

Make it so!

Maybe you totally disagree, that is fine. Come talk to me and we can start a discussion. I am deeply partial to Plone and let's make it better.

Nathan van Gheem and Ramon Navarro Bosch: guillotina and asyncio

published Oct 18, 2017

Talk by Nathan van Gheem and Ramon Navarro Bosch at the Plone Conference 2017 in Barcelona.

At first we had plone.server as name, but it was not really Plone anymore, so we were looking for a new name and come up with guillotine, and then Ramon said we should call it guillotina, the Catalan name for it.

Some background. There are web frameworks like Angular and React. Server rendering frameworks are dying out: the front end just wants to talk to an API. We love Plone, but wanted to be able to use it in high performant situations with modern front ends.

History:

  • 18 years ago, Zope and the ZODB were created, an object oriented application and database.
  • 16 years ago Plone got built on top of it.
  • 7 years ago Pyramid created.
  • 2 years ago Plone REST API created
  • a bit more than one year ago plone.server/guillotina was born.

Guillotina is an evolution, done in the spirit of Plone. It is not necessarily a replacement for Plone. We are taking lessons learned. We are okay with forking some code to change it to our needs. We are not going to provide everything out of the box. Zope and Plone are inspirations, they are in our heads, like the hierarchical data model. It is not a re-implementation of Plone, it is not plone.restapi compatible. It is an asynchronous REST API service.

We have transactions, with conflict resolution policies that reduce conflict errors and gives better performance. We want to use the best database systems available, we support PostgreSQL and CockroachDB.

Information is organised in trees of objects. We love that in Plone and decided to keep it. Objects are resources, with schema attributes and annotations, OO inheritance and static or dynamic behaviors. The data is serialised as json.

Other features:

  • Security: similar to how this is in Zope and Plone, but simplified.
  • All code is based on asyncio for network integrations with external indexers, database, caching, services. It is based on aioHTTP.
  • We want the installation to be simple. You can do pip install guillotina or docker run guillotina/guillotina.
  • You can configure how CORS should be handled.
  • It is easy to use web sockets.
  • It is perfect for micro services, like handled in lots of containers with kubernetis.
  • It is extensible, uses zope.interface and based on zope.component, with utilities and adapters.
  • You can use Cookie Cutter to create a new project.
  • We have a persistent configuration registry for each container. A container is comparable with a Plone Site inside a Zope database.
  • You can point to a static directory for static files and javascript apps.
  • You can mount multiple databases at the same time.
  • We have open sourced our S3 and GCloud storage packages, and ElasticSearch indexing.
  • Automatic API documentation generation using Swagger. With guillotina-swagger you get clickable documentation where you can try everything out.
  • Give a container 200 MB of RAM and it will handle hundreds of requests per second.
  • Urls are like: /db/container-name/obj1/obj2
  • zope.interface is the only package that we kept from Zope/Plone.
  • Only Python 3.6 or higher.
  • Hopefully we get some reusable components that are also used for the Plone REST API.

Links:

Devon Bernard: How to design a great API using Flask

published Oct 18, 2017

Talk by Devon Bernard at the Plone Conference 2017 in Barcelona.

At Enlitic we study lots of medical data using computers. I hope to teach you how to create an api that developers and users love. If you are a backend developer, you can see the front end developers as your user. It should be intuitive, durable, flexible.

Intuitive

Your API should not surprise your users.

Hard coded variables are bad. Keep runtime variables in one central place, so users can find them. Make it so that you do not need to juggle with git to avoid accidentally committing a password. Search github for 'removed password' and you find over 300,000 commits with that commit message... Good: create a Config class, and DevelopmentConfig, and ProductionConfig and handle the differences there.

Use object relational mappers (ORM), instead of writing SQL, as you constantly worry about SQL injections.

How to standardise database migrations? A tool like alembic helps, to create a way to upgrade and downgrade. But please change the alembic file template to start with a date instead of a random hash.

Have a standard database seed, that users can use to get a database with standard content.

Use a requirements.txt file with all the requirements that you need. You can get packages from a git repository too. And you can install packages from the local file system, which is especially handy if you develop two packages at the same time, so you don't need to push changes all the time for package A when you yet know if they actually work with package B.

Is the setup code in your README working? If you can replace this by a single command, that would be better.

Create unit tests. You may hear:

  • "But it wastes time." No, you still have to verify your code before you deploy it to production. And you do not just need to verify the new code, also the older code, as it may be affected by your change.
  • "But writing it is boring." Manually clicking to verify that it works, is even more boring.

Database flush versus commit. A flush reserves a placeholder spot in the database. A commit saves records in the database. If you explicitly commit, and later something goes wrong, your commit is still in the database; usually you do not want this: either all changes should be written, or none at all. Also: use flush instead of commit when you need to get an auto increment id back: what is the id of the row that will be committed to the database.

Flexible

Create an app factory, so there is one way to instantiate your app. This may also help for avoiding circular imports, like when your API depends on your database code, and the other way around.

Create blueprints. For example create one decorator for registering an admin view, and another for anonymous views. That makes it very clear which view is for what kind of user.

Reliable

How to maximize uptime? Prevent downtime by not shipping bugs: have automated testing, have staging servers

Versioning: have a way to know what the used version is. Keep backwards compatibility. Communicate when you will drop a feature.

Get some analytics about API usage. If one of your front end developers is still calling a deprecated part of the API, detect this and send them an email.

First rule of endpoint design is: be consistent.

Document your API, or it does not exist.

A tool like Postman can be used to show the end points, and is helpful for debugging: paste the exact payload that gave an error on production.

Use Python profiling: what lines of your code take the most time?

Caching: store common responses in memory for quick retrieval.

Find me on Twitter: @devonwbernard.