Weblog

published Nov 03, 2021, last modified Nov 04, 2021

Philip Bauer: A new hope for migrations and upgrades

published Oct 27, 2021

Talk by Philip Bauer at the online Plone Conference 2021.

I've often argued for in-place migrations and worked hard to make them as easy as possible. The thing is: They are still hard. Especially when you add Archetypes/Dexterity, Python 2/3, Multilingual, old Add-ons and Volto to the hurdles to overcome. All of this changed when early this year a new star was born. It has started as the idea to build a small wrapper around serializing and deserializing content using the REST-Api. Since then collective.exportimport has grown into a powerful tool with a beginner-friendly user-interface. I will show how it is used for all kinds of small and large migrations and how it can solve even the edgiest edge-cases.

I was tempted to call this talk: a new default for migrations and upgrades. But we should discuss that.

I have given a lot of talks about migrations or upgrades for Plone during the years. The two options so far were:

  • in-place migrations (the default)
  • transmogrifier

What if you could sidestep all those migration steps, and go to your target Plone version in one go? Prompted by Kim Nguyen, I started work on collective.exportimport.

See: https://github.com/collective/collective.exportimport

You can export:

  • Plone 4, 5, 6
  • Archetypes and Dexterity
  • Python 2 and 3
  • plone.app.multilingual and Products.LinguaPlone

Import:

  • Plone 5.2 and 6
  • Python 2 and 3
  • Dexterity

You can use this for migrations, but also for exporting just a part of your Plone Site.

It is built around plone.restapi. This tool takes care of most of the subtle details that might otherwise go wrong if you write something yourself from scratch. The Rest API also makes it easy to customize parts of this, especially serializers and deserializers.

Since version 1.0 of exportimport came out, we have:

  • export a complete site, instad of one content type at a time
  • export trees
  • export portlets
  • more options for blobs

Demo.

Since spring this year, we added support for "big data":

  • The export uses generators, and writes one item at a time, so you don't run out of memory on large sites.
  • You can export/import blob-paths so you can use the original blobstorage.
  • For import we use ijson to efficiently load large json files, again to avoid running out of memory.
  • Commit after X items (work in progress)

Exporting a 10 GB Data.fs with 82 GB blobs, resulted in a 643 MB Plone.json file, plus much smaller other json files. Exporting took 30 minutes for the content, and 20 minutes for the other stuff, like portlets. Import took 6 hours for content, 1 hour for all others. That is on my current computer. It would actually be slower, but I disable versioning on initial creation, which helps.

It is customizable. I decided to not go for the Zope Component Architecture. You subclass the base class and use some hooks. Example of a hook to turn an old HelpCenter class into a standard folder:

def dict_hook_helpcenter(self, item):
    item["@type"] = "Folder"
    item["@layout"] = "listing_view"
    return item

TODO:

  • Fix html. Between Plone 4 and 5, html in richtext fields has changed a lot. For example, we need to add extra data attributes to link tags, otherwise they are not editable in TinyMCE. Also, the scales have changed, which needs some changes in html. This is work in progress.
  • Migrate plone.org.
  • Migrate Classic to Volto: html to draftjs/slate, tiles to blocks.

I am not sure if collective.exportimport should be the new default. Usually in-place migration is fine, just not so much when migrating to Dexterity or to Python 3.

Watch the next talk by Fred, who will talk about more details.

Katie Shaw: From Opaque to Open: Untangling Apparel Supply Chains with Open Data

published Oct 27, 2021

Keynote talk by Katie Shaw at the online Plone Conference 2021.

The intro from the conference website:

The tragic Rana Plaza building collapse in 2013 revealed to the world how little many global brands knew about where their products were being made. Following the disaster, demand for supply chain disclosure in the apparel sector was heightened. However, in response to these calls for greater transparency, supply chain disclosure has been inconsistent, inaccessible, of poor and varied quality, and stored in siloed databases. The Open Apparel Registry (OAR) was built to address all these data challenges. At its heart, the OAR exists to drive improvements in data quality for the benefit of all stakeholders in the apparel sector. Powered by a sophisticated name- and- address-matching algorithm, the tool creates one common, open registry of global facility names and addresses, with an industry standard facility ID.

Join us to learn more about the challenges facing the apparel sector, including low levels of technical exposure and understanding of open data; collaborative work that’s being done to educate the sector on the power of open data, including the launch of the Open Data Standard for the Apparel Sector (ODSAS), and examples of how data from the Open Apparel Registry being freely shared and used is creating meaningful changes in the lives of some of global society’s most oppressed people.

[Note from Maurits: apparel is a fancy word for clothes. I had to look it up.]

So what is the trouble with fashion anyway? Not high fashion, I just mean your daily clothes.

Fashion generates 2.5 trillion dollar in global annual revenues. Two of the richest people in the world are fashion industry magnates. "It takes a garment worker 18 months to earn what a fashion brand CEO earns during lunchtime."

In 2013 a garment factory collapsed in Dhaka. Workers were forced to go to work earlier on that day, although many saw cracks in the building.

Supply chains are complex. The label in your T-shirt may say "made in Vietnam" but parts of it may have been made in a completely different location. Maybe simply the buttons are from an entirely different country.

How can better and open* data help? There are databases with addresses of factories, that we cannot match to an actual address. Visiting the factory to check working conditions, or train people, is not possible if you cannot even find the building.

See the tool at https://openapparel.org

Each unique facility in the OAR (Open Apparel Registry) is allocated an ID number.

Technically, the biggest issue was: the data. (It's always the data...)

  • no industry-wide standards
  • often extracted from PDF
  • non-structured addresses (5 kilometers after the post office)

We had 50,000 existing facilities. Any new facility uploaded needed to be checked: maybe it is already there. This took far too long. We now use the dedupe Python package for this, which uses fuzzy matching. Much faster.

We made an Open Data Standard for the Apparel Sector (ODSAS). We call for Open Data Principles in EU corporate sustainability reporting directive legislation.

Sign on and join: Clean Clothes Campaign, Open Appparel Registry, WikiRate, and more.

I want to share stories of the OAR in action.

Clean Clothes Compaign, where Plonista Paul Roeland works, is a global alliance dedicated to improving working conditions of workers in the apparel supply chain. It uses the OAR data in its Urgent Appeal work, in which it responds to concrete violations reported by workers and unions. For example, a union leader was sacked, but after an appeal by CCC he was restored after five days.

Our data is used to map which apparel facilities will be underwater in 2030. Researchers combined our data with sea level projections from the climate panel. The OAR provided a unique data-set for this work.

See our code here: https://github.com/open-apparel-registry

It's time to untangle supply chains!

For further reading, there are also books, especially Fashionopolis by Dana Thomas.

BHRRC used our data in a case where workers did not get paid. In our data they found which brands were using this factory, they contacted them, and the case got solved.

Lightning talks Tuesday

published Oct 26, 2021

Lightning talks on Tuesday at the online Plone Conference 2021.

Michael McFadden - The Many Layers of Radio Free Asia

We have 14 different languages on our site. Everybody wants to display something differently, and it is never enough.

How do we do it in Plone? Browserlayers! Add-ons for everybody: a theme for the Koreans, the Cantonese, etcetera.

English is our base theme. Typical. It changes basic things about Plone, and then the other add-ons register changes for their browser layers.

Alexander Loechel - Regulation (EU) 2018/1724

This is about "Single Digital Gateway” & the “Your Europe” Project, a portal approach for citizen centric services in the European Union. Establish a single digital gateway to information, procedures, etc. One of the many digitization efforts of the European Union. 21 procedures should be standardized, and fully online in the entire EU, including requesting a certificate of birth. Thousands of portals in the EU do this, but EU wants to centralize it, making it the same, use the same procedure, common UI. One goal: you don't need to inform twenty institutions that you have moved.

See https://europa.eu/youreurope

Plone is a content integration framework. It can play a role here. Meet on Friday in Open Space L.

Kim Nguyen - Plone in a Box™

I gave a presentation last year too. People want to try Plone, add some content, some image, show it to colleagues. Put it in the cloud.

I put together this repo: https://github.com/collective/plone-in-a-box

Why: make it easy to put on a server, also for non-developers. You do have to create for example an Amazon webservices account, or for Linode. After two and a half minutes you can have a Plone Site running on a new server. I want to get it working on DigitalOcean as well, and sprint on it this weekend.

Lucas Aquino - UseCase - Plone in the Brazilian Superior Electoral Court

We started on Plone 3 in 2010, Plone 4 in 2012, started migration to Plone 5 in 2021. It has information on elections, explaining the situation. We had election day this year. We do support and training.

See the English version of the site: https://english.tse.jus.br

Calvin Hendryx-Parker - Pyenv rocks and you should be using it

  • Do not use sudo.
  • Do not use the system Python, it is for the system.
  • Use pyenv.
brew install pyenv
# show which version you use by default
pyenv global
# which do you have installed:
pyenv versions
# install one:
pyenv install 3.9.7
# In the current dir use another one:
pyenv local 3.9.5

pyenv plugins: virtualenv and virtualenvwrapper:

mkdir proj1; cd proj1
pyenv virtualenv 3.8.6 proj1
pyenv local proj1

I did not have to activate or deactivate anything.

Paul Grunewald (Dresden University) - Keeping track of your Plone customization - a little helper

See [my community post](https://community.plone.org/t/keeping-track-of-customizations/14232). I asked about how people keep track of changes. How do you check for updates to any files that you have customized, like templates that have gotten bugfixes upstream.

I have a package collective.patchwatcher, not released yet. You must create an overrides_info.py where you declare which overrides you have.

Run bin/patchwatcher. It shows where updates may be. It even tries to merge upstream changes.

[Interesting! MvR]

Michael McFadden: Plone Outputfilters and TransformChain (how to display things differently)

published Oct 26, 2021

Talk by Michael McFadden at the online Plone Conference 2021.

I work for Radio Free Asia.

Let's jump right into the problem. There should be some kind of 'hook' to modify 'stuff' before we send it to 'the browser'. For example: TinyMCE images, safe-html transforms, Mosaic blocks. Intercept the response, do something with it, and send it on.

The solutions in Plone: plone.outputfilters and plone.transformchain.

plone.outputfilters: tools for modifying field values when you get() them. Demo: add an image and then use this in a news item or document in TinyMCE. Save it, and you see the caption of the image. Where does this come from?

  • story.text.raw has just an image.
  • story.text.output suddenly has a caption.

plone/outputfilter/browser/configure.zcml has a browser page plone.outputfilters_captioned_image. If you don't like it, you can easily override it with jbot, overrides.zcml or a browser layer.

For Radio Free Asia, we want AMP. This is a special webpage for Google to quickly show news items, and Google caches it. We need an updated caption for this, but only when there is an AMP request.

First try: special browser view, let it add an interface on the request, and let the snippet return something else.

Problem: the output is cached, so you have the original non-AMP answer in a Zope cache.

Plone Outputfilters hooks into the Products.PortalTransforms machinery. It registers its own mimetype and transform for safe html transform.

Solution: create a new mimetype text/x-html-safe-for-amp and transform policy. Manually call the transformer in the browser page. We now have two versions of the caption snippet in the cache.

On to plone.transformchain. It provides methods to modify the response from a page before it is returned to the browser. It is used by plone.app.theming, plone.app.blocks, plone.protect, plone.app.caching, etc. They are called in a specific order.

At RFA, we use this to convert between language alphabets. Uygur Arabic is transformed into Latin or Cyrillic.

We have some example private code, you can send me a mail to get access.

Jens Klein: Forging a New Installer

published Oct 26, 2021

Talk by Jens Klein at the online Plone Conference 2021.

The old installer needs updating.

In the past we had the UnifiedInstaller. This tried to do all in one: install Python from scratch at first (not anymore), buildout based, support as many OS as possible. It has become too complex and won't be used for plone 6.

In the future we want to use the standard tooling. There is tooling for backend Python and frontend Node. Python and Node themselves are widely available. Use pip and npm to install backend and frontend. Use containers / Docker.

So: no installer. But we have tooling.

  • Use pip install Plone
  • Use pip to install add-ons.
  • Use standard Zope scripts to create the instance.
  • npm run yo init @plone/volto to bootstrap your Volto frontend.

You may now say: "Stop! This is a lot to install on my machine!." So we use containers:

docker run plone-[webserver|frontend|backend|database]

Run it in Docker of Kubernetes. Inside it, we run:

  • webserver: nginx (or Traefik, Caddy)
  • frontend: Volto
  • backend: Plone application server
  • database: ZEO+blobstorage, or PostGreSQL (recommended) or MySQL/Oracle

It is easy to start with. You can do local development with containers. It integrates well with CI/CD processes. The container image is a frozen version of the whole application. So you have isolated environments and repeatable deployments.

Workflow example:

  • Two repos or directories with backend and frontend customizations
  • Dockerfile
  • Setup Gitlab CI/CD workflow to build containers and store in registry.
  • Auto deploy to testing/staging/live.

You could use Ansible and similar tools (I won't).

Disclaimer: this is work in progress. Official images:

This needs help.

  • Try the new setup.
  • Ask questions at community.plone.org
  • Is it inconvenient? Tell us how to improve.
  • Have you found a problem? Create an issue.
  • Have you fixed a problem? Create a pull request.