Weblog

published Nov 03, 2021 , last modified Nov 04, 2021

Fred van Dijk: collective.exportimport: tips, tricks and deploying staging content

published Oct 27, 2021

Talk by Fred van Dijk at the online Plone Conference 2021.

collective.exportimport is the latest addition to our migration toolbox and we have achieved good results with it with upgrading Plone sites to Plone 5.2. But A new 'side' use case with this add-on getting mature is distributing content trees between existing Plone sites. For example to create an online marketing campaign and deploying the setup to several country sites for translation and local use. I will demonstrate the 'content copy' use case, discuss current state and planned/wished improvements. As a related subtopic I will also touch on current capabilities and 'caveats' of exportimport when using this for migrations based on our current experience.

I will do exactly the same talk as Philip, but in my way and in half an hour.

As Philip said: "Migrations can be 'fun' again."

I started work on a migration in autumn of 2020. The Plone 4.3 site originally started in Plone 3, maybe even some left-overs from 2.5. I did all the usual stuff for inline migration to 5.2 on Python 3. And I found a banana peel. And another. And another. With such an old site, that has been migrated over and over, too many things can be lurking in the database that at some point bite you.

The main drawback of in-place migration: there are unknown unknowns. You never know how many dark things are still lurking in there until you have completely finished the migration. So: very hard to estimate.

You also require an 'intermediate' migration environment: you go from 4.3/Py2 to 5.2/Py2 to 5.2/Py3.

A lot of work has been done on in-place migration. It is stable. It works on standard Plone sites. But who really has a standard Plone site?

collective.exportimport uses ETL transformation. Transmogrifier uses the same idea/theory:

you Export
you Transform
you Load

Actually, you often export, load, then transform/fix. So ELT/ELF. Benefits:

No need for an intermediate Py2/3 environment.
You don't touch your existing/old environment data.
You only need to add collective.exportimport to your old site.

Now some technical tidbits.

There are several chicken-and-egg problems.

You import a folder, this has a default page, but the page does not exist yet.
Page A relates to Page B which does not exist yet.

So we import all content, and then import other stuff like default pages and relations afterwards.

Part 2 of this talk: Copying staging content.

Use case from a customer, Zeelandia. This customer has subsidiaries in several countries, also using Plone Sites. They wanted to create some content on one site and make it available on the other sites so local editors can adapt it to their language.

With content-tree support in exportimport, we can export a folder and import it in another site. This works! Except: this site has Mosaic, and we use persistent tiles.

Tile data is either stored:

in a urlencoded part on the html field
on the context of the item (plone.app.standardtiles)
as a persistent mapping annotation on the context.

Problem: tile annotations are not yet exported. I asked my colleague Maurits: Can we fix this? Yes we can. Demo. We export the content tree to a shared directory that is accessible to all Plone Sites on the server. They use this now to export marketing compaigns.

We export the annotations using the dict_hook mechanism to check if the context has a Mosaic layout. We need some adapters/converters for richtext fields, and named files and images when they are in tiles. The context is not a content item, but a dictionary, that is why we need these extra converters. It would help to know the actual schema of the tile, and there should be ways for that, so we could improve this. Missing from tiles is: support for RelationValues.

Could we define a generic 'bundle' json Plone export format? Then we could export the content plus default pages plus relations, etcetera, in one json. And then we can import it in one go in the correct order.

Tips and tricks:

Keep a log of what you do when, in which order, with number of items processed.
collective.migrationhelpers has some fixers, which are being moved to exportimport.
You can use collective.searchandreplace to fix most css class changes by using smart regexps.
Check disk space, space in TMP.
After import, do basic checks, like: can I edit content, especially rich text.

Philip Bauer: A new hope for migrations and upgrades

published Oct 27, 2021

Talk by Philip Bauer at the online Plone Conference 2021.

I've often argued for in-place migrations and worked hard to make them as easy as possible. The thing is: They are still hard. Especially when you add Archetypes/Dexterity, Python 2/3, Multilingual, old Add-ons and Volto to the hurdles to overcome. All of this changed when early this year a new star was born. It has started as the idea to build a small wrapper around serializing and deserializing content using the REST-Api. Since then collective.exportimport has grown into a powerful tool with a beginner-friendly user-interface. I will show how it is used for all kinds of small and large migrations and how it can solve even the edgiest edge-cases.

I was tempted to call this talk: a new default for migrations and upgrades. But we should discuss that.

I have given a lot of talks about migrations or upgrades for Plone during the years. The two options so far were:

in-place migrations (the default)
transmogrifier

What if you could sidestep all those migration steps, and go to your target Plone version in one go? Prompted by Kim Nguyen, I started work on collective.exportimport.

See: https://github.com/collective/collective.exportimport

You can export:

Plone 4, 5, 6
Archetypes and Dexterity
Python 2 and 3
plone.app.multilingual and Products.LinguaPlone

Import:

Plone 5.2 and 6
Python 2 and 3
Dexterity

You can use this for migrations, but also for exporting just a part of your Plone Site.

It is built around plone.restapi. This tool takes care of most of the subtle details that might otherwise go wrong if you write something yourself from scratch. The Rest API also makes it easy to customize parts of this, especially serializers and deserializers.

Since version 1.0 of exportimport came out, we have:

export a complete site, instad of one content type at a time
export trees
export portlets
more options for blobs

Demo.

Since spring this year, we added support for "big data":

The export uses generators, and writes one item at a time, so you don't run out of memory on large sites.
You can export/import blob-paths so you can use the original blobstorage.
For import we use ijson to efficiently load large json files, again to avoid running out of memory.
Commit after X items (work in progress)

Exporting a 10 GB Data.fs with 82 GB blobs, resulted in a 643 MB Plone.json file, plus much smaller other json files. Exporting took 30 minutes for the content, and 20 minutes for the other stuff, like portlets. Import took 6 hours for content, 1 hour for all others. That is on my current computer. It would actually be slower, but I disable versioning on initial creation, which helps.

It is customizable. I decided to not go for the Zope Component Architecture. You subclass the base class and use some hooks. Example of a hook to turn an old HelpCenter class into a standard folder:

def dict_hook_helpcenter(self, item):
    item["@type"] = "Folder"
    item["@layout"] = "listing_view"
    return item

TODO:

Fix html. Between Plone 4 and 5, html in richtext fields has changed a lot. For example, we need to add extra data attributes to link tags, otherwise they are not editable in TinyMCE. Also, the scales have changed, which needs some changes in html. This is work in progress.
Migrate plone.org.
Migrate Classic to Volto: html to draftjs/slate, tiles to blocks.

I am not sure if collective.exportimport should be the new default. Usually in-place migration is fine, just not so much when migrating to Dexterity or to Python 3.

Watch the next talk by Fred, who will talk about more details.

Katie Shaw: From Opaque to Open: Untangling Apparel Supply Chains with Open Data

published Oct 27, 2021

Keynote talk by Katie Shaw at the online Plone Conference 2021.

The intro from the conference website:

The tragic Rana Plaza building collapse in 2013 revealed to the world how little many global brands knew about where their products were being made. Following the disaster, demand for supply chain disclosure in the apparel sector was heightened. However, in response to these calls for greater transparency, supply chain disclosure has been inconsistent, inaccessible, of poor and varied quality, and stored in siloed databases. The Open Apparel Registry (OAR) was built to address all these data challenges. At its heart, the OAR exists to drive improvements in data quality for the benefit of all stakeholders in the apparel sector. Powered by a sophisticated name- and- address-matching algorithm, the tool creates one common, open registry of global facility names and addresses, with an industry standard facility ID.

Join us to learn more about the challenges facing the apparel sector, including low levels of technical exposure and understanding of open data; collaborative work that’s being done to educate the sector on the power of open data, including the launch of the Open Data Standard for the Apparel Sector (ODSAS), and examples of how data from the Open Apparel Registry being freely shared and used is creating meaningful changes in the lives of some of global society’s most oppressed people.

[Note from Maurits: apparel is a fancy word for clothes. I had to look it up.]

So what is the trouble with fashion anyway? Not high fashion, I just mean your daily clothes.

Fashion generates 2.5 trillion dollar in global annual revenues. Two of the richest people in the world are fashion industry magnates. "It takes a garment worker 18 months to earn what a fashion brand CEO earns during lunchtime."

In 2013 a garment factory collapsed in Dhaka. Workers were forced to go to work earlier on that day, although many saw cracks in the building.

Supply chains are complex. The label in your T-shirt may say "made in Vietnam" but parts of it may have been made in a completely different location. Maybe simply the buttons are from an entirely different country.

How can better and open* data help? There are databases with addresses of factories, that we cannot match to an actual address. Visiting the factory to check working conditions, or train people, is not possible if you cannot even find the building.

See the tool at https://openapparel.org

Each unique facility in the OAR (Open Apparel Registry) is allocated an ID number.

Technically, the biggest issue was: the data. (It's always the data...)

no industry-wide standards
often extracted from PDF
non-structured addresses (5 kilometers after the post office)

We had 50,000 existing facilities. Any new facility uploaded needed to be checked: maybe it is already there. This took far too long. We now use the dedupe Python package for this, which uses fuzzy matching. Much faster.

We made an Open Data Standard for the Apparel Sector (ODSAS). We call for Open Data Principles in EU corporate sustainability reporting directive legislation.

Sign on and join: Clean Clothes Campaign, Open Appparel Registry, WikiRate, and more.

I want to share stories of the OAR in action.

Clean Clothes Compaign, where Plonista Paul Roeland works, is a global alliance dedicated to improving working conditions of workers in the apparel supply chain. It uses the OAR data in its Urgent Appeal work, in which it responds to concrete violations reported by workers and unions. For example, a union leader was sacked, but after an appeal by CCC he was restored after five days.

Our data is used to map which apparel facilities will be underwater in 2030. Researchers combined our data with sea level projections from the climate panel. The OAR provided a unique data-set for this work.

See our code here: https://github.com/open-apparel-registry

It's time to untangle supply chains!

For further reading, there are also books, especially Fashionopolis by Dana Thomas.

BHRRC used our data in a case where workers did not get paid. In our data they found which brands were using this factory, they contacted them, and the case got solved.

Lightning talks Tuesday

published Oct 26, 2021

Lightning talks on Tuesday at the online Plone Conference 2021.

Erico - World Plone Day 2022

It is on April 27, 2022. At least 24 hours of streaming at https://youtube.com/c/PloneCMS See https://plone.org/events/wpd/about-world-plone-day

Janina Hard - collective.fullcalendar

Quick demo of this package in Classic UI. See https://github.com/collective/collective.fullcalendar

Michael McFadden - The Many Layers of Radio Free Asia

We have 14 different languages on our site. Everybody wants to display something differently, and it is never enough.

How do we do it in Plone? Browserlayers! Add-ons for everybody: a theme for the Koreans, the Cantonese, etcetera.

English is our base theme. Typical. It changes basic things about Plone, and then the other add-ons register changes for their browser layers.

Alexander Loechel - Regulation (EU) 2018/1724

This is about "Single Digital Gateway” & the “Your Europe” Project, a portal approach for citizen centric services in the European Union. Establish a single digital gateway to information, procedures, etc. One of the many digitization efforts of the European Union. 21 procedures should be standardized, and fully online in the entire EU, including requesting a certificate of birth. Thousands of portals in the EU do this, but EU wants to centralize it, making it the same, use the same procedure, common UI. One goal: you don't need to inform twenty institutions that you have moved.

See https://europa.eu/youreurope

Plone is a content integration framework. It can play a role here. Meet on Friday in Open Space L.

Kim Nguyen - Plone in a Box™

I gave a presentation last year too. People want to try Plone, add some content, some image, show it to colleagues. Put it in the cloud.

I put together this repo: https://github.com/collective/plone-in-a-box

Why: make it easy to put on a server, also for non-developers. You do have to create for example an Amazon webservices account, or for Linode. After two and a half minutes you can have a Plone Site running on a new server. I want to get it working on DigitalOcean as well, and sprint on it this weekend.

Lucas Aquino - UseCase - Plone in the Brazilian Superior Electoral Court

We started on Plone 3 in 2010, Plone 4 in 2012, started migration to Plone 5 in 2021. It has information on elections, explaining the situation. We had election day this year. We do support and training.

See the English version of the site: https://english.tse.jus.br

Maurits van Rees - Image transform chain

See some notes in [this readme](https://github.com/mauritsvanrees/experimental.focalpoints/blob/main/src/experimental/focalpoints/focalpoint/README.md)

Calvin Hendryx-Parker - Pyenv rocks and you should be using it

Do not use sudo.
Do not use the system Python, it is for the system.
Use pyenv.

brew install pyenv
# show which version you use by default
pyenv global
# which do you have installed:
pyenv versions
# install one:
pyenv install 3.9.7
# In the current dir use another one:
pyenv local 3.9.5

pyenv plugins: virtualenv and virtualenvwrapper:

mkdir proj1; cd proj1
pyenv virtualenv 3.8.6 proj1
pyenv local proj1

I did not have to activate or deactivate anything.

Paul Grunewald (Dresden University) - Keeping track of your Plone customization - a little helper

See [my community post](https://community.plone.org/t/keeping-track-of-customizations/14232). I asked about how people keep track of changes. How do you check for updates to any files that you have customized, like templates that have gotten bugfixes upstream.

I have a package collective.patchwatcher, not released yet. You must create an overrides_info.py where you declare which overrides you have.

Run bin/patchwatcher. It shows where updates may be. It even tries to merge upstream changes.

[Interesting! MvR]

Michael McFadden: Plone Outputfilters and TransformChain (how to display things differently)

published Oct 26, 2021

Talk by Michael McFadden at the online Plone Conference 2021.

I work for Radio Free Asia.

Let's jump right into the problem. There should be some kind of 'hook' to modify 'stuff' before we send it to 'the browser'. For example: TinyMCE images, safe-html transforms, Mosaic blocks. Intercept the response, do something with it, and send it on.

The solutions in Plone: plone.outputfilters and plone.transformchain.

plone.outputfilters: tools for modifying field values when you get() them. Demo: add an image and then use this in a news item or document in TinyMCE. Save it, and you see the caption of the image. Where does this come from?

story.text.raw has just an image.
story.text.output suddenly has a caption.

plone/outputfilter/browser/configure.zcml has a browser page plone.outputfilters_captioned_image. If you don't like it, you can easily override it with jbot, overrides.zcml or a browser layer.

For Radio Free Asia, we want AMP. This is a special webpage for Google to quickly show news items, and Google caches it. We need an updated caption for this, but only when there is an AMP request.

First try: special browser view, let it add an interface on the request, and let the snippet return something else.

Problem: the output is cached, so you have the original non-AMP answer in a Zope cache.

Plone Outputfilters hooks into the Products.PortalTransforms machinery. It registers its own mimetype and transform for safe html transform.

Solution: create a new mimetype text/x-html-safe-for-amp and transform policy. Manually call the transformer in the browser page. We now have two versions of the caption snippet in the cache.

On to plone.transformchain. It provides methods to modify the response from a page before it is returned to the browser. It is used by plone.app.theming, plone.app.blocks, plone.protect, plone.app.caching, etc. They are called in a specific order.

At RFA, we use this to convert between language alphabets. Uygur Arabic is transformed into Latin or Cyrillic.

We have some example private code, you can send me a mail to get access.

Previous 5 items
1
...
11
12
13
14
15
16
17
...
105
Next 5 items