Matt Hamilton, Matt Sital-Singh - Transmogrifier: The good, the bad and the ugly
Talk during the Plone Conference 2012.
Matt Hamilton, Matt Sital-Singh talk about transmogrifier, the good, the bad and the ugly, during the Plone conference 2012.
This is about content migration. A quick show of hands: most people in the room have heard of transmogrifier. About half have used it in some way.
`collective.transmogrifier is the basis, with some companion tools with recipes or extras, like plone.app.transmogrifier and quintagroup.transmogrifier. Simple cases are very well documented. There have been many talks about it, so Google for it, for example Lennarts Regebro during Plone conference 2009, and Clayton Parker.
All the talks are mostly about how easy it is. We are going to talk about the rest as well, some things that you should be aware off.
Content migration in itself is hard. 95 percent is straight forward, but then there are the corner cases which are hard. You may have to hold to a short maintenance window. There always some changes that people want next to simply importing the old content. You may think you have everything covered but seven folders deep some item is not exported for some reason and you only find out too late. The ZODB is a bit harder for that than a simple relational database.
There are a lot of recipes, blue prints of how to do the simple cases. Sometimes you only have one shot. For one client we had a weekend to migrate a large 2.5 Plone Site to 4. You should be prepared, and have some test runs before that. You need to know how long the migration takes and can mostly not afford mistakes.
We have some code examples that probably should be moved to some of the upstream packages: https://github.com/netsight/netsight.transmogrifier
One of the things I tried to do, was divide the mgiration up into more sensible amounts. So divide a site into useful sections, minimise cross references and cleanup later. It could be as simple as first importing the users, then importing the content.
Killing the catalog is a good option. As you go along, Plone is merrily indexing and reindexing stuff. Cataloging is a large part of the time. So you could try out switching the catalog off and on again in the transmogrify pipeline.
Watch out for some corners, like that the modification date of all items will be set to today when you are migrating.
Watch out for running out of memory. You may need to restart Zope halfway through so you free up memory, depending on which versions you are using.
A migration can be a good time to get rid of some custom content types, when you have created some new types where in practice the standard types are actually good enough.
There are some dark corners. quintagroup.transmogrifier uses Marshall under the hood. This seemed to be okay, but we ran into some xml problems within Marshall, where the export format was different from the import format. Marshall was last touched in 2006, so it is hard to figure out if there was a reason for this. We have fixed some bugs there.
In netsight.transmogrifier we made some more dynamic pipelines, so look at those.
Takeaway: transmogrifier works. Some glue code needs to be published.
Steve: look at collective.jsonify instead of Marshall. Also: transmogrifying from Archetypes to dexterity works too.
How do you verify that it works? We do manual inspection. You could use web spiders to check things.
Some things do not come over, like the portal_redirection tool and portal_archivist with older revisions.