Indexes in catalog.xml considered harmful

published Dec 02, 2009, last modified Mar 16, 2022

Do not add indexes in catalog.xml. Do that in a separate import or upgrade step. Read on for how to do that; I will throw in some general GenericSetup best practices along the way.

Basic use of catalog.xml

Using GenericSetup you can add indexes and metadata columns to the portal_catalog with a catalog.xml file like this:

<!--?xml version="1.0"?--> <object name="portal_catalog" width="300" height="150">
<index name="getSomething" meta_type="KeywordIndex">
<indexed_attr value="getSomething"></indexed_attr>
</index>
<column value="getSomething"></column></object>

Specifying an index will add an index for getSomething in the portal_catalog so you can search on it. Specifying a column will add getSomething to the metadata of the catalog brains, so you can ask a brain what his value is for getSomething. These are very different use cases, so before you add both an index and a column you may want to think if you really need them both or if one of them is enough.

Anyway, specifying a column here is fine. Nothing wrong with it. Do note that when you add a column here this does not make getSomething available in the current brains in the catalog. You will need to do a reindex; a clear and rebuild of the catalog would do it, but it may be enough to find and reindex items of one specific content type that has this field. Depending on your specific situation this may or may not be an issue.

What happens with indexes?

What is almost never a good idea however, is specifying an index here. What this does is it creates the index in the portal_catalog. The index is not filled automatically, so you will have to reindex it manually (or write some code for that). But what happens the next time you reinstall your product or reapply your profile? The index gets removed and recreated. So the index is empty and you will need to reindex it manually again! That is not very handy.

This might be fixable in the GenericSetup import handler for catalog.xml. But this is hard to do as it is currently not possible to verify without a doubt that the index that is currently in the portal_catalog has the same configuration as specified in the catalog.xml. For example, the id might be the same but the existing index might be a FieldIndex and catalog.xml might specify a KeywordIndex. This specific check might be doable, but there are other indexes for which this is not so simple.

Import handler

So, what do you do instead? You add an import handler. I have done that in several products, so instead of copy-pasting code from one of those products I might as well copy-paste it from my weblog. :-)

Write an import step in setuphandlers.py:

import logging
from Products.CMFCore.utils import getToolByName
# The profile id of your package:
PROFILE_ID = 'profile-your.product:default'


def add_catalog_indexes(context, logger=None):
    """Method to add our wanted indexes to the portal_catalog.

    @parameters:

    When called from the import_various method below, 'context' is
    the plone site and 'logger' is the portal_setup logger.  But
    this method can also be used as upgrade step, in which case
    'context' will be portal_setup and 'logger' will be None.
    """
    if logger is None:
        # Called as upgrade step: define our own logger.
        logger = logging.getLogger('your.package')

    # Run the catalog.xml step as that may have defined new metadata
    # columns.  We could instead add  to
    # the registration of our import step in zcml, but doing it in
    # code makes this method usable as upgrade step as well.  Note that
    # this silently does nothing when there is no catalog.xml, so it                                                                                  
    # is quite safe.
    setup = getToolByName(context, 'portal_setup')
    setup.runImportStepFromProfile(PROFILE_ID, 'catalog')

    catalog = getToolByName(context, 'portal_catalog')
    indexes = catalog.indexes()
    # Specify the indexes you want, with ('index_name', 'index_type')
    wanted = (('getSomething', 'FieldIndex'),
              ('getAnother', 'KeywordIndex'),
              )
    indexables = []
    for name, meta_type in wanted:
        if name not in indexes:
            catalog.addIndex(name, meta_type)
            indexables.append(name)
            logger.info("Added %s for field %s.", meta_type, name)
    if len(indexables) > 0:
        logger.info("Indexing new indexes %s.", ', '.join(indexables))
        catalog.manage_reindexIndex(ids=indexables)


def import_various(context):
    """Import step for configuration that is not handled in xml files.
    """
    # Only run step if a flag file is present
    if context.readDataFile('your_package-default.txt') is None:
        return
    logger = context.getLogger('your.package')
    site = context.getSite()
    add_catalog_indexes(site, logger)

If you need to replace an existing FieldIndex with a KeywordIndex this code is not enough, but we ignore that possibility here.

The rest should be nothing new, but let's make it clear and explicit by showing everything here.

Register your GenericSetup code

I usually end up moving the registration of GenericSetup profiles, import and upgrade steps in a separate zcml file called profiles.zcml. We need to include that in our configure.zcml:


We register our profile and our steps in profiles.zcml:

  
  

  
  

  
  


metadata.xml

Create a profiles/default directory if this does not exist yet. This must have a metadata.xml file like this:

 

  1001

The version number should be an integer. This profile version has nothing at all to do with the number in our version.txt or setup.py, but that is a different discussion. The destination number in our last upgrade step registration must match this metadata version.

Flag file

When you apply a GenericSetup profile or (re)install a product, every import step defined by any package is called. One step looks for a catalog.xml file within the profile directory of the profile that is being applied and exits if it is not there; another looks for a skins.xml and exits if it is not there. Our own import step must do the same, otherwise our code is executed far too often, even when our product is not installed.

As seen above, our import_various import handler starts with this check:

if context.readDataFile('your_package-default.txt') is None:
    return

So we must add a file with the name your_package-default.txt in profiles/default. The contents don't really matter; it can be something like this:

Flag file for the import handler of your.package

Proper catalog.xml

In our case we assume we still want the extra metadata in the catalog brains, so instead of the catalog.xml from the beginning we will have this one:


Happy indexing!

Keywords
plone