Weblog

published Nov 03, 2021, last modified Nov 04, 2021

How to minimize CPU and memory usage of Zope and Plone applications

published Oct 10, 2007, last modified Jan 30, 2008

Plone conference 2007, Naples. Speaker: Gael Le Mignot Company: Pilot Systems

Plone conference 2007, Naples. Speaker: Gael Le Mignot Company: Pilot Systems

Why this talk? Plone is powerful but slow. Us developers tend to forget about optimization. "Computers are powerful, right?" Yes, but Plone can be slow, so we need to look at that. You can do several things.

  • Use fast algorithms, especially for code that is called often. In a test algorithm going from list to set speeds things up by a factor of 2700.
  • Trust the base python code, instead of rolling your own functions.
  • Compute once, use often. (So use plone.memoize for instance.)

How do you know how much time your code takes?

  • Write a decorator printing the time taken by its function execution.
  • On Unix systems, use resource.getrusage
    • Should you look at system or user time? user time is the time spent in your python/zope code.
    • Unix does not give you per-thread accounting. So you could use just one thread during testing. But that steers you away from real life.
  • Due to caching etcetera a second test will probably give different results.
    • Fresh load strategy: test on a freshly started zope
    • Second test strategy: run the test twice and throw away the first result.
  • DeadlockDebugger: meant for debugging deadlocks. But you can also use it for profiling.
  • While running the benchmarks, do not surf the web or otherwise use your computer.

Write efficient Zope code:

  • Use the catalog, but do not store too much in it, else it gets too big.
  • Page templates and python scripts are slow.
  • Content types, tools and python code in general are faster.
  • Use caching with decorators for slow methods. Do not store zope objects in the cache, but simple python types. Think about what you put in the cache key. Roles? User ids?
  • Think about expiry of the cache, otherwise your memory will run full. Options:
    • keep it for a limited time
    • store a maximum number of objects
    • use LRU (or scoring)
    • restart Zope
    • Have sme button you can push to flush the cache.

Pilot Systems is releasing GenericCache, a GPLed pure python cache, so you may want to look at that.

Conflict errors:

  • With threads and transactions, nothing is written to the ZODB until the end. So when your transaction takes too long, another transaction may commit earlier, which causes you code to run again, slowing you down even more.
  • Try not to change too many objects in the same transaction.
  • But look out for resulting consistency errors. Look for a good point to commit the transaction which will not leave your data in an inconsistent state.
  • Some conflicts can be resolved. For instance: for the most recent access date you can just pick the most recent. Use the _p_resolveConflict method for those cases. It takes old state, saved state and current state as input. It is up to you to use it.

Memory freeing may kick in too late, wasting memory that you know can be freed. This is made difficult with several layers on top of each other: python garbage collector, C libraries, operating system.

Swapping memory: the size is no real problem. The problem is the frequency with which things are fetched from disk to memory.

  • Do I have a memory problem? Use vmstat 2 on the command line to look at your virtual memory state. Interesting columns in its output: si and so on Linux, pi` and po on FreeBSD.
  • Use the gc (garbage collection) module of python. Get a list of objects, check for uncollectable cycles, ask for manual collection.
  • Monkey patch malloc. Track malloc/free calls with ltrace. In the GNU C library write hooks to malloc.

Optimizing memory:

  • Use the catalog, like mentioned already. It is better for your memory than awakening objects.
  • Store big files in the filesystem. This will avoid polluting the ZODB cache. Even better: use Apache to serve them directly. Use tramline. The BlobFile directory can be served by Apache.
  • You can use del to manually remove large objects before running some slow code.

Massive site deployments:

  • Use ZEO. Easy to set up. Makes Zope parallellizable on many CPUs which makes Zope faster.
  • But: it will slow ZODB access, especially when doing it over a network. Also, the ZODB is on one server, which can become a bottle neck.
  • Use a proxy-cache like squid. This works best for anonymous users. Caching for logged-in users is more difficult.
  • Squid varying: cache based on e.g. the language that the browser includes in the request.
  • Create a second ZODB on a different server by exporting the site as static html or with a zexp or just copying the ZODB. Syncing back is tricky, but we did it with PloneGossip and SignableEvent.

Conclusion: Plone is not doomed to be slow! But optimization has to be part of all steps, from design to code. Fast CPUs and large memory does not solve this problem.

i18n, locales and Plone 3.0

published Sep 25, 2007, last modified Mar 05, 2013

More and more products are developed as python packages instead of Zope products. This means they should not be put in the Products directory anymore, but in the lib/python/ directory of your instance. How you handle translations has changed a bit because of that. Also with Zope 2.10 and higher the translation machinery has changed a bit. So how should you handle translations now?

For the most recent version focusing on Plone 4, see my talk at the Plone Conference 2012.

For a version from 2010 focusing on Plone 3.3 and 4, see a different article on my blog.

How should you handle translations now when dealing with locales, i18n, products, packages and Plone 3.0?

For clarity: I use the following terms here: a product is in the Products dir, a package is in the lib/python dir.

Please correct me if anything I write here is wrong.

locales

Most translations should now be put in the locales directory of your product or package. This directory must be registered in zcml before it is picked up on Zope startup. So in your configure.zcml add:



    


The locales directory differs a bit from the i18n directory. i18n has for example:

i18n/mydomain.pot
i18n/mydomain-nl.po
i18n/mydomain-de.po

In locales that should be:

locales/mydomain.pot
locales/nl/LC_MESSAGES/mydomain.po
locales/de/LC_MESSAGES/mydomain.po

You can rebuild the .pot file with:

i18ndude rebuild-pot --pot locales/mydomain.pot --create mydomain .

and you sync the .po files with:

i18ndude sync --pot locales/mydomain.pot locales/*/LC_MESSAGES/mydomain.po

Domain plone

If you want to add translations to the plone domain you could add locales/plone.pot and locales/nl/LC_MESSAGES/plone.po (or locales/plone-mydomain.pot and locales/nl/LC_MESSAGES/plone-mydomain.po if you want). That works: your translations for the plone domain are then available. But now the default translations for the plone domain (so those from PloneTranslations) are overridden. This is because your translations for the plone domain are picked up first by the zope translation machinery; the PlacelessTranslationService that normally loads the translations from PloneTranslations then gets ignored for the plone domain. So half your site is in English even though your browser is set to Dutch.

So extra translations for the plone domain should not be done in the locales directory. You can still put them in the i18n directory though. In Plone 3.5 this is likely to change.

As Hanno Schlichting tells me, the same is of course true for the other domains in PloneTranslations, like atcontenttypes, cmfeditions, plonelanguagetool, etcetera.

i18n

If you put an i18n dir in a package, it will be ignored. Doing i18n:registerTranslations for this directory does not work. You can only use an i18n dir in a product.

GenericSetup profiles

Several .xml files in the profiles can handle i18n as well. For instance in a types definition like types/JobPerformanceInterview.xml:



The i18n:domain should be plone and not for instance mydomain as these translations are used in templates of plone itself. In fact, I think that the only use for having these i18n commands in the .xml files is that they can then be extracted by i18ndude (version 3.0 I think, which is best installed in a workingenv).

Compiling .po files

.po files in i18n are compiled on zope startup time by the PlacelessTranslationService. With compiling I mean: turning them into .mo files so they are usable by zope. This automatic compiling does not happen .po files in packages. So it is better to compile those files yourself. You can do that like this:

# Compile po files
for lang in $(find locales -mindepth 1 -maxdepth 1 -type d); do
    if test -d $lang/LC_MESSAGES; then
        msgfmt -o $lang/LC_MESSAGES/${PRODUCTNAME}.mo $lang/LC_MESSAGES/${PRODUCTNAME}.po
    fi
done

Conclusions

  • For both products and packages: put the translations for your domain in the locales directory.
  • Put extra translations for the plone domain in an i18n directory in a product for the Products dir.

Plone en Single Sign On

published Sep 19, 2007, last modified Sep 20, 2007

Door Duco Dokter van Goldmund, Wildebeast & Wunderliebe op de Nederlandse Plone gebruikersdag, 19 september 2007.

Door Duco Dokter van Goldmund, Wildebeast & Wunderliebe op de Nederlandse Plone gebruikersdag, 19 september 2007.

Existentiële zaken

Wat is het? Eenmalig authenticeren voor meerdere applicaties. 1 metasessie. Je hebt ook Web SSO, specifiek voor webapplicaties.

Waarom zou je het willen? Gebruikers willen het graag, al draagt het niets wezenlijks toe. Het is gewoon gemakkelijk. Wel heb je minder accounts nodig, net zoals bij bijvoorbeeld OpenID. Ook ligt de focus van het beveiligingsbeleid centraal, dus beleids- en beheersmatig is het handig.

Hoe gaat het in zijn werk? Er is 1 bron die de authenticatie regelt. Die bron wordt vertrouwd door andere applicaties. Een betrouwbaar protocol is opgesteld voor deze relatie.

Plone en SSO

Je kan regelen dat meerdere Plone sites dezelfde gebruikersgegevens hebben en inloggen bij de een je meteen authenticeert voor de ander. Plone kan je als front-end gebruiken voor andere sites, bijvoorbeeld middels atom of rss feeds. Je kan andere (non)webapplicaties in dezelfde sessies hebben. Plone kan ook net als anderen gebruik maken van LDAP, al hebben we het daar vanavond niet over.

CAS is een SSO server gebouwd aan de Yale universiteit. Het is een open protocol. Plone kan daarmee praten, middels PlonePAS en CAS4PAS en optioneel PloneCASLogin.

Sessie A

Je bezoekt (maakt een http request naar) Plone site A. Je krijgt een zogenoemde challenge (uitdaging) van CAS4PAS, die je redirect naar de CAS server over https. Daar log je in. Die CAS server zet een cookie en redirect je terug naar de callback service (dus Plone site A) met een ticket. Plone Site A gaat met dat ticket zelf weer terug naar de CAS server en vraagt of het ticket geldig is. Als het klopt, verwijdert de CAS server de ticket, zegt tegen Plone site A dat het in orde is en geeft het netID, de gebruikersnaam van de persoon die zich zojuist heeft aangemeld. Plone site A geeft vervolgens een response aan de gebruiker, met een Plone cookie.

Sessie B

Je bezoekt Plone Site B. Je kiest de inloglink naar de CAS server of krijgt automatisch een redirect naar de authenticatie. De CAS server herkent de sessie op basis van je cookie. CAS stuurt dus meteen een ticket terug, zonder dat je je gebruikersnaam en wachtwoord in hoeft te vullen. Daarna gaat het hetzelfde als bij sessie A.

Qua backend wordt vaak LDAP gebruikt of SQL.

Maurits van Rees, BICT

published Jul 19, 2007

I am now a Bachelor of ICT!

Today I got my diploma. I have finished my study of Informatics (specializing in Software Engineering) at the Rotterdam University. So I can now call myself Maurits van Rees, BICT (Bachelor of Information and Communication Technologies).

At the moment I am extremely happy, relieved, proud, joyful and very much in want of a short, well deserved, vacation. Tomorrow I am heading for a week to the Dutch New Wine Summer Conference.

Cheers!

Ing. Maurits van Rees

published Jul 19, 2007

Ik ben nu afgestudeerd ingenieur!

Vandaag heb ik mijn diploma gekregen. Ik heb dus mijn studie Informatica (afstudeerrichting Software Engineering) aan de Hogeschool Rotterdam afgerond. Ik mag mezelf dus ing. Maurits van Rees noemen. :-) Of op z'n Engels: Maurits van Rees, BICT (Bachelor of Information and Communication Technology).

Ik ben dus op het moment behoorlijk vrolijk, trots, blij, opgelucht, uitzinnig, gelukkig en hard toe aan een behoorlijk verdiende vakantie. Morgen vertrek ik voor een week naar de New Wine zomerconferentie.

Proost!