Weblog
International emails
Sending emails with strange characters to strange people, or at least people with non-ascii names.
Every few months at my employer Zest Software we have an evening of "eten en weten". Literally that is Dutch for "eating and knowing". Let's call it "Food for thought". We eat together and several of us hold presentations on subjects that are in some way related to our work. For example: Django, common Dutch language mistakes, how we use subversion, or local site hooks and the many interesting ways in which they can break when migrating from Plone 2.5 to 3. I managed to squeeze that last one into a lightning talk of a few minutes; you really don't want to know. ;-) (In case you do want to know, take a look at Products.Plone3Cleaners).
It is probably about time for a new "eten en weten" so it is probably also about time I uploaded my talk from last time about international emails. I talked about some base terminology, what can go wrong, pointed to the python email module and showed how to send a complete message, including some details that you can forget as long as you use the proper methods. After all, foreign languages are difficult enough already:

i18n/l10n
Two terms widely used are:
internationalization i 18 n localization l 10 n
Roughly said, in a Plone context, internationalization is making sure the content or the UI is translated into several languages. Localization is making sure that 3 May 2009 is 05-03-2009 in the USA and 03.05.2009 in Germany.
These two terms are not really the focus here though. The point is: how do you make sure that an email sent from Plone (or any python application really, if you ignore some details) with a Chinese name as From address, a Japanese name as To address, a Russian Subject and a Korean body text is delivered without errors.
Now do not think: "I live and work in America, I only need ascii." Don't you have Spanish colleagues? Some friends from your year abroad at that French university? A few Chinese clients? You could use only ascii, but you might regret that:

utf-8 is not unicode
Repeat after me: "utf-8 is not unicode", "utf-8 is not unicode", "utf-8 is not unicode":
>>> type('ascii') >>> type('utf-8') >>> type(u'unicode')
Basics
Sending an email in Plone goes something like this:
charset = portal.getProperty( 'email_charset', 'ISO-8859-1') mailhost = getToolByName(portal, 'MailHost') mailHost.send(message = msg, mto = address, mfrom = mfrom, subject = subject, charset = charset)
What can go wrong with that?
Hard to read headers:
From: RenXX Artois
Hard to read body text:
lettere accentate: ò ùâ
Unrecognized addresses:
To: undisclosed recipients
No email body: C
UnicodeDecodeErrors/UnicodeEncodeErrors
Parsing/formatting addresses
The To and From fields should have something like this:
Maurits van Rees
The standard python email package has nice utilities for this:
>>> from email.Utils import parseaddr >>> from email.Utils import formataddr >>> formataddr(('Maurits van Rees', 'maurits@example.org')) 'Maurits van Rees ' >>> parseaddr( 'Maurits van Rees ') ('Maurits van Rees', 'maurits@example.org')
These functions can get confused by strange characters. You can guard against that by parsing the address that you have just formatted and seeing if the parsed information still makes sense:
from_address = portal.getProperty( 'email_from_address', '') from_name = portal.getProperty( 'email_from_name', '') mfrom = formataddr((from_name, from_address)) if parseaddr(mfrom)[1] != from_address: # formataddr probably got confused # by special characters. mfrom = from_address
Character sets
The python email.Charset module has interesting information about how email headers and body text should be encoded depending on the input character set. Some examples (QP is quoted printable):
input header enc body enc output conv iso-8859-1: QP QP None iso-8859-15: QP QP None windows-1252: QP QP None us-ascii: None None None big5: BASE64 BASE64 None euc-jp: BASE64 None iso-2022-jp iso-2022-jp: BASE64 None None utf-8: SHORTEST BASE64 utf-8 ...
If that does not make sense, perhaps this helps:

This information is used when creating email headers:
>>> from email.Charset import Charset >>> latin = Charset('iso-8859-1') >>> utf = Charset('utf-8') >>> latin.header_encode('René Artois') u'=?iso-8859-1?q?Ren=C3=A9_Artois?=' >>> utf.header_encode('René Artois') '=?utf-8?q?Ren=C3=A9_Artois?='
and encoding body text:
>>> latin.get_body_encoding() 'quoted-printable' >>> latin.body_encode('René Artois') 'Ren=C3=A9 Artois' >>> utf.get_body_encoding() 'base64' >>> utf.body_encode('René Artois') 'UmVuw6kgQXJ0b2lz\n'
This may look confusing. Surely if you get an email with a text or subject like this it is unreadable? No, your email program should be smart enough to display this to you in a readable fashion. No need for the funny face:

Formatting headers
Instead of using email.Charset for formatting headers you normally use the email.Header module:
>>> from email.Header import Header >>> subject = 'Re: René'.decode('latin-1') >>> subject u'Re: Ren\xc3\xa9' >>> subject = Header(subject, 'latin-1') >>> subject >>> print subject =?iso-8859-1?q?Re=3A_Ren=C3=A9?=
Formatting the body
You will need to know which character set the body text has, or at least in which character set it can be encoded without errors. This snipped tries three character sets:
charset = portal.getProperty( 'email_charset', 'ISO-8859-1') for body_charset in 'US-ASCII', charset, 'UTF-8': try: message = message.encode(body_charset) except UnicodeError: pass else: break
If the message only contains ascii characters, then at the end of this snippet the message is encoded in ascii and the body_charset variable is 'US-ASCII'.
Send it
We have done all the hard work with the Headers so now we can use the 'send' method:
# Create the message. # 'plain' stands for Content-Type: text/plain from email.MIMEText import MIMEText msg = MIMEText(message, 'plain', body_charset) msg['From'] = email_from msg['To'] = email_to msg['Subject'] = subject msg = msg.as_string() mailhost = getToolByName(portal, 'MailHost') mailhost.send(message=msg)
Using secureSend
Easier is to use the secureSend method; using with the Header class is not needed then, as secureSend takes care of that:
email_msg = MIMEText(message, 'plain', body_charset) mailhost.secureSend( message = email_msg, mto = email_to, mfrom = email_from, subject = subject, charset = header_charset)
Now international email sending should work:

Images courtesy of:
Version numbers in CMFQuickInstaller
A tale of two versions.
Note: this article has been copied to http://plone.org/documentation/how-to/version-numbers-in-cmfquickinstaller where it has more chance of being updated to keep pace with developments.
When you write an add-on package for Plone, there are currently two ways of writing an installer: a GenericSetup (GS) profile and an External Method (an install method in Extensions/install.py). The preference nowadays is for the GS profile; in fact, in the trunk of Plone (which will be Plone 4 in a year or so) a GS profile is the only way.
But meanwhile, when you go to the portal_quickinstaller (QI) in the ZMI, or the Add-On Products section in the Plone Site Setup (same QI, different user interface), you might see some confusing numbers. For example, you may have previously installed Products.FooBar 1.2 in your Plone Site and now you have upgraded it to version 1.3 on the file system and the quick installer tells you it has an upgrade available for "Profile for Foo Bar' from version 1.2 to 700. Or some other unexpected number, like a lower 1.0. What happened to Products.FooBar 1.3 that you just added to your buildout?
Well, first of all, this product apparently has a GS profile and the QI does not show the name of the package (Products.FooBar) but the (hopefully informative) name of the GS profile (Profile for Foo Bar). Fair enough. But what about those version numbers?
What happened here is that you actually do have the correct version 1.3 of the package, but the QI is showing the version of the GS profile, which can be wildly different from the package version. The idea behind the differing numbers is that this new package release only has some minor changes to for example some page templates. When there are no changes to the GS profile, there is no need to reinstall the package, or reapply the GS profile, so we keep the profile version the same. Since the package and profile version can be different, it is best to keep them different from the start. For the profile version it is fine to simply use increasing numbers: 1, 2, 3.
In this case, something has changed in the package and caused the profile version to be shown where previously the package version was used; or perhaps those numbers were the same in the previous release. For a better understanding, let's look at how the various CMFQuickInstaller releases handle version numbers. Handy conclusions are at the end.
In all versions:
- The GS Profile name is shown, otherwise the package name is shown.
- If the getProductVersion method of the QI cannot find a version, no version is shown.
- Packages must be in the Products name space or use the five:registerPackage directive in zcml, otherwise they are not shown in the Product Management section of the Zope Control Panel.
- For getting the version number from the version.txt file, the package must be listed in that Product Management section. For getting the version number from setup.py or from the GS profile (the metadata.xml file) this is not necessary.
- Packages that rely on an external method (so no GS profile) for getting installed and that are not in the Product Management section, are not listed in the QI.
- Note that the Products section of the Zope Control Panel only looks for version.txt, at least in Zope 2.10.
Plone 3.0
- Plone 3.0.6 uses QI 2.0.4.
- getProductVersion gets the version from version.txt only.
- The GS Profile version is never shown.
- Packages must be in the Products name space or use the five:registerPackage directive, otherwise they are not shown in the QI. Only with this QI version is this true both for packages with an external method and for packages with a GS profile.
Plone 3.1/3.2
- Plone 3.1.7 and 3.2.2 use QI 2.1.6.
- getProductVersion gets the version from version.txt; when that fails it gets the version from the GS profile.
- Packages with a GS profile are always shown in the QI, also when they do not use five:registerPackage and are not in the Products name space.
- When a GS profile version is found, this version is shown, except that when five:registerPackage is used (or the Products name space) and there is a version.txt, then that version is shown.
Plone 3.3
- Plone 3.3rc2 uses QI 2.1.7.
- getProductVersion gets the version from setup.py; else it gets the version from version.txt; it specifically does not get the version from the GS profile.
- setup.cfg is also used, so if you have tag_build = dev in the egg_info section, the version will have 'dev' behind it.
Conclusion
It looks like the only way to have the same version number listed in all Plone 3.x versions is with version.txt. Follow these guidelines and (at least for the studied QI versions) the version number from version.txt is always shown:
- Use the Products name space or use the five:registerPackage directive. (Tip: when using paster to create your package, answer yes to the question 'Are you creating a Zope 2 Product?')
- Use a version.txt.
- In the setup.py use the version number from version.txt as the package version.
With this strategy it is possible to use GS profile versions that are entirely different from the version of your package, without confusing users. Specifically, I think I will start using just integers for the GS profile versions, instead of the dotted numbers I mostly use now.
One last conclusion: for Plone 3.0 (QI 2.0) remember that if you only have a GS profile and no external install method, your package will not be shown in the QI so you will have to use the portal_setup tool instead to install your package.
Using Ubuntu 9.04 beta
The next Ubuntu version is almost out. I am using the beta release now.
Code named "Jaunty Jackalope", the next Ubuntu GNU/Linux release 9.04 is scheduled to be officially released in April 2009 (hence the version number). On my work laptop I was still using the 8.04 release. An upgrade this weekend to 8.10 gave me problems with my video card and I decided to start with a fresh install and give the 9.04 beta a try.
In general, things just worked for me. No nasty surprises. The video card (nvidia) worked fine. So a big thumbs up for the Ubuntu community that again delivers quality!
How does it fare in my daily work? I develop Zope/Plone applications for Zest Software. The current stable Plone version (3.x) still needs python 2.4. The biggest change in this Ubuntu version then is that python 2.4 is not installed by default anymore and that several python packages only work for python 2.5 and 2.6. And they conflict with their python 2.4 counterparts. For example, the python imaging (PIL) and ldap modules gave me problems.
Installing the python2.4 package is no problem, just do sudo aptitude install python2.4.
To get PIL to work in my buildouts I created a new file pil.cfg with these contents:
[buildout] extends = buildout.cfg find-links += # For PIL (python-imaging): http://download.zope.org/distribution # PIL(woTK) - python imaging - added to eggs. eggs += PILwoTK
Then I ran bin/buildout -c pil.cfg and it worked. (Well, you could just add those lines to your buildout.cfg, and I did something slightly different still, but you get the idea.)
For ldap I created a virtualenv. You may need to specify your python version in the virtualenv call, see virtualenv --help. After that it is just easy_install python-ldap. I got some errors when doing that, which indicated that I needed some more packages. These were the ones I missed and that I added with aptitude (or apt-get):
libldap2-dev libsasl2-dev libssl-dev
I could install PIL in a virtualenv too. Or I could add the virtualenv with PIL/ldap to the PATH or something like that. I'll see what works for me.
When your correctly configured portal tool is not working
Case in point: portal_transforms has a pdf_to_text transform but when indexing a pdf the transform is not found so SearchableText returns no content from the pdf file.
For a customer at Zest Software I am migrating a site from Plone 2.5 to Plone 3.1. In the migrated site I uploaded a pdf file. None of its contents ended up in the SearchableText index. In a fresh Plone Site in the same Zope instance this did work. In the portal_transforms tool the pdf_to_text transform was correctly registered. The mimetypes_registry looked okay. The pdftotext binary was available on the system. So everything looked fine, but did not actually work. What is going on?
Well, it turned out that the portal_transforms in the ZMI was not actually used. A getToolByName call was made which did not give back this tool but a utility. And the utility did not have the pdf_to_text transform. So I went to the Components tab in the ZMI of the Plone Site root. I removed the portal_transforms utility from the xml listed there and applied the changes. This made the pdf_to_text transform available again. Problem solved.
Note that this is the first time I edited the xml on the Components tab, so be careful if you do this: it may have adverse effects that I have not noticed yet; and I can imagine that typos are dangerous here.
So how did this go wrong? I did not explore this further, so I can only guess. I think during the migration from 2.5 to 3.1 the pdf_to_text binary was not available. Or there was some other reason why the transform did not work. During the migration the utility got added so it missed this transform. I removed the portal_transforms tool and added it again to get the missing transforms. At that point the utility and the tool were not linked to each other anymore. Again, it is a guess.
So, the conclusion of all this: if your portal tool does not work like you think it should, check the Components tab and see if a utility is registered under the same name. Removing it there may help. Note: keep a backup of your Data.fs when you do this and do not try it out on a production website but try it locally first.
Scrambled eggs
Broken python eggs won't hatch. [Update: there are patches.]
As Philipp von Weitershausen says, there is mostly no point to distributing .egg files. Indeed I have been creating source distributions (still called eggs usually) using python setup.py sdist for some months now. But those eggs can have their problems too, though some of those problems are shared with the .egg files.
Broken tar files on python 2.4
Python 2.4 has a tarfile module that breaks in some cases. If you unpack a tar file that contains a directory path with exactly 100 characters the unpacking will fail. Marius Gedminas helpfully fished up some bug reports when I asked about this on the distutils-sig mailing list a few weeks ago. Read that complete thread for a bit more info.
The result is that an sdist of my.package in tar.gz format can work fine in version 1.2, fail in 1.2.3 and work again in 1.2.3.4: the path that has a slash at position 100 in 1.2.3 will have that slash at position 98 in version 1.2 and at 102 in 1.2.3.4.
Solution: use the zip format when creating an sdist. You can do that with python2.4 setup.py sdist --formats=zip or create a file setup.cfg next to setup.py with these contents:
[sdist] # guard against tarfile bug by using the zip format formats = zip
Until now I am still using the tar.gz format that is the default (on Linux) and only create .zip files when it is needed. But I am not sure if there is a really a good reason to stick to tar.gz.
Setuptools and subversion 1.5
When the server that contains your subversion code is using subversion 1.4 and you have subversion 1.5 on your computer, setuptools fails with a missing python import. See this thread on distutils-sig and the blog from Mr. Topf. Both have some solutions too.
You can hit that problem while easy_installing something or when running buildout. There may be other version combinations or actions that trigger the error. At least I think I saw this with subversion 1.4 too, but could not reproduce it later.
Actually, that bug is only important when installing an existing egg. But creating an egg can also fail; and that also has to do with subversion 1.5 too. This calls for a new section.
Missing files with subversion 1.5
I just tried creating an sdist of the new xm.tracker package (an optional package for Products.eXtremeManagement, adding a time tracker).
With subversion 1.4 a perfect tar ball is created. With subversion 1.5 about half the files are missing, for example the version.txt file. Creating a .zip file instead does not help, so it is not the tar file bug mentioned earlier. And in fact, creating a .egg file does not help either. Using python 2.5 instead of 2.4 has no effect.
I have not heard reports of this yet, so there may be something weird in this particular package. If you want to try creating an sdist yourself, do an svn checkout of http://svn.plone.org/svn/collective/xm.tracker/tags/0.2
In this case I fixed it by logging into a server that still had subversion 1.4 and creating the sdist there.
Update: as colleague Mark van Lent writes there is a patched version of setuptools available that fixes this problem. Or use easy_install setuptools==dev06.
Conclusion
Entirely too much can go wrong when creating an sdist. And with .eggs you are not safe from harm either. Like I already said in a previous log entry, you need to check that egg!
So after you create it, try easy_installing it. Or for Zope/Plone packages: create a buildout configuration that uses this released egg and run the automated tests and try to start Zope with it. I just started doing that last week with the released.cfg file in the plonehrm buildout.