Plone
This is here to serve as contents for the atom/rss feed for Plone, also read by planet.plone.org.
Plone Conference 2009 Intro
Short intro about the plone conference 2009.
The Plone Conference 2009 has just started. I am there with my Zest Software colleagues Jean-Paul and Fred. Location is Budapest, capital city of Hungary. It is the biggest Plone conference so far, with about 400 attendees.
Thomas Moroz talked a bit about KARL. He works for the Open Society Institute (OSI) that helped organize the conference. KARL is one of the largest sites built with Plone; a knowledge management system. He will talk about it in more dpeth later this week.
Python User group Netherlands meeting
The three-monthly meeting of the Dutch python user group was held Thursday 24 September 2009 in Amsterdam, in the Beurs van Berlage, organised by Go2People. First a half-hour presentation about Phatch, then some lightning talks, another half-hour presentation on MonetDB and moving from python 2 to python 3, then a few more lightning talks.
My brother Reinout was also there and of course he wrote a summary too.
Stani: Phatch - Batch photo processing for everyone
Lots of photo data, including metadata. You have ImageMagick. Phatch allows you to resize images, do colour profiling. Combine actions. Prefabricated recipes. You don't need to know how to do graphics manipulation. You can remove metadata so you can upload photos anonymously to sites. You can have a droplet on your desktop; just drop an image on it and it will do the actions you specified. Safe and unsafe mode. Safe restricts you; in unsafe mode you have the full power of python. It's a command line tool, but also a UI. Uses open source fonts. Phatch can do anything Gimp and ImageMagick does and more as they are the back ends, including PIL and blender. Its goal is to unite all open source graphics tools. Anyone can submit new objects. Works on Linux and mostly on Windows and Mac, but we need more developers there. About six developers now. Someone has plans to work on Django integration. Link: http://photobatch.stani.be/
Jasper Spaans, Fox-IT: Pycuda - massively parallel computing
Calculating with the GPU (Graphical Processing Units). GPU's are incredibly fast for parallel computations compared to their price. Cuda is only nor Nvidia. OpenCL does that too, and that can also use ATI/AMD. Use it for parallel algorithms. PyCuda is a python interface for cuda. Slides in Dutch: http://jasper.es/talks/pun2009-09
Wim Feijen, Go2People: Bazaar (and mercurial and git)
Bazaar: you can have one or more repositories. You can work and commit locally. It's simple. It tracks movements. You can branch projects. Look at http://github.com (not only for git users) and http://bitbucket.org, for free project hosting.
Remco Wendt, Maykin Media: libraries that you love
Does anyone know of a library that is cool to use? Speak up about it!
- Simon: greenlets; does micro threads; nice for animation, xml parser, http://pypi.python.org/pypi/greenlet
- zc.buildout: separate different project environments, repeatable deployment, http://pypi.python.org/pypi/zc.buildout
- pysvn: bindings for svn in python, http://pysvn.tigris.org/
- Ivo: virtualenv, keep libraries separate, sandbox them, http://pypi.python.org/pypi/virtualenv
- Tim: Paste: WSGI library (and other stuff), http://pypi.python.org/pypi/Paste
- werkzeug: like paste but different - http://pypi.python.org/pypi/Werkzeug
- WebOb, request and response objects: http://pythonpaste.org/webob/
- repoze.who, repoze.what: WSGI authentication and authentication framework, middleware, http://docs.repoze.org/who/ and http://what.repoze.org/docs/1.x/
- Fabric: remote deployment tool, http://docs.fabfile.org/0.9/
- grok: web framework, zope 3, without zcml headaches, http://grok.zope.org/
- scrapy: build web scrapers, http://scrapy.org/
- lamson: smtp server framework in python, http://lamsonproject.org/
- pep8, python style guide checker, finally easy_installable, http://pypi.python.org/pypi/pep8
Jan Jaap Driessen (Health Agency) and Sylvain Viollon (Infrae): Grok
Web framework, based on Zope. Even cavemen can use Zope. Just had a sprint about Grok. Takes the pain out of working with Zope. Zope has lots of layers, configuration, zcml, adapters, whatever. With Grok you do not register your stuff in zcml. Set of default settings. Grok "groks" your code, understands your code without needing a separate configuration language. Grok 1.0 is due for release this or next week. We have been using it for three years already. http://grok.zope.org
Gijs Molenaar, UvA, CWI: MonetDB and python 3
DBMS, developed by CWI, open source, BSD, platform for scientific research, not only SQL, http://www.monetdb.nl. Good for big databases; SkyServer is 6TB. For query intensive apps that are mostly read-only. For complex data models beyond SQL. Sometimes outperforms other databases by a hundred times for big tables.
It is a column based storage database. All columns are shown separately. Flexible kernel, which is easily extensible. SQL front end, but not limited to SQL; can use Xquery.
You can store xml documents in your database. Monet takes the document apart and stores it in several tables.
There are APIs for Ruby, PHP, python, perl (needs a rewrite but nobody wants to touch it), C, java. The CWI at the University of Amsterdam wanted me to rewrite the python API. MonetDB API is the wire protocol. Basic functions: set up connection, execute commands, parse the response, chop queries and responses in blocks. python.sql is the implementation of all DBAPI 2.0 functionality. Bridge between this and MonetAPI (mapi).
There were tests available for the MySQL python API, so I borrowed those for Monet. We looked at supporting python 2 and 3. We did not want to use two trees. We decided to do a 2+3 hybrid, though Guido van Rossum tells us not to do that. In python 3 there is a new IO layer with bytes and strings. New IO layer is used in the mapi layer. We made two versions here, mapi2.py and mapi3.py as it was too hard to combine, with exceptions that were no longer there in python 3. We do a python 3 version check in only two places, like conditionally importing mapi2 or mapi3, so not too bad. Tip: don't use print, use logging.
The hybrid python2-3 model works quite well in our case as the API is quite small. We minimised code duplication. Not advised for all projects.
Future work: bug fixes, port the xquery api, currently I am rewriting the C api (mapilite), unit tests in python using ctypes, having problem with structs.
Something completely different. http://pythonic.nl non-profit amateur web hosting provider for python web development. Focus on django.
Question: How mature is MonetDB? Well, there have been cases of data loss, but it has gotten better the last few years and it has been used in various projects.
Tim Molendijk: debugging
I usually just hunt for a bug, kill it and forget how I did the debugging and not improve it next time. But I am learning. I should probably use the python debugger, but usually I start with adding a few print statements. But with WSGI applications that normally does not work. So yes, next step is (i)pdb. For Django: Django debug toolbar.
Suggestions from the crowd: post mortem debugger, Twill, zc.testbrowser, mechanize, Selenium, winpdb, pydev for eclipse (you can still use vim for writing the code if you have to), werkzeug (for django).
Maurits van Rees: international emails
Short version of a presentation I gave at my employer Zest Software and that I blogged about recently. Suggestion from Jasper Spaans: use lamson (mentioned above), an smtp server framework in python.
International emails
Sending emails with strange characters to strange people, or at least people with non-ascii names.
Every few months at my employer Zest Software we have an evening of "eten en weten". Literally that is Dutch for "eating and knowing". Let's call it "Food for thought". We eat together and several of us hold presentations on subjects that are in some way related to our work. For example: Django, common Dutch language mistakes, how we use subversion, or local site hooks and the many interesting ways in which they can break when migrating from Plone 2.5 to 3. I managed to squeeze that last one into a lightning talk of a few minutes; you really don't want to know. ;-) (In case you do want to know, take a look at Products.Plone3Cleaners).
It is probably about time for a new "eten en weten" so it is probably also about time I uploaded my talk from last time about international emails. I talked about some base terminology, what can go wrong, pointed to the python email module and showed how to send a complete message, including some details that you can forget as long as you use the proper methods. After all, foreign languages are difficult enough already:
i18n/l10n
Two terms widely used are:
internationalization i 18 n localization l 10 n
Roughly said, in a Plone context, internationalization is making sure the content or the UI is translated into several languages. Localization is making sure that 3 May 2009 is 05-03-2009 in the USA and 03.05.2009 in Germany.
These two terms are not really the focus here though. The point is: how do you make sure that an email sent from Plone (or any python application really, if you ignore some details) with a Chinese name as From address, a Japanese name as To address, a Russian Subject and a Korean body text is delivered without errors.
Now do not think: "I live and work in America, I only need ascii." Don't you have Spanish colleagues? Some friends from your year abroad at that French university? A few Chinese clients? You could use only ascii, but you might regret that:
utf-8 is not unicode
Repeat after me: "utf-8 is not unicode", "utf-8 is not unicode", "utf-8 is not unicode":
>>> type('ascii') >>> type('utf-8') >>> type(u'unicode')
Basics
Sending an email in Plone goes something like this:
charset = portal.getProperty( 'email_charset', 'ISO-8859-1') mailhost = getToolByName(portal, 'MailHost') mailHost.send(message = msg, mto = address, mfrom = mfrom, subject = subject, charset = charset)
What can go wrong with that?
Hard to read headers:
From: RenXX Artois
Hard to read body text:
lettere accentate: ò ùâ
Unrecognized addresses:
To: undisclosed recipients
No email body: C
UnicodeDecodeErrors/UnicodeEncodeErrors
Parsing/formatting addresses
The To and From fields should have something like this:
Maurits van Rees
The standard python email package has nice utilities for this:
>>> from email.Utils import parseaddr >>> from email.Utils import formataddr >>> formataddr(('Maurits van Rees', 'maurits@example.org')) 'Maurits van Rees ' >>> parseaddr( 'Maurits van Rees ') ('Maurits van Rees', 'maurits@example.org')
These functions can get confused by strange characters. You can guard against that by parsing the address that you have just formatted and seeing if the parsed information still makes sense:
from_address = portal.getProperty( 'email_from_address', '') from_name = portal.getProperty( 'email_from_name', '') mfrom = formataddr((from_name, from_address)) if parseaddr(mfrom)[1] != from_address: # formataddr probably got confused # by special characters. mfrom = from_address
Character sets
The python email.Charset module has interesting information about how email headers and body text should be encoded depending on the input character set. Some examples (QP is quoted printable):
input header enc body enc output conv iso-8859-1: QP QP None iso-8859-15: QP QP None windows-1252: QP QP None us-ascii: None None None big5: BASE64 BASE64 None euc-jp: BASE64 None iso-2022-jp iso-2022-jp: BASE64 None None utf-8: SHORTEST BASE64 utf-8 ...
If that does not make sense, perhaps this helps:
This information is used when creating email headers:
>>> from email.Charset import Charset >>> latin = Charset('iso-8859-1') >>> utf = Charset('utf-8') >>> latin.header_encode('René Artois') u'=?iso-8859-1?q?Ren=C3=A9_Artois?=' >>> utf.header_encode('René Artois') '=?utf-8?q?Ren=C3=A9_Artois?='
and encoding body text:
>>> latin.get_body_encoding() 'quoted-printable' >>> latin.body_encode('René Artois') 'Ren=C3=A9 Artois' >>> utf.get_body_encoding() 'base64' >>> utf.body_encode('René Artois') 'UmVuw6kgQXJ0b2lz\n'
This may look confusing. Surely if you get an email with a text or subject like this it is unreadable? No, your email program should be smart enough to display this to you in a readable fashion. No need for the funny face:
Formatting headers
Instead of using email.Charset for formatting headers you normally use the email.Header module:
>>> from email.Header import Header >>> subject = 'Re: René'.decode('latin-1') >>> subject u'Re: Ren\xc3\xa9' >>> subject = Header(subject, 'latin-1') >>> subject >>> print subject =?iso-8859-1?q?Re=3A_Ren=C3=A9?=
Formatting the body
You will need to know which character set the body text has, or at least in which character set it can be encoded without errors. This snipped tries three character sets:
charset = portal.getProperty( 'email_charset', 'ISO-8859-1') for body_charset in 'US-ASCII', charset, 'UTF-8': try: message = message.encode(body_charset) except UnicodeError: pass else: break
If the message only contains ascii characters, then at the end of this snippet the message is encoded in ascii and the body_charset variable is 'US-ASCII'.
Send it
We have done all the hard work with the Headers so now we can use the 'send' method:
# Create the message. # 'plain' stands for Content-Type: text/plain from email.MIMEText import MIMEText msg = MIMEText(message, 'plain', body_charset) msg['From'] = email_from msg['To'] = email_to msg['Subject'] = subject msg = msg.as_string() mailhost = getToolByName(portal, 'MailHost') mailhost.send(message=msg)
Using secureSend
Easier is to use the secureSend method; using with the Header class is not needed then, as secureSend takes care of that:
email_msg = MIMEText(message, 'plain', body_charset) mailhost.secureSend( message = email_msg, mto = email_to, mfrom = email_from, subject = subject, charset = header_charset)
Now international email sending should work:
Images courtesy of:
Using Ubuntu 9.04 beta
The next Ubuntu version is almost out. I am using the beta release now.
Code named "Jaunty Jackalope", the next Ubuntu GNU/Linux release 9.04 is scheduled to be officially released in April 2009 (hence the version number). On my work laptop I was still using the 8.04 release. An upgrade this weekend to 8.10 gave me problems with my video card and I decided to start with a fresh install and give the 9.04 beta a try.
In general, things just worked for me. No nasty surprises. The video card (nvidia) worked fine. So a big thumbs up for the Ubuntu community that again delivers quality!
How does it fare in my daily work? I develop Zope/Plone applications for Zest Software. The current stable Plone version (3.x) still needs python 2.4. The biggest change in this Ubuntu version then is that python 2.4 is not installed by default anymore and that several python packages only work for python 2.5 and 2.6. And they conflict with their python 2.4 counterparts. For example, the python imaging (PIL) and ldap modules gave me problems.
Installing the python2.4 package is no problem, just do sudo aptitude install python2.4.
To get PIL to work in my buildouts I created a new file pil.cfg with these contents:
[buildout] extends = buildout.cfg find-links += # For PIL (python-imaging): http://download.zope.org/distribution # PIL(woTK) - python imaging - added to eggs. eggs += PILwoTK
Then I ran bin/buildout -c pil.cfg and it worked. (Well, you could just add those lines to your buildout.cfg, and I did something slightly different still, but you get the idea.)
For ldap I created a virtualenv. You may need to specify your python version in the virtualenv call, see virtualenv --help. After that it is just easy_install python-ldap. I got some errors when doing that, which indicated that I needed some more packages. These were the ones I missed and that I added with aptitude (or apt-get):
libldap2-dev libsasl2-dev libssl-dev
I could install PIL in a virtualenv too. Or I could add the virtualenv with PIL/ldap to the PATH or something like that. I'll see what works for me.
When your correctly configured portal tool is not working
Case in point: portal_transforms has a pdf_to_text transform but when indexing a pdf the transform is not found so SearchableText returns no content from the pdf file.
For a customer at Zest Software I am migrating a site from Plone 2.5 to Plone 3.1. In the migrated site I uploaded a pdf file. None of its contents ended up in the SearchableText index. In a fresh Plone Site in the same Zope instance this did work. In the portal_transforms tool the pdf_to_text transform was correctly registered. The mimetypes_registry looked okay. The pdftotext binary was available on the system. So everything looked fine, but did not actually work. What is going on?
Well, it turned out that the portal_transforms in the ZMI was not actually used. A getToolByName call was made which did not give back this tool but a utility. And the utility did not have the pdf_to_text transform. So I went to the Components tab in the ZMI of the Plone Site root. I removed the portal_transforms utility from the xml listed there and applied the changes. This made the pdf_to_text transform available again. Problem solved.
Note that this is the first time I edited the xml on the Components tab, so be careful if you do this: it may have adverse effects that I have not noticed yet; and I can imagine that typos are dangerous here.
So how did this go wrong? I did not explore this further, so I can only guess. I think during the migration from 2.5 to 3.1 the pdf_to_text binary was not available. Or there was some other reason why the transform did not work. During the migration the utility got added so it missed this transform. I removed the portal_transforms tool and added it again to get the missing transforms. At that point the utility and the tool were not linked to each other anymore. Again, it is a guess.
So, the conclusion of all this: if your portal tool does not work like you think it should, check the Components tab and see if a utility is registered under the same name. Removing it there may help. Note: keep a backup of your Data.fs when you do this and do not try it out on a production website but try it locally first.