Python Users Netherlands
Dutch Python Users meeting hosted by Nelen & Schuurmans in Utrecht on 21 September 2012.
Why do we use pyramid in our project? We did not need a lot of standard functionality (so we did not choose Plone). We did not need pluggability (so we did not choose Zope). We had a complete design of our own, so we wanted to be able to customize things from the bottom up. We needed things fast, translatable, and built in Python.
We started with Pylons. We used that for about a year, but it was not maintained very much anymore at that point. We ported to repoze.bfg, with help from Chris McDonough. Took about two months to migrate the whole thing, which was about the time we estimated. This worked quite well actually.
repoze.bfg got merged into Pylons and became Pyramid. It was easy to migrate to that.
We have fashion portals. For each customer a website with an own design. There is standard functionality, like login, registration, figure analysis and a shop Each site has extras, like a CMS, magazines, celebrities, faceted search, etc.
We have a standard site. For each client we create a wrapper around it, where we add different routes (say urls) and replace some css or parts of templates.
The styling engine is a library to work with body measurements and clothes. Initially we implemented it in Python. We rewrote it into C++ with Boost.Python, which was an interesting experience, partly to be able to not give away the source code in clean text. (Audience: have you looked at bikeshed, which does something similar with C++? No, I don't think it existed yet.)
Internationalization and localization was painful. For us it also had to work in, for example, Chinese, so this was important. We needed translation of content, displaying numbers, times, dates, currencies, etc. Translating urls was interesting, including unicode urls. The design of a Chinese website is very different, much more visible on the page.
zope.i18n and babel both have problems. Python's POSIX gettext support is not complete.
For data entry we created a central system for tagging of clothes. The design was stolen from the NuPlone (R) skin. It requires a modern browser, specifically we tell our clients they must use Chrome.
Quality assurance was important. Everyone makes mistakes. It was hard to get our developers in China to write tests. We use tests and reviews, to avoid errors. We gather errors and logs from sites to get information about what still goes wrong on live sites.
What have we learned?
- There is no good (perfect) library and it is probably not possible to create this. There are too many ways to use forms. We currently use wtform, but just the schema part. Libraries only work well for a specific use case.
- Internationalization and localization is hard.
- Internationalization is very hard if you cannot read the language of the customer.
- It is important to follow the direction of the platform you are using.
- Pyramid has been great for us. It is very flexible, with a light weight core.
Disco is a large scale data analysis platform. We had a project with a lot of data, which was not really doable on a single computer. It is an implementation of MapReduce, written in Erlang. You use Python to write jobs.
- Map: chop up your work into little bits.
- Reduce: work on the little bits.
Disco users start jobs in Python scripts. These send the jobs to a server that distributes them.
For more info about disco, plus another summary of tonight's talks, see his blog: http://blog.kollerie.com/2012/09/22/python_users_netherlands_meeting/
I'm not a data mining expert, but I'll show some extremely simple easy algorithms.
First clustering, with K-Means. Then classification: find the label for a thing.
Take away from this: venture outside your own field and use your knowledge in that field and the other way around.
We had a project with sensors where we had to test dikes, to see when they would fail and get our feet wet.
We needed to get data out of an API, so get something from a server. You could use urllib or urllib2 for this, but Request is a much nicer Python library.
try: response = request.get(self.source_url, timeout=...) except requests.exceptions.Timeout: ... if response.json is None: ...
We used Django previously, but we only used about ten percent of it. We now use about 90 percent of Pyramid.
Gantt charts do not work. They are static, someone up top comes up with them and they are outdated when they are distributed. So we want progressive planning.
MongoDB is really easy to install. We use that as a basis.
We started with Django, because I knew it. Plus Postgres, in 2009. The problem was that our use case did not fit Django. We have RESTful API through Django-Tastypie. Wonderful, but it was very inefficient. It was not at all scalable without hard-coded SQL magic. There were no object level privileges, no offline working, etcetera. For example there was some code that saved some state just in case, but that killed our servers because the state consisted of lots and lots of relations.
Why adopt MongoDB? Everything is JSON, which is what we wanted. Deployment is a breeze. It is web-scalable and flexible out of the box. It has very active development and a growing community.
Why MongoEngine? It is small, transparent, active. mature, well-documented, readable, responsive authors. It connects Python (Django?) and MongoDB.
mongoengine-relational features. It manages changes to relations, automatically updates the other side, memoizes document fields to monitor differences and much more. As bonus we needed DocumentCache. It transparently caches documents in a thread-safe, self-attaching DocumentCache on the request object. Really easy to setup. You define trigger functions like on_change_<fieldname>.
Why mongoengine-privileges? We needed document-level and field-level permissions. We could not find an existing mixin that correctly and transparently handled object-level privileges across relations.
Why TastyMongo? We already had a RESTful API (TastyPie) that worked like a charm in Django. You can use TastyMongo to talk with MongoDB in Pyramid.
We end up writing a lot of throwaway bash scripts. After a few months you improve it. Then a colleague improves it some more. So it is not throwaway code after all. So for starters put it in source code control.
Can it be in Python? Sure. You can create some basic stuff in house. We have something like this in all our scripts:
from in_house.toolbox import script @script(name='defrobnicator') def main(script_utils): ...
You want to prevent reinventing the wheel.
from sh import ifconfig, git, ls, wc
I have not used that yet actually.
Also have a look at plumbum.
from blessings import Terminal
See the speaker's own notes: http://www.logophile.org/blog/2012/09/25/a-less-than-lightning-talk/
$ ^chmod^chown # Run the previous line (with typo) with the improved spelling $ sudo !! # Run the previous command as sudo $ mv !-1:1 !-2:1 # Get stuff back from previous commands (careful when you used ``rm``) $ CTRL-x CTRL-e # Start your editor on the current line, save and run the command $ shopt -s globstar; ls **/*py # List all python files recursively
Bash history cheat sheet: http://www.catonmat.net/download/bash-history-cheat-sheet.pdf
Fabric is a library to do stuff on remote machines. I use vagrant to easily create a VirtualBox with Linux on my Mac and use a small fabric-like script to easily use that.
cd ~/vm/my-vagrant-box/home/me/somewhere vc bin/test # ssh to the vagrant box, go to this dir and run bin/test
Thanks to Nelen & Schuurmans for hosting the meeting.