Python Users Netherlands meeting, 24 March 2010
Summary of the meeting of the Python Users group Netherlands.
This PUN meeting is organized and sponsored by Maykin Media and go2people. The location is the ABC Treehouse (American Book Center), Amsterdam. We have this location for free once a month, for the next year. So if you have ideas here for community projects, let us know. Next meeting is July 14th. It could be here, but if there is a different spot, that is good as well.
Like always, 2 larger presentations (half an hour), 5 lightning talks, and drinks afterwards.
Idea: we could use this location for a 'code dojo', a coding practice. Get together with smart people that know about coding and build something nice. Or a pair programming dojo: take a simple problem and work on it with two people.
Student at the Vrije Universiteit Amsterdam, via the Erasmus exchange program; I come from Poland. I am writing my master thesis on concurrency in python. Concurrency is about simultaneous computations, potentially interacting with each other. Could be running on separate cores or cpus.
Python has two modules for this: thread (low level, in C), and threading (more high level, more advanced threading primitives, more object oriented).
A lot of people do not like threading in Python because of the Global Interpreter Lock (GIL). Only a single thread can run code at one point. Other threads can start any time. Every 100 ticks (by default) the interpreter lock is released so another thread can take over. Also done on IO blocking operations. What this means is that you have no gains on multicore systems. Performance can even get worse for multiple cores, because threads battle for the GIL. Also, the latency in response to events is larger because of the GIL. On single core systems the performance with the GIL is quite good. It is a safe environment for C extensions: you can make some handy assumptions about the environment. Also, you do not have to use threading.
Some people try to remove the GIL. Greg Stein tried it; slower for single core machines, hardly faster on multicore; similar for python-safethread. There is PyPy and Unladen Swallow. There is also a policy on dropping GIL in cPython, which can be used. Other solutions: processes instead of threads; Jython and IronPython, Async.
The multiprocessing module (python 2.6) looks a lot like threading. Almost the same, but not entirely. No implicit shared memory (global variables, imported modules). There needs to be some interprocess communication (IPC). Data needs to be serializable (picklable).
Asynchronous solutions have been present in python for some time, like the standard asyncore module. The Twisted web server does this explicitly. It has a reactor loop that registers callbacks that other code should call when it is ready with processing. Other option: greenlets; all threads run in the same interpreter, so the same thread. See modules Eventlet, Gevent, PyEvent; quite a lot happening there recently. Lower level: select, epoll, kqueue for more directliy talking to the operating system. Also look at Kamaelia, and Stackless python. These solutions can be better when you would otherwise need thousands of threads. Some of this is used by big online games, like Eve Online.
See the slides.
From Apeldoorn. Started as programmer in Cobol. You need a jcl file (stream) to make an executable program. I made a program to create that jcl automatically.
From Go2People. With @property you turn a method into a property. How does that work? It is not an object property, but a class property. But the class property is not the same as the object property, which is not what you would expect. Descriptors can optionally implement __set__ and __delete__. In the attribute search order, python does some checks to see what should be returned, and this is where the descriptors take their place. Functions are always descriptors. See among others Shalabh Chaturvedi: Python Attributes and Methods.
This was a demo about opening beer bottles without an opener. Always useful information during a PUN. :-)
From http://gijs.pythonic.nl. I am studying still, and doing things with computer vision and python. Computer vision means: extracting information from images. An image is a matrix of pixels. Things you can do, are: convert the color space (Hue Saturation Intensity), so you can for example focus on just one colour; thresholding; clustering; line, geometry or edge detection. Usage can be: optical character recognition, human computer interfaces, surveillance, augmented reality, or even an artistic toy. In The Netherlands we have section control (trajectcontrole) on some roads now: check if a car keeps to the maximum speed over a certain section of road (basically turning it into maximum average speed). More info: book Computer Vision: Algorithms and Applications, by Richard Szeliski from Microsoft Research; see http://research.microsoft.com/en-us/um/people/szeliski/Book/
About OpenCV: Open Computer Vision. It is open source, over 500 functions, originally developed by Intel, for Windows, Linux and Mac. Structure: HighGUI module for the IO and the GUI, make windows, sliders, webcam interface; CXcore module, for datatypes and basic functions; CV module for all the interesting computer vision functions; ML for Machine Learning, for analyzing data that comes out of the CV.
He demoed a program, of just a screen full of code, that captured live images from his webcam and did line recognition.
Three APIs: new python wrapper (native C++), old python wrapper (using swig), PyOpenCV (Boost/Bjam + numpy). The old wrapper is usually available on your OS as a package (aptitude install python-opencv, port install opencv). I have packages for the new wrapper ready for upload. Manual compilation: advantage: full control, some optimizations. Book: Learning OpenCV.
Problems: a lot of people report problems with installations; sometimes buggy, with segmentation faults, in which case you usually need to use the code a bit differently; sometimes undocumented; almost useless mailing lists, far more questions than answers, except when you have a really interesting question.
Presentations and examples can be downloaded.
A virtualenvironment (virtualenv) is an isolated environment where you can install packages with pip, without bothering already installed packages in other environment. So for instance you can use this as an easy sandbox for safe testing of an unknown package. virtualenv --no-site-packages dirname creates a virtualenv in the directory dirname. cd dirname; source bin/activate. Set export PIP_RESPECT_VIRTUALENV=true somewhere in your ~/.bashrc. This way pip install mypackage will figure out if you are in a virtualenv and act accordingly. With pip freeze > production_ready.txt you can get the currently used package list with the correct versions, which you can reuse later, with pip install -r production_ready.txt.
Remark from the crowd: never remove old versions of packages from pypi or from wherever; it breaks peoples software. So please please never do this.
From http://pragmagik.com. pydirs is a simple python object database. First ideas: 'svnodb', an svn object database, but subversion turned out to be a nuisance. So now pydirs. Simple code base, less than 500 lines. Simple explorable storage format with just directories and basic datatypes as text files. Great for debugging as you can use the command line tools. Everything else is stored as pickles. API: typeregistry, database object, session object (get, commit, rollback), pydiritem object. Currently quite stable, and used in production. I hope to release it very soon.
Update: it has been released here: http://johnnydebris.net/pydirs.txt
See my brother Reinout's weblog for another summary.