Armin Ronacher: Keynote: State of webdev in Python

published May 20, 2011

Summary of keynote at the PyGrunn conference.

Armin Ronacher gives the keynote "State of webdev in Python", at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

I am founding member of the Pocoo Team, doing Jinja2, Crossroads, Werkzeug, etc. Python 3 is kind of the elephant in the room that no one is talking about. So Python in 2011 has the big Python 2 versus 3 debate. 'Unladen Swallow' is resting, Python 3.2 was released, the packaging infrastructure is being worked on, including distutils2.

PyPy has become really fast. PyPy is Python written in Python. PyPy trunk is on average 3.7 times faster than the standard CPython, see http://speed.pypy.com. There is only experimental support for the Python C API. Different garbage collection behavior, no reference counting. So a few things will break, depending on what you are using it for. Django, ctypes, pyglet, twisted, all work.

All language development is now happening on Python 3. It adds unicode to the whole stack. The language is cleaned up. It does break backwards compatibility. Most code does not run on both 2 and 3, but a few packages do (lxml, at least at some point). It introduces unicode in exceptions and source compilation as well as identifiers (although I urge you not to try that). Greatly improved I/O API for unicode. Some new language constructs. Implementation was cleaned up a lot.

New constructs: extended iterable unpacking (a, *b = some_iterable), keyword-only arguments, nonlocal, function parameter and return value annotations (use them e.g. for documentation purposes).

print is now a function. Improved syntax for catching and raising exceptions. Ellipsis (...) syntax element.

Some different behaviours. More powerful metaclasses. List comprehensions are closer to generators now. Lesson: don't rely on undocumented 'features'.

Classic classes are gone. Imports are absolute by default.

Python 2.6 and 2.7 make it possible to write code that you can run through the 2to3 program to turn it into Python 3 code. You can use abstract base classes to check for certain implementations (not: is this a dict, but: is this dict-like).

Do you want beauty of code? Use Python 3. Do you want raw speed? Use PyPy.

Numeric libraries work great on Python 3 and benefit from improvements in the language.

Predictions:

Most people will write their code against 2.7 with the intention of supporting PyPy.
Libraries that require the C API will become less common.
We will see libraries that support targeting both Python 2.7 and 3.x.

Now the second part of this talk: Python and the Web.

WSGI has a new specification for Python 3. Some work is done to port implementations to Python 3. It just works; not really an issue anymore.

New developments: Improvements to PyPy support for database adapters. Improvements in template compilation in e.g. Django to take advantage of PyPy's behaviour. Some libraries are being ported over to Python 3.

Python 3 can work. You can start porting libraries over. Issues will only be resolved if you actually try to port. Higher level code is usually easier to port; low level libraries are trickier. Porting is easier if you drop support for 2.6. For porting, see http://bit.ly/python3-now.

WSGI works well in practice on Python 3. Pylons and BFG are now Pyramid, which is a nice introduction into the Zope world. There is less and less framework specific code out there; it is easier to share code.

At the low level, Werkzeug and WebOb may merge at some point; they are much alike.

Frameworks are good. In new frameworks we can explore new paradigms and concepts. It is surprisingly easy to switch frameworks or parts of them. Frameworks themselves are even merging.

I think PyPy will gain more traction in the python web world. It may eventually become more popular than CPython. Supporting Python 3 in PyPy should be easier than Python 2.

Things like 0MQ (Zero MQ) may help to have parts of your code in Python 2 and part in Python 3.

Jobert Abma: The Ten Commandments of Security

published May 20, 2011

Summary of talk at the PyGrunn conference.

Jobert Abma, ethical hacker at Online24, talks about the ten commandments of security, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

I will discuss ten things you need to think of to get a secure application.

1. Your application is not the only attack vector. There can be weak passwords in other parts of the stack or server. Social engineering can become an issue.

2. Conduct risk assessments to identify risks. Then you start controlling them. You can score a risk on Confidentiality, Integrity, Availability.

3. Only trust your own code. And double check. The platform you are developing on can have security problems.

4. 'Security by design' solves major issues. Application logic is an important part. Centralize validation.

5. Always be aware of technical issues, like CSRF, XSS.

6. Time (mis)management. You don't always get time from your manager to solve security issues, even when you are aware of it.

7. Keep track of design documents and documentation. Is the design secure? Does it still match the current functionality?

8. Process designing is one of the most important processes securing an application. If a checkout process in a web shop is not designed well so that 10,000 euros each day end up on someone else's bank account, that is a problem.

9 Security can clash with usability. 'This email is not in our database' is potentially interesting knowledge for an attacker.

10. Information is power. Encryption on the server side and on the transport layer. If your database gets hacked, does that give the attacker information he can use, like passwords and credit card numbers?

One more thing: handle input as being dangerous. It will save your ass more than once.

Summary: Security is not just a bunch of tricks. It is a process.

Òscar Vilaplana: ØMQ

published May 20, 2011 , last modified Jun 07, 2011

Summary of talk at the PyGrunn conference.

Òscar Vilaplana (Paylogic) talks about ØMQ (Zero MQ), at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

0MQ is sockets how it should be. Bindings are available in many, many languages, including Python. Of course messaging looks simple: you send a message and the guy on the other end receives it! Well, it still requires work. But 0MQ is indeed simple. (In python: import zmq). Messages are sent in the background. You have a queue. If the receiver is not there, the message just stays in the queue longer.

Messages are strings and they have a length. With multiparts they can have a sender. You can send through TCP or UDP. You can publish and subscribe. You send messages down a pipeline. As infrastructure you can choose a queue, a forwarder or a streamer.

You can poll from a 0MQ socket or a regular TCP socket or stdin.

Code from this talk: http://oscarvilaplana.cat/zmqtalk.tar.gz

Pieter Noordhuis: Redis in Practice

published May 20, 2011

Summary of talk at the PyGrunn conference.

Pieter Noordhuis: Redis in Practice, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

Redis is a key-value store. It can be compared to memcached. But it natively supports strings, lists, sets, sorted sets and hashes. Everything is stored in memory, so that puts a limit on what you can put in it but also makes it very fast. You can also persist it though, unlike memcached. Supports replication, so you can have one master that you write to and say fifty slaves just for reading.

Any blob will do: ascii, utf-8, png. Example:

redis> set str "hello world"
OK
redis> get str
"hello world"

It runs in a single thread: no race conditions or locks; every operation is atomic. This greatly simplifies replication.

Invalidate immediately:

redis> del page:/home
(integer) 1

You can have rate limiting:

INCR limit
EXPIRE limit 60 # iff INCR == 1

Lists are a natural fit: RPUSH a new job (push it at the right of the list) and LPOP a job from the left of the list. With PubSub you can set up some notifications when jobs are done.

Sets: unordered sets of unique values. Ordered Sets of unique values: ZADD an item to increase its score, ZREM to decrease it. You can use this easily to show currently logged in users, top users for some measurement, etc.

We can do 100,000 gets and sets per second on commodity hardware. With tweaking and better hardware we have heard of 1.5 million per second.

You can have durability through snapshotting: save every N seconds or every N changes.

Getting started: http://redis.io/download, no dependencies, make it and you are done.

Henk Doornbos (Paylogic): Making large, untested code bases testable

published May 20, 2011

Summary of talk at the PyGrunn conference.

Henk Doornbos (Paylogic) talks about making large, untested code bases testable, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

Computer scientist, software engineer, architect, consultant, head architecture department. The Paylogic system is front and back office for payment. Mostly Python (84406 lines of code) and javascript.

There are many abstraction levels: components, classes, features, the code level. Process when doing a support case: staging, testing, bug reports. We use FogBugz as bug tracker. Example: downloading a 'scanware' file was broken. Problem solved now, but can we check automatically that it has indeed been fixed and is not broken later again. Maybe we can execute the bug report as a test? This can be done when you are rigid enough in specifying the bug report ('Given When Then' style: starting state, steps to reproduce, expected behavior). You can write a small language that specifies what should be done when you encounter specific text in a bug report, translating that to Selenium tests (visit this page, click there, check that text is on the resulting page). We use the robot framework.

You could do similar things with record and playback. But often the bug report is a missing requirement. We only want to add features that deliver value. Behavior driven development style: to have some fun (goal, value), as a customer (who, actor), I want to buy a ticket (reason). You can write that as:

Given I am buying tickets
When I choose the amount
Then I see the total cost

So in one smooth, agile process you create these products: requirements document, acceptence tests, working code that delivers value, bug reports.