Pieter Noordhuis: Redis in Practice

published May 20, 2011

Summary of talk at the PyGrunn conference.

Pieter Noordhuis: Redis in Practice, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

Redis is a key-value store. It can be compared to memcached. But it natively supports strings, lists, sets, sorted sets and hashes. Everything is stored in memory, so that puts a limit on what you can put in it but also makes it very fast. You can also persist it though, unlike memcached. Supports replication, so you can have one master that you write to and say fifty slaves just for reading.

Any blob will do: ascii, utf-8, png. Example:

redis> set str "hello world"
OK
redis> get str
"hello world"

It runs in a single thread: no race conditions or locks; every operation is atomic. This greatly simplifies replication.

Invalidate immediately:

redis> del page:/home
(integer) 1

You can have rate limiting:

INCR limit
EXPIRE limit 60 # iff INCR == 1

Lists are a natural fit: RPUSH a new job (push it at the right of the list) and LPOP a job from the left of the list. With PubSub you can set up some notifications when jobs are done.

Sets: unordered sets of unique values. Ordered Sets of unique values: ZADD an item to increase its score, ZREM to decrease it. You can use this easily to show currently logged in users, top users for some measurement, etc.

We can do 100,000 gets and sets per second on commodity hardware. With tweaking and better hardware we have heard of 1.5 million per second.

You can have durability through snapshotting: save every N seconds or every N changes.

Getting started: http://redis.io/download, no dependencies, make it and you are done.

Henk Doornbos (Paylogic): Making large, untested code bases testable

published May 20, 2011

Summary of talk at the PyGrunn conference.

Henk Doornbos (Paylogic) talks about making large, untested code bases testable, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

Computer scientist, software engineer, architect, consultant, head architecture department. The Paylogic system is front and back office for payment. Mostly Python (84406 lines of code) and javascript.

There are many abstraction levels: components, classes, features, the code level. Process when doing a support case: staging, testing, bug reports. We use FogBugz as bug tracker. Example: downloading a 'scanware' file was broken. Problem solved now, but can we check automatically that it has indeed been fixed and is not broken later again. Maybe we can execute the bug report as a test? This can be done when you are rigid enough in specifying the bug report ('Given When Then' style: starting state, steps to reproduce, expected behavior). You can write a small language that specifies what should be done when you encounter specific text in a bug report, translating that to Selenium tests (visit this page, click there, check that text is on the resulting page). We use the robot framework.

You could do similar things with record and playback. But often the bug report is a missing requirement. We only want to add features that deliver value. Behavior driven development style: to have some fun (goal, value), as a customer (who, actor), I want to buy a ticket (reason). You can write that as:

Given I am buying tickets
When I choose the amount
Then I see the total cost

So in one smooth, agile process you create these products: requirements document, acceptence tests, working code that delivers value, bug reports.

Gideon de Kok: Mobile Architectures

published May 20, 2011

Summary of talk at the PyGrunn conference.

Gideon de Kok talks about Mobile Architectures, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

Yes, you should treat mobile clients differently. Mobile applications should still work reasonably even when not connected, if possible. You want to get things done quickly. More push and pull. Best practices are still mostly the same. Difference: your server should work harder than the client: no big calculations in javascript. Don't let the client pull but let the server push; this saves battery life.

Improve connectivity: do lots of caching. Store them locally in the browser/app. Refresh your data periodically, also on user request. Properly expire and delete data. Combine requests so you have less problems with latency and connectivity. If connection fails, just wait. Queue requests and send them when connectivity is there again, instead of checking every few seconds. Restrict connectivity: if you don't use it, don't send it. Don't preload things the client may possibly need.

Make your payload small, one simple trick is of course using gzip. Adapt your payload: send smaller images or send only alternative text. No more direct database access: create an API to talk to the database, including logic for failing connectivity. Never trust API input. Client side validation is good for connectivity and performance, but you definitely need to check on the server too.

Security: encrypt your data streams and store secret stuff well. And do not save too secret stuff on the client: you don't want thieves stealing secrets from your phone by reverse engineering.

A mobile-specific API probably has more push and less pull calls.

Don't overengineer your SSL connections. It takes too long to decrypt it on a mobile device if you send too much encrypted data. Safety versus speed.

Native applications are generally speedier and better in the pushing and pulling, relying less on good connectivity than a mobile web site. When you build native apps as e.g. a bank, you suddenly become a software creator, instead of just needing to keep a website running. A web app based on html5 and css3 it will mostly work on all devices; with a mobile specific app, you have lots more devices that you would need to check for compatibility. There are frameworks for this, but the resulting apps will be less effective than an app created specifically for one device.

Luit van Drongelen: Lightweight Python deployment servers

published May 20, 2011 , last modified May 23, 2011

Summary of talk at the PyGrunn conference.

Luit van Drongelen talks about Lightweight Python deployment servers, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

I code python for fun (and hopefully eventually for profit). What is wrong with Apache? Well, nginx is much faster. You can double the responses and have much less memory usage. Apache has higher I/O load, e.g. for loading .htaccess file. nginx can natively connect to uWSGI: fast, light weight version of modwsgi. The protocol is uwsgi (so all lowercase). This is run separately from the web server. Tested on many operating system; not Windows, sadly.

So why use uWSGI instead of e.g. modwsgi. It's fast with a lower memory foot print. Can handle multiple interpreteer versions in multiple virtualenvs. Supports old and new WSGI standard. It can kill misbehaving worker threads; helps if you are coding those wrongly yourself. It can also handle long-running tasks. Configuration can be in ini files, json, environment variables, command line options, etcetera. Built-in message-passing system. Embedded (evented/async) HTTP server. It even has a clustering feature (beta at the moment). [Live demo of that feature, which worked yesterday.]

Some other WSGI servers show great performance too. For me uWSGI performs great and has good features.

Comment from room: look at this WSGI performance comparison.

See the WSGI slides.

Rix Groeneboom (Parasoft): Mijn Overheid: Performance testing in practice

published May 20, 2011

Summary of talk at the PyGrunn conference.

Rix Groeneboom (Parasoft) talks about Mijn Overheid: Performance testing in practice, at the PyGrunn conference in Groningen, The Netherlands. Organized by Paylogic and Goldmund Wyldebeast & Wunderliebe.

('Mijn Overheid' is Dutch for 'My Government'.) I have a new hobby: collecting screen shots. One is a screen shot of 23 March 2010: disruption at DigiD, central login application of the government, for example for submitting your tax form. Another one: Servers of Cito offline during exam. Yet another one: at the DUO website (government agency that gives loans to students) you could see data from other students. And: changes in a test environment were available on other production systems, so people suddenly discovered they were married or had children.

At the Mijn Overheid website you can login with your DigiD and see all kinds of info about yourself, like home address, home ownership, license plates, speeding tickets. If you agree, the government can send you messages in the system instead of via paper mail. The government wanted a way on this website to know for sure that you have read it. The system is called the 'berichtenbox', message box.

We wanted to be able to simulate lots of civilians ('burgers') to login and use the web site and see what that does with the load on the servers; what is the 'temperature' of the system? Systems in isolation were tested, but there were also dependencies and those make it trickier.

With the government we specified performance requirements. We profiled the server implementation (a LAMP setup). Some search queries were very slow as they needed info from several servers; searching for 'gemeente' (community, city) would give almost every item back as search result. We profiled the combination of the MO (Mijn Overheid) website with the GEB (database with info on civilians).

We had only 2 DigiD accounts available to test the load on the system. That is not quite enough. So we faked/stubbed the DigiD server. This way we could let the test MO website talk with our fake DigiD server with lots more load, making the testing also simpler.

We created a platform for SOA and chain testing. Written in Java, Jython and Eclipse. Agile and continuous integration with a command line and web interface. It could run on Windows, Linux, Solaris, with a Linux VMWare image available. External integration with version control, defect tracking and testdata generation.

For the load testing we used Python and MySQL. We implemented intelligence, like simple 'wait' steps, generating valid BSN (social security) numbers. Every request was logged in MySQL; this database also had a shadow administration of user accounts. Extra checks to see if someone had been able to see information for a different account.

The system was very fast on low load. But at 1000 users at the same time, requests could take 20 seconds. That would be quite an extreme load, but good for testing. We found that the database query took 90 percent of the time. Also loading the PHP modules at some point took 8 seconds. A little smarter programming fixed that (being smarter so you don't always need require_once in the interpreter). Some optimizations in NFS, Apache and the OS.

We talked about process improvements. The load testing sometimes could not be done on the acceptance server because that was being used to test functionality. We figured out where the actual bottle necks were, which could be on a different system: it's nice if the load on our server is low, but if the user is still waiting for reaction from a different system it does not buy him much.

Summary: We set a performance norm. We found out the weakest link in the various system. Result was a large number of improvements.