Plone
This is here to serve as contents for the atom/rss feed for Plone, also read by planet.plone.org.
Python Users Netherlands meeting 15 January 2014
Summary of the meeting of the Dutch Python Users group on 15 January 2014.
We were welcome at the offices of Schuberg Philis in Schiphol-rijk for food, drink, presentations, lightning talks and meeting fellow-Pythonistas. Thanks for organizing!
Schuberg Philis is doing managed hosting, with high uptime.
Pini Reznik - Ansible, Docker and Django
Pini Reznik works for Ugly Duckling.
In 1995 you had thick clients and thick servers. Well-defined stack of O/S, runtime, middleware. A monolithic physical structure.
In 2014 you often have a thin app on mobile, tablet. Various web front-ends. You have multiple environments, for example several (parts of) applications and servers running on various versions of Python and different operating systems.
Compare this with lots of different cargo that may need to be moved with trucks, trains, planes, ships, or all of them at some point. Some seventy years ago they switched to containers for shipping goods. You put things inside containers in your factory. A crane only supports one container size: it is simplified. Much easier than fitting a few pianos and barrels of oil without a container.
Common challenges in the pipeline:
- on development: set up your environment, especially when you first start for a company or a new project
- testing: a clean environment, with good tests
- acceptance: is it really similar/equal to production?
So: put your software in containers.
What is Docker? http://docker.io says: Docker is an open-source project to easily create lightweight, portable, self-sufficient containers from any application
We have a java project where every test is run in a separate docker container. This is very quick.
Docker is not a VM (virtual machine). Docker runs directly on the host operating system. A VM has much higher overhead.
Orchestration: defining and deploying multiple environments. Ansible is an orchestration engine. Key concepts:
- agentless: you only need your certificate or password and you can run on any machine. For most stuff you usually need Python on that machine though.
- language agnostic: you just need to be able to execute code; the code itself does not need to be Python. It can be a shell script, C, anything that is supported by the target machine.
- inventory: smart way to manage all the hosts. Inventory is in /etc/ansible/hosts: here you define that servers A and B are webservers, and servers C, D and E are database servers.
- playbooks: scripts that define hosts, vars, tasks, handlers. What to execute and where to execute it.
- modules: functionality written by somebody, for example for integration with docker.
Ad-hoc commands are for example:
ansible all -a "reboot" ansible webservers -a "reboot"
There is a UI around ansible. This is expensive, the rest is free.
Docker + Ansible = software configuration management done right. Everything we need for building our software is now finally in version control.
Demo. If you have a docker called ubuntu this just takes 122 milliseconds:
time docker run ubuntu ls
The docker module for ansible is a bit out of date. Docker is in early development.
Roderick Schaefer - Django as your Backbone
Roderick Schaefer is currently working for Schuberg Philis. See also https://wehandle.it. We do mission critical outsourcing. Say you have a bank and need a guaranteed 100 percent uptime. We can do that. We embrace devops. My devops team works on a project called Connect. We love Python and Django.
Old school web development: Django with MTV/MVC pattern. Model, template/view, controller.
New school: API driven. We use: Python, Django, jQuery, TastyPie (REST framework), Backbone, Require.js, Underscore, Handlebars, Backbone-TastyPie. It just makes sense to work API driven. With APIs you can securely expose databases. The front-ends consume, process and present. Your average phone is faster than your average few-years-old server.
Introduction to TastyPie. This is a tool you hook up to Django to expose your Django models. It can also expose non ORM-related data. Include authorization. Serialization with json.
Backbone. Frontend MVC framework. Uses Underscore.js and friends. Template rendering, routing (note: try to do only routing here, do not corrupt it with other stuff), models, collections, events.
For 'patch' support and file upload to work in Backbone plus TastyPie, you need a few changes in your code. There is form generation with Backbone-Forms.js.
A tip when transitioning to api based development. Wrap your base.html javascripts in a block. Empty that block in your Backbone powered apps. Include some legacy-scripts template from the template of your backbone app, for compatibility with generic stuff like the main navigation.
Challenges are url reversal and session state awareness.
Check http://TodoMVC.com to get familiar with various MVC frameworks, including Backbone.
Look into Backbone SEO if you want to use this for sites where you do not want a single page but want to be indexable by search engines. It is possible.
And now the lightning talks.
Pawel Lewicki - Plone
Working with gw20e in Amsterdam. Plone is a Python CMS. It has everything you want. It may be scary for most of you, but it does not have to be. Try out the installer, it is quite easy. I show a recorded demo. You have content types, workflow states, you can see changes to pages and revert them.
It is Python. Enterprise ready, scales well. Fully functional application. Lots of add-ons if you want those. Very simple from a user perspective, being able to edit in the same environment as visitors see. There is a user manual as a book.
Bahadir Cambel - PythonHackers.com
I use lots of languages, but I like Python. But where to get information? irc, github, bitbucket, svn, python.org, blogs. Come to the python hackers site.
Learn and share with REPL (Read - Evaluate - Print - Loop). Discover, connect, contribute and be awesome.
Find hackers like you. Discover and discuss open source projects. Write, talk, share code. Build applications together. Read and write articles.
A single platform only for Python, with python hackers like you. A bit twitterish, with messages and following, channels, timeline.
What we use technically: Flask plus plugins, memcache, redis, postgres, cassandra, autobahn, fabric, coffeescript. Future: Apache Kafka, some clojure.
Programming is a personal expression.
Douwe van der Meij - Concepts in Django
Working at gw20e. A project may have a crm and a webshop package, with some interconnection. You may want to open source the webshop, but how can you make it work without the crm?
General concept:
- Producers contain data.
- Consumers need producers.
For this example we have the concept of a Customer. The crm app produces a Customer. The webshop app consumes a Customer. Define a CustomerConcept class. Define a foreign key from webshop to the CustomerConcept.
I created a concepts Django app. This makes apps reusable without dependencies. The only dependency is a concept. You may choose your own concept producer, different for various clients.
Related work: Zope Component Architecture.
Ilja Heitlager - MicroPython
I am a Kickstarter fan. Information Officer at Schuberg Philis.
Damien George came up with the MicroPython project on Kickstarter.
PyMite was first, Python on a microcontroller.
MicroPython has a dedicated board. Python 3.3. Compiler/code generator on the board. Much faster. Library support, wifi board.
Participate on http://www.micropython.org
Expected: April 2014.
Holger Krekel - re-inventing Python packaging & testing
Holger Krekel gives the keynote at Pygrunn, about re-inventing Python packaging and testing.
See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.
I am @hpk42 on Twitter.
I started programming in 1984. I am going to tell you how distribution and installation worked that day, as you are too young to know. Me and a friend would sit down after school and take a magazine. One of us would read some hexadecimal numbers from it and the other typed it in. One and a half hour later we could play a pacman game.
Apprentice: "Can anyone tell me why X isn't finished?"
Master: "It takes a long time to write software."
Projects take time. CPython is 22 years old.
Where do all these efforts go into? Into mathematical algorithms? No. Deployment takes a huge bite. Software needs to run on different machines, needs to be configured, tested, packaged, distributed, installed, executed, maintained, monitored, etcetera.
The problem with deployment is the real world. Machines are different, users are different, networks are different, operating systems are different, software versions are different.
There are producers of software. If as a producer I change an API or a UI that creates a danger for my users. This means releasing a new version is dangerous, because for the users deploying the new version is potentially dangerous.
A lot can be solved by automation. Automated tests help. You need to communicate (allow users to track changes, have discussions). Configurations should be versioned so you can go back to earlier versions or at least see what the difference is. You need a packaging system and a deployment system. This may be more important than choosing which language to use.
The modern idea to simplify programming is usually: let's have only one way so it is clear for everyone what to do. Oh, and it should be my way.
Standardization fosters collaboration, even if the standard is not perfect. But tools that come out of this standardization are more important than the standardization document itself.
Are standardized platforms a win? For example 64/Amiga, iOS, Android, Debian, .NET, company wide choices for virtual machines and packaging. This reduces complexity, but increases the lock-in. You may not want to bet your whole business on one platform.
Modernism: have one true order. For example, Principia Mathematica for having one system of mathematics that could do everything. Gödel proved this was impossible.
Let's check the koans of Perl and Python. Perl says there is more than one way to do it. Python says there should be one - and preferably only one - obvious way to do it. Both say there are multiple ways. You need to take that into account.
A note on the Python standard library: Python includes lots of functionality. This was a good idea in the past. Today, PyPI often provides better APIs, and we can still improve it.
Perl has the CPAN, Comprehensive Perl Archive Network. Lots of good structure in there.
Python is still catching up. Python is growing declarative packaging metadata instead of in the Python setup.py file. Trying to standardize on pip and wheels, but easy_install remains a possibility. Uploading or PyPI server interaction today is hard. The server is hard to deploy on a laptop. There are no enforced version semantics. It has a brittle protocol. It is hard to move away from setup.py though.
http://pypi-mirrors.org lists about eight mirror of the official http://pypi.python.org server. Most are not up to date or even not updating at all. Not good.
Perl and Python are both not living up to their koans. Python has lot to improve.
What needs to be improved? setuptools and distribute are being merged. The bandersnatch tool is being deployed, which is much better and faster for mirroring. Several PEPs are being discussed and considered. The people proposing these PEPs are talking to each other, so communication is good. New version comparison, new packaging metadata version, new rules on PyPI, etcetera. A lot is happening.
We should be aware of the standardization trap: you try to solve the five existing ways of doing something by adding a sixth way. To avoid this, don't demand that the world changes first before your tool or idea can be used. To a certain degree Python fell into that trap, but that is outside the scope for this talk.
I would like to focus on integration of meta tools. These can configure and invoke existing tools and make them work for most use cases. You can enable and facilitate new technology there.
Testing
Python has lots of testing tools, like nose, py.test, unittest, unittest2, zope.testing, make test, setup.py test.
tox is a "meta" test running tool. Its mission is to standardize testing in Python. It is a bit like a Makefile. It runs the tests with the tools of your choice. It acts as a front-end to CI servers. See http://tox.testrun.org for details.
travisci (Travis CI) is a "meta" test running service. It configures all kinds of dependencies, priming your environment.
devpi
I have a new idea, devpi: support packaging, testing and deployment. The devpi-server part is a new compatible Python index and upload service. The client part has sub commands for managing release and QA workflows.
Why a new index server? In the existing solutions, I missed an automatically tested extensible code base, or other parts.
devpi-server is self-updating. It is a selective mirror. It does not try to update all packages on the original PyPI, just the ones that you actually use.
But: working with multiple indexes is burdensome. You can use devpi to provide "workflow" subcommands. use to set the current PyPI index. upload to build and upload packages from a checkout. test to download and test a package. So you can create a package, upload it to a local test PyPI, test the package and then upload it to the real PyPI.
I did the last pytest releases using devpi.
Development plans: MIT licensed, test driven development. Get early adopters.
The main messages from this talk:
- Evolve and build standards, do not impose them.
- Integrate existing solutions, do not add yet another way, if possible.
- Let's share this tooling and collaborate. Maybe you have some tool to reliably create a Debian package from a Python package. Make it available and get feedback and code from others.
Strive for something simpler, see the requests library. Simplicity is usually something that emerges by using a piece of software.
Luuk van der Velden - Best practices for the lone coder syndrome
Luuk van der Velden talks about best practices for the lone coder syndrome, at PyGrunn.
See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.
I do a PhD at the Center for Neuroscience, University of Amsterdam. I switched from Matlab to Python a few years ago. I am a passionate and critical programmer.
Programming is not a substantial part of most science educations, apart from obvious studies like computer science. A lot of experiments in sciences generate more and more data. The demand on computer power and data analysis is becoming bigger.
A PhD student, which we take as example of a lone coder, is responsible for his own project. He or she does the work himself: experiments, analysis. Collaborations do happen, but are asymmetric. I can talk to others, but they usually do not program together with me. Or they pass me some Matlab code that I then have to translate into Python.
A PhD will take about four years, so your code needs to keep running for all that time, maybe longer. Development is continuous.
Cutting corners when working on your own is attractive. You are the only one who uses it, and it works, so why bother improving it for corner cases? High standards demands discipline. So you end up with duplicated code, unreadable code, no documentation, unstructured functionality with no eye for reuse, code rot.
We have a scripting pitfall. Scripting languages like Python are a flexible tool to link different data producing systems, process data and create summaries and figures. Pitfalls for common scripts are: data hiding, hiding of complexity, division of functionality (household and processing), lack of scalability, no handles for code reuse.
What a script for scientific analysis should do, is defining what you want, concisely.
Prototyping is essential for researching a solution. It is used continuously. Consolidation is very different from prototyping. Some things are better left as a prototype.
You should have a hard core of software that is tested well. In your scripts you use this, instead of copying an old full script. 'Soft' code sits between the hard core and the script, as an interface.
As a scientist you did not get educated as a programmer. So you should get educated. And as Python programmers we should educate them. Presently the emphasis is on getting work done, not on programming. Matlab is the default language. This was originally a stripped down version for teaching students, but everyone kept using it. Closed source software goes against scientific ethos.
Python offers a full featured scientific computing stack. Python scales with your skills. You can use imperative code, functional, object oriented or meta programming. Python is free, so you can use the latest version without needing to pay for an upgrade like with Matlab.
We can organize courses and workshops, for example Software Carpentry.
Álex González - Python and Scala smoke the peace pipe
Álex González talks about Python and Scala, at PyGrunn.
ay Python conference in Groningen, The Netherlands.
Thrift is an interface definition language. You can use to work with several languages at the same time. It gives you basic types, transport, protocol, versioning, processors (input, output).
It helps for example your Python client talk to the Scala server or the other way around.
Types: bool, byte, several integers, string, struct. Also containers: list, set, map (dict in Python). An exception, services (for example a method).
Transport:
- TFileTransport uses files.
- TFramedTransport, for non-blocking servers, chunked data
- TMemoryTransport, user memory for IO
- TSocket for blocking sockets
- TZlibTransport for compressed transport.
Protocols: binary, compact, dense, with and without metadata.
Versioning. For every field in a struct you should add an integer identifier, otherwise you automatically get negative numbers.
Similar things are SOAP, CORBA, COM, Pillar, Protocol buffers (Google's protobuf is really similar to Thrift).
I am @agonzalezro on Twitter. See also http://agonzalezro.github.io
Armin Ronacher - A year with MongoDB
Armin Ronacher talks about MongoDB, at PyGrunn.
See the PyGrunn website for more info about this one-day Python conference in Groningen, The Netherlands.
I do computers, currently at Fireteam. We do the internet for pointy-shooty games.
I started out hating MongoDB a year ago. Then it started making sense after a while, but I ended up not liking it much. But: MongoDB is a pretty okay data store, as Jared Heftly says. We are just really good at finding corner cases.
MongoDB is like a nuclear reactor, but if you use it well, it is safe. I said that in October. Currently I am less enthousiastic.
We had a game. Christmas came around. Server load went up a lot.
Why did we pick MongoDB initially? It is schemaless, the database is sharded automatically, it is sessionless. But schemaless is just wrong, MongoDB's sharding is annoying, thinking in records is hard, and there is a reason people use sessions.
MongoDB has several parts: mongod, mongoc, mongos. So it has many moving parts.
First we failed ourselves. We were on Amazon, which is not good for databases. Mongos and Mongod were split but on the same server, which meant that they were constantly waiting on each other. We went to two cores and then it was fine. Still, EBS (Elastic Block Storage) is not good for IO, so not good for databases. Try writing a lot of data for a minute, just with dd and you will see what I mean.
MongoDB has no transactions. You can work around this, but we really did need it. It is meant for Document-level operations, storing documents within documents, but that did not really work for us. Your mileage may vary.
MongoDB is stateful. It assumes that the data is saved correctly. If you want to be sure, you need to ask it explicitly.
It crashes a lot. We did not update from 2.0 for a while because we would have hit lots of segfaults.
To break your cluster: add new primary, remove old primary, don't shutdown old primary (this step is bad!), network partitions and one of them overrides the config of the other in the mongoc. That happened to us during Christmas.
Schema versus schemaless is like static typing versus dynamic typing. Ever since C# and TypeScript, static typing with an escape hatch to dynamic typing wins. I think almost everyone adds schemas to MongoDB. It is what we do anyway.
getLastError() is just disappointing. Because you have to ask this all the time, things are always slower.
There is a lack of joins. This is called a 'feature'. I see people joining in their code by hand. The database should be much better at doing this than the average user. MongoDB does not have Map-Reduce, except a version that hardly counts.
When using the find or aggregate functions in the API to get records, you can basically get SQL injection when a user makes sure to get a dollar sign at the beginning of a string, as MongoDB handles that differently.
Even MySQL supports MVCC, so transactions. MongoDB: no.
MongoDB can only use one index per query, so quite limited. Negations never use indexes; not too unreasonable, but very annoying. There is a query optimizer though.
Making MongoDB far less slow on OS X:
mongod --noprealloc --smallfiles --nojournal run
Do not use : or | in your collection names, or it will not work if you try to import it on Windows.
A third of the data is the key. That is just insane. A reason to use schemas.
A MongoDB cluster needs to boot in a certain order.
MongoDB is a pretty good data dump thing. It is not a SQL database, but you probably want a SQL database, at least until RethinkDB is ready. Probably we would have had similar problems with RethinkDB though.
It is improving. There is a lot of backing from really big companies.
I don't want to do this again. I want to use Postgres. If I ever get data that is so large that Postgres cannot handle it, I have apparently done something successful and I will start doing something else. Postgres already has solved so many problems at the database level so you do not have to come up with solutions yourself at a higher level.
Without a doubt, MongoDB will get better and be the database of choice for some problems.
The project we use it for does still run on MongoDB and that will probably remain that way.