Speed up your Python code

published Jul 05, 2006, last modified May 18, 2007

by Dr. Stefan Schwarzer (SSchwarzer.com)

Premature optimization is the root of all evil - C.A.R. Hoare

  • Slow startup times are okay if you hardly ever start the program.
  • Speed may not be important for a nightly cronjob.
  • Actual user experience: does the program feel slow?

Plan:

  1. Get the architecture right
  2. if bugs: fix them
  3. while code is too slow:
    • find worst bottleneck by profiling
    • try to optimize, running unit tests
    • if not faster: undo last code changes

Bottlenecks:

  • maybe just your processor, memory, network, I/O, database problem
  • Use OS tools, e.g. time, top, dstat, gkrellm, xosview to see in which program the problem is.
  • python tools: profile (cprofile in python 2.5), hotshot, print statements

Big O notation

  • differentiates between slower and faster algorithms, when the dataset imcreases.

Try to avoid O(n^2) and slower algorithms for large sizes of n.

Performance may be less important than code readability and maintainability.

Strategies:

  • change algorithms
  • change the architecture
  • avoid nested loops
  • move loop-invariant code out of the loop
  • update only changed data
  • cache instead of recompute or reload (but this can be error-prone)
  • conversely, keep things out of the cache, as this might exhaust memory and start eating into disk space.
  • Use multithreading for I/O bound code

File operations optimization:

  • read file completely and then process if it is small
  • read and process line by line if it is large
  • use database instead of flat files

Python specific:

  • Use python -O
  • avoid exec and eval
  • avoid from module import *
  • shortcut namespace searches (e.g. opj = os.path.join)
  • change code to use C-coded objects: lists, tuples, dictionaries, sets.

Don't waste developer time on unnecessary optimizations.