Speed up your Python code
by Dr. Stefan Schwarzer (SSchwarzer.com)
Premature optimization is the root of all evil - C.A.R. Hoare
- Slow startup times are okay if you hardly ever start the program.
- Speed may not be important for a nightly cronjob.
- Actual user experience: does the program feel slow?
Plan:
- Get the architecture right
- if bugs: fix them
- while code is too slow:
- find worst bottleneck by profiling
- try to optimize, running unit tests
- if not faster: undo last code changes
Bottlenecks:
- maybe just your processor, memory, network, I/O, database problem
- Use OS tools, e.g. time, top, dstat, gkrellm, xosview to see in which program the problem is.
- python tools: profile (cprofile in python 2.5), hotshot, print statements
Big O notation
- differentiates between slower and faster algorithms, when the dataset imcreases.
Try to avoid O(n^2) and slower algorithms for large sizes of n.
Performance may be less important than code readability and maintainability.
Strategies:
- change algorithms
- change the architecture
- avoid nested loops
- move loop-invariant code out of the loop
- update only changed data
- cache instead of recompute or reload (but this can be error-prone)
- conversely, keep things out of the cache, as this might exhaust memory and start eating into disk space.
- Use multithreading for I/O bound code
File operations optimization:
- read file completely and then process if it is small
- read and process line by line if it is large
- use database instead of flat files
Python specific:
- Use
python -O
- avoid exec and eval
- avoid
from module import *
- shortcut namespace searches (e.g.
opj = os.path.join
) - change code to use C-coded objects: lists, tuples, dictionaries, sets.
Don't waste developer time on unnecessary optimizations.