Surgical Time Tracking in Python – Sicara | Agile Big Data Development

Why the standard python profiler is not what I need?

Let’s profile our piece of python code with cProfile. To do so, load the cProfile module when you run your script:

python -m cProfile load_and_normalize.py

Alternatively, you can profile the “run” function directly in the python code:

import cProfile
cProfile.run('run()')

In the end, you’ll get an output that looks like this:

Why it doesn’t help me:

  • it gives way too much information: it tracks the time of all called functions, including the one called deep in pandas. Hence I got more ~300 function calls. In my example, I am only interested in the execution time of my own functions i.e., run, load, normalize_ticket_price
  • “tottime” is not relevant: this is the actual time spent in the function excluding the time spent in subcalls. As all calls are tracked, it is almost always 0.000.
  • exploring the profiler results is hard: time results are either print to the console or written in a file. You may use pstats or tools such as cProfilev (basic display with sorting facilities), kcachegring or gprof2dot (show call graph) to explore results, but it is heavy.
  • exporting the result is not easy: in my case, I’d like to store the profiler results in a comprehensive way in an external database.

A simple python decorator for tracking time

Instead, I’ll use this simple python decorator for profiling:

It computes the execution time of the decorated function fn and logs the result right after fn finished to execute:

  • the (line 6) decorator sets the meta-attributes (name, docstring, …) of the decorated function from the original function ones.
  • the try ... finally ensures the execution time is always logged, even if the decorated function raised an exception.

To use it, simply decorate the functions that you decided to profile:

Executing the function run will print the following:

[SimpleTimeTracker] load 0.006
[SimpleTimeTracker] compute_max_ticket_price_per_class 0.003
[SimpleTimeTracker] normalize_ticket_price_per_class 0.004
[SimpleTimeTracker] run 0.010
  • it is simple: the decorator is 24 lines long and has no dependencies.
  • it is surgical: I only got timings of the function I decided to profile. This allows you to quickly understand which part of your script is slower.
  • it is easy to export the results: the logging function _log passed to the decorator handles the profiler results. In the example above, it simply prints timings to the console, but I could have store the profiler results to a database.

A better python decorator that profiles partial times

The partial time of a function foo consists of its total execution time i.e., the time delta between entering and exiting foo, minus the execution times of the functions that are called inside the body of foo.

In the example above, the partial execution time of the function run is zero: its total execution time (10ms) minus execution times of normalize_ticket_price (4ms) and load (6ms). It means that if I want to improve the performance of the function run, I may focus on improving load and normalize_ticket_price, as literally no time is spent in the body of the function run (which is obvious when looking at the code).

To profile partial execution times, I improved a little bit the decorator:

Note that we do need a global variable _total_time_call_stack to compute partial execution times.

Here is the profiling results of when running the function run.

[BetterTimeTracker] load 0.006 0.006
[BetterTimeTracker] compute_max_ticket_price_per_class 0.003 0.003
[BetterTimeTracker] normalize_ticket_price_per_class 0.004 0.001
[BetterTimeTracker] run 0.010 0.000

Full code, documentation and examples are available here.

[Edit]

Following this comment on Reddit, I tried pyflame + flamegraph tools for profiling. It’s amazing: it shows which part of the python code takes more time in a concise way and without modifying the code (e.g., adding decorators).

To do so, you need to install pyflame and flamegraph on your computer, and to run the following command:

pyflame -t python your_code.py | flamegraph > profile.svg

You’ll get an SVG file that looks like this:

You can download the SVG file here. In a nutshell:

  • Each bar represents a call to a function;
  • Colors are randomly picked among warm colors 🙂
  • The Y-axis represent call stacks;
  • The X-axis is not the time: pyflame checks every milisecond the current call stack and the bar width represents the overall number of ticks it was in the function.

In this example, you can see most of the time (91.17%) is spent loading pandas. To get more detail about the run function, do ctrl+f to search the SVG and click on the matching element (the tiny purple bar in the right part), you should get something like:

In the end, both pyflame and the profiling decorators allow you retrieve the execution time ratio spent in load and normalize_ticket_price_per_class functions which is about 60/40.

Browse

Article by channel:

Read more articles tagged: Agile