Browse Source

perf(events-stats): Zerofill very slow from using plus operator (#41241)

The zerofill function in events-stats can be very slow and it looks like
the culprit here is the + on lists. This creates a new list with the
contents of the operands whereas extends modify the list in place which
is more performant.

I noticed that the `/events-stats` endpoint can be very slow, especially
the higher percentiles are noticeably slower. After examining some
transactions, it looks like the `top_events.transform_results` is the
main culprit for this.


![image](https://user-images.githubusercontent.com/10239353/201209955-58934c16-65b0-4c72-8bc9-b22627eb0a28.png)

Looking at some profiles for this transaction, we see a large portion of
the time was spent inside the zerofill function. The stacks seem to
point at the `dateutil.parser.parse`.


![image](https://user-images.githubusercontent.com/10239353/201210193-cb6d09ef-f773-47f6-8758-a93d5548ab37.png)

After some light benchmarking, this might be a coincidence that this
particular stack was collected. But it does appear that the culprit is
in nearby. After some examination, I noticed that were were using `+` to
concat lists which is significantly slower than using `.extend` by a
order of magnitudes.
Tony Xiao 2 years ago
parent
commit
fb5ff758ce
1 changed files with 1 additions and 2 deletions
  1. 1 2
      src/sentry/snuba/discover.py

+ 1 - 2
src/sentry/snuba/discover.py

@@ -113,8 +113,7 @@ def zerofill(data, start, end, rollup, orderby):
 
     for key in range(start, end, rollup):
         if key in data_by_time and len(data_by_time[key]) > 0:
-            rv = rv + data_by_time[key]
-            data_by_time[key] = []
+            rv.extend(data_by_time[key])
         else:
             rv.append({"time": key})