perf(replays): query directly on replay_id uuid instead of stripped s… (#45715)
Using our base query, it's shown that removing the string conversion /
dash stripping on the replay_id column improves memory usage by ~26% and
speed by ~21%. Queries pasted in at bottom.
This PR creates a new `Field` `UUIDField` as we have to do validation /
conversion prior to sending our query to clickhouse, and our minimum
clickhouse version doesn't have useful helpful functions so we have to
work around.
We also now have to strip dashes on replay_ids in the post processing of
our queries. We could do the same for error_ids and trace_ids in a
future PRs if we want that optimization on those fields as well.
```
SET send_logs_level = 'trace'
```
```
SELECT project_id AS _snuba_project_id, replaceAll (toString (replay_id), '-', '') AS _snuba_replay_id, max(timestamp AS _snuba_timestamp) AS _snuba_finished_at, sum(length(error_ids AS _snuba_error_ids)) AS _snuba_count_errors, groupArray (1) (environment AS _snuba_environment)[1] AS _snuba_agg_environment, _snuba_replay_id, notEmpty (groupArray (is_archived AS _snuba_is_archived)) AS _snuba_isArchived FROM replays_local WHERE (_snuba_project_id IN[11276]) AND (_snuba_timestamp < toDateTime ('2023-03-11T00:30:34', 'Universal')) AND (_snuba_timestamp >= toDateTime ('2022-03-10T00:30:34', 'Universal')) GROUP BY _snuba_project_id, _snuba_replay_id HAVING (min(segment_id AS _snuba_segment_id) = 0) AND (_snuba_finished_at < toDateTime ('2023-03-11T00:30:34', 'Universal')) AND (_snuba_isArchived = 0) ORDER BY _snuba_count_errors ASC LIMIT 0, 10
Peak memory usage (for query): 612.65 MiB.
10 rows in set. Elapsed: 2.912 sec. Processed 7.94 million rows, 341.42 MB (2.73 million rows/s., 117.25 MB/s.)
```
```
SELECT project_id AS _snuba_project_id, replay_id, max(timestamp AS _snuba_timestamp) AS _snuba_finished_at, sum(length(error_ids AS _snuba_error_ids)) AS _snuba_count_errors, groupArray (1) (environment AS _snuba_environment)[1] AS _snuba_agg_environment, notEmpty (groupArray (is_archived AS _snuba_is_archived)) AS _snuba_isArchived FROM replays_local WHERE (_snuba_project_id IN[11276]) AND (_snuba_timestamp < toDateTime ('2023-03-11T00:30:34', 'Universal')) AND (_snuba_timestamp >= toDateTime ('2022-03-10T00:30:34', 'Universal')) GROUP BY _snuba_project_id, replay_id HAVING (min(segment_id AS _snuba_segment_id) = 0) AND (_snuba_finished_at < toDateTime ('2023-03-11T00:30:34', 'Universal')) AND (_snuba_isArchived = 0) ORDER BY _snuba_count_errors ASC LIMIT 0, 10
MemoryTracker: Peak memory usage (for query): 448.65 MiB
10 rows in set. Elapsed: 1.711 sec. Processed 7.94 million rows, 341.42 MB (4.64 million rows/s., 199.60 MB/s.)
```