lib/tasks/migrate · 16f9faa876df1edc1fa2a6519afa5fcebbdc1fc1 · Forest Godfrey / gitlab-ce

Optimise searching for users using short queries · 41bfe82b
authored Feb 15, 2018
This optimises searching for users when using queries consisting out of
one or two characters such as "ab". We optimise such cases by searching
for `LOWER(name)` and `LOWER(username)` instead of using `ILIKE`. Using
`LOWER` produces a _much_ better performing query.

For example, when searching for all users matching the term "a" we'd
produce the following plan:

     Limit  (cost=637.69..637.74 rows=20 width=805) (actual time=41.983..41.995 rows=20 loops=1)
       Buffers: shared hit=8330
       ->  Sort  (cost=637.69..638.61 rows=368 width=805) (actual time=41.982..41.990 rows=20 loops=1)
             Sort Key: (CASE WHEN ((name)::text = 'a'::text) THEN 0 WHEN ((username)::text = 'a'::text) THEN 1 WHEN ((email)::text = 'a'::text) THEN 2 ELSE 3 END), name
             Sort Method: top-N heapsort  Memory: 35kB
             Buffers: shared hit=8330
             ->  Bitmap Heap Scan on users  (cost=75.47..627.89 rows=368 width=805) (actual time=9.452..41.305 rows=277 loops=1)
                   Recheck Cond: (((name)::text ~~* 'a'::text) OR ((username)::text ~~* 'a'::text) OR ((email)::text = 'a'::text))
                   Rows Removed by Index Recheck: 7601
                   Heap Blocks: exact=7636
                   Buffers: shared hit=8327
                   ->  BitmapOr  (cost=75.47..75.47 rows=368 width=0) (actual time=8.290..8.290 rows=0 loops=1)
                         Buffers: shared hit=691
                         ->  Bitmap Index Scan on index_users_on_name_trigram  (cost=0.00..38.85 rows=180 width=0) (actual time=4.369..4.369 rows=4071 loops=1)
                               Index Cond: ((name)::text ~~* 'a'::text)
                               Buffers: shared hit=360
                         ->  Bitmap Index Scan on index_users_on_username_trigram  (cost=0.00..34.41 rows=188 width=0) (actual time=3.896..3.896 rows=4140 loops=1)
                               Index Cond: ((username)::text ~~* 'a'::text)
                               Buffers: shared hit=328
                         ->  Bitmap Index Scan on users_email_key  (cost=0.00..1.94 rows=1 width=0) (actual time=0.022..0.022 rows=0 loops=1)
                               Index Cond: ((email)::text = 'a'::text)
                               Buffers: shared hit=3
     Planning time: 3.912 ms
     Execution time: 42.171 ms

With the changes in this commit we now produce the following plan
instead:

     Limit  (cost=13257.48..13257.53 rows=20 width=805) (actual time=1.567..1.579 rows=20 loops=1)
       Buffers: shared hit=287
       ->  Sort  (cost=13257.48..13280.93 rows=9379 width=805) (actual time=1.567..1.572 rows=20 loops=1)
             Sort Key: (CASE WHEN ((name)::text = 'a'::text) THEN 0 WHEN ((username)::text = 'a'::text) THEN 1 WHEN ((email)::text = 'a'::text) THEN 2 ELSE 3 END), name
             Sort Method: top-N heapsort  Memory: 35kB
             Buffers: shared hit=287
             ->  Bitmap Heap Scan on users  (cost=135.66..13007.91 rows=9379 width=805) (actual time=0.194..1.107 rows=277 loops=1)
                   Recheck Cond: ((lower((name)::text) = 'a'::text) OR (lower((username)::text) = 'a'::text) OR ((email)::text = 'a'::text))
                   Heap Blocks: exact=277
                   Buffers: shared hit=287
                   ->  BitmapOr  (cost=135.66..135.66 rows=9379 width=0) (actual time=0.152..0.152 rows=0 loops=1)
                         Buffers: shared hit=10
                         ->  Bitmap Index Scan on yorick_test_users  (cost=0.00..124.75 rows=9377 width=0) (actual time=0.101..0.101 rows=277 loops=1)
                               Index Cond: (lower((name)::text) = 'a'::text)
                               Buffers: shared hit=4
                         ->  Bitmap Index Scan on index_on_users_lower_username  (cost=0.00..1.94 rows=1 width=0) (actual time=0.035..0.035 rows=1 loops=1)
                               Index Cond: (lower((username)::text) = 'a'::text)
                               Buffers: shared hit=3
                         ->  Bitmap Index Scan on users_email_key  (cost=0.00..1.94 rows=1 width=0) (actual time=0.014..0.014 rows=0 loops=1)
                               Index Cond: ((email)::text = 'a'::text)
                               Buffers: shared hit=3
     Planning time: 0.303 ms
     Execution time: 1.687 ms

Here we can see the new query is 25 times faster compared to the old
query.
41bfe82b
Name	Last commit	Last update
..
add_limits_mysql.rake		Loading commit data...
migrate_iids.rake		Loading commit data...
setup_postgresql.rake		Loading commit data...