BigW Consortium Gitlab

  1. 18 Nov, 2015 11 commits
    • Refactor getting user groups/projects/contributions · 5fcd9986
      Yorick Peterse authored
      This new setup no longer loads any IDs into memory using "pluck",
      instead using SQL UNIONs to merge the various datasets together. This
      results in greatly improved query performance as well as a reduction of
      memory usage.
      
      The old setup was in particular problematic when requesting the
      authorized projects _including_ public/internal projects as this would
      result in roughly 65000 project IDs being loaded into memory. These IDs
      would in turn be passed to other queries.
    • Prefix table names for User UNIONs · bfd9855a
      Yorick Peterse authored
    • Use SQL::Union for User#authorized_groups · 189c40c3
      Yorick Peterse authored
      This removes the need for plucking any IDs into Ruby.
    • Make it easier to re-apply default sort orders · 656d9ff6
      Yorick Peterse authored
      By moving the default sort order into a separate scope (and calling this
      from the default scope) we can more easily re-apply a default order
      without having to specify the exact column/ordering all over the place.
    • Use SQL::Union for User#authorized_projects · 028bd227
      Yorick Peterse authored
      This allows retrieving of the list of authorized projects using a single
      query, without having to load any IDs into Ruby. This in turn also means
      we can remove the method User#authorized_projects_id.
    • Added Gitlab::SQL::Union class · d769596a
      Yorick Peterse authored
      This class can be used to join multiple AcitveRecord::Relation objects
      together using a SQL UNION statement. ActiveRecord < 5.0 sadly doesn't
      support UNION and existing Gems out there don't handle prepared
      statements (e.g. they never incremented the variable bindings).
    • Faster way of obtaining latest event update time · 054f2f98
      Yorick Peterse authored
      Instead of using MAX(events.updated_at) we can simply sort the events in
      descending order by the "id" column and grab the first row. In other
      words, instead of this:
      
          SELECT max(events.updated_at) AS max_id
          FROM events
          LEFT OUTER JOIN projects   ON projects.id   = events.project_id
          LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id
          WHERE events.author_id IS NOT NULL
          AND events.project_id IN (13083);
      
      we can use this:
      
          SELECT events.updated_at AS max_id
          FROM events
          LEFT OUTER JOIN projects   ON projects.id   = events.project_id
          LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id
          WHERE events.author_id IS NOT NULL
          AND events.project_id IN (13083)
          ORDER BY events.id DESC
          LIMIT 1;
      
      This has the benefit that on PostgreSQL a backwards index scan can be
      used, which due to the "LIMIT 1" will at most process only a single row.
      This in turn greatly speeds up the process of grabbing the latest update
      time. This can be confirmed by looking at the query plans. The first
      query produces the following plan:
      
          Aggregate  (cost=43779.84..43779.85 rows=1 width=12) (actual time=2142.462..2142.462 rows=1 loops=1)
            ->  Index Scan using index_events_on_project_id on events  (cost=0.43..43704.69 rows=30060 width=12) (actual time=0.033..2138.086 rows=32769 loops=1)
                  Index Cond: (project_id = 13083)
                  Filter: (author_id IS NOT NULL)
          Planning time: 1.248 ms
          Execution time: 2142.548 ms
      
      The second query in turn produces the following plan:
      
          Limit  (cost=0.43..41.65 rows=1 width=16) (actual time=1.394..1.394 rows=1 loops=1)
            ->  Index Scan Backward using events_pkey on events  (cost=0.43..1238907.96 rows=30060 width=16) (actual time=1.394..1.394 rows=1 loops=1)
                  Filter: ((author_id IS NOT NULL) AND (project_id = 13083))
                  Rows Removed by Filter: 2104
          Planning time: 0.166 ms
          Execution time: 1.408 ms
      
      According to the above plans the 2nd query is around 1500 times faster.
      However, re-running the first query produces timings of around 80 ms,
      making the 2nd query "only" around 55 times faster.
    • Merge branch 'add_lfs_doc' into 'master' · 21a59b23
      Job van der Voort authored
      Documentation for Git LFS
      
      Doc explaining how to use Git LFS, how to use Git LFS with GitLab, known issues and troubleshooting tips.
      
      See merge request !1813
    • missing . · 60e45651
      Job van der Voort authored
    • Updates to the lfs doc. · 6c62f182
      Marin Jankovski authored
  2. 17 Nov, 2015 29 commits