- 03 Feb, 2017 1 commit
-
-
Adam Pahlevi authored
add complete changelog for !8949
-
- 30 Jan, 2017 1 commit
-
-
Adam Pahlevi authored
don’t pass AR object, use the ID to avoid depr warning pass in the id instead of AR object to specs for `ProjectDestroyWorker`
-
- 25 Jan, 2017 1 commit
-
-
Yorick Peterse authored
There were two cases that could be problematic: 1. Because sometimes AuthorizedProjectsWorker would be scheduled in a transaction it was possible for a job to run/complete before a COMMIT; resulting in it either producing an error, or producing no new data. 2. When scheduling jobs the code would not wait until completion. This could lead to a user creating a project and then immediately trying to push to it. Usually this will work fine, but given enough load it might take a few seconds before a user has access. The first one is problematic, the second one is mostly just annoying (but annoying enough to warrant a solution). This commit changes two things to deal with this: 1. Sidekiq scheduling now takes places after a COMMIT, this is ensured by scheduling using Rails' after_commit hook instead of doing so in an arbitrary method. 2. When scheduling jobs the calling thread now waits for all jobs to complete. Solution 2 requires tracking of job completions. Sidekiq provides a way to find a job by its ID, but this involves scanning over the entire queue; something that is very in-efficient for large queues. As such a more efficient solution is necessary. There are two main Gems that can do this in a more efficient manner: * sidekiq-status * sidekiq_status No, this is not a joke. Both Gems do a similar thing (but slightly different), and the only difference in their name is a dash vs an underscore. Both Gems however provide far more than just checking if a job has been completed, and both have their problems. sidekiq-status does not appear to be actively maintained, with the last release being in 2015. It also has some issues during testing as API calls are not stubbed in any way. sidekiq_status on the other hand does not appear to be very popular, and introduces a similar amount of code. Because of this I opted to write a simple home grown solution. After all, all we need is storing a job ID somewhere so we can efficiently look it up; we don't need extra web UIs (as provided by sidekiq-status) or complex APIs to update progress, etc. This is where Gitlab::SidekiqStatus comes in handy. This namespace contains some code used for tracking, removing, and looking up job IDs; all without having to scan over an entire queue. Data is removed explicitly, but also expires automatically just in case. Using this API we can now schedule jobs in a fork-join like manner: we schedule the jobs in Sidekiq, process them in parallel, then wait for completion. By using Sidekiq we can leverage all the benefits such as being able to scale across multiple cores and hosts, retrying failed jobs, etc. The one downside is that we need to make sure we can deal with unexpected increases in job processing timings. To deal with this the class Gitlab::JobWaiter (used for waiting for jobs to complete) will only wait a number of seconds (30 by default). Once this timeout is reached it will simply return. For GitLab.com almost all AuthorizedProjectWorker jobs complete in seconds, only very rarely do we spike to job timings of around a minute. These in turn seem to be the result of external factors (e.g. deploys), in which case a user is most likely not able to use the system anyway. In short, this new solution should ensure that jobs are processed properly and that in almost all cases a user has access to their resources whenever they need to have access.
-
- 08 Jan, 2017 1 commit
-
-
Vincent Wong authored
Addresses: Issue #13810 1. Adds a last_used_at attribute to the Key table/model 2. Update a key's last_used_at whenever it gets used 3. Display how long ago an ssh key was last used
-
- 21 Dec, 2016 3 commits
-
-
Kamil Trzcinski authored
-
Markus Koller authored
This adds counters for build artifacts and LFS objects, and moves the preexisting repository_size and commit_count from the projects table into a new project_statistics table. The counters are displayed in the administration area for projects and groups, and also available through the API for admins (on */all) and normal users (on */owned) The statistics are updated through ProjectCacheWorker, which can now do more granular updates with the new :statistics argument.
-
Adam Niedzielski authored
The button allows to choose a ".gitlab-ci.yml" template that automatically sets up the deployment of an application. The currently supported template is Kubernetes template.
-
- 19 Dec, 2016 2 commits
-
-
Nick Thomas authored
-
Yorick Peterse authored
Prior to this commit the refreshing of authorized projects was done in two steps: 1. Remove existing authorizations 2. Insert a new list of all authorizations This can lead to a high amount of dead tuples as every time all rows are being replaced. For example, if a user with 100 authorizations is given access to a new project this would lead to: * 100 rows being removed * 101 new rows being inserted This commit changes the way this system works so it only removes/inserts what is necessary. Using the above example this would lead to only 1 new row being inserted, with the initial 100 being left untouched. Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/25257
-
- 13 Dec, 2016 1 commit
-
- 06 Dec, 2016 1 commit
-
-
Lin Jen-Shin authored
-
- 04 Dec, 2016 1 commit
-
-
Z.J. van de Weg authored
-
- 01 Dec, 2016 2 commits
-
-
Yorick Peterse authored
By passing commit data to this worker we remove the need for querying the Git repository for every job. This in turn reduces the time spent processing each job. The migration included migrates jobs from the old format to the new format. For this to work properly it requires downtime as otherwise workers may start producing errors until they're using a newer version of the worker code.
-
Robert Speicher authored
-
- 25 Nov, 2016 1 commit
-
-
Yorick Peterse authored
When I proposed using serializable transactions I was hoping we would be able to refresh data of individual users concurrently. Unfortunately upon closer inspection it was revealed this was not the case. This could result in a lot of queries failing due to serialization errors, overloading the database in the process (given enough workers trying to update the target table). To work around this we're now using a Redis lease that is cancelled upon completion. This ensures we can update the data of different users concurrently without overloading the database. The code will try to obtain the lease until it succeeds, waiting at least 1 second between retries. This is necessary as we may otherwise end up _not_ updating the data which is not an option.
-
- 21 Nov, 2016 2 commits
-
-
Yorick Peterse authored
This refactors repository caching so it's possible to selectively refresh certain caches, instead of just expiring and refreshing everything. To allow this the various methods that were cached (e.g. "tag_count" and "readme") use a similar pattern that makes expiring and refreshing their data much easier. In this new setup caches are refreshed as follows: 1. After a commit (but before running ProjectCacheWorker) we expire some basic caches such as the commit count and repository size. 2. ProjectCacheWorker will recalculate the commit count, repository size, then refresh a specific set of caches based on the list of files changed in a push payload. This requires a bunch of changes to the various methods that may be cached. For one, data should not be cached if a branch used or the entire repository does not exist. To prevent all these methods from handling this manually this is taken care of in Repository#cache_method_output. Some methods still manually check for the existence of a repository but this result is also cached. With selective flushing implemented ProjectCacheWorker no longer uses an exclusive lease for all of its work. Instead this worker only uses a lease to limit the number of times the repository size is updated as this is a fairly expensive operation.
-
Grzegorz Bizon authored
-
- 18 Nov, 2016 1 commit
-
-
Ahmad Sherif authored
Closes #23150
-
- 17 Nov, 2016 1 commit
-
-
James Lopez authored
-
- 12 Nov, 2016 1 commit
-
-
Oswaldo Ferreira authored
- Also remove unnecessary param
-
- 09 Nov, 2016 1 commit
-
-
Toon Claes authored
It adds a button to the branches page that the user can use to delete all the branches that are already merged. This can be used to clean up all the branches that were forgotten to delete while merging MRs. Fixes #21076.
-
- 07 Nov, 2016 1 commit
-
-
Yorick Peterse authored
This moves the code used for processing commits from GitPushService to its own Sidekiq worker: ProcessCommitWorker. Using a Sidekiq worker allows us to process multiple commits in parallel. This in turn will lead to issues being closed faster and cross references being created faster. Furthermore by isolating this code into a separate class it's easier to test and maintain the code. The new worker also ensures it can efficiently check which issues can be closed, without having to run numerous SQL queries for every issue.
-
- 04 Nov, 2016 2 commits
-
-
Jacob Vosmaer authored
-
Jacob Vosmaer authored
-
- 28 Oct, 2016 1 commit
-
-
Frank Groeneveld authored
-
- 25 Oct, 2016 1 commit
-
-
Yorick Peterse authored
This changes ProjectCacheWorker.perform_async so it only schedules a job when no lease for the given project is present. This ensures we don't end up scheduling hundreds of jobs when they won't be executed anyway.
-
- 21 Oct, 2016 2 commits
-
-
Yorick Peterse authored
Dumping too many jobs in the same queue (e.g. the "default" queue) is a dangerous setup. Jobs that take a long time to process can effectively block any other work from being performed given there are enough of these jobs. Furthermore it becomes harder to monitor the jobs as a single queue could contain jobs for different workers. In such a setup the only reliable way of getting counts per job is to iterate over all jobs in a queue, which is a rather time consuming process. By using separate queues for various workers we have better control over throughput, we can add weight to queues, and we can monitor queues better. Some workers still use the same queue whenever their work is related. For example, the various CI pipeline workers use the same "pipeline" queue. This commit includes a Rails migration that moves Sidekiq jobs from the old queues to the new ones. This migration also takes care of doing the inverse if ever needed. This does require downtime as otherwise new jobs could be scheduled in the old queues after this migration completes. This commit also includes an RSpec test that blacklists the use of the "default" queue and ensures cron workers use the "cronjob" queue. Fixes gitlab-org/gitlab-ce#23370
-
- 20 Oct, 2016 1 commit
-
-
Yorick Peterse authored
This ensures ProjectCacheWorker jobs for a given project are performed at most once per 15 minutes. This should reduce disk load a bit in cases where there are multiple pushes happening (which should schedule multiple ProjectCacheWorker jobs).
-
- 18 Oct, 2016 1 commit
-
-
Lin Jen-Shin authored
We use bcc here because we don't want to generate this emails for a thousand times. This could be potentially expensive in a loop, and recipients would contain all project watchers so it could be a lot.
-
- 17 Oct, 2016 9 commits
-
-
Grzegorz Bizon authored
It may happen that job meant to remove expired artifacts will be executed asynchronously when, in the meantime, project associated with given build gets removed by another asynchronous job. In that case we should not remove artifacts because such build will be removed anyway, when project removal is complete.
-
Nick Thomas authored
The amount of precision times have in databases is variable, so we need tolerances when comparing in specs. It's better to have the tolerance defined in one place than several.
-
Kamil Trzcinski authored
-
Lin Jen-Shin authored
-
Lin Jen-Shin authored
-
- 14 Oct, 2016 1 commit
-
-
Grzegorz Bizon authored
-