BigW Consortium Gitlab

Commit d4c05766 by Achilleas Pipinellis

Refactor GitLab Metrics docs

[ci skip]
parent 790c6868
...@@ -49,13 +49,6 @@ ...@@ -49,13 +49,6 @@
- [Test Clojure applications](ci/examples/test-clojure-application.md) - [Test Clojure applications](ci/examples/test-clojure-application.md)
- Help your favorite programming language and GitLab by sending a merge request with a guide for that language. - Help your favorite programming language and GitLab by sending a merge request with a guide for that language.
## GitLab Metrics
- [Introduction](metrics/introduction.md)
- [GitLab Configuration](metrics/gitlab_configuration.md)
- [InfluxDB Configuration](metrics/influxdb_configuration.md)
- [InfluxDB Schema](metrics/influxdb_schema.md)
## Administrator documentation ## Administrator documentation
- [Custom git hooks](hooks/custom_hooks.md) Custom git hooks (on the filesystem) for when web hooks aren't enough. - [Custom git hooks](hooks/custom_hooks.md) Custom git hooks (on the filesystem) for when web hooks aren't enough.
...@@ -74,6 +67,7 @@ ...@@ -74,6 +67,7 @@
- [Reply by email](incoming_email/README.md) Allow users to comment on issues and merge requests by replying to notification emails. - [Reply by email](incoming_email/README.md) Allow users to comment on issues and merge requests by replying to notification emails.
- [Migrate GitLab CI to CE/EE](migrate_ci_to_ce/README.md) Follow this guide to migrate your existing GitLab CI data to GitLab CE/EE. - [Migrate GitLab CI to CE/EE](migrate_ci_to_ce/README.md) Follow this guide to migrate your existing GitLab CI data to GitLab CE/EE.
- [Git LFS configuration](workflow/lfs/lfs_administration.md) - [Git LFS configuration](workflow/lfs/lfs_administration.md)
- [GitLab Metrics](integration/metrics/introduction.md) Configure GitLab and InfluxDB for measuring performance metrics
## Contributor documentation ## Contributor documentation
......
...@@ -15,6 +15,7 @@ See the documentation below for details on how to configure these services. ...@@ -15,6 +15,7 @@ See the documentation below for details on how to configure these services.
- [OAuth2 provider](oauth_provider.md) OAuth2 application creation - [OAuth2 provider](oauth_provider.md) OAuth2 application creation
- [Gmail actions buttons](gmail_action_buttons_for_gitlab.md) Adds GitLab actions to messages - [Gmail actions buttons](gmail_action_buttons_for_gitlab.md) Adds GitLab actions to messages
- [reCAPTCHA](recaptcha.md) Configure GitLab to use Google reCAPTCHA for new users - [reCAPTCHA](recaptcha.md) Configure GitLab to use Google reCAPTCHA for new users
- [GitLab Metrics](metrics/introduction.md) Configure GitLab and InfluxDB for measuring performance metrics
GitLab Enterprise Edition contains [advanced Jenkins support][jenkins]. GitLab Enterprise Edition contains [advanced Jenkins support][jenkins].
......
# GitLab Configuration
GitLab Metrics is disabled by default. To enable it and change any of its
settings, navigate to the Admin area in **Settings > Metrics**
(`/admin/application_settings`).
The minimum required settings you need to set are the InfluxDB host and port.
Make sure _Enable InfluxDB Metrics_ is checked and hit **Save** to save the
changes.
---
![GitLab Metrics Admin Settings](img/metrics_gitlab_configuration_settings.png)
---
Finally, a restart of all GitLab processes is required for the changes to take
effect:
```bash
# For Omnibus installations
sudo gitlab-ctl restart
# For installations from source
sudo service gitlab restart
```
## Pending Migrations
When any migrations are pending, the metrics are disabled until the migrations
have been performed.
---
Read more on:
- [Introduction to GitLab Metrics](introduction.md)
- [InfluxDB Configuration](influxdb_configuration.md)
- [InfluxDB Schema](influxdb_schema.md)
# InfluxDB Configuration
The default settings provided by [InfluxDB] are not sufficient for a high traffic
GitLab environment. The settings discussed in this document are based on the
settings GitLab uses for GitLab.com, depending on your own needs you may need to
further adjust them.
If you are intending to run InfluxDB on the same server as GitLab, make sure
you have plenty of RAM since InfluxDB can use quite a bit depending on traffic.
Unless you are going with a budget setup, it's advised to run it separately.
## Requirements
- InfluxDB 0.9.5 or newer
- A fairly modern version of Linux
- At least 4GB of RAM
- At least 10GB of storage for InfluxDB data
Note that the RAM and storage requirements can differ greatly depending on the
amount of data received/stored. To limit the amount of stored data users can
look into [InfluxDB Retention Policies][influxdb-retention].
## Installation
Installing InfluxDB is out of the scope of this document. Please refer to the
[InfluxDB documentation].
## InfluxDB Server Settings
Since InfluxDB has many settings that users may wish to customize themselves
(e.g. what port to run InfluxDB on), we'll only cover the essentials.
The configuration file in question is usually located at
`/etc/influxdb/influxdb.conf`. Whenever you make a change in this file,
InfluxDB needs to be restarted.
### Storage Engine
InfluxDB comes with different storage engines and as of InfluxDB 0.9.5 a new
storage engine is available, called [TSM Tree]. All users **must** use the new
`tsm1` storage engine as this [will be the default engine][tsm1-commit] in
upcoming InfluxDB releases.
Make sure you have the following in your configuration file:
```
[data]
dir = "/var/lib/influxdb/data"
engine = "tsm1"
```
### Admin Panel
Production environments should have the InfluxDB admin panel **disabled**. This
feature can be disabled by adding the following to your InfluxDB configuration
file:
```
[admin]
enabled = false
```
### HTTP
HTTP is required when using the [InfluxDB CLI] or other tools such as Grafana,
thus it should be enabled. When enabling make sure to _also_ enable
authentication:
```
[http]
enabled = true
auth-enabled = true
```
_**Note:** Before you enable authentication, you might want to [create an
admin user](#create-a-new-admin-user)._
### UDP
GitLab writes data to InfluxDB via UDP and thus this must be enabled. Enabling
UDP can be done using the following settings:
```
[[udp]]
enabled = true
bind-address = ":8089"
database = "gitlab"
batch-size = 1000
batch-pending = 5
batch-timeout = "1s"
read-buffer = 209715200
```
This does the following:
1. Enable UDP and bind it to port 8089 for all addresses.
2. Store any data received in the "gitlab" database.
3. Define a batch of points to be 1000 points in size and allow a maximum of
5 batches _or_ flush them automatically after 1 second.
4. Define a UDP read buffer size of 200 MB.
One of the most important settings here is the UDP read buffer size as if this
value is set too low, packets will be dropped. You must also make sure the OS
buffer size is set to the same value, the default value is almost never enough.
To set the OS buffer size to 200 MB, on Linux you can run the following command:
```bash
sysctl -w net.core.rmem_max=209715200
```
To make this permanent, add the following to `/etc/sysctl.conf` and restart the
server:
```bash
net.core.rmem_max=209715200
```
It is **very important** to make sure the buffer sizes are large enough to
handle all data sent to InfluxDB as otherwise you _will_ lose data. The above
buffer sizes are based on the traffic for GitLab.com. Depending on the amount of
traffic, users may be able to use a smaller buffer size, but we highly recommend
using _at least_ 100 MB.
When enabling UDP, users should take care to not expose the port to the public,
as doing so will allow anybody to write data into your InfluxDB database (as
[InfluxDB's UDP protocol][udp] doesn't support authentication). We recommend either
whitelisting the allowed IP addresses/ranges, or setting up a VLAN and only
allowing traffic from members of said VLAN.
## Create a new admin user
If you want to [enable authentication](#http), you might want to [create an
admin user][influx-admin]:
```
influx -execute "CREATE USER thedude WITH PASSWORD '1234' WITH ALL PRIVILEGES"
```
## Create the `gitlab` database
Once you get InfluxDB up and running, you need to create a database for GitLab.
Make sure you have changed the [storage engine](#storage-engine) to `tsm1`
before creating a database.
_**Note:** If you [created an admin user](#create-a-new-admin-user) and enabled
[HTTP authentication](#http), remember to append the username (`-username thedude`)
and password (`-password 1234`) to the commands below._
Run the following command to create a database named `gitlab`:
```bash
influx -execute 'CREATE DATABASE gitlab'
```
The name **must** be `gitlab`, do not use any other name.
Next, make sure that the database was successfully created:
```bash
influx -execute 'SHOW DATABASES'
```
The output should be similar to:
```
name: databases
---------------
name
_internal
gitlab
```
That's it! Now your GitLab instance should send data to InfluxDB.
---
Read more on:
- [Introduction to GitLab Metrics](introduction.md)
- [GitLab Configuration](gitlab_configuration.md)
- [InfluxDB Schema](influxdb_schema.md)
[influxdb-retention]: https://docs.influxdata.com/influxdb/v0.9/query_language/database_management/#retention-policy-management
[influxdb documentation]: https://docs.influxdata.com/influxdb/v0.9/
[influxdb cli]: https://docs.influxdata.com/influxdb/v0.9/tools/shell/
[udp]: https://docs.influxdata.com/influxdb/v0.9/write_protocols/udp/
[influxdb]: https://influxdata.com/time-series-platform/influxdb/
[tsm tree]: https://influxdata.com/blog/new-storage-engine-time-structured-merge-tree/
[tsm1-commit]: https://github.com/influxdata/influxdb/commit/15d723dc77651bac83e09e2b1c94be480966cb0d
[influx-admin]: https://docs.influxdata.com/influxdb/v0.9/administration/authentication_and_authorization/#create-a-new-admin-user
...@@ -2,16 +2,16 @@ ...@@ -2,16 +2,16 @@
The following measurements are currently stored in InfluxDB: The following measurements are currently stored in InfluxDB:
* `PROCESS_file_descriptors` - `PROCESS_file_descriptors`
* `PROCESS_gc_statistics` - `PROCESS_gc_statistics`
* `PROCESS_memory_usage` - `PROCESS_memory_usage`
* `PROCESS_method_calls` - `PROCESS_method_calls`
* `PROCESS_object_counts` - `PROCESS_object_counts`
* `PROCESS_transactions` - `PROCESS_transactions`
* `PROCESS_views` - `PROCESS_views`
Here `PROCESS` is replaced with either "rails" or "sidekiq" depending on the Here, `PROCESS` is replaced with either `rails` or `sidekiq` depending on the
process type. In all series any form of duration is stored in milliseconds. process type. In all series, any form of duration is stored in milliseconds.
## PROCESS_file_descriptors ## PROCESS_file_descriptors
...@@ -32,13 +32,15 @@ value field `value` contains the number of bytes. ...@@ -32,13 +32,15 @@ value field `value` contains the number of bytes.
## PROCESS_method_calls ## PROCESS_method_calls
This measurement contains the methods called during a transaction along with This measurement contains the methods called during a transaction along with
their durations and a name of the transaction action that invoked the method (if their duration, and a name of the transaction action that invoked the method (if
available). The method call duration is stored in the value field `duration` available). The method call duration is stored in the value field `duration`,
while the method name is stored in the tag `method`. The tag `action` contains while the method name is stored in the tag `method`. The tag `action` contains
the full name of the transaction action. Both the `method` and `action` fields the full name of the transaction action. Both the `method` and `action` fields
are in the following format: are in the following format:
ClassName#method_name ```
ClassName#method_name
```
For example, a method called by the `show` method in the `UsersController` class For example, a method called by the `show` method in the `UsersController` class
would have `action` set to `UsersController#show`. would have `action` set to `UsersController#show`.
...@@ -55,21 +57,31 @@ This measurement is used to store basic transaction details such as the time it ...@@ -55,21 +57,31 @@ This measurement is used to store basic transaction details such as the time it
took to complete a transaction, how much time was spent in SQL queries, etc. The took to complete a transaction, how much time was spent in SQL queries, etc. The
following value fields are available: following value fields are available:
* `duration`: the total duration of the transaction. | Value | Description |
* `allocated_memory`: the amount of bytes allocated while the transaction was | ----- | ----------- |
running. This value is only reliable when using single-threaded application | `duration` | The total duration of the transaction |
servers. | `allocated_memory` | The amount of bytes allocated while the transaction was running. This value is only reliable when using single-threaded application servers |
* `method_duration`: the total time spent in method calls. | `method_duration` | The total time spent in method calls |
* `sql_duration`: the total time spent in SQL queries. | `sql_duration` | The total time spent in SQL queries |
* `view_duration`: the total time spent in views. | `view_duration` | The total time spent in views |
## PROCESS_views ## PROCESS_views
This measurement is used to store view rendering timings for a transaction. The This measurement is used to store view rendering timings for a transaction. The
following value fields are available: following value fields are available:
* `duration`: the rendering time of the view. | Value | Description |
* `view`: the path of the view, relative to the application's root directory. | ----- | ----------- |
| `duration` | The rendering time of the view |
| `view` | The path of the view, relative to the application's root directory |
The `action` tag contains the action name of the transaction that rendered the The `action` tag contains the action name of the transaction that rendered the
view. view.
---
Read more on:
- [Introduction to GitLab Metrics](introduction.md)
- [GitLab Configuration](gitlab_configuration.md)
- [InfluxDB Configuration](influxdb_configuration.md)
# Introduction to GitLab Metrics # GitLab Metrics
GitLab comes with its own application performance measuring system as of GitLab GitLab comes with its own application performance measuring system as of GitLab
8.4, simply called "GitLab Metrics". GitLab Metrics is available in both the 8.4, simply called "GitLab Metrics". GitLab Metrics is available in both the
Community and Enterprise editions. Community and Enterprise editions.
Apart from this introduction, you are advised to read through the following
documents in order to understand and properly configure GitLab Metrics:
- [GitLab Configuration](gitlab_configuration.md)
- [InfluxDB Configuration](influxdb_configuration.md)
- [InfluxDB Schema](influxdb_schema.md)
## Introduction to GitLab Metrics
GitLab Metrics makes it possible to measure a wide variety of statistics GitLab Metrics makes it possible to measure a wide variety of statistics
including (but not limited to): including (but not limited to):
* The time it took to complete a transaction (a web request or Sidekiq job). - The time it took to complete a transaction (a web request or Sidekiq job).
* The time spent in running SQL queries and rendering HAML views. - The time spent in running SQL queries and rendering HAML views.
* The time spent executing (instrumented) Ruby methods. - The time spent executing (instrumented) Ruby methods.
* Ruby object allocations, and retained objects in particular. - Ruby object allocations, and retained objects in particular.
* System statistics such as the process' memory usage and open file descriptors. - System statistics such as the process' memory usage and open file descriptors.
* Ruby garbage collection statistics. - Ruby garbage collection statistics.
Metrics data is written to [InfluxDB][influxdb] over [UDP](influxdb-udp). Stored Metrics data is written to [InfluxDB][influxdb] over [UDP][influxdb-udp]. Stored
data can be visualized using [Grafana][grafana] or any other application that data can be visualized using [Grafana][grafana] or any other application that
supports reading data from InfluxDB. Alternatively data can be queried using the supports reading data from InfluxDB. Alternatively data can be queried using the
InfluxDB CLI. InfluxDB CLI.
...@@ -24,7 +33,7 @@ InfluxDB CLI. ...@@ -24,7 +33,7 @@ InfluxDB CLI.
Two types of metrics are collected: Two types of metrics are collected:
1. Transaction specific metrics. 1. Transaction specific metrics.
2. Sampled metrics, collected at a certain interval in a separate thread. 1. Sampled metrics, collected at a certain interval in a separate thread.
### Transaction Metrics ### Transaction Metrics
...@@ -41,7 +50,7 @@ metrics are collected at a regular interval. This interval is made up out of two ...@@ -41,7 +50,7 @@ metrics are collected at a regular interval. This interval is made up out of two
parts: parts:
1. A user defined interval. 1. A user defined interval.
2. A randomly generated offset added on top of the interval, the same offset 1. A randomly generated offset added on top of the interval, the same offset
can't be used twice in a row. can't be used twice in a row.
The actual interval can be anywhere between a half of the defined interval and a The actual interval can be anywhere between a half of the defined interval and a
......
# GitLab Configuration
By default GitLab Metrics is disabled. To enable GitLab Metrics and change any
of its settings open a web browser and navigate to
`http://YOUR_GITLAB_HOST/admin/application_settings`, the settings can be found
in the "Metrics" section. A restart of all GitLab processes is required for any
changes to take effect.
## Pending Migrations
When any migrations are pending the metrics are disabled until the migrations
have been performed.
# InfluxDB Configuration
The default settings provided by InfluxDB are not sufficient for a high traffic
GitLab environment. The settings discussed in this document are based on the
settings GitLab uses for GitLab.com, depending on your own needs you may need to
further adjust them.
## Requirements
* InfluxDB 0.9 or newer
* A fairly modern version of Linux
* At least 4GB of RAM
* At least 10GB of storage for InfluxDB data
Note that the RAM and storage requirements can differ greatly depending on the
amount of data received/stored. To limit the amount of stored data users can
look into [InfluxDB Retention Policies][influxdb-retention].
## InfluxDB Server Settings
Since InfluxDB has many settings that users may wish to customize themselves
(e.g. what port to run InfluxDB on) we'll only cover the essentials.
### Storage Engine
InfluxDB comes with different storage engines and as of InfluxDB 0.9 a new
storage engine is available called "tsm1". All users _must_ use the new tsm1
storage engine (this will be the default engine in upcoming InfluxDB engines).
### Admin Panel
Production environments should have the InfluxDB admin panel _disabled_. This
feature can be disabled by adding the following to your InfluxDB configuration
file:
[admin]
enabled = false
### HTTP
HTTP is required when using the InfluxDB CLI or other tools such as Grafana,
thus it should be enabled. When enabling make sure to _also_ enable
authentication:
[http]
enabled = true
auth-enabled = true
### UDP
GitLab writes data to InfluxDB via UDP and thus this must be enabled. Enabling
UDP can be done using the following settings:
[udp]
enabled = true
bind-address = ":8089"
database = "gitlab"
batch-size = 1000
batch-pending = 5
batch-timeout = 1s
read-buffer = 209715200
This does the following:
1. Enable UDP and bind it to port 8089 for all addresses.
2. Store any data received in the "gitlab" database.
3. Define a batch of points to be 1000 points in size and allow a maximum of
5 batches _or_ flush them automatically after 1 second.
4. Define a UDP read buffer size of 200 MB.
One of the most important settings here is the UDP read buffer size as if this
value is set too low packets will be dropped. You must also make sure the OS
buffer size is set to the same value, the default value is almost never enough.
To set the OS buffer size to 200 MB on Linux you can run the following command:
sysctl -w net.core.rmem_max=209715200
To make this permanent, add the following to `/etc/sysctl.conf` and restart the
server:
net.core.rmem_max=209715200
It is **very important** to make sure the buffer sizes are large enough to
handle all data sent to InfluxDB as otherwise you _will_ lose data. The above
buffer sizes are based on the traffic for GitLab.com. Depending on the amount of
traffic users may be able to use a smaller buffer size, but we highly recommend
using _at least_ 100 MB.
When enabling UDP users should take care to not expose the port to the public as
doing so will allow anybody to write data into your InfluxDB database (as
InfluxDB's UDP protocol doesn't support authentication). We recommend either
whitelisting the allowed IP addresses/ranges, or setting up a VLAN and only
allowing traffic from members of said VLAN.
[influxdb-retention]: https://docs.influxdata.com/influxdb/v0.9/query_language/database_management/#retention-policy-management
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment