BigW Consortium Gitlab

redis.md 20.9 KB
Newer Older
Drew Blessing committed
1 2
# Configuring Redis for GitLab HA

3
You can choose to install and manage Redis yourself, or you can use the one
4
that comes bundled with Omnibus GitLab packages.
Drew Blessing committed
5

6 7 8 9 10
> **Note:** Redis does not require authentication by default. See
  [Redis Security](http://redis.io/topics/security) documentation for more
  information. We recommend using a combination of a Redis password and tight
  firewall rules to secure your Redis service.

11 12 13 14 15 16 17
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents**

- [Configure your own Redis server](#configure-your-own-redis-server)
- [Configure Redis using Omnibus](#configure-redis-using-omnibus)
- [Experimental Redis Sentinel support](#experimental-redis-sentinel-support)
18
- [Redis Sentinel support](#redis-sentinel-support)
19
  - [Redis setup](#redis-setup)
20 21 22 23 24 25 26 27
    - [Existing single-machine installation](#existing-single-machine-installation)
    - [Installation from source](#installation-from-source)
    - [Omnibus packages](#omnibus-packages)
  - [Configuring Sentinel](#configuring-sentinel)
    - [How sentinel handles a failover](#how-sentinel-handles-a-failover)
    - [Sentinel setup](#sentinel-setup)
      - [Community Edition](#community-edition)
      - [Enterprise Edition](#enterprise-edition)
28
  - [GitLab setup](#gitlab-setup)
29 30 31
- [Troubleshooting](#troubleshooting)
  - [Redis replication](#redis-replication)
  - [Sentinel](#sentinel)
32
    - [Omnibus install](#omnibus-install)
33
    - [Source install](#source-install)
34 35 36

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

37 38 39 40 41 42 43 44 45 46 47 48
## Configure your own Redis server

If you're hosting GitLab on a cloud provider, you can optionally use a
managed service for Redis. For example, AWS offers a managed ElastiCache service
that runs Redis.

## Configure Redis using Omnibus

If you don't want to bother setting up your own Redis server, you can use the
one bundled with Omnibus. In this case, you should disable all services except
Redis.

49
1. Download/install Omnibus GitLab using **steps 1 and 2** from
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
   [GitLab downloads](https://about.gitlab.com/downloads). Do not complete other
   steps on the download page.
1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration.
   Be sure to change the `external_url` to match your eventual GitLab front-end
   URL:

    ```ruby
    external_url 'https://gitlab.example.com'

    # Disable all services except Redis
    redis['enable'] = true
    bootstrap['enable'] = false
    nginx['enable'] = false
    unicorn['enable'] = false
    sidekiq['enable'] = false
    postgresql['enable'] = false
66
    gitlab_rails['enable'] = false
67 68 69 70 71 72 73 74
    gitlab_workhorse['enable'] = false
    mailroom['enable'] = false

    # Redis configuration
    redis['port'] = 6379
    redis['bind'] = '0.0.0.0'

    # If you wish to use Redis authentication (recommended)
75
    redis['password'] = 'redis-password-goes-here'
76 77
    ```

78
1. Run `sudo gitlab-ctl reconfigure` to install and configure Redis.
79 80 81 82 83 84 85

    > **Note**: This `reconfigure` step will result in some errors.
      That's OK - don't be alarmed.

1. Run `touch /etc/gitlab/skip-auto-migrations` to prevent database migrations
   from running on upgrade. Only the primary GitLab application server should
   handle migrations.
86

87
## Experimental Redis Sentinel support
88

89 90 91 92 93 94
   > Experimental Redis Sentinel support was [Introduced][ce-1877] in GitLab 8.11.
     Starting with 8.13, Redis Sentinel is no longer experimental.
     If you used with versions `< 8.13` before, please check the updated
     documentation below.

## Redis Sentinel support
95

96
Since GitLab 8.11, you can configure a list of Redis Sentinel servers that
97 98 99 100
will monitor a group of Redis servers to provide you with a standard failover
support.

To get a better understanding on how to correctly setup Sentinel, please read
101 102 103
the [Redis Sentinel documentation](http://redis.io/topics/sentinel) first, as
failing to configure it correctly can lead to data loss.

104 105 106 107 108 109 110 111 112
Redis Sentinel can handle the most important tasks in a HA environment to help
keep servers online with minimal to no downtime:

- Monitors master and slave instances to see if they are available
- Promote a slave to master when the master fails.
- Demote a master to slave when failed master comes back online (to prevent
  data-partitioning).
- Can be queried by clients to always connect to the correct master server.

113
The configuration consists of three parts:
114

115 116 117 118
- Setup Redis Master and Slave nodes
- Setup Sentinel nodes
- Setup GitLab

119 120 121 122 123 124 125
### Prerequisites

You need at least `3` independent machines: physical, or VMs running into
distinct physical machines.

If you fail to provision the machines in that specific way, any issue with
the shared environment can bring your entire setup down.
126 127 128

Read carefully how to configure those components below.

129
### Redis setup
130

131 132 133
You must have at least `3` Redis servers: `1` Master, `2` Slaves, and they need to
be each in a independent machine (see explanation above).

134
They should be configured the same way and with similar server specs, as
135
in a failover situation, any `Slave` can be elected as the new `Master` by
136
the Sentinel servers.
137

138 139 140
With Sentinel, you must define a password to protect the access as both
Sentinel instances and other redis instances should be able to talk to
each other over the network.
141

142
You'll need to define both `requirepass` and `masterauth` in all
143 144
nodes. At any time during a failover the Sentinels can reconfigure a node
and change it's status from `Master` to `Slave` and vice versa.
145

146
Initial `Slave` nodes require an additional `slaveof` setting in `redis.conf`
147
pointing to the initial `Master`.
148

149 150 151 152 153 154 155 156 157 158 159 160 161 162 163
#### Existing single-machine installation

If you already have a single-machine GitLab install running, you will need to
replicate from this machine first, before de-activating the Redis instance
inside it.

Your single-machine install will be the initial `Master`, and the `3` others
should be configured as `Slave` pointing to this machine.

After replication catchs-up, you will need to stop services in the
single-machine install, to rotate the `Master` to one of the new nodes.

Make the required changes in configuration and restart the new nodes again.

To disable redis in the single install, edit `/etc/gitlab/gitlab.rb`:
164

165 166 167 168 169 170 171
```ruby
redis['enable'] = false
```

#### Installation from source

**Configuring Master Redis instance**
172

173
You need to make the following changes in `redis.conf`:
174

175 176 177
1. Define a `bind` address pointing to a local IP that your other machines
   can reach you. If you really need to bind to an external acessible IP, make
   sure you add extra firewall rules to prevent unauthorized access:
178

179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201
   ```conf
   # By default, if no "bind" configuration directive is specified, Redis listens
   # for connections from all the network interfaces available on the server.
   # It is possible to listen to just one or multiple selected interfaces using
   # the "bind" configuration directive, followed by one or more IP addresses.
   #
   # Examples:
   #
   # bind 192.168.1.100 10.0.0.1
   # bind 127.0.0.1 ::1
   bind 0.0.0.0 # This will bind to all interfaces
   ```

1. Define a `port` to force redis to listin on TCP so other machines can
   connect to it:

   ```conf
   # Accept connections on the specified port, default is 6379 (IANA #815344).
   # If port 0 is specified Redis will not listen on a TCP socket.
   port 6379
   ```

1. Set up password authentication (use the same password in all nodes)
202

203
    ```conf
204 205
    requirepass "redis-password-goes-here"
    masterauth "redis-password-goes-here"
206 207 208 209
    ```

1. Restart the Redis services for the changes to take effect.

210
**Configuring Slave Redis instance**
211

212
1. Follow same instructions from master, with the extra change in `redis.conf`:
213

214 215 216 217
   ```conf
   # IP and port of the master Redis server
   slaveof 10.10.10.10 6379
   ```
218

219
1. Restart the Redis services for the changes to take effect.
220

221
#### Omnibus packages
222

223
You need to install the Omnibus GitLab package in `3` independent machines.
224

225
**Configuring Master Redis instance**
226

227
You will need to configure the following:
228 229 230 231

1. Define a `redis['bind']` address pointing to a local IP that your other machines
   can reach you. If you really need to bind to an external acessible IP, make
   sure you add extra firewall rules to prevent unauthorized access.
232 233 234
1. Define a `redis['port']` so redis can listen for TCP requests which will
   allow other machines to connect to it.
1. Set up a password authentication with `redis['master_password']` (use the same
235
   password in all nodes).
236

237 238
In `/etc/gitlab/gitlab.rb`:

239 240 241 242 243 244 245 246
```ruby
## Redis TCP support (will disable UNIX socket transport)
redis['bind'] = '0.0.0.0' # or specify an IP to bind to a single one
redis['port'] = 6379
redis['requirepass'] = 'redis-password-goes-here'
redis['master_password'] = 'redis-password-goes-here'
```

247
Reconfigure Omnibus GitLab for the changes to take effect: `sudo gitlab-ctl reconfigure`
248

249
**Configuring Slave Redis instances**
250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265

You need to make the same changes listed for the `Master` instance,
with an additional `Slave` section as in the example below:

```ruby
redis['bind'] = '0.0.0.0' # or specify an IP to bind to a single one
redis['port'] = 6379
redis['requirepass'] = 'redis-password-goes-here'
redis['master_password'] = 'redis-password-goes-here'

## Slave redis instance
redis['master'] = false
redis['master_ip'] = '10.10.10.10' # IP of master Redis server
redis['master_port'] = 6379 # Port of master Redis server
```

266
Reconfigure Omnibus GitLab for the changes to take effect: `sudo gitlab-ctl reconfigure`
267 268 269 270 271

---

Now that the Redis servers are all set up, let's configure the Sentinel
servers.
272

273 274 275 276
If you are not sure if your Redis servers are working and replicating
correctly, please read the [Troubleshooting  Replication](#troubleshooting-replication)
and fix it before proceeding with Sentinel setup.

277
### Configuring Sentinel
278 279

You must have at least `3` Redis Sentinel servers, and they need to
280 281
be each in a independent machine. You can configure them in the same
machines where you've configured the other Redis servers.
282 283

This number is required for the consensus algorithm to be effective
284 285 286 287
in the case of a failure. **You should always have and `odd` number
of Sentinel nodes provisioned**.

#### How sentinel handles a failover
288

289 290 291
If (`quorum` value of) Sentinels  agree the fact the `master` is not reachable,
Sentinels will try to elect a temporary `Leader`. The **Majority** of the
Sentinels must agree to start a failover.
292

293 294
If you don't have the **Majority** of the Sentinels online (for example if you
are under a network partitioning), a failover **will not be started**.
295

296 297
For example, for a cluster of `3` Sentinels, at least `2` must agree on a
`Leader`. If you have total of `5` at least `3` must agree on a `Leader`.
298

299
The `quorum` is only used to detect failure, not to elect the `Leader`.
300 301 302 303 304 305 306

Official [Sentinel documentation](http://redis.io/topics/sentinel#example-sentinel-deployments)
also lists different network topologies and warns againts situations like
network partition and how it can affect the state of the HA solution. Make
sure you read it carefully and understand the implications in your current
setup.

307
GitLab Enterprise Edition provides [automated way to setup and run](#sentinel-setup-ee-only) the Sentinel daemon.
308

309
#### Sentinel setup
310

311 312 313 314
##### Community Edition
With GitLab Community Edition, you need to install, configure, execute and
monitor Sentinel from source. Omnibus GitLab Community Edition package does
not support Sentinel configuration.
315

316
A minimal configuration file (`sentinel.conf`) should contain the following:
317

318
```conf
319 320 321 322
bind 0.0.0.0 # bind to all interfaces or change to a specific IP
port 26379 # default sentinel port
sentinel auth-pass gitlab-redis redis-password-goes-here
sentinel monitor gitlab-redis 10.0.0.1 6379 2
323 324 325
sentinel down-after-milliseconds gitlab-redis 10000
sentinel config-epoch gitlab-redis 0
sentinel leader-epoch gitlab-redis 0
326 327
```

328
##### Enterprise Edition
329

330
To setup sentinel, you edit `/etc/gitlab/gitlab.rb` file:
331 332 333

```ruby

334 335 336
## When you install Sentinel in a separate machine, you need to control which
## other services will be running in it. Take a look at the following variables
## and enable or disable whenever it fits your strategy:
337

338
## Enabled Redis and Sentinel services
339 340 341 342 343 344 345 346 347 348 349 350 351 352
redis['enable'] = true
sentinel['enable'] = true

# Disabled all other services
redis['enable'] = false
bootstrap['enable'] = false
nginx['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
postgresql['enable'] = false
gitlab_workhorse['enable'] = false
gitlab_rails['enable'] = false
mailroom['enable'] = false

353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398
## Configure Redis
redis['master_name'] = 'gitlab-redis' # must be the same in every sentinel node
redis['master_ip'] = '10.0.0.1' # ip of the initial master redis instance
redis['master_port'] = 6379 # port of the initial master redis instance
redis['master_password'] = 'your-secure-password-here' # the same value defined in redis['password'] in the master instance

## Configure Sentinel
# sentinel['port'] = 26379 # uncomment to change default port

## Quorum must reflect the amount of voting sentinels it take to start a failover.
## Value must NOT be greater then the ammount of sentinels.
##
## The quorum can be used to tune Sentinel in two ways:
## 1. If a the quorum is set to a value smaller than the majority of Sentinels
##    we deploy, we are basically making Sentinel more sensible to master failures,
##    triggering a failover as soon as even just a minority of Sentinels is no longer
##    able to talk with the master.
## 1. If a quorum is set to a value greater than the majority of Sentinels, we are
##    making Sentinel able to failover only when there are a very large number (larger
##    than majority) of well connected Sentinels which agree about the master being down.s
sentinel['quorum'] = 2

## Consider unresponsive server down after x amount of ms.
# sentinel['down_after_milliseconds'] = 10000

## Specifies the failover timeout in milliseconds. It is used in many ways:
##
## - The time needed to re-start a failover after a previous failover was
##   already tried against the same master by a given Sentinel, is two
##   times the failover timeout.
##
## - The time needed for a slave replicating to a wrong master according
##   to a Sentinel current configuration, to be forced to replicate
##   with the right master, is exactly the failover timeout (counting since
##   the moment a Sentinel detected the misconfiguration).
##
## - The time needed to cancel a failover that is already in progress but
##   did not produced any configuration change (SLAVEOF NO ONE yet not
##   acknowledged by the promoted slave).
##
## - The maximum time a failover in progress waits for all the slaves to be
##   reconfigured as slaves of the new master. However even after this time
##   the slaves will be reconfigured by the Sentinels anyway, but not with
##   the exact parallel-syncs progression as specified.
# sentinel['failover_timeout'] = 60000
```
399

400 401 402 403 404
---

The final part is to inform the main GitLab application server of the Redis
master and the new sentinels servers.

405
### GitLab setup
406 407

You can enable or disable sentinel support at any time in new or existing
408
installations. From the GitLab application perspective, all it requires is
409
the correct credentials for the master Redis and for a few Sentinel nodes.
410

411
It doesn't require a list of all Sentinel nodes, as in case of a failure,
412 413
the application will need to query only one of them.

414 415 416 417
>**Note:**
The following steps should be performed in the [GitLab application server](gitlab.md).

**For source based installations**
418

419
1. Edit `/home/git/gitlab/config/resque.yml` following the example in
420
   `/home/git/gitlab/config/resque.yml.example`, and uncomment the sentinels
421 422
   line, changing to the correct server credentials.
1. Restart GitLab for the changes to take effect.
423

424
**For Omnibus installations**
425

426 427 428
1. Edit `/etc/gitlab/gitlab.rb` and add/change the following lines:

    ```ruby
429
    redis['master_name'] = "gitlab-redis"
430
    redis['master_password'] = 'redis-password-goes-here'
431
    gitlab_rails['redis_sentinels'] = [
432 433 434 435 436 437 438 439
      {'host' => '10.10.10.1', 'port' => 26379},
      {'host' => '10.10.10.2', 'port' => 26379},
      {'host' => '10.10.10.3', 'port' => 26379}
    ]
    ```

1. [Reconfigure] the GitLab for the changes to take effect.

440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505
## Troubleshooting

There are a lot of moving parts that needs to be taken care carefully
in order for the HA setup to work as expected.

Before proceeding with the troubleshooting below, check your firewall
rules:
- Redis machines
   - Accept TCP connection in `6379`
   - Connect to the other Redis machines via TCP in `6379`
- Sentinel machines
   - Accept TCP connection in `26379`
   - Connect to other Sentinel machines via TCP in `26379`
   - Connect to the Redis machines via TCP in `6379`

### Redis replication

You can check if everything is correct by connecting to each server using
`redis-cli` application, and sending the `INFO` command.

If authentication was correctly defined, it should fail with:
`NOAUTH Authentication required` error. Try to authenticate with the
previous defined password with `AUTH redis-password-goes-here` and
try the `INFO` command again.

Look for the `# Replication` section where you should see some important
information like the `role` of the server.

When connected to a `master` redis, you will see the number of connected
`slaves`, and a list of each with connection details:

```
# Replication
role:master
connected_slaves:1
slave0:ip=10.133.5.21,port=6379,state=online,offset=208037514,lag=1
master_repl_offset:208037658
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:206989083
repl_backlog_histlen:1048576
```

When it's a `slave`, you will see details of the master connection and if
its `up` or `down`:

```
# Replication
role:slave
master_host:10.133.1.58
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:208096498
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
```

### Sentinel
506

507
#### Omnibus GitLab
508 509 510 511 512 513 514 515 516 517 518 519

If you get an error like: `Redis::CannotConnectError: No sentinels available.`,
there may be something wrong with your configuration files or it can be related
to [this issue][gh-531].

You must make sure you are defining the same value in `redis['master_name']`
and `redis['master_pasword']` as you defined for your sentinel node.

The way the redis connector `redis-rb` works with sentinel is a bit
non-intuitive. We try to hide the complexity in omnibus, but it still requires
a few extra configs.

520
#### Install from Source
521

522 523
If you get an error like: `Redis::CannotConnectError: No sentinels available.`,
there may be something wrong with your configuration files or it can be related
524
to [this issue][gh-531].
525

526 527
It's a bit non-intuitive the way you have to config `resque.yml` and
`sentinel.conf`, otherwise `redis-rb` will not work properly.
528

529 530 531
The `master-group-name` ('gitlab-redis') defined in (`sentinel.conf`)
**must** be used as the hostname in GitLab (`resque.yml` for source installations
or `gitlab-rails['redis_*']` in Omnibus):
532 533 534

```conf
# sentinel.conf:
535
sentinel monitor gitlab-redis 10.10.10.10 6379 2
536 537 538
sentinel down-after-milliseconds gitlab-redis 10000
sentinel config-epoch gitlab-redis 0
sentinel leader-epoch gitlab-redis 0
539 540
```

541 542 543
```yaml
# resque.yaml
production:
544
  url: redis://:myredispassword@gitlab-redis/
545 546
  sentinels:
    -
547
      host: slave1.example.com # or use ip
548 549
      port: 26380 # point to sentinel, not to redis port
    -
550
      host: slave2.exampl.com # or use ip
551 552
      port: 26381 # point to sentinel, not to redis port
```
Drew Blessing committed
553

554
When in doubt, please read [Redis Sentinel documentation](http://redis.io/topics/sentinel)
Drew Blessing committed
555

556
---
Drew Blessing committed
557

558
To make sure your configuration is correct:
Drew Blessing committed
559

560 561 562 563 564 565 566 567
1. SSH into your GitLab application server
1. Enter the Rails console:

    ```
    # For Omnibus installations
    sudo gitlab-rails console

    # For source installations
568
    sudo -u git rails console production
569 570 571
    ```

1. Run in the console:
Drew Blessing committed
572 573

    ```ruby
574 575
    redis = Redis.new(Gitlab::Redis.params)
    redis.info
Drew Blessing committed
576 577
    ```

578
    Keep this screen open and try to simulate a failover below.
Drew Blessing committed
579

580 581 582 583 584 585 586 587 588 589 590 591 592 593 594
1. To simulate a failover on master Redis, SSH into the Redis server and run:

    ```bash
    # port must match your master redis port
     redis-cli -h localhost -p 6379 DEBUG sleep 60
    ```

1. Then back in the Rails console from the first step, run:

    ```
    redis.info
    ```

    You should see a different port after a few seconds delay
    (the failover/reconnect time).
Drew Blessing committed
595 596 597 598 599 600 601 602 603

---

Read more on high-availability configuration:

1. [Configure the database](database.md)
1. [Configure NFS](nfs.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
604 605 606 607 608 609

[ce-1877]: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/1877
[restart]: ../restart_gitlab.md#installations-from-source
[reconfigure]: ../restart_gitlab.md#omnibus-gitlab-reconfigure
[gh-531]: https://github.com/redis/redis-rb/issues/531
[gh-534]: https://github.com/redis/redis-rb/issues/534