BigW Consortium Gitlab

what_requires_downtime.md 5.88 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
# What requires downtime?

When working with a database certain operations can be performed without taking
GitLab offline, others do require a downtime period. This guide describes
various operations and their impact.

## Adding Columns

On PostgreSQL you can safely add a new column to an existing table as long as it
does **not** have a default value. For example, this query would not require
downtime:

```sql
ALTER TABLE projects ADD COLUMN random_value int;
```

Add a column _with_ a default however does require downtime. For example,
consider this query:

```sql
ALTER TABLE projects ADD COLUMN random_value int DEFAULT 42;
```

This requires updating every single row in the `projects` table so that
`random_value` is set to `42` by default. This requires updating all rows and
indexes in a table. This in turn acquires enough locks on the table for it to
effectively block any other queries.

As of MySQL 5.6 adding a column to a table is still quite an expensive
operation, even when using `ALGORITHM=INPLACE` and `LOCK=NONE`. This means
downtime _may_ be required when modifying large tables as otherwise the
operation could potentially take hours to complete.

34 35 36 37 38 39 40 41
Adding a column with a default value _can_ be done without requiring downtime
when using the migration helper method
`Gitlab::Database::MigrationHelpers#add_column_with_default`. This method works
similar to `add_column` except it updates existing rows in batches without
blocking access to the table being modified. See ["Adding Columns With Default
Values"](migration_style_guide.html#adding-columns-with-default-values) for more
information on how to use this method.

42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
## Dropping Columns

On PostgreSQL you can safely remove an existing column without the need for
downtime. When you drop a column in PostgreSQL it's not immediately removed,
instead it is simply disabled. The data is removed on the next vacuum run.

On MySQL this operation requires downtime.

While database wise dropping a column may be fine on PostgreSQL this operation
still requires downtime because the application code may still be using the
column that was removed. For example, consider the following migration:

```ruby
class MyMigration < ActiveRecord::Migration
  def change
    remove_column :projects, :dummy
  end
end
```

Now imagine that the GitLab instance is running and actively uses the `dummy`
column. If we were to run the migration this would result in the GitLab instance
producing errors whenever it tries to use the `dummy` column.

As a result of the above downtime _is_ required when removing a column, even
when using PostgreSQL.

69 70 71 72 73 74
## Renaming Columns

Renaming columns requires downtime as running GitLab instances will continue
using the old column name until a new version is deployed. This can result
in the instance producing errors, which in turn can impact the user experience.

75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167
## Changing Column Constraints

Generally changing column constraints requires checking all rows in the table to
see if they meet the new constraint, unless a constraint is _removed_. For
example, changing a column that previously allowed NULL values to not allow NULL
values requires the database to verify all existing rows.

The specific behaviour varies a bit between databases but in general the safest
approach is to assume changing constraints requires downtime.

## Changing Column Types

This operation requires downtime.

## Adding Indexes

Adding indexes is an expensive process that blocks INSERT and UPDATE queries for
the duration. When using PostgreSQL one can work arounds this by using the
`CONCURRENTLY` option:

```sql
CREATE INDEX CONCURRENTLY index_name ON projects (column_name);
```

Migrations can take advantage of this by using the method
`add_concurrent_index`. For example:

```ruby
class MyMigration < ActiveRecord::Migration
  def change
    add_concurrent_index :projects, :column_name
  end
end
```

When running this on PostgreSQL the `CONCURRENTLY` option mentioned above is
used. On MySQL this method produces a regular `CREATE INDEX` query.

MySQL doesn't really have a workaround for this. Supposedly it _can_ create
indexes without the need for downtime but only for variable width columns. The
details on this are a bit sketchy. Since it's better to be safe than sorry one
should assume that adding indexes requires downtime on MySQL.

## Dropping Indexes

Dropping an index does not require downtime on both PostgreSQL and MySQL.

## Adding Tables

This operation is safe as there's no code using the table just yet.

## Dropping Tables

This operation requires downtime as application code may still be using the
table.

## Adding Foreign Keys

Adding foreign keys acquires an exclusive lock on both the source and target
tables in PostgreSQL. This requires downtime as otherwise the entire application
grinds to a halt for the duration of the operation.

On MySQL this operation also requires downtime _unless_ foreign key checks are
disabled. Because this means checks aren't enforced this is not ideal, as such
one should assume MySQL also requires downtime.

## Removing Foreign Keys

This operation should not require downtime on both PostgreSQL and MySQL.

## Updating Data

Updating data should generally be safe. The exception to this is data that's
being migrated from one version to another while the application still produces
data in the old version.

For example, imagine the application writes the string `'dog'` to a column but
it really is meant to write `'cat'` instead. One might think that the following
migration is all that is needed to solve this problem:

```ruby
class MyMigration < ActiveRecord::Migration
  def up
    execute("UPDATE some_table SET column = 'cat' WHERE column = 'dog';")
  end
end
```

Unfortunately this is not enough. Because the application is still running and
using the old value this may result in the table still containing rows where
`column` is set to `dog`, even after the migration finished.

In these cases downtime _is_ required, even for rarely updated tables.