BigW Consortium Gitlab

architecture.md 9.34 KB
Newer Older
1
# GitLab Architecture Overview
2

3
## Software delivery
4

5
There are two editions of GitLab: [Enterprise Edition](https://about.gitlab.com/gitlab-ee/) (EE) and [Community Edition](https://about.gitlab.com/gitlab-ce/) (CE). GitLab CE is delivered via git from the [gitlabhq repository](https://gitlab.com/gitlab-org/gitlab-ce/tree/master). New versions of GitLab are released in stable branches and the master branch is for bleeding edge development.
6

7
EE releases are available not long after CE releases. To obtain the GitLab EE there is a [repository at gitlab.com](https://gitlab.com/subscribers/gitlab-ee). For more information about the release process see the section 'New versions and upgrading' in the readme.
8

9
Both EE and CE require some add-on components called gitlab-shell and Gitaly. These components are available from the [gitlab-shell](https://gitlab.com/gitlab-org/gitlab-shell/tree/master) and [gitaly](https://gitlab.com/gitlab-org/gitaly/tree/master) repositories respectively. New versions are usually tags but staying on the master branch will give you the latest stable version. New releases are generally around the same time as GitLab CE releases with exception for informal security updates deemed critical.
10

11 12 13 14 15 16 17 18
## Physical office analogy

You can imagine GitLab as a physical office.

**The repositories** are the goods GitLab handling.
They can be stored in a warehouse.
This can be either a hard disk, or something more complex, such as a NFS filesystem;

19 20
**Nginx** acts like the front-desk.
Users come to Nginx and request actions to be done by workers in the office;
21 22 23 24 25

**The database** is a series of metal file cabinets with information on:
 - The goods in the warehouse (metadata, issues, merge requests etc);
 - The users coming to the front desk (permissions)

26
**Redis** is a communication board with “cubby holes” that can contain tasks for office workers;
27 28 29 30 31 32 33 34 35 36 37

**Sidekiq** is a worker that primarily handles sending out emails.
It takes tasks from the Redis communication board;

**A Unicorn worker** is a worker that handles quick/mundane tasks.
They work with the communication board (Redis).
Their job description:
 - check permissions by checking the user session stored in a Redis “cubby hole”;
 - make tasks for Sidekiq;
 - fetch stuff from the warehouse or move things around in there;

38 39 40 41
**GitLab-shell** is a third kind of worker that takes orders from a fax machine (SSH) instead of the front desk (HTTP).
GitLab-shell communicates with Sidekiq via the “communication board” (Redis), and asks quick questions of the Unicorn workers either directly or via the front desk.

**Gitaly** is a back desk that is specialized on reaching the disks to perform git operations efficiently and keep a copy of the result of costly operations. All git operations go through Gitaly.
42 43 44

**GitLab Enterprise Edition (the application)** is the collection of processes and business practices that the office is run by.

45
## System Layout
46

47
When referring to `~git` in the pictures it means the home directory of the git user which is typically /home/git.
48

49 50 51 52
GitLab is primarily installed within the `/home/git` user home directory as `git` user. Within the home directory is where the gitlabhq server software resides as well as the repositories (though the repository location is configurable).

The bare repositories are located in `/home/git/repositories`. GitLab is a ruby on rails application so the particulars of the inner workings can be learned by studying how a ruby on rails application works.

53
To serve repositories over SSH there's an add-on application called gitlab-shell which is installed in `/home/git/gitlab-shell`.
54

55
### Components
56

57
![GitLab Diagram Overview](gitlab_architecture_diagram.png)
58

59
_[edit diagram (for GitLab team members only)](https://docs.google.com/drawings/d/1fBzAyklyveF-i-2q-OHUIqDkYfjjxC4mq5shwKSZHLs/edit)_
60

61
A typical install of GitLab will be on GNU/Linux. It uses Nginx or Apache as a web front end to proxypass the Unicorn web server. By default, communication between Unicorn and the front end is via a Unix domain socket but forwarding requests via TCP is also supported. The web front end accesses `/home/git/gitlab/public` bypassing the Unicorn server to serve static pages, uploads (e.g. avatar images or attachments), and precompiled assets. GitLab serves web pages and a [GitLab API](https://gitlab.com/gitlab-org/gitlab-ce/tree/master/doc/api) using the Unicorn web server. It uses Sidekiq as a job queue which, in turn, uses redis as a non-persistent database backend for job information, meta data, and incoming jobs.
62

63
The GitLab web app uses MySQL or PostgreSQL for persistent database information (e.g. users, permissions, issues, other meta data). GitLab stores the bare git repositories it serves in `/home/git/repositories` by default. It also keeps default branch and hook information with the bare repository.
64

65
When serving repositories over HTTP/HTTPS GitLab utilizes the GitLab API to resolve authorization and access as well as serving git objects.
66

67 68 69
The add-on component gitlab-shell serves repositories over SSH. It manages the SSH keys within `/home/git/.ssh/authorized_keys` which should not be manually edited. gitlab-shell accesses the bare repositories through Gitaly to serve git objects and communicates with redis to submit jobs to Sidekiq for GitLab to process. gitlab-shell queries the GitLab API to determine authorization and access.

Gitaly executes git operations from gitlab-shell and Workhorse, and provides an API to the GitLab web app to get attributes from git (e.g. title, branches, tags, other meta data), and to get blobs (e.g. diffs, commits, files)
70 71 72 73

### Installation Folder Summary

To summarize here's the [directory structure of the `git` user home directory](../install/structure.md).
74

75
### Processes
76 77 78

    ps aux | grep '^git'

79
GitLab has several components to operate. As a system user (i.e. any user that is not the `git` user) it requires a persistent database (MySQL/PostreSQL) and redis database. It also uses Apache httpd or Nginx to proxypass Unicorn. As the `git` user it starts Sidekiq and Unicorn (a simple ruby HTTP server running on port `8080` by default). Under the GitLab user there are normally 4 processes: `unicorn_rails master` (1 process), `unicorn_rails worker` (2 processes), `sidekiq` (1 process).
80

81
### Repository access
82

83
Repositories get accessed via HTTP or SSH. HTTP cloning/push/pull utilizes the GitLab API and SSH cloning is handled by gitlab-shell (previously explained).
84

85
## Troubleshooting
86

87
See the README for more information.
88

89
### Init scripts of the services
90

91
The GitLab init script starts and stops Unicorn and Sidekiq.
92 93

```
94
/etc/init.d/gitlab
95
Usage: service gitlab {start|stop|restart|reload|status}
96
```
97

98 99 100
Redis (key-value store/non-persistent database)

```
101
/etc/init.d/redis
102
Usage: /etc/init.d/redis {start|stop|status|restart|condrestart|try-restart}
103 104 105
```

SSH daemon
106

107
```
108
/etc/init.d/sshd
109 110 111
Usage: /etc/init.d/sshd {start|stop|restart|reload|force-reload|condrestart|try-restart|status}
```

112
Web server (one of the following)
113 114

```
115
/etc/init.d/httpd
116 117 118 119 120 121 122 123 124
Usage: httpd {start|stop|restart|condrestart|try-restart|force-reload|reload|status|fullstatus|graceful|help|configtest}

$ /etc/init.d/nginx
Usage: nginx {start|stop|restart|reload|force-reload|status|configtest}
```

Persistent database (one of the following)

```
125
/etc/init.d/mysqld
126 127 128 129 130 131
Usage: /etc/init.d/mysqld {start|stop|status|restart|condrestart|try-restart|reload|force-reload}

$ /etc/init.d/postgresql
Usage: /etc/init.d/postgresql {start|stop|restart|reload|force-reload|status} [version ..]
```

132
### Log locations of the services
133

134
Note: `/home/git/` is shorthand for `/home/git`.
135 136 137

gitlabhq (includes Unicorn and Sidekiq logs)

138
- `/home/git/gitlab/log/` contains `application.log`, `production.log`, `sidekiq.log`, `unicorn.stdout.log`, `githost.log` and `unicorn.stderr.log` normally.
139 140 141

gitlab-shell

142
- `/home/git/gitlab-shell/gitlab-shell.log`
143 144 145

ssh

146 147
- `/var/log/auth.log` auth log (on Ubuntu).
- `/var/log/secure` auth log (on RHEL).
148 149 150

nginx

151
- `/var/log/nginx/` contains error and access logs.
152 153 154

Apache httpd

155
- [Explanation of Apache logs](https://httpd.apache.org/docs/2.2/logs.html).
156 157
- `/var/log/apache2/` contains error and output logs (on Ubuntu).
- `/var/log/httpd/` contains error and output logs (on RHEL).
158 159 160

redis

161
- `/var/log/redis/redis.log` there are also log-rotated logs there.
162 163 164

PostgreSQL

165
- `/var/log/postgresql/*`
166 167 168

MySQL

169 170
- `/var/log/mysql/*`
- `/var/log/mysql.*`
171

172
### GitLab specific config files
173

174
GitLab has configuration files located in `/home/git/gitlab/config/*`. Commonly referenced config files include:
175

176 177 178
- `gitlab.yml` - GitLab configuration.
- `unicorn.rb` - Unicorn web server settings.
- `database.yml` - Database connection settings.
179

180
gitlab-shell has a configuration file at `/home/git/gitlab-shell/config.yml`.
181

182
### Maintenance Tasks
183

184
[GitLab](https://gitlab.com/gitlab-org/gitlab-ce/tree/master) provides rake tasks with which you see version information and run a quick check on your configuration to ensure it is configured properly within the application. See [maintenance rake tasks](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/raketasks/maintenance.md).
185
In a nutshell, do the following:
186 187 188 189 190 191 192 193

```
sudo -i -u git
cd gitlab
bundle exec rake gitlab:env:info RAILS_ENV=production
bundle exec rake gitlab:check RAILS_ENV=production
```

194
Note: It is recommended to log into the `git` user using `sudo -i -u git` or `sudo su - git`. While the sudo commands provided by gitlabhq work in Ubuntu they do not always work in RHEL.