BigW Consortium Gitlab

shell_commands.md 7.87 KB
Newer Older
Jacob Vosmaer committed
1 2
# Guidelines for shell commands in the GitLab codebase

3 4 5
This document contains guidelines for working with processes and files in the GitLab codebase.
These guidelines are meant to make your code more reliable _and_ secure.

6 7 8 9
## References

- [Google Ruby Security Reviewer's Guide](https://code.google.com/p/ruby-security/wiki/Guide)
- [OWASP Command Injection](https://www.owasp.org/index.php/Command_Injection)
dosire committed
10
- [Ruby on Rails Security Guide Command Line Injection](http://guides.rubyonrails.org/security.html#command-line-injection)
11

Jacob Vosmaer committed
12 13
## Use File and FileUtils instead of shell commands

14
Sometimes we invoke basic Unix commands via the shell when there is also a Ruby API for doing it. Use the Ruby API if it exists. <http://www.ruby-doc.org/stdlib-2.0.0/libdoc/fileutils/rdoc/FileUtils.html#module-FileUtils-label-Module+Functions>
Jacob Vosmaer committed
15 16 17 18 19 20 21 22 23 24 25 26 27

```ruby
# Wrong
system "mkdir -p tmp/special/directory"
# Better (separate tokens)
system *%W(mkdir -p tmp/special/directory)
# Best (do not use a shell command)
FileUtils.mkdir_p "tmp/special/directory"

# Wrong
contents = `cat #{filename}`
# Correct
contents = File.read(filename)
28 29 30 31 32 33

# Sometimes a shell command is just the best solution. The example below has no
# user input, and is hard to implement correctly in Ruby: delete all files and
# directories older than 120 minutes under /some/path, but not /some/path
# itself.
Gitlab::Popen.popen(%W(find /some/path -not -path /some/path -mmin +120 -delete))
Jacob Vosmaer committed
34 35 36 37
```

This coding style could have prevented CVE-2013-4490.

38 39 40 41 42 43 44 45 46 47
## Always use the configurable git binary path for git commands

```ruby
# Wrong
system(*%W(git branch -d -- #{branch_name}))

# Correct
system(*%W(#{Gitlab.config.git.bin_path} branch -d -- #{branch_name}))
```

Jacob Vosmaer committed
48 49
## Bypass the shell by splitting commands into separate tokens

50
When we pass shell commands as a single string to Ruby, Ruby will let `/bin/sh` evaluate the entire string. Essentially, we are asking the shell to evaluate a one-line script. This creates a risk for shell injection attacks. It is better to split the shell command into tokens ourselves. Sometimes we use the scripting capabilities of the shell to change the working directory or set environment variables. All of this can also be achieved securely straight from Ruby
Jacob Vosmaer committed
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

```ruby
# Wrong
system "cd /home/git/gitlab && bundle exec rake db:#{something} RAILS_ENV=production"
# Correct
system({'RAILS_ENV' => 'production'}, *%W(bundle exec rake db:#{something}), chdir: '/home/git/gitlab')

# Wrong
system "touch #{myfile}"
# Better
system "touch", myfile
# Best (do not run a shell command at all)
FileUtils.touch myfile
```

This coding style could have prevented CVE-2013-4546.

## Separate options from arguments with --

70
Make the difference between options and arguments clear to the argument parsers of system commands with `--`. This is supported by many but not all Unix commands.
Jacob Vosmaer committed
71 72 73 74 75 76 77 78 79 80 81

To understand what `--` does, consider the problem below.

```
# Example
$ echo hello > -l
$ cat -l
cat: illegal option -- l
usage: cat [-benstuv] [file ...]
```

82
In the example above, the argument parser of `cat` assumes that `-l` is an option. The solution in the example above is to make it clear to `cat` that `-l` is really an argument, not an option. Many Unix command line tools follow the convention of separating options from arguments with `--`.
Jacob Vosmaer committed
83 84 85 86 87 88 89 90 91 92 93

```
# Example (continued)
$ cat -- -l
hello
```

In the GitLab codebase, we avoid the option/argument ambiguity by _always_ using `--`.

```ruby
# Wrong
94
system(*%W(#{Gitlab.config.git.bin_path} branch -d #{branch_name}))
Jacob Vosmaer committed
95
# Correct
96
system(*%W(#{Gitlab.config.git.bin_path} branch -d -- #{branch_name}))
Jacob Vosmaer committed
97 98 99 100 101 102
```

This coding style could have prevented CVE-2013-4582.

## Do not use the backticks

103
Capturing the output of shell commands with backticks reads nicely, but you are forced to pass the command as one string to the shell. We explained above that this is unsafe. In the main GitLab codebase, the solution is to use `Gitlab::Popen.popen` instead.
Jacob Vosmaer committed
104 105 106

```ruby
# Wrong
107
logs = `cd #{repo_dir} && #{Gitlab.config.git.bin_path} log`
Jacob Vosmaer committed
108
# Correct
109
logs, exit_status = Gitlab::Popen.popen(%W(#{Gitlab.config.git.bin_path} log), repo_dir)
Jacob Vosmaer committed
110 111 112 113 114 115 116 117 118 119 120

# Wrong
user = `whoami`
# Correct
user, exit_status = Gitlab::Popen.popen(%W(whoami))
```

In other repositories, such as gitlab-shell you can also use `IO.popen`.

```ruby
# Safe IO.popen example
121
logs = IO.popen(%W(#{Gitlab.config.git.bin_path} log), chdir: repo_dir) { |p| p.read }
Jacob Vosmaer committed
122 123 124
```

Note that unlike `Gitlab::Popen.popen`, `IO.popen` does not capture standard error.
125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151

## Avoid user input at the start of path strings

Various methods for opening and reading files in Ruby can be used to read the
standard output of a process instead of a file.  The following two commands do
roughly the same:

```
`touch /tmp/pawned-by-backticks`
File.read('|touch /tmp/pawned-by-file-read')
```

The key is to open a 'file' whose name starts with a `|`.
Affected methods include Kernel#open, File::read, File::open, IO::open and IO::read.

You can protect against this behavior of 'open' and 'read' by ensuring that an
attacker cannot control the start of the filename string you are opening.  For
instance, the following is sufficient to protect against accidentally starting
a shell command with `|`:

```
# we assume repo_path is not controlled by the attacker (user)
path = File.join(repo_path, user_input)
# path cannot start with '|' now.
File.read(path)
```

152 153 154 155 156
If you have to use user input a relative path, prefix `./` to the path.

Prefixing user-supplied paths also offers extra protection against paths
starting with `-` (see the discussion about using `--` above).

157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189
## Guard against path traversal

Path traversal is a security where the program (GitLab) tries to restrict user
access to a certain directory on disk, but the user manages to open a file
outside that directory by taking advantage of the `../` path notation.

```
# Suppose the user gave us a path and they are trying to trick us
user_input = '../other-repo.git/other-file'

# We look up the repo path somewhere
repo_path = 'repositories/user-repo.git'

# The intention of the code below is to open a file under repo_path, but
# because the user used '..' she can 'break out' into
# 'repositories/other-repo.git'
full_path = File.join(repo_path, user_input)
File.open(full_path) do # Oops!
```

A good way to protect against this is to compare the full path with its
'absolute path' according to Ruby's `File.absolute_path`.

```
full_path = File.join(repo_path, user_input)
if full_path != File.absolute_path(full_path)
  raise "Invalid path: #{full_path.inspect}"
end

File.open(full_path) do # Etc.
```

A check like this could have avoided CVE-2013-4583.
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219

## Properly anchor regular expressions to the start and end of strings

When using regular expressions to validate user input that is passed as an argument to a shell command, make sure to use the `\A` and `\z` anchors that designate the start and end of the string, rather than `^` and `$`, or no anchors at all. 

If you don't, an attacker could use this to execute commands with potentially harmful effect.

For example, when a project's `import_url` is validated like below, the user could trick GitLab into cloning from a Git repository on the local filesystem.

```ruby
validates :import_url, format: { with: URI.regexp(%w(ssh git http https)) }
# URI.regexp(%w(ssh git http https)) roughly evaluates to /(ssh|git|http|https):(something_that_looks_like_a_url)/ 
```

Suppose the user submits the following as their import URL:

```
file://git:/tmp/lol
```

Since there are no anchors in the used regular expression, the `git:/tmp/lol` in the value would match, and the validation would pass.

When importing, GitLab would execute the following command, passing the `import_url` as an argument:


```sh
git clone file://git:/tmp/lol
```

Git would simply ignore the `git:` part, interpret the path as `file:///tmp/lol` and import the repository into the new project, in turn potentially giving the attacker access to any repository in the system, whether private or not.