michael orlitzky

POSIX hardlink heartache

Filing CVE-2017-18925 has reawakened in me some unprocessed feelings about hardlinks that I developed while researching CVE-2017-18188. Below I summarize my latest therapy session.

feeling bad about exploits

On UNIX systems, symbolic links (symlinks) can be exploited by bad guys to take over the system. By default, the standard system utilities like chown and chmod all follow symlinks, as do most libc functions like open. If root tries to change the owner or permissions on a file in a user-writable directory, then that user can often trick root into modifying the wrong file by replacing the original target with a symlink. To avoid this, you must take care to avoid following symlinks.

A parallel plight plagues hardlinks; although, as we will see, there is no such thing as “following a hardlink”—which makes it rather hard not to do.

feeling bad about program design

Many programs are designed to change ownership and permissions, as root, within a user-writable directory. What do you think chown -R does?

As a prominent member of several anti-systemd chat rooms, I am most deeply troubled by this pattern in OpenRC, whose checkpath helper is used to create or modify files and directories (as root) whenever a service starts or stops; and in opentmpfiles, a cross-platform processor of tmpfiles.d entries.

feeling bad about cross-platform mitigations

Linux has evolved some defenses in this regard, accessible through sysctl:

fs.protected_symlinks
disables other users’ symlinks in a sticky world-writable directory (like /tmp) unless the symlink and target owners match
fs.protected_hardlinks
disables the creation of hardlinks to things you can't write to

The (Linux-only) systemd project, for example, relies on these parameters for the safety of its tmpfiles implementation. ¡Pero cuidado! If you're not using systemd, the vanilla Linux kernel does not enable these protections by default.

Enter POSIX

The Portable Operating System Interface (POSIX) says how “UNIX” should work. It's not comprehensive, but most popular UNIX implementations (Linux, macOS, BSD…) try to respect the things that are written down in POSIX. Programs that want to run on more than one of those operating systems can generally expect and rely on what POSIX says.

Conversely, cross-platform applications cannot rely upon any behavior that POSIX does not specify. Neither of those sysctl parameters are mentioned in the POSIX standard, so cross-platform applications cannot rely on them for security.

feeling pretty good about avoiding symlinks

Programs written in C can pass the O_NOFOLLOW flag to open (or stat, or chmod, or…) to avoid following symlinks entirely. Those functions are all booby-trapped, however: passing O_NOFOLLOW only avoids symlinks in the terminal component of a path. If you open /tmp/foo/bar with O_NOFOLLOW and if /tmp/foo is a symlink, it will still be followed.

To work around that, you first have to give up on paths and resign yourself to using file descriptors everywhere instead. That means, for example, replacing open, stat, and chmod, with openat, fstatat, and fchmodat, respectively. Once you've obtained a file descriptor from a path, it always references the same “physical” file (more on that in a second), even if what's at that path changes. Suppose you've obtained a file descriptor for /tmp/foo.jpg. You might use fstatat to check its permissions, and fchmodat to adjust them if necessary. If someone replaces /tmp/foo.jpg with a symlink to /etc/passwd in the middle of that, it won't hurt: the descriptor you already have is not automatically transmogrified into a descriptor for /etc/passwd.

Second, each path must be opened from the root up, recursively, with O_NOFOLLOW. In each iteration, a directory descriptor that is assumed to have been obtained safely is passed to openat(…, O_NOFOLLOW) to obtain the directory descriptor for the next iteration. The next iteration is then safe as well. Opening the root (/) had better be safe; as a result, this whole process is “safe by induction.” The safe_open(_ex) functions in apply-default-acl demonstrate this.

feeling bad about avoiding hardlinks

To understand why hardlinks cannot be avoided, one must first understand hardlinks, and hardlinks are hard to understand (cf. hard lemonade). To prove that they are hard to understand, I'm going to explain them poorly. Please read Advanced Programming in the Unix Environment (third edition) by Stevens and Rago instead.

The term “hardlink” is a misnomer. A hardlink is just a name for a file. When you think of a file at the lowest level, you imagine a special little chunk, somewhere on a physical disk, that contains your data. In your filesystem, every one of those special little chunks is pointed to by something called an inode, and each of those inodes is pointed to by one or more named directory entries, each of which (by virtue of their existence) associates a name with your special little chunk. Those names—even if there's only one—are what we call “hardlinks.” Nom would be a betternomer.

If you understood the preceding paragraph, you will realize that being a hardlink to something is a nonsensical concept: every name for a file is a hardlink. It's meaningless to ask if a function or program will “follow hardlinks,” because if it uses filenames, it does.

So, then… how might we detect the names that could lead to a security exploit? The farce oft perpetrated is to count the number of names that a special little chunk has, and to treat anything with more than one name as dangerous. The number of names referencing an inode is stored in its st_nlink field, so we start to worry if st_nlink > 1. Here's an example from systemd-tmpfiles,

1
2
3
4
5
static bool hardlink_vulnerable(const struct stat *st) {
  assert(st);

  return !S_ISDIR(st->st_mode) && st->st_nlink > 1 && dangerous_hardlinks();
}

and in the interest of fairness, one from OpenRC's checkpath:

1
2
3
4
5
  if ((type != inode_dir) && (st.st_nlink > 1)) {
    eerror("%s: chmod: %s %s", applet, "Too many hard links to", path);
    close(readfd);
    return -1;
  }

Within a program, there are more layers of obfuscation (sometimes called abstraction) involved. Your special little chunk is still pointed to by an inode. But now, every inode is pointed to by a vnode, and every vnode is pointed to by the file table. Each entry in the file table is pointed to by a file descriptor, and within a process, each file descriptor is identified by a number that is generally valid only within that process. If you've done everything else correctly, file descriptors are what you will be using to change ownership or permissions.

Notice how the names associated with your special little chunk are completely absent from the preceding paragraph. This design has some important consequences:

The last item above foreshadows the problem with avoiding “dangerous” hardlinks. Most filesystem operations involve at least two steps:

  1. Checking whether or not it's safe to do a thing
  2. Doing the thing

Between those two steps, you don't want the target of the operation to change, otherwise it may no longer be safe to proceed. This is widespread enough to have its own acronym: TOCTOU. The solution to that problem with symlinks was to use file descriptors everywhere instead of paths. Using file descriptors you can safely answer questions like “is this a directory?,” but you cannot safely decide whether or not a hardlink is “dangerous.” The reason why is contained in that bullet point, quoted for emphasis:

Again, with respect to the st_nlink > 1 test, a special little chunk can become dangerous without changing its contents, inode, vnode, file table entry, or file descriptors.

To the point: a regular user can change a “safe” hardlink into a “dangerous” one after you've checked it but before you operate on it—even if you use file descriptors. Here's a proof-of-concept. First, create an empty file and one hardlink to it (yes, that's the wrong way to think about what's happening):

mjo $ touch original

mjo $ ln original hardlink

mjo $ ls -l

total 4

-rw-r--r-- 2 mjo mjo 0 Dec 5 10:09 hardlink

-rw-r--r-- 2 mjo mjo 0 Dec 5 10:09 original

Now compile and run the following program, which demonstrates how a bad guy can trick a good guy into changing the permissions on the original file through a descriptor for the hardlink. Inline comments explain the process.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <sys types.h>
#include <sys stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char** argv) {
  int fd_hardlink;
  struct stat st_hardlink;

  /* Good guy: obtain a file descriptor for the hardlink. His goal
   * would be to ignore the hardlink once he determines that it is
   * dangerous. */
  fd_hardlink = open("hardlink", 0);
  printf("Good guy obtains a file descriptor for hardlink...\n");

  /* Bad guy: remove the hardlink before the good guy can perform
   * the "is this a dangerous hardlink?" check. */
  unlink("hardlink");
  printf("Bad guy deletes hardlink...\n");

  /* Good guy: stat the hardlink descriptor to see if the underlying
   * special little chunk has more than one name. It will not, since
   * the bad guy just deleted the second name. */
  fstat(fd_hardlink, &st_hardlink);
  printf("Good guy determines that hardlink has %lu name(s)...\n",
	 st_hardlink.st_nlink);

  /* Good guy: change the permissions on the hardlink descriptor. This
   * succeeds even though the descriptor was obtained from a path that
   * no longer exists; operations on the descriptor still affect the
   * underlying chunk. */
  fchmod(fd_hardlink, 0);
  printf("Good guy zeroes permissions on hardlink "
	 "and affects the original.\n");
}

The output shows what's happening…

mjo $ gcc hardlinks-r-stupid.c && ./a.out

Good guy obtains a file descriptor for hardlink...

Bad guy deletes hardlink...

Good guy determines that hardlink has 1 name(s)...

Good guy zeroes permissions on hardlink and affects the original.

and we can check that the original file was affected:

mjo $ ls -l original

---------- 1 mjo mjo 0 Dec 5 10:30 original

So in short: the st_nlink > 1 test isn't safe. And there is no other test, so that's that.

diagnosis

POSIX guarantees that hard links can exist. It follows that, on POSIX systems without any non-standard protections, it's unsafe for anyone (but in particular, root) to do anything sensitive in a directory that is writable by another user. Cross-platform programs designed to do so are simply flawed.

What's the alternative? If you're doing something as root in a user-writable directory, drop privileges to that user first. There's no cross-platform way to do that in shell script, but the setuid and setgid functions are POSIX. Switching users/groups can also eliminate many uses of chown, just as setting the umask appropriately often obviates the need for chmod.

appendix

POSIX actually allows hardlinks to directories as well:

If the -s option is not specified, whether a directory can be linked is implementation-defined.

“Fortunately,” directory hardlinks don't work on Linux, so on Linux you have nothing to worry about. But if you find yourself on a system that does support them, you can be pretty sure that no one has ever thought about handling them securely. Yikes.