posted 2019-03-16
(Wherein we find a proposal for ACLs in POSIX that are better than POSIX ACLs.)
Access Control Lists (ACLs) are a flexible way to grant permissions on files within an operating system. There are three main types you have probably encountered:
If you right-click on a file, choose Properties, and then the Security tab, the stuff you see there is the access control list. Some people can do some stuff, other people can't do other stuff. These are presumably described somewhere in the Windows Access Control documentation.
These borrow from another filesystem, version 4 of the Network File System (NFSv4). As a result, they are called “NFSv4 ACLs.” This is not the hard part of the article. They are specified in RFC 7530.
These are usually called just “POSIX ACLs,” because they are not a part of the POSIX standard. They appeared in a draft called POSIX.1e, but that draft was never ratified. Nevertheless, they're what's implemented for Linux. So if you've used filesystem ACLs on Linux, you used these guys.
The first two are essentially equivalent. The design of NFSv4 ACLs was based on Windows ACLs, and both of them are better than the POSIX.1e ACLs that we currently have on Linux. Moreover the Windows and NFSv4 ACLs are interoperable: permissions can be accurately mapped back and forth between Windows and macOS systems. Not so on Linux, because POSIX.1e ACLs aren't expressive enough. To remedy that, the RichACLs project aims to bring NFSv4 ACLs to Linux.
Nobody uses POSIX ACLs on Linux, for two reasons:
And both of these problems are the result of one dumb design decision, to abuse the group permission bits to store something other than group permissions. The specification for POSIX ACLs starts out great. If you want to grant some user permission to a file, then you add an ACL to that file that says what he can do. For backwards-compatibility with the standard UNIX permission bits, the owner and other permission bits get interpreted as special ACLs that do exactly what the permission bits did:
The permissions specified by the file owner class permission bits correspond to the permissions associated with the
ACL_USER_OBJ
entry… The permissions specified by the file other class permission bits correspond to the permissions associated with theACL_OTHER
entry.
So far so good. If I sat down right now to write an ACL specification, that might be what I would come up with. And then,
The permissions specified by the file group class permission bits correspond to the permissions associated with the
ACL_GROUP_OBJ
entry or the permissions associated with theACL_MASK
entry if the ACL contains anACL_MASK
entry.
Derp. That says that the group permission bits might not be group
permission bits if an invisible ACL entry is present. In practice,
an ACL_MASK
entry is always present if the file has an
ACL, so the group permission bits always represent a
“permissions mask” rather than group permissions on
files with ACLs. But, not all files have ACLs. Thus the meaning of
the group permission bits changes when the file acquires an ACL.
This mistake is so mistaken that it has a name. In database land, it's called a “polymorphic association,” and is the focus of Chapter 7 of the book SQL Antipatterns by Bill Karwin. If you're not familiar with the term, “antipattern” means that it's the opposite of something you should do. I can also cite my (Second Edition) copy of Code Complete by Steve McConnell, which says, in the section titled Using Each Variable for Exactly one Purpose using big kindergarten letters,
Use each variable for one purpose only… avoid variables with hidden meanings… even if the double use is clear to you, it won't be to someone else.
The Common Weakness Enumeration project calls this innovation a Multiple Interpretation Error or an Interpretation Conflict.
A thorough understanding of filesystem permissions is essential to your security, because on “everything is a file” UNIX, they are your security. However, under threat of the group-bits mask, your existing knowledge of UNIX permissions is no longer valid. This is terrifying for novices, who don't want to learn a new set of complicated rules. This is terrifying for experts, who know that a system has to be easily understood to be secure. And the complexity comes not only from the group bits: almost half of the access check algorithm in the acl(5) man page is special cases for the goddamned mask!
Since POSIX ACLs redefine the meaning of the group permission bits, any tool that treats group permission bits like group permission bits is going to dick up your ACLs. For example, the cp program breaks default ACLs, because it tries to ensure that the target's group permission bits (which are no longer group permission bits, after the default ACL is applied) match the source. The end result is that all of your default ACLs on the target directory get reduced to whatever the group bits allowed on the source file:
user $ mkdir acl
user $ cd acl
user $ setfacl --default -m user:apache:rwx .
user $ cp /etc/profile ./
user $ getfacl --omit-header ./profile
user:apache:rwx #effective:r--
group::r-x #effective:r--
mask::r--
other::r--
This is usually merely annoying, but can also lead to security vulnerabilities. For example, a program might remove a mask thinking that it's only loosening the group permissions, when in reality all permissions are loosened. The CWE's description of an Interpretation Conflict sums this up nicely:
Product A handles inputs or steps differently than Product B, which causes A to perform incorrect actions based on its perception of B's state.
Yup.
Other fundamental utilities like mkdir
and tar exhibit the same problem, and
they can't be fixed. Each program would need to understand how to
undo the ACL mask in a way that doesn't compromise security. This
can be done—the apply-default-acl
program implements it—but it's far too much security-sensitive
code to copy & paste into every program that calls
chmod(2)
. No one's going to do it so long as only a tiny
fraction of users use POSIX ACLs. And only a tiny fraction of users
will ever use POSIX ACLs, because POSIX ACLs are useless if I can't
use cp to copy files into a directory
with ACLs.
RichACLs are a brand-new implementation of the superior NFSv4-style ACLs on Linux, that nobody is going to use for two reasons:
RichACLs incorporate all of the nice features of Windows and NFSv4 ACLs, but they also borrow you-know-what from POSIX ACLs. And they've gone full retard: with RichACLs, all of the traditional permission bits can act as masks, and everything is controlled by metadata. Watch the richacl(7) man page try to explain this shit:
RichACLs consist of a number of ACL entries, three file masks, and a set of flags specifying attributes of the ACL as a whole (by contrast with the per-ACL-entry flags described below)…
The owner, group, and other file masks further control which permissions the ACL grants, subject to the
masked (m)
andwrite_through (w)
ACL flags: when the permissions of a file or directory are changed withchmod(2)
, the file masks are set based on the new file mode, and themasked
andwrite_through
ACL flags are set. Likewise, when a new file or directory inherits an ACL from its parent directory, the file masks are set to the intersection between the permissions granted by the inherited ACL and the mode parameter as given toopen(2)
,mkdir(2)
, and similar, and the masked ACL flag is set. In both cases, the file masks limit the permissions that the ACL will grant…
masked (m)
When set, the file masks define upper limits on the permissions the ACL may grant. When not set, the file masks are ignored.
write_through (w)
When this flag and the masked flag are both set, the owner and other file masks define the actual permissions granted to the file owner and to others instead of defining an upper limit. When the
masked
flag is not set, thewrite_through
flag has no effect.
If you have no idea what you just read: good, you are perhaps a sane and rational individual. I don't actually know what the fuck is going on, but I'm pretty sure it's more complicated than it used to be with the POSIX ACLs that were already too complicated. At the moment, the richacl git repository contains a separate 18KiB richaclex(7) man page that “…shows how they interact with the POSIX file permission bits.” Okay.
The complexity would be fine if it needed to be implemented only a few times, by technical people. But with ACLs, either
If a lawyer and a paralegal want to share some documents, do you
think they're going to be able to read and understand those man
pages? Because the only thing I did take away from the word salad
in the man page is that calling chmod
will still dick up your default ACLs: “when the
permissions of a file or directory are changed with
chmod(2)
, the file masks are set based on the new file
mode, and the masked
and write_through
ACL
flags are set.” So RichACLs won't work either, and no one will
use them.
Casey Schaufler, who was the technical editor on the POSIX.1e draft, gave a talk at the 2018 linux.conf.au conference titled The Twisting, Turning, Narrow Road That Is Security. In it, he describes the rationale behind the group-bits mask. It's worth reproducing in full.
The initial proposal was that, if you had an access control list, you used the access control list. Period. End of sentence. If you had the mode bits but no access control list, you used the mode bits. Everybody would have been happy there; access control lists would have been very simple. But, we were in an era of compatibility, and so… we didn't do that…
Backward-compatibility is a real nuisance on occasion. One of the members of the team said, here's what we have to be able to do:
- Do a
stat()
to get the mode bits [of some file]- Set the mode to zero [on that file], so that nobody can access it
- Set the mode bits back [to what they were originally]
If you have an access control list, that behavior still needs to be supported. So
chmod 0
has to turn off access, and thenchmod
back to what it was before has to give you the exact same access you had before, even if you have an access control list. Because that's the way people write programs. It would really be nice if, on occasion, if we could change something…So we ended up with this interesting thing called a mask… in support of this one little use scenario here.
Wow, maybe that guy is dead?
Keep in mind that RichACLs aren't a proposed amendment to the POSIX standard. This leads to a conflict between ACL permissions and the traditional permission bits. RichACLs will be used on systems where most of the other software expects POSIX semantics. So insofar as possible, RichACLs do have to respect the permission bits, because that's what POSIX currently says. Specifically, POSIX.1-2017 insists that any additional access control mechanism (such as RichACLs) must treat the permission bits as an upper bound:
An additional access control mechanism shall only further restrict the access permissions defined by the file permission bits.
To paraphrase, we are kind of fucked so long as RichACLs are a vague
“additional access control mechanism.” If, for example,
cp calls chmod(2)
to clear
the “other” bits, then we have to honor that, regardless
of what any ACLs say.
There are three issues that need to be addressed:
So here's what we do:
A simple, predictable, standard, solution that actually works. The mode bits and masks are not mentioned anywhere in the NFSv4 ACL specification, so we still have a faithful implementation of that standard. Oh, and this is exactly how macOS implements them. Copy that shit and be done with it.
The RichACLs implementation is currently an unofficial patch to the Linux kernel, so there is still time to get it right.