From a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d Mon Sep 17 00:00:00 2001 From: David Howells Date: Fri, 14 Nov 2008 10:39:24 +1100 Subject: CRED: Make execve() take advantage of copy-on-write credentials Make execve() take advantage of copy-on-write credentials, allowing it to set up the credentials in advance, and then commit the whole lot after the point of no return. This patch and the preceding patches have been tested with the LTP SELinux testsuite. This patch makes several logical sets of alteration: (1) execve(). The credential bits from struct linux_binprm are, for the most part, replaced with a single credentials pointer (bprm->cred). This means that all the creds can be calculated in advance and then applied at the point of no return with no possibility of failure. I would like to replace bprm->cap_effective with: cap_isclear(bprm->cap_effective) but this seems impossible due to special behaviour for processes of pid 1 (they always retain their parent's capability masks where normally they'd be changed - see cap_bprm_set_creds()). The following sequence of events now happens: (a) At the start of do_execve, the current task's cred_exec_mutex is locked to prevent PTRACE_ATTACH from obsoleting the calculation of creds that we make. (a) prepare_exec_creds() is then called to make a copy of the current task's credentials and prepare it. This copy is then assigned to bprm->cred. This renders security_bprm_alloc() and security_bprm_free() unnecessary, and so they've been removed. (b) The determination of unsafe execution is now performed immediately after (a) rather than later on in the code. The result is stored in bprm->unsafe for future reference. (c) prepare_binprm() is called, possibly multiple times. (i) This applies the result of set[ug]id binaries to the new creds attached to bprm->cred. Personality bit clearance is recorded, but now deferred on the basis that the exec procedure may yet fail. (ii) This then calls the new security_bprm_set_creds(). This should calculate the new LSM and capability credentials into *bprm->cred. This folds together security_bprm_set() and parts of security_bprm_apply_creds() (these two have been removed). Anything that might fail must be done at this point. (iii) bprm->cred_prepared is set to 1. bprm->cred_prepared is 0 on the first pass of the security calculations, and 1 on all subsequent passes. This allows SELinux in (ii) to base its calculations only on the initial script and not on the interpreter. (d) flush_old_exec() is called to commit the task to execution. This performs the following steps with regard to credentials: (i) Clear pdeath_signal and set dumpable on certain circumstances that may not be covered by commit_creds(). (ii) Clear any bits in current->personality that were deferred from (c.i). (e) install_exec_creds() [compute_creds() as was] is called to install the new credentials. This performs the following steps with regard to credentials: (i) Calls security_bprm_committing_creds() to apply any security requirements, such as flushing unauthorised files in SELinux, that must be done before the credentials are changed. This is made up of bits of security_bprm_apply_creds() and security_bprm_post_apply_creds(), both of which have been removed. This function is not allowed to fail; anything that might fail must have been done in (c.ii). (ii) Calls commit_creds() to apply the new credentials in a single assignment (more or less). Possibly pdeath_signal and dumpable should be part of struct creds. (iii) Unlocks the task's cred_replace_mutex, thus allowing PTRACE_ATTACH to take place. (iv) Clears The bprm->cred pointer as the credentials it was holding are now immutable. (v) Calls security_bprm_committed_creds() to apply any security alterations that must be done after the creds have been changed. SELinux uses this to flush signals and signal handlers. (f) If an error occurs before (d.i), bprm_free() will call abort_creds() to destroy the proposed new credentials and will then unlock cred_replace_mutex. No changes to the credentials will have been made. (2) LSM interface. A number of functions have been changed, added or removed: (*) security_bprm_alloc(), ->bprm_alloc_security() (*) security_bprm_free(), ->bprm_free_security() Removed in favour of preparing new credentials and modifying those. (*) security_bprm_apply_creds(), ->bprm_apply_creds() (*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds() Removed; split between security_bprm_set_creds(), security_bprm_committing_creds() and security_bprm_committed_creds(). (*) security_bprm_set(), ->bprm_set_security() Removed; folded into security_bprm_set_creds(). (*) security_bprm_set_creds(), ->bprm_set_creds() New. The new credentials in bprm->creds should be checked and set up as appropriate. bprm->cred_prepared is 0 on the first call, 1 on the second and subsequent calls. (*) security_bprm_committing_creds(), ->bprm_committing_creds() (*) security_bprm_committed_creds(), ->bprm_committed_creds() New. Apply the security effects of the new credentials. This includes closing unauthorised files in SELinux. This function may not fail. When the former is called, the creds haven't yet been applied to the process; when the latter is called, they have. The former may access bprm->cred, the latter may not. (3) SELinux. SELinux has a number of changes, in addition to those to support the LSM interface changes mentioned above: (a) The bprm_security_struct struct has been removed in favour of using the credentials-under-construction approach. (c) flush_unauthorized_files() now takes a cred pointer and passes it on to inode_has_perm(), file_has_perm() and dentry_open(). Signed-off-by: David Howells Acked-by: James Morris Acked-by: Serge Hallyn Signed-off-by: James Morris --- security/commoncap.c | 152 +++++++++++++++++++++++++-------------------------- 1 file changed, 76 insertions(+), 76 deletions(-) (limited to 'security/commoncap.c') diff --git a/security/commoncap.c b/security/commoncap.c index b5419273f92..51dfa11e8e5 100644 --- a/security/commoncap.c +++ b/security/commoncap.c @@ -167,7 +167,7 @@ int cap_capset(struct cred *new, static inline void bprm_clear_caps(struct linux_binprm *bprm) { - cap_clear(bprm->cap_post_exec_permitted); + cap_clear(bprm->cred->cap_permitted); bprm->cap_effective = false; } @@ -198,15 +198,15 @@ int cap_inode_killpriv(struct dentry *dentry) } static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps, - struct linux_binprm *bprm) + struct linux_binprm *bprm, + bool *effective) { + struct cred *new = bprm->cred; unsigned i; int ret = 0; if (caps->magic_etc & VFS_CAP_FLAGS_EFFECTIVE) - bprm->cap_effective = true; - else - bprm->cap_effective = false; + *effective = true; CAP_FOR_EACH_U32(i) { __u32 permitted = caps->permitted.cap[i]; @@ -215,16 +215,13 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps, /* * pP' = (X & fP) | (pI & fI) */ - bprm->cap_post_exec_permitted.cap[i] = - (current->cred->cap_bset.cap[i] & permitted) | - (current->cred->cap_inheritable.cap[i] & inheritable); + new->cap_permitted.cap[i] = + (new->cap_bset.cap[i] & permitted) | + (new->cap_inheritable.cap[i] & inheritable); - if (permitted & ~bprm->cap_post_exec_permitted.cap[i]) { - /* - * insufficient to execute correctly - */ + if (permitted & ~new->cap_permitted.cap[i]) + /* insufficient to execute correctly */ ret = -EPERM; - } } /* @@ -232,7 +229,7 @@ static inline int bprm_caps_from_vfs_caps(struct cpu_vfs_cap_data *caps, * do not have enough capabilities, we return an error if they are * missing some "forced" (aka file-permitted) capabilities. */ - return bprm->cap_effective ? ret : 0; + return *effective ? ret : 0; } int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps) @@ -250,10 +247,9 @@ int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data size = inode->i_op->getxattr((struct dentry *)dentry, XATTR_NAME_CAPS, &caps, XATTR_CAPS_SZ); - if (size == -ENODATA || size == -EOPNOTSUPP) { + if (size == -ENODATA || size == -EOPNOTSUPP) /* no data, that's ok */ return -ENODATA; - } if (size < 0) return size; @@ -262,7 +258,7 @@ int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data cpu_caps->magic_etc = magic_etc = le32_to_cpu(caps.magic_etc); - switch ((magic_etc & VFS_CAP_REVISION_MASK)) { + switch (magic_etc & VFS_CAP_REVISION_MASK) { case VFS_CAP_REVISION_1: if (size != XATTR_CAPS_SZ_1) return -EINVAL; @@ -283,11 +279,12 @@ int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data cpu_caps->permitted.cap[i] = le32_to_cpu(caps.data[i].permitted); cpu_caps->inheritable.cap[i] = le32_to_cpu(caps.data[i].inheritable); } + return 0; } /* Locate any VFS capabilities: */ -static int get_file_caps(struct linux_binprm *bprm) +static int get_file_caps(struct linux_binprm *bprm, bool *effective) { struct dentry *dentry; int rc = 0; @@ -313,7 +310,10 @@ static int get_file_caps(struct linux_binprm *bprm) goto out; } - rc = bprm_caps_from_vfs_caps(&vcaps, bprm); + rc = bprm_caps_from_vfs_caps(&vcaps, bprm, effective); + if (rc == -EINVAL) + printk(KERN_NOTICE "%s: cap_from_disk returned %d for %s\n", + __func__, rc, bprm->filename); out: dput(dentry); @@ -334,18 +334,27 @@ int cap_inode_killpriv(struct dentry *dentry) return 0; } -static inline int get_file_caps(struct linux_binprm *bprm) +static inline int get_file_caps(struct linux_binprm *bprm, bool *effective) { bprm_clear_caps(bprm); return 0; } #endif -int cap_bprm_set_security (struct linux_binprm *bprm) +/* + * set up the new credentials for an exec'd task + */ +int cap_bprm_set_creds(struct linux_binprm *bprm) { + const struct cred *old = current_cred(); + struct cred *new = bprm->cred; + bool effective; int ret; - ret = get_file_caps(bprm); + effective = false; + ret = get_file_caps(bprm, &effective); + if (ret < 0) + return ret; if (!issecure(SECURE_NOROOT)) { /* @@ -353,63 +362,47 @@ int cap_bprm_set_security (struct linux_binprm *bprm) * executables under compatibility mode, we override the * capability sets for the file. * - * If only the real uid is 0, we do not set the effective - * bit. + * If only the real uid is 0, we do not set the effective bit. */ - if (bprm->e_uid == 0 || current_uid() == 0) { + if (new->euid == 0 || new->uid == 0) { /* pP' = (cap_bset & ~0) | (pI & ~0) */ - bprm->cap_post_exec_permitted = cap_combine( - current->cred->cap_bset, - current->cred->cap_inheritable); - bprm->cap_effective = (bprm->e_uid == 0); - ret = 0; + new->cap_permitted = cap_combine(old->cap_bset, + old->cap_inheritable); } + if (new->euid == 0) + effective = true; } - return ret; -} - -int cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe) -{ - const struct cred *old = current_cred(); - struct cred *new; - - new = prepare_creds(); - if (!new) - return -ENOMEM; - - if (bprm->e_uid != old->uid || bprm->e_gid != old->gid || - !cap_issubset(bprm->cap_post_exec_permitted, - old->cap_permitted)) { - set_dumpable(current->mm, suid_dumpable); - current->pdeath_signal = 0; - - if (unsafe & ~LSM_UNSAFE_PTRACE_CAP) { - if (!capable(CAP_SETUID)) { - bprm->e_uid = old->uid; - bprm->e_gid = old->gid; - } - if (cap_limit_ptraced_target()) { - bprm->cap_post_exec_permitted = cap_intersect( - bprm->cap_post_exec_permitted, - new->cap_permitted); - } + /* Don't let someone trace a set[ug]id/setpcap binary with the revised + * credentials unless they have the appropriate permit + */ + if ((new->euid != old->uid || + new->egid != old->gid || + !cap_issubset(new->cap_permitted, old->cap_permitted)) && + bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) { + /* downgrade; they get no more than they had, and maybe less */ + if (!capable(CAP_SETUID)) { + new->euid = new->uid; + new->egid = new->gid; } + if (cap_limit_ptraced_target()) + new->cap_permitted = cap_intersect(new->cap_permitted, + old->cap_permitted); } - new->suid = new->euid = new->fsuid = bprm->e_uid; - new->sgid = new->egid = new->fsgid = bprm->e_gid; + new->suid = new->fsuid = new->euid; + new->sgid = new->fsgid = new->egid; - /* For init, we want to retain the capabilities set - * in the init_task struct. Thus we skip the usual - * capability rules */ + /* For init, we want to retain the capabilities set in the initial + * task. Thus we skip the usual capability rules + */ if (!is_global_init(current)) { - new->cap_permitted = bprm->cap_post_exec_permitted; - if (bprm->cap_effective) - new->cap_effective = bprm->cap_post_exec_permitted; + if (effective) + new->cap_effective = new->cap_permitted; else cap_clear(new->cap_effective); } + bprm->cap_effective = effective; /* * Audit candidate if current->cap_effective is set @@ -425,23 +418,31 @@ int cap_bprm_apply_creds (struct linux_binprm *bprm, int unsafe) */ if (!cap_isclear(new->cap_effective)) { if (!cap_issubset(CAP_FULL_SET, new->cap_effective) || - bprm->e_uid != 0 || new->uid != 0 || - issecure(SECURE_NOROOT)) - audit_log_bprm_fcaps(bprm, new, old); + new->euid != 0 || new->uid != 0 || + issecure(SECURE_NOROOT)) { + ret = audit_log_bprm_fcaps(bprm, new, old); + if (ret < 0) + return ret; + } } new->securebits &= ~issecure_mask(SECURE_KEEP_CAPS); - return commit_creds(new); + return 0; } -int cap_bprm_secureexec (struct linux_binprm *bprm) +/* + * determine whether a secure execution is required + * - the creds have been committed at this point, and are no longer available + * through bprm + */ +int cap_bprm_secureexec(struct linux_binprm *bprm) { const struct cred *cred = current_cred(); if (cred->uid != 0) { if (bprm->cap_effective) return 1; - if (!cap_isclear(bprm->cap_post_exec_permitted)) + if (!cap_isclear(cred->cap_permitted)) return 1; } @@ -477,7 +478,7 @@ int cap_inode_removexattr(struct dentry *dentry, const char *name) } /* moved from kernel/sys.c. */ -/* +/* * cap_emulate_setxuid() fixes the effective / permitted capabilities of * a process after a call to setuid, setreuid, or setresuid. * @@ -491,10 +492,10 @@ int cap_inode_removexattr(struct dentry *dentry, const char *name) * 3) When set*uiding _from_ euid != 0 _to_ euid == 0, the effective * capabilities are set to the permitted capabilities. * - * fsuid is handled elsewhere. fsuid == 0 and {r,e,s}uid!= 0 should + * fsuid is handled elsewhere. fsuid == 0 and {r,e,s}uid!= 0 should * never happen. * - * -astor + * -astor * * cevans - New behaviour, Oct '99 * A process may, via prctl(), elect to keep its capabilities when it @@ -751,4 +752,3 @@ int cap_vm_enough_memory(struct mm_struct *mm, long pages) cap_sys_admin = 1; return __vm_enough_memory(mm, pages, cap_sys_admin); } - -- cgit v1.2.3