最後更新: 2023-09-07
Capabilities
目錄
- Check Process Capabilities
- capsh
- LXC Settings
Check Process Capabilities
# 當 process 沒有 thead 時, 那 TID=PID
/proc/PID/task/TID/status
ie.
cat /proc/$$/task/$$/status
Name: bash Umask: 0022 State: S (sleeping) Tgid: 1880 ... CapInh: 0000000000000000 CapPrm: 0000001fffffffff CapEff: 0000001fffffffff CapBnd: 0000001fffffffff CapAmb: 0000000000000000 ...
- Inherited capabilities (CapInh)
-
Permitted capabilities (CapPrm)
capabilities that can be introduced into effective when needed using syscalls -
Effective capabilities (CapEff)
capabilities that will be verified for each privilege action -
Bounding set (CapBnd)
capabilities superset, nothing more than this can be done - Ambient capabilities set (CapAmb)
capsh --decode=0000001fffffffff
0x0000001fffffffff=cap_chown,cap_dac_override,...
getpcaps $$
Capabilities for `1913': = cap_chown,cap_dac_override,...
capsh
- --print display capability relevant state
- --decode=xxx decode a hex string to a list of caps
- --supports=xxx exit 1 if capability xxx unsupported
- --drop=xxx remove xxx,.. capabilities from bset
- --caps=xxx set caps as per cap_from_text()
- --inh=xxx set xxx,.. inheritiable set
- --secbits=<n> write a new value for securebits
- --keep=<n> set keep-capabability bit to <n>
- --uid=<n> set uid to <n> (hint: id <username>)
- --gid=<n> set gid to <n> (hint: id <username>)
- --groups=g,... set the supplemental groups
- --user=<name> set uid,gid and groups to that of user
- --chroot=path chroot(2) to this path
- --killit=<n> send signal(n) to child
- --forkfor=<n> fork and make child sleep for <n> sec
Proccess 的執行身份
Make arbitrary manipulations of process GIDs and supplementary GID list;
- CAP_SETUID
- CAP_SETGID
audit
cap_audit_control
enable and disable kernel auditing;
change auditing filter rules;
retrieve auditing status and filtering rules.
cap_audit_write (since linux 2.6.11)
write records to kernel auditing log
- Increase resource limits (see setrlimit(2));
- Override maximum number of consoles on console allocation;
* use raw and packet sockets
* bind to any address for transparent proxying
沒有它會影響 tcpdump 及 iptables
tcpdump -i eth0
tcpdump: eth0: You don't have permission to capture on that device
iptables -nL
iptables v1.4.12: can't initialize iptables table `filter': Permission denied (you must be root) Perhaps iptables or your kernel needs to be upgraded.
cap_net_broadcast (unused)
make socket broadcasts, and listen to multicasts
cap_ipc_lock
lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2)).
---
cap_sys_ptrace
trace arbitrary processes using ptrace(2);
---
cap_fsetid
overrides the following restrictions, that the effective user id shall
match the file owner id, when setting the s_isuid and s_isgid bits on
that file; that the effective group id (or one of the supplementary
group ids) shall match the file owner id when setting the s_isgid bit
on that file; that the s_isuid and s_isgid bits are cleared on
successful return from chown(2) (not implemented).
Well, the Pine "problem" isn't really a problem as long as I leave the CAP_FSETID bit off of it. Pine opens the file $HOME/mail/saved-messages when I look in saved messages, so if $HOME/mail/saved-messages is a symlink to /etc/shadow and pine has the CAP_FSETID capability, it can read /etc/shadow even though you normally wouldn't have read acccess to it.
---
cap_block_suspend (since linux 3.5)
block system suspend (epoll(7) epollwakeup, /proc/sys/wake_lock).
---
cap_sys_boot
use reboot(2) and kexec_load(2)
By default, lxc does not support rebooting a container from within.
It will simply stop and the host will not know to start it.
If you want your container to reboot gracefully, you need sys_boot capability
---
cap_sys_chroot
use chroot(2)
---
cap_sys_time
set system clock (settimeofday(2), stime(2), adjtimex(2)); set real-time (hardware) clock.
---
cap_sys_module
load and unload kernel modules(see init_module(2) and delete_module(2));
---
CAP_SETFCAP
Set file capabilities.
CLI: setcap - set file capabilities
---
CAP_SETPCAP
make changes to the securebits flags
add any capability from the calling thread's bounding set to its inheritable set
drop capabilities from the bounding set
If file capabilities are not supported:
grant or remove any capability in the caller's permitted capability set to or from any other process.
---
cap_mac_admin
# mac = Mandatory Access Control
allow mac configuration or state changes
---
cap_mac_override
override mandatory access control (mac)
CAP_SYS_ADMIN
* Perform a range of system administration operations including: quotactl(2), mount(2), umount(2), swapon(2), swapoff(2),
sethostname(2), and setdomainname(2);
* perform privileged syslog(2) operations (since Linux 2.6.37, CAP_SYSLOG should be used to permit such operations);
* perform VM86_REQUEST_IRQ vm86(2) command;
* perform IPC_SET and IPC_RMID operations on arbitrary System V IPC objects;
* perform operations on trusted and security Extended Attributes (see attr(5));
* use lookup_dcookie(2);
* use ioprio_set(2) to assign IOPRIO_CLASS_RT and (before Linux 2.6.25) IOPRIO_CLASS_IDLE I/O scheduling classes;
* forge UID when passing socket credentials;
* exceed /proc/sys/fs/file-max, the system-wide limit on the number of open files, in system calls that open files
(e.g., accept(2), execve(2), open(2), pipe(2));
* employ CLONE_* flags that create new namespaces with clone(2) and unshare(2);
* call perf_event_open(2);
* access privileged perf event information;
* call setns(2);
* call fanotify_init(2);
* perform KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2) operations;
* perform madvise(2) MADV_HWPOISON operation;
* employ the TIOCSTI ioctl(2) to insert characters into the input queue of a terminal other than the caller's controlling terminal.
* employ the obsolete nfsservctl(2) system call;
* employ the obsolete bdflush(2) system call;
* perform various privileged block-device ioctl(2) operations;
* perform various privileged filesystem ioctl(2) operations;
* perform administrative operations on many device drivers.
LXC Settings
lxc.cap.drop # space separation items
My LXC Settings
#### Capabilities #### # 這行肯定有 lxc.cap.drop = sys_module sys_time mac_admin mac_override lxc.cap.drop = sys_admin lxc.cap.drop = sys_resource lxc.cap.drop = sys_rawio lxc.cap.drop = mknod setuid net_raw lxc.cap.drop = setfcap setpcap lxc.cap.drop = sys_pacct sys_ptrace lxc.cap.drop = audit_control audit_write lxc.cap.drop = sys_tty_config sys_resource
Notes
# ubuntu 12 要它才 start 到 #lxc.cap.drop = sys_admin # 在 vps 內行 reboot 要它 #lxc.cap.drop = sys_boot # ssh 要它才 start 到 #lxc.cap.drop = sys_chroot # dhcp, iptables 及 tcpdump 要用它 #lxc.cap.drop = net_raw # U16 要有它們才 login 到, 否則一直會 login fail #lxc.cap.drop = audit_control audit_write # 最好有它, 因為 /dev 下有機會少左野 #lxc.cap.drop = mknod
Remark
dropping sys_admin and net_admin isn't very practical, you won't make your container much safer,
原因: as root in the container will be able to re-grant itself any dropped capability
CAP_SYS_MODULE should be specified as sys_module
Doc
http://manpages.ubuntu.com/manpages/trusty/en/man7/capabilities.7.html