最後更新: 2021-03-31
Usage
i.e. 找出 pid 1725 打開的 file, 並把資料保存在 log.txt
strace -p 1725 -e open -o log.txt
-p pid # Attach to the process with the process ID pid and begin tracing.
-e expr # which modifies which events to trace
# Format: [qualifier=][!][?]value1[,[?]value2]...
# The default qualifier is "trace". "-e open" means literally "-e trace=open"
-o file # Write the trace output to the file rather than to stderr.
Other useful opts:
-f # Trace child processes as they are created by currently traced processes as a result of the fork(2) system call.
-t # Print Timestamp
Generate Statistics Report
# -c Count time, calls, and errors for each system call and report a summary on program exit
strace -c ls /home
iredadmin iredapd policyd tim % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000235 118 2 rt_sigaction 0.00 0.000000 0 10 read 0.00 0.000000 0 1 write 0.00 0.000000 0 12 open 0.00 0.000000 0 14 close 0.00 0.000000 0 1 execve 0.00 0.000000 0 1 1 access 0.00 0.000000 0 3 brk 0.00 0.000000 0 2 ioctl 0.00 0.000000 0 3 munmap 0.00 0.000000 0 1 uname 0.00 0.000000 0 9 mprotect 0.00 0.000000 0 1 rt_sigprocmask 0.00 0.000000 0 1 getrlimit 0.00 0.000000 0 26 mmap2 0.00 0.000000 0 1 stat64 0.00 0.000000 0 12 fstat64 0.00 0.000000 0 2 getdents64 0.00 0.000000 0 1 fcntl64 0.00 0.000000 0 2 1 futex 0.00 0.000000 0 1 set_thread_area 0.00 0.000000 0 1 set_tid_address 0.00 0.000000 0 1 statfs64 0.00 0.000000 0 1 set_robust_list ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000235 109 2 total
應用: 查看 hang 了的 process
# -y Print paths associated with file descriptor arguments.
strace -p -y pid
poll([{fd=9, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout)
變成了
poll([{fd=9<socket:[4898187]>, events=POLLIN}], 1, 1000^Cstrace: Process 7579 detached
lsof -i | grep 4898187 # 查看它連了去那裡
Remark
-yy Print protocol specific information associated with socket file descriptors,
and block/character device number associated with device file descriptors.
poll
poll() performs a similar task to select(2): it waits for one of a set of file descriptors to become ready to perform I/O.
POLLIN # There is data to read.
POLLPRI # There is urgent data to read.
POLLRDNORM # Equivalent to POLLIN.
POLLRDBAND # Priority band data can be read (generally unused on Linux).
POLLERR # Error condition (output only).
POLLHUP # Hang up (output only).
recvfrom()
The recvfrom() function receives data on a socket named by descriptor socket and stores it in a buffer.
The recvfrom() function applies to any datagram socket, whether connected or unconnected.
Error
recvfrom(24, 0x7fadc2311c18, 286, MSG_DONTWAIT, NULL, NULL) = -1 ETIMEDOUT (Connection timed out)
-1
recv returns a length of -1 and errno of "Resource temproarily anavialable"
MSG_DONTWAIT (since Linux 2.2)
Enables nonblocking operation; if the operation would block, the call fails with the error EAGAIN or EWOULDBLOCK. This
provides similar behavior to setting the O_NONBLOCK flag (via the fcntl(2) F_SETFL operation), but differs in that
MSG_DONTWAIT is a per-call option, whereas O_NONBLOCK is a setting on the open file description (see open(2)), which will
affect all threads in the calling process and as well as other processes that hold file descriptors referring to the same open file description.
This is not an error but an expected behavior. The documentation of MSG_DONTWAIT states:
If no data is available, then instead of blocking, return immediately with the error EAGAIN.
Which means that you should wait for and handle such situation.
ETIMEDOUT
The connection timed out during connection establishment, or due to a transmission timeout on active connection.
There are three possible reasons for seeing ETIMEDOUT:
1. The connection timed out inside recv, this is very unlikely to happen even once (but surely not several times).
2. You did not check success of connect, and the connection was never successfully established
(maybe the firewall is dropping the connection attempts?). This is the likely reason.
3. Your sockets implementation is broken, this is very unlikely.
select does not generate ETIMEDOUT, only connect and recv may.
futex
Example
strace -p 352
futex(0xb773fbd8, FUTEX_WAIT, 355, NULL
which simply means you are tracing the original parent thread, and it’s doing nothing but waiting for some other threads to finish.
# 找出某 proccess 的 thread id
ps -efL|grep <Process Name>
# 看某 thread 在做什麼
-f Trace child processes as they are created by currently traced processes
strace -f -p 15336
[pid 371] futex(0x90008644, FUTEX_WAIT_PRIVATE, 1, {0, 49942441}) = -1 ETIMEDOUT (Connection timed out)
FUTEX_WAIT and FUTEX_WAKE
This is an optimization done by linux/glibc to make futexes faster when they're not shared between processes.
Glibc will use the _PRIVATE versions of each of the futex calls unless the PTHREAD_PROCESS_SHARED attribute is set on your mutex
Doc
http://linux.die.net/man/2/futex
flock
flock - apply or remove an advisory lock on an open file
int flock(int fd, int operation);
performs one of the operations described below on the open file descriptor fd.
LOCK_SH
Place a shared lock. More than one process may hold a shared lock for a given file at a given time.
LOCK_EX
Place an exclusive lock. Only one process may hold an exclusive lock for a given file at a given time.
LOCK_UN
Remove an existing lock held by this process.
i.e.
flock(23, LOCK_EX) = 0
fcntl(23, F_SETFD, FD_CLOEXEC) = 0
The operation is determined by cmd.
F_SETFD (int)
Set the file descriptor flags to the value specified by arg.
FD_CLOEXEC
the close-on-exec flag.
If the FD_CLOEXEC bit is set, the file descriptor will automatically be closed during a successful execve(2).
(If the execve(2) fails, the file descriptor is left open.) If the FD_CLOEXEC bit is not set, the file descriptor will remain open across an execve(2).
send(), sendto(), and sendmsg()
used to transmit a message to another socket.