I just added another tool into my 0x.tools toolset:
With xstack, you can passively sample both kernel stacks and user stacks (as long as framepointers are present) with a pretty minimalistic tool, with no direct impact to your critical application processes!
While xstack itself is just a data extraction tool, you probably want to summarize/profile all those thread states and stack profiles. Here’s an example of piping xstack output to flamelens to immediately display a (terminal) flamegraph from the sampled data:
xstack is a lightweight, completely passive stack profiler for Linux that uses eBPF task iterators to sample thread states and stack traces without injecting any tracepoints, kprobes, or perf events into the system.
Thanks to using modern eBPF sleepable task iterators and the bpf_copy_from_user_task()
helper, xstack can read the task states of all other threads in the system and read both their kernel and userspace stacks (where frame pointers are available).
This allows you to do full wall-clock time profiling (not only CPU) without having to inject any tracepoints or probes into the critical path of all the important processes in your system. And with xstack output, you can do CPU profiling without even using perf_events to inject interrupts to the critical paths!
This means that the overhead of running xstack
is essentially none! We just read anything we want from shared physical memory and let the memory controller ship bytes around between CPU caches under the hood, no signals, interrupts or interprocessor “remote function calls” are involved.
Anyway, go to the xstack subdirectory in 0x.tools repo to try it out, it’s a pretty short and simple C program.
Here’s the help section and the example output is below:
$ ./xstack --help
Usage: xstack [OPTION...]
xstack v3.0.0 by Tanel Poder [0x.tools]
Completely passive stack profiling without injecting any tracepoints
USAGE: xstack -a | -p PID | -t TID [-F HZ] [-i NUM]
EXAMPLES:
xstack -a # Sample all tasks continuously
xstack -p 1234 # Sample process 1234 and its threads
xstack -t 5678 # Sample only thread 5678
xstack -a -F 10 # Sample all tasks at 10 Hz
xstack -a -i 100 # Sample all tasks for 100 iterations
xstack -p $$ -F 5 -i 25 # Sample shell at 5 Hz for 5 seconds
-a, --all Sample all tasks/threads
-F, --freq=HZ Sampling frequency in Hz (default: 1)
-i, --iterations=NUM Number of sampling iterations (default: infinite)
-p, --pid=PID Filter by process ID (TGID)
-q, --quiet Suppress CSV header output
-r, --reverse-stack Reverse stack trace order (innermost first)
-t, --tid=TID Filter by thread ID (PID)
-?, --help Give this help list
--usage Give a short usage message
-V, --version Print program version
Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.
Report bugs to <https://github.com/tanelpoder/0xtools>.
Example output when running sysbench with MySQL on RHEL9 is below. Warning, the output is very wide :-)
$ sudo ./xstack -a | grep DISK
2025-08-14 17:28:33.276015,57052,56914,ib_log_flush,DISK,os_file_flush_func(int)+0x28;0x5585bd1a1d3e;0x5585bd1bea65;log_flusher(log_t*)+0x2d0;0x5585bd1c14d9;0x7f1a8c6dbad4,xlog_wait_on_iclog+0x16b;xfs_log_force_seq+0x8f;xfs_file_fsync+0x15e;__x64_sys_fsync+0x33;do_syscall_64+0x5c;entry_SYSCALL_64_after_hwframe+0x78
2025-08-14 17:28:33.276625,57054,56914,ib_log_writer,DISK,0x5585bd1cf9fc;0x5585bd1957d7;0x5585bd1c9950;log_writer(log_t*)+0x440;0x5585bd1c14d9;0x7f1a8c6dbad4,xfs_ilock_for_iomap+0xca;xfs_buffered_write_iomap_begin+0x100;iomap_iter+0xaf;iomap_file_buffered_write+0x93;xfs_file_buffered_write+0x85;vfs_write+0x33a;__x64_sys_pwrite64+0x90;do_syscall_64+0x5c;entry_SYSCALL_64_after_hwframe+0x78
2025-08-14 17:28:33.278146,57217,56914,connection,DISK,IO_CACHE_ostream::sync()+0x6f;MYSQL_BIN_LOG::sync_binlog_file(bool)+0x5f;MYSQL_BIN_LOG::ordered_commit(THD*, bool, bool)+0x7ba;MYSQL_BIN_LOG::commit(THD*, bool)+0x8b2;ha_commit_trans(THD*, bool, bool)+0x3fc;trans_commit(THD*, bool)+0x5f;mysql_execute_command(THD*, bool)+0x2802;Prepared_statement::execute(THD*, String*, bool)+0x597;Prepared_statement::execute_loop(THD*, String*, bool)+0x149;mysqld_stmt_execute(THD*, Prepared_statement*, bool, unsigned long, PS_PARAM*)+0x145;dispatch_command(THD*, COM_DATA const*, enum_server_command)+0xdd3;do_command(THD*)+0x261;0x5585bc6e5208;0x5585bd4bc866;start_thread+0x31a,xlog_wait_on_iclog+0x16b;xfs_log_force_seq+0x8f;xfs_file_fsync+0x15e;__x64_sys_fdatasync+0x43;do_syscall_64+0x5c;entry_SYSCALL_64_after_hwframe+0x78
2025-08-14 17:28:34.028444,57052,56914,ib_log_flush,DISK,os_file_flush_func(int)+0x28;0x5585bd1a1d3e;0x5585bd1bea65;log_flusher(log_t*)+0x2d0;0x5585bd1c14d9;0x7f1a8c6dbad4,xlog_wait_on_iclog+0x16b;xfs_log_force_seq+0x8f;xfs_file_fsync+0x15e;__x64_sys_fsync+0x33;do_syscall_64+0x5c;entry_SYSCALL_64_after_hwframe+0x78
2025-08-14 17:28:34.028645,57054,56914,ib_log_writer,DISK,0x5585bd1cf9fc;0x5585bd1957d7;0x5585bd1c9950;log_writer(log_t*)+0x440;0x5585bd1c14d9;0x7f1a8c6dbad4,xfs_vn_update_time+0xa5;file_modified_flags+0xc9;xfs_file_write_checks+0x249;xfs_file_buffered_write+0x5f;vfs_write+0x33a;__x64_sys_pwrite64+0x90;do_syscall_64+0x5c;entry_SYSCALL_64_after_hwframe+0x78
2025-08-14 17:28:35.029735,57052,56914,ib_log_flush,DISK,os_file_flush_func(int)+0x28;0x5585bd1a1d3e;0x5585bd1bea65;log_flusher(log_t*)+0x2d0;0x5585bd1c14d9;0x7f1a8c6dbad4,xlog_wait_on_iclog+0x16b;xfs_log_force_seq+0x8f;xfs_file_fsync+0x15e;__x64_sys_fsync+0x33;do_syscall_64+0x5c;entry_SYSCALL_64_after_hwframe+0x78
2025-08-14 17:28:35.029979,57054,56914,ib_log_writer,DISK,0x5585bd1cf9fc;0x5585bd1957d7;0x5585bd1c9950;log_writer(log_t*)+0x440;0x5585bd1c14d9;0x7f1a8c6dbad4,xfs_vn_update_time+0xa5;file_modified_flags+0xc9;xfs_file_write_checks+0x249;xfs_file_buffered_write+0x5f;vfs_write+0x33a;__x64_sys_pwrite64+0x90;do_syscall_64+0x5c;entry_SYSCALL_64_after_hwframe+0x78
Further reading
- xstack reuses some ideas from my more powerful xcapture tool that I published a while ago
- If you want to see how I pull all these tools and techniques together, into dimensional performance analysis method using my upcoming xtop “top for wall-clock time” tool, come and join my xtop launch demo webinar (next Tuesday 19th Aug, 1pm EDT).
That’s all, see you soon!