This should be something useful for Linux kernel explorers and eBPF nerds!
Last year I released a tool called syscallargs that walked through the Linux /sys/kernel/debug/tracing/events/syscalls
directory tree and allowed you to query and list available system calls in your current system from the command line. And it printed out syscall argument names and their datatypes, so it has helped me avoid opening man-pages (or web-pages) a few times.
Later I realized that since many other kinds of kernel events and tracepoints are also presented in the same /sys
directory, in the same format as syscall events, we can examine all of them using the same approach. So, I wrote a new, more general tool called tracepointargs and recently uploaded it to my 0x.tools GitHub repo.
This tool can examine the metadata of all Linux kernel tracepoints that the tracefs mounted under /sys
has to offer. It also has a major improvement, it can access and expand Linux struct layouts of any kernel structures referenced by the tracepoint API. It won’t do any actual tracing for you, but if you find yourself constantly searching through the kernel source files to see how some tracepoint’s metadata is laid out, it can help.
For example, you can focus on a single tracepoint if you know its name. Or use a regular expression to match many. The other arguments are similar to my previous syscallargs
tool, but now you can access any tracepoint, like the random “filemap” one below:
$ sudo ./tracepointargs filemap/file_check_and_advance_wb_err -lit filemap/file_check_and_advance_wb_err // 531 0: (struct file *) file 1: (unsigned long) i_ino 2: (dev_t) s_dev 3: (errseq_t) old 4: (errseq_t) new
We see that the first argument points to a struct file, so can we see what’s exactly behind it, without starting to browse around? Yes! The -a
option accesses and expands any known Linux kernel structs that these tracepoints use:
$ sudo ./tracepointargs filemap/file_check_and_advance_wb_err -lita Error: --expand-structs (-a) requires --vmlinux
Oops! I am deliberately showing this issue though, you’ll first need to create a vmlinux.h file with all the interesting structure definitions from your kernel with bpftool
. You can install it via your package manager if needed, it’s pretty standard on modern Linuxes:
$ bpftool btf dump file /sys/kernel/btf/vmlinux format c > ~/vmlinux.h
Now we have a C header file containing lots of struct definitions, extracted from your currently running kernel! (In the future I might bundle a bunch of such header files for common kernel versions and platforms in this repo in advance, but I’m not focusing on that right now).
Let’s try again and access the newly created vmlinux.h file for struct expansion:
$ sudo ./tracepointargs filemap/file_check_and_advance_wb_err -lita --vmlinux ~/vmlinux.h filemap/file_check_and_advance_wb_err // 531 0: (struct file *) file struct file { struct callback_head f_task_work; struct llist_node f_llist; unsigned int f_iocb_flags; } 1: (unsigned long) i_ino 2: (dev_t) s_dev 3: (errseq_t) old 4: (errseq_t) new
We can see inside this structure now! But the file struct contains further structs, can we recursively follow these deeper nested structs and expand these too? Yes! The -f
option does exactly that:
$ sudo ./tracepointargs filemap/file_check_and_advance_wb_err -litaf --vmlinux ~/vmlinux.h filemap/file_check_and_advance_wb_err // 531 0: (struct file *) file struct file { struct callback_head f_task_work; struct callback_head { struct callback_head *next; struct callback_head /* recursive reference */ void (*func)(struct callback_head *); struct callback_head /* recursive reference */ } struct llist_node f_llist; struct llist_node { struct llist_node *next; struct llist_node /* recursive reference */ } unsigned int f_iocb_flags; } 1: (unsigned long) i_ino 2: (dev_t) s_dev 3: (errseq_t) old 4: (errseq_t) new
Now we “see through” all the structures and what’s underneath them, this makes some things like building new eBPF stuff a little easier!
You can omit the tracepoint name completely or use a permissive regular expression to see lots of tracepoints and the structs they come with. Here are a couple of examples from the system-call world (where my previous syscallargs
tool did not dive into the related data structure layouts itself).
Who hasn’t wrestled with remembering the exact layout of the iocb struct? :-)
$ sudo ./tracepointargs io_submit --vmlinux ~/vmlinux.h -litaf syscalls/sys_enter_io_submit // 1002 0: (int) __syscall_nr 1: (aio_context_t) ctx_id 2: (long) nr 3: (struct iocb * *) iocbpp struct iocb { __u64 aio_data; __u32 aio_key; __kernel_rwf_t aio_rw_flags; __u16 aio_lio_opcode; __s16 aio_reqprio; __u32 aio_fildes; __u64 aio_buf; __u64 aio_nbytes; __s64 aio_offset; __u64 aio_reserved2; __u32 aio_flags; __u32 aio_resfd; }
Here’s a longer one from io_uring:
$ sudo ./tracepointargs syscalls/sys_enter_io_uring_setup --vmlinux ~/vmlinux.h -litaf syscalls/sys_enter_io_uring_setup // 1282 0: (int) __syscall_nr 1: (u32) entries 2: (struct io_uring_params *) params struct io_uring_params { __u32 sq_entries; __u32 cq_entries; __u32 flags; __u32 sq_thread_cpu; __u32 sq_thread_idle; __u32 features; __u32 wq_fd; __u32 resv; struct io_sqring_offsets sq_off; struct io_sqring_offsets { __u32 head; __u32 tail; __u32 ring_mask; __u32 ring_entries; __u32 flags; __u32 dropped; __u32 array; __u32 resv1; __u64 user_addr; } struct io_cqring_offsets cq_off; struct io_cqring_offsets { __u32 head; __u32 tail; __u32 ring_mask; __u32 ring_entries; __u32 overflow; __u32 cqes; __u32 flags; __u32 resv1; __u64 user_addr; } }
If you run this tool for all possible (thousands of) tracepoints in your machine, you’ll see some pretty big and deep layers of structures attached and available to some tracepoints…
Of course you can just get a terse list of tracepoints and their arguments, without examining what’s behind their datatypes (and without needing the vmlinux.h file) too:
$ sudo ./tracepointargs 'block/*rq*' -t block/block_getrq((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (char) rwbs[8], (char) comm[16]); block/block_rq_complete((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (int) error, (char) rwbs[8], (__data_loc char[]) cmd); block/block_rq_error((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (int) error, (char) rwbs[8], (__data_loc char[]) cmd); block/block_rq_insert((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (unsigned int) bytes, (char) rwbs[8], (char) comm[16], (__data_loc char[]) cmd); block/block_rq_issue((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (unsigned int) bytes, (char) rwbs[8], (char) comm[16], (__data_loc char[]) cmd); block/block_rq_merge((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (unsigned int) bytes, (char) rwbs[8], (char) comm[16], (__data_loc char[]) cmd); block/block_rq_remap((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (dev_t) old_dev, (sector_t) old_sector, (unsigned int) nr_bios, (char) rwbs[8]); block/block_rq_requeue((dev_t) dev, (sector_t) sector, (unsigned int) nr_sector, (char) rwbs[8], (__data_loc char[]) cmd);
Here’s a part of the same data as above, just formatted differently:
$ sudo ./tracepointargs 'block/*rq*' -lit block/block_getrq // 1250 0: (dev_t) dev 1: (sector_t) sector 2: (unsigned int) nr_sector 3: (char) rwbs[8] 4: (char) comm[16] block/block_rq_complete // 1262 0: (dev_t) dev 1: (sector_t) sector 2: (unsigned int) nr_sector 3: (int) error 4: (char) rwbs[8] 5: (__data_loc char[]) cmd block/block_rq_error // 1261 0: (dev_t) dev 1: (sector_t) sector 2: (unsigned int) nr_sector 3: (int) error 4: (char) rwbs[8] 5: (__data_loc char[]) cmd
That’s all, hopefully this helps you save a few seconds of your time, here and there!