Skip to content

fprobes and fprobe events implementation from Codasip#19

Open
martin-kaiser wants to merge 13 commits intoCHERI-Alliance:codasip-cheri-riscv-6.18from
martin-kaiser:fprobes_codasip
Open

fprobes and fprobe events implementation from Codasip#19
martin-kaiser wants to merge 13 commits intoCHERI-Alliance:codasip-cheri-riscv-6.18from
martin-kaiser:fprobes_codasip

Conversation

@martin-kaiser
Copy link
Copy Markdown

Here's the fprobes implementation from Codasip that was reviewed and merged internally.

The first part updates the code to define and attach probes (which have an entry and and exit function), the kunit tests for fprobes and a demo module.

The second part modifies fprobe events, i.e. the code that parses a config string from /sys/kernel/tracing/dynamic_events and creates an fprobe and a dynamic event.

The config defines the function to which the probe is attached and the data that the probe should read. The probe has access to function arguments, return value, stack content and memory. Memory locations are defined by an absolute address or a symbol name, both of which may have an additional offset. The acquired data is presented to userspace as a dynamic event.

Cherifying the code revealed an OOB read when resolving a symbol name.

Most of the parsing code is shared between the different probe types. This pull request modifies only the parts that are used by fprobes.

The idea is to read stack and registers as capabilities. For now, the metadata is not stored in the dynamic event. For the memory access to a user-provided address + offset, our only option is to fabricate a capability.

I suppose that I've tested a good part of the possible combinations for the dynamic events.

martin-kaiser and others added 13 commits April 13, 2026 09:58
If fprobe_entry does not fill the allocated fgraph_data completely, the
unused part does not have to be zeroed.

fgraph_data is a short-lived part of the shadow stack. The preceding
length field allows locating the end regardless of the content.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Use DIV_ROUND_UP in two places where the division + round up is
hand coded.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Code that registers with the function graph tracer may request a private
data area to store data between function entry and return. This area is
allocated on the return stack, the caller requests a size in bytes.

The fgraph code calculates the number of return stack words for the
requested number of bytes. Fix this calcuation for cheri, where the
return stack words are uintptr_t instead of unsigned long.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
fprobe allocates a private data chunk from fgraph. This chunk starts with
an fprobe header. The header contains a struct fprobe pointer and the
length of the following private data.

On 64-bit systems, the two header fields are packed into one unsigned
long variable. An architecture may provide functions for packing and
unpacking. The generic versions take the pointer and replace the
higher bits with the length field. This assumes that the highest bits are
all 1 for a pointer to kernel data.

On a cheri system, replacing the highest bits of an address may move the
capability out of bounds. Fall back to using a struct for the two header
elements.

(We could also define cheri versions for the pack/unpack functions, but
 that's probably not worth the effort.)

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
Apply the usual fixups to make the code work with cheri.

Use a fake pointer to calculate the key for the hash tables.

An fprobe's private data area is allocated from the return stack, which
is an array of uintptr_t.

The return address is read from a capability-sized register. Pass it to
the fprobe handlers as a capability.

Remove the packed attribute from struct __fprobe_header for cheri to
ensure that the struct is capability-aligned even if it's used in an
array.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
Downgrade two capabilities to addresse to fix clang-tidy warnings.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Enable fprobes, they have been adapted for cheri.

Keep the dynamic_events interface to fprobes disabled. This part needs
more fixups for cheri.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
Update the fprobe_example demo module for cheri.

The entry and exit handlers get a full capability for ret_ip.

Addresses are converted to fake pointers for printing.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
clang-tidy complains about a const struct btf_param * that is cast to
const char **.

kernel/trace/trace_probe.c:1744:11: warning: CHERI: Incompatible
pointer target types in cast [cheri-PtrToIntCast]

 1744 |                         return (const char **)params;

It seems that params is initialised with NULL and used only in this
return statement. Remove params and return NULL directly.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
fetch_store_symstring sets __dest = base + offset, __dest points to
maxlen bytes of memory.

__dest is passed to sprint_symbol, which expects an input buffer of
KSYM_SYMBOL_LEN bytes. sprint_symbol writes a 0-byte at the end of this
input buffer.

Introduce a temporary buffer to fix this oob write.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
Update trace_fprobe for cheri. This is the code that creates fprobes from
input strings written to /sys/kernel/tracing/dynamic_events.

Treat the stack as an array of capabilities in
ftrace_regs_get_kernel_stack_nth.

Fix a size check for the shadow stack. On cheri, it is a uintptr_t array.

The fentry_trace_entry_head and fexit_trace_entry_head structs are stored
in the ringbuffer. Addresses in those structs must be downgraded.

The parsing of dynamic_events' input strings creates a sequence of fetch
operations which are processed in four stages.

We use a capability in stage 1 where data is read from the stack, from a
register or from struct task_struct. The metadata aren't used at the
moment, we're planning to pass them to the generated dynamic events in
the future.

edata in stage 1 is data that was collected at function entry and stored
on the shadow stack. When it's processed at function exit, it must be
read as a uintptr_t array.

Stages 2-4 may access memory at a user-provided address + offset. Our only
option is to fabricate a capability for each access.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
Update the parameters for probe_mem_read and probe_mem_read_user to match
the common versions that we've modified for cheri.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
Fprobe events should now be working on cheri. Enable them.

Signed-off-by: Martin Kaiser <martin.kaiser@codasip.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant