connect()
static __always_inline int trace_ret_generic(u32 id, struct pt_regs *ctx, u64 types, u32 scope)
{
if (skip_syscall())
return 0;
sys_context_t context = {};
args_t args = {};
if (ctx == NULL)
return 0;
if (load_args(id, &args) != 0)
return 0;
init_context(&context);
context.event_id = id;
context.argnum = get_arg_num(types);
context.retval = PT_REGS_RC(ctx);
// skip if No such file/directory or if there is an EINPROGRESS
// EINPROGRESS error, happens when the socket is non-blocking and the connection cannot be completed immediately.
if (context.retval == -2 || context.retval == -115)
{
return 0;
}
if (context.retval >= 0 && drop_syscall(scope))
{
return 0;
}
set_buffer_offset(DATA_BUF_TYPE, sizeof(sys_context_t));
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
if (bufs_p == NULL)
return 0;
save_context_to_buffer(bufs_p, (void *)&context);
save_args_to_buffer(types, &args);
events_perf_submit(ctx);
return 0;
}
Overview
-
The function starts by checking if
skip_syscall()
returns a truthy value. If it does, the function returns 0 and exits early. -
Two local variables,
context
andargs
, are declared and initialized. context is of type sys_context_t and args is of type args_t. -
If the ctx parameter is NULL, the function returns 0 and exits early.
-
The function calls the
load_args
function, passing id and a pointer to args. If load_args returns a non-zero value (indicating an error), the function returns 0 and exits early. -
The
init_context
function is called to initialize the context variable. -
Various fields of the
context
structure are set:event_id
is set to id.argnum
is set to the result of theget_arg_num
function, passing types.retval
is set to the return value of the system call, obtained fromPT_REGS_RC(ctx)
.
-
If the
retval
is-2
or-115
(indicating “No such file/directory” or “EINPROGRESS” error respectively), the function returns 0 and exits early. -
If the
retval
is greater than or equal to 0 and thedrop_syscall
function returns a truthy value when passed scope, the function returns 0 and exits early. -
The
set_buffer_offset
function is called to set the buffer offset for theDATA_BUF_TYPE
-
The
get_buffer
function is called to retrieve a pointer to a buffer of type DATA_BUF_TYPE. If the pointer isNULL
, the function returns 0 and exits early. -
The
save_context_to_buffer
function is called to save the context structure to the buffer pointed bybufs_p
. -
The
save_args_to_buffer
function is called to save the arguments (args) to the buffer based on the provided types. -
The
events_perf_submit
function is called, passing ctx as an argument. Finally, the function returns 0.
Overall, this function seems to be part of a tracing mechanism for system calls.
Breakdown of the function
static __always_inline int trace_ret_generic(u32 id, struct pt_regs *ctx, u64 types, u32 scope)
-
static
: This keyword indicates that the visibility of the function is limited to the translation unit (source file) where it is defined. It means that the function cannot be accessed from other source files. -
__always_inline
: This is a compiler-specific attribute that suggests the compiler to inline the function whenever possible, regardless of the optimization level set by the user. -
int
: It specifies the return type of the function, which in this case is an integer. -
trace_ret_generic
: This is the name of the function. -
(u32 id, struct pt_regs *ctx, u64 types, u32 scope)
: These are the function parameters. It expects four arguments: -
id of type u32 (unsigned 32-bit integer)
-
ctx of type struct pt_regs * (a pointer to a structure of type pt_regs)
-
types of type u64 (unsigned 64-bit integer)
-
scope of type u32 (unsigned 32-bit integer)
- The function is intended to be used for tracing and returning a value of type int.
sys_context_t context = {};
1.1 Required definitions
typedef struct __attribute__((__packed__)) sys_context
{
u64 ts;
u32 pid_id;
u32 mnt_id;
u32 host_ppid;
u32 host_pid;
u32 ppid;
u32 pid;
u32 uid;
u32 event_id;
u32 argnum;
s64 retval;
char comm[TASK_COMM_LEN];
} sys_context_t;
The line sys_context_t context = {}
; initializes a variable named context
of type sys_context_t
using an empty initializer {}
.
__attribute__((__packed__))
: This is an attribute specified using the attribute syntax. It indicates that the structure should be packed tightly, without any padding between members. This ensures that the structure’s memory layout is compact and doesn’t contain any unused bytes due to alignment requirements.
The members of the sys_context structure are defined as follows:
u64 ts
: An unsigned 64-bit integer representing atimestamp
.u32 pid_id
: An unsigned 32-bit integer representing aprocess identifier (PID)
ID.u32 mnt_id
: An unsigned 32-bit integer representing amount identifier (MNT)
ID.u32 host_ppid
: An unsigned 32-bit integer representing thehost's parent process
identifier (PPID).u32 host_pid
: An unsigned 32-bit integer representing thehost's process identifier
(PID).u32 ppid
: An unsigned 32-bit integer representing theparent process identifier
(PPID).u32 pid
: An unsigned 32-bit integer representing theprocess identifier (PID)
.u32 uid
: An unsigned 32-bit integer representing theuser identifier (UID)
.u32 event_id
: An unsigned 32-bit integer representing theevent identifier.
u32 argnum
: An unsigned 32-bit integer representing thenumber of arguments
.s64 retval
: A signed 64-bit integer representing thereturn value of a system call.
char comm[TASK_COMM_LEN]
: An array of characters representing the command name associated with the process.TASK_COMM_LEN
is likely a predefined constant specifying the maximum length of the command name.
Overall, this structure is used to store various context information related to a system call or process, such as timestamps, process identifiers, user identifiers, event information, and return values. The use of the __attribute__((__packed__))
attribute ensures that the structure’s memory layout is tightly packed without any padding.
- This check ensures that the function can safely handle the case when the ctx pointer is NULL and avoids potential issues or crashes that may occur when trying to access or manipulate data through a null pointer
if (ctx == NULL)
return 0;
-
ctx
is a pointer of typestruct pt_regs *
. It is being checked to see if it is pointing toNULL
, indicating that it doesn’t point to a valid memory location. -
If
ctx
is indeedNULL
, the code block following the condition is executed. In this case, the code simply returns 0, which means that the function trace_ret_generic exits early and returns a value of 0.
- In summary, the
load_args
function retrieves previously saved arguments from a BPF map (args_map) based on theevent_id
andcurrent process/thread group ID
(tgid). It copies the retrieved arguments to the provided args structure and removes the entry from the map. If the lookup fails, it returns -1.
if (load_args(id, &args) != 0)
return 0;
3.1 Required definition
static __always_inline int load_args(u32 event_id, args_t *args)
{
u32 tgid = bpf_get_current_pid_tgid();
u64 id = ((u64)event_id << 32) | tgid;
args_t *saved_args = bpf_map_lookup_elem(&args_map, &id);
if (saved_args == 0)
{
return -1; // missed entry or not a container
}
args->args[0] = saved_args->args[0];
args->args[1] = saved_args->args[1];
args->args[2] = saved_args->args[2];
args->args[3] = saved_args->args[3];
args->args[4] = saved_args->args[4];
args->args[5] = saved_args->args[5];
bpf_map_delete_elem(&args_map, &id);
return 0;
}
if (load_args(id, &args) != 0)
return 0; is another conditional statement in the trace_ret_generic function. Here’s what it does:
The code calls the load_args
function, passing id
and a pointer to the args structure (&args) as arguments
.
The result of the load_args
function is compared to 0 using the inequality operator !=. The load_args function likely returns 0 to indicate success, while a non-zero value indicates an error or failure.
If the result of load_args is not equal to 0, indicating an error occurred during the function call, the code block following the condition is executed.
In this case, the code simply returns 0, indicating that the trace_ret_generic function exits early and returns a value of 0.
This check is used to handle the case when the load_args function fails or encounters an error. By returning 0, the function indicates that it cannot proceed further due to the error in loading the arguments and terminates its execution.
load_args function Let’s understand its functionality:
-
The function takes two parameters:
event_id
oftype u32
andargs
of typeargs_t*
, which is a pointer to a structure args_t. -
The function starts by getting the
current process ID
andthread group ID
usingbpf_get_current_pid_tgid()
and assigns it to the variable tgid. -
The variable
id
is created by combining event_id and tgid using a bitwise shift and bitwise OR operations. -
The
bpf_map_lookup_elem
function is called, passing the address ofargs_map
and the address of id as arguments. It attempts to look up an element in the args_map BPF map using the id as the key. -
If the return value
saved_args
is equal to 0 (indicating a missed entry or not a container), the function returns -1 to indicate an error. -
If a valid
saved_args
element is found in the map, the individual elements of saved_args are copied to the corresponding elements of args using assignments. -
After the values are copied, the
bpf_map_delete_elem
function is called toremove
the element from the args_map using the id as the key. -
Finally, the function returns 0 to indicate successful loading of arguments.
- Calls a function named
init_context
and passes the address of thecontext
variable as an argument(&context)
.
init_context(&context);
-
init_context
is a function that initializes the context structure with default values or performs some necessary setup. -
By passing the address of the context variable, the function can modify the contents of the context structure directly within the
init_context
function. -
The function starts by getting the current task’s
task_struct pointer
usingbpf_get_current_task()
and assigns it to the local variable task. -
The
timestamp (ts)
of the context structure is set to the current time in nanoseconds usingbpf_ktime_get_ns()
. -
The
host_ppid
field of the context structure is set by calling theget_task_ppid
function, passing the task pointer. -
The
host_pid
field of the context structure is set by shifting the result ofbpf_get_current_pid_tgid()
by32 bits
to the right. -
The
uid
field of the context structure is set to the current user identifier (UID) obtained frombpf_get_current_uid_gid()
. -
The command name
(comm)
of the current task is retrieved usingbpf_get_current_comm
and stored in thecontext->comm
array with a size ofsizeof(context->comm)
.
Finally, the function returns 0 to indicate successful initialization of the context structure.
The updated init_context
function initializes the ts
, host_ppid
, host_pid
, uid
, and comm
fields of the sys_context_t
structure based on the current task’s information.
4.1 Required Definition
static __always_inline u32 get_task_ppid(struct task_struct *task)
{
struct task_struct *parent = READ_KERN(task->parent);
return READ_KERN(parent->pid);
}
Defines a function get_task_ppid
that takes a pointer to a struct task_struct
named task as an argument. It is marked with the __always_inline
attribute, indicating that it should be inlined by the compiler whenever possible.
-
Inside the function, it performs the following steps:
-
It declares a local variable parent of type
struct task_struct*
. -
It reads the value of
task->parent
using theREAD_KERN
macro or function, which suggests that it reads a kernel memory location. -
It assigns the value read from
task->parent
to the parent variable. -
It reads the value of
parent->pid
using theREAD_KERN
macro or function. -
It returns the value read from
parent->pid
as the result of the function.
Overall, the get_task_ppid
function retrieves the parent process ID (PPID)
of a given task by accessing the parent field of the task_struct and reading its pid field.
4.2 Required Definition
#define GET_FIELD_ADDR(field) &field
#define READ_KERN(ptr) \
({ \
typeof(ptr) _val; \
__builtin_memset((void *)&_val, 0, sizeof(_val)); \
bpf_probe_read((void *)&_val, sizeof(_val), &ptr); \
_val; \
})
Provides two macros: GET_FIELD_ADDR
and READ_KERN
.
GET_FIELD_ADDR(field)
macro simply takes a field name
as an argument and returns the address of that field. For example, if you have a field named my_field in a structure, you can use GET_FIELD_ADDR(my_field) to obtain its address.
READ_KERN(ptr)
macro is a compound statement that reads a value from kernel memory at the given pointer ptr. It uses the bpf_probe_read
function to safely read the value, ensuring that the memory access is valid and doesn’t cause issues. The macro is defined using a GCC extension called statement expression, denoted by ({ … }). Within the macro:
a. It declares a local variable _val
with the same type as ptr
. b. It uses __builtin_memset
to zero out the memory occupied by _val
. c. It then calls bpf_probe_read
to read the value from ptr into _val, ensuring that it doesn’t exceed the size of _val. d. Finally, it returns the value of _val
.
The purpose of the READ_KERN macro
is to provide a safe mechanism to read values from kernel memory within the eBPF program, using the bpf_probe_read
function. It helps ensure that the memory access is valid and avoids potential issues.
- These assignments populate the
event_id
,argnum
, andretval
fields of thesys_context_t
structure with relevant information related to the traced syscall.
context.event_id = id;
context.argnum = get_arg_num(types);
context.retval = PT_REGS_RC(ctx);
-
The
event_id
field is assigned the value of theid
variable. This value is used to identify the specific event or syscall being traced. -
The
argnum
field is assigned the result of theget_arg_num
function, which takes the types parameter as an argument. -
The
retval
field is assigned the return value of the traced syscall, obtained fromPT_REGS_RC(ctx)
.PT_REGS_RC
is a macro or function that extracts the return value from thept_regs
structure pointed to by thectx
parameter.
5.1 Required defintion
static __always_inline int get_arg_num(u64 types)
{
unsigned int i, argnum = 0;
#pragma unroll
for (i = 0; i < MAX_ARGS; i++)
{
if (DEC_ARG_TYPE(i, types) != NONE_T)
argnum++;
}
return argnum;
}
get_arg_num
function calculates thenumber of arguments
based on thetypes
parameter, which is of typeu64
(unsigned 64-bit integer).
Here’s how the function works:
-
The function declares two variables: i, which represents the loop counter, and
argnum
, which is used to count the number ofnon-NONE_T
argument types. -
The
#pragma unroll
directive suggests that the loop should be unrolled by the compiler for performance optimization. This pragma is used to provide a hint to the compiler about loop unrolling, but its effect may vary depending on the compiler. -
The loop iterates over
MAX_ARGS
(presumably a predefined constant) times, starting from 0 and incrementing i by 1 in each iteration. -
Inside the loop, the function checks if the argument type for the current index (i) obtained from DEC_ARG_TYPE(i, types) is not equal to NONE_T. If the argument type is not NONE_T, the argnum variable is incremented.
-
After the loop completes, the function returns the final value of
argnum
, which represents the number of non-NONE_T argument types encountered during the loop.
In summary, the get_arg_num function
iterates over a range of indices and counts the number of non-NONE_T argument types based on the provided types value.
5.2 Required defintion
#define DEC_ARG_TYPE(n, type) ((type >> (8 * n)) & 0xFF)
The macro DEC_ARG_TYPE(n, type)
takes two arguments:
n
: Represents the index of the argument type to be extracted.type
: Represents the input value from which the argument type is extracted.
Here’s how the macro works:
-
The macro shifts the type value to the right by
8 * n
bits. This effectively aligns the desired argument type at theleast significant byte
position. -
The &
0xFF
operation is performed to mask all but the least significant byte, ensuring that only the value of the desired argument type is retained. -
The resulting value represents the argument type extracted from the type value at the specified index.
In summary, the DEC_ARG_TYPE macro extracts the argument type at the given index from the provided type value. The 8 * n bit shift aligns the desired argument type, and the & 0xFF operation masks the value to retain only the least significant byte.
5.2 Required defintion
#define MAX_ARGS 6
#define ENC_ARG_TYPE(n, type) type << (8 * n)
#define ARG_TYPE0(type) ENC_ARG_TYPE(0, type)
#define ARG_TYPE1(type) ENC_ARG_TYPE(1, type)
#define ARG_TYPE2(type) ENC_ARG_TYPE(2, type)
#define ARG_TYPE3(type) ENC_ARG_TYPE(3, type)
#define ARG_TYPE4(type) ENC_ARG_TYPE(4, type)
#define ARG_TYPE5(type) ENC_ARG_TYPE(5, type)
This defines several macros related to argument types. Let’s go through each of them:
MAX_ARGS
: This macro defines the maximum number of arguments as 6.
ENC_ARG_TYPE(n, type)
: This macro takes two arguments:
-
n
: Represents the index of the argument type. -
type
: Represents the value of the argument type. The macroleft-shifts
the type value by8 * n bits
, effectively encoding the argument type at the specified index. -
ARG_TYPE0
,ARG_TYPE1
,ARG_TYPE2
,ARG_TYPE3
,ARG_TYPE4
,ARG_TYPE5
: These macros are convenience macros for encoding argument types at specific indices. Each of these macros takes a single argument, type, which represents the value of the argument type. They use the ENC_ARG_TYPE macro to encode the argument type at the corresponding index.
For example, if you want to encode an argument type for index 2, you can use the ARG_TYPE2 macro like this: ARG_TYPE2(my_argument_type). This will effectively encode the my_argument_type value at index 2 by left-shifting it by 8 * 2 bits.
These macros provide a convenient way to encode and manipulate argument types based on their respective indices.
Example:
ARG_TYPE0(SOCK_DOM_T) | ARG_TYPE1(SOCK_TYPE_T) | ARG_TYPE2(INT_T)
-
results in a
bitwise OR
operation between the encoded argument types forindex 0
,index 1
, andindex 2
.Here’s what it means: -
ARG_TYPE0(SOCK_DOM_T)
: This macro expands to the encoding of the SOCK_DOM_T argument type at index 0. Based on the provided definitions, SOCK_DOM_T has a value of 15UL. Therefore, ARG_TYPE0(SOCK_DOM_T) results in the encoded value 15UL « (8 * 0), which is 15UL. -
ARG_TYPE1(SOCK_TYPE_T)
: This macro expands to the encoding of the SOCK_TYPE_T argument type at index 1. According to the definitions, SOCK_TYPE_T has a value of 16UL. Therefore, ARG_TYPE1(SOCK_TYPE_T) results in the encoded value 16UL « (8 * 1), which is 4096UL. -
ARG_TYPE2(INT_T)
: This macro expands to the encoding of the INT_T argument type at index 2. From the definitions, INT_T has a value of 1UL. Thus, ARG_TYPE2(INT_T) results in the encoded value 1UL « (8 * 2), which is 65536UL.
The bitwise OR operation (|) is applied between these encoded values, resulting in the final value:
15UL | 4096UL | 65536UL
6.Checks the value of the context.retval
variable and returns 0
if it is equal to either -2
or -115
. This condition is used to skip further processing if the returned value indicates specific error conditions related to file/directory operations or non-blocking socket connections.
if (context.retval == -2 || context.retval == -115)
{
return 0;
}
7.The set_buffer_offset
function is used to update the value associated with a key in the bufs_offset
map, allowing the buffer offset to be set for a specific buffer type.
set_buffer_offset(DATA_BUF_TYPE, sizeof(sys_context_t));
7.1 Required definition
#define DATA_BUF_TYPE 0
7.2 Required definition
static __always_inline void set_buffer_offset(int buf_type, u32 off)
{
bpf_map_update_elem(&bufs_offset, &buf_type, &off, BPF_ANY);
}
-
Defines a function called
set_buffer_offset
that takes two parameters:buf_type
oftype int
andoff
oftype u32
. The function is declared with the __always_inline attribute, which suggests that the compiler should try to inline the function whenever possible. -
The function uses the
bpf_map_update_elem
function to update an element in thebufs_offset
map. -
The
bpf_map_update_elem
function is used to update the value associated with thebuf_type
key in the bufs_offset map. It takes the address ofbuf_typ
e as the key parameter, theaddress of off
as the value parameter, and theBPF_ANY
flag to indicate that the update should overwrite any existing value associated with the key.
7.3 Required definition
BPF_PERCPU_ARRAY(bufs_offset, u32, 3);
Defines a BPF (Berkeley Packet Filter) per-CPU array named bufs_offset
. Here’s a breakdown of its components:
BPF_PERCPU_ARRAY
: This is a macro provided by the BPF framework. It is used to declare aper-CPU array
in the eBPF (extended Berkeley Packet Filter) program.
Per-CPU arrays allow each CPU to have its own separate copy of the array, providing efficient parallel access to the array elements.
-
bufs_offset
: This is the name given to the per-CPU array. You can use this name to refer to the array in other parts of the code. -
u32
: This specifies the type of the elements in the array. In this case, u32 represents an unsigned 32-bit integer. -
3
: This specifies thesize
of the array. It indicates that the bufs_offset array will have 3 elements. r-CPU array named bufs_offset with 3 element Overall, the line of code declares a pes, where each element is an unsigned 32-bit integer.
- This line retrieves a buffer of type DATA_BUF_TYPE and assigns its address to the bufs_p pointer variable. The bufs_p pointer can then be used to access and manipulate the buffer contents.
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
if (bufs_p == NULL)
return 0;
The line bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE)
; declares a pointer variable bufs_p
of type bufs_t*
and initializes it with the result of the get_buffer
function call. The argument passed to the get_buffer function is DATA_BUF_TYPE
.
After obtaining the pointer bufs_p
, the code checks if it is NULL
. If the pointer is NULL
, it means that the buffer retrieval was unsuccessful, possibly indicating an error or absence of the desired buffer. In such a case, the code returns 0
to indicate failure or an error condition.
8.1 Required Definition
#define MAX_BUFFER_SIZE 32768
typedef struct buffers
{
u8 buf[MAX_BUFFER_SIZE];
} bufs_t;
Defines a structure named buffers with a single member buf. Here’s a breakdown of its components:
-
typedef
: This keyword is used to create a new type alias. In this case, it is used to create an alias bufs_t for the structure struct buffers. -
struct buffers
: This specifies the name of the structure being defined. -
u8 buf[MAX_BUFFER_SIZE]
: This declares a member variable buf of typeu8 (unsigned 8-bit integer)
array. The array has a size ofMAX_BUFFER_SIZE
-
Each instance of
bufs_t
will have abuf
member that can hold up toMAX_BUFFER_SIZE
bytes of data.
8.2 Required Definition
static __always_inline bufs_t *get_buffer(int buf_type)
{
return bpf_map_lookup_elem(&bufs, &buf_type);
}
The function get_buffer
is defined as an inline
function with the static
and __always_inline
attributes. It takes an integer buf_type as an argument.
-
Inside the function, it calls the
bpf_map_lookup_elem
function, passing the address of thebuf_type
variable and thebufs
map as arguments. Thebpf_map_lookup_elem
function is used to retrieve the value associated with the specifiedkey (buf_type)
from the bufs map. -
The function then returns the result of the
bpf_map_lookup_elem
function call, which is apointer to the bufs_t
structure corresponding to thebuf_type
key in thebufs
map. -
In summary, the
get_buffer
function is responsible for looking up and returning a pointer to a buffer of typebufs_t
based on the providedbuf_type
BPF_PERCPU_ARRAY(bufs, bufs_t, 3);
-
BPF_PERCPU_ARRAY(bufs, bufs_t, 3)
; declares aper-CPU array
namedbufs
with a capacity of 3 elements, where each element has the type bufs_t. -
A per-CPU array is a special type of array in eBPF programs that provides individual instances of the array for each CPU core. Each CPU core has its own separate copy of the array, which allows for concurrent access without synchronization.
-
In this case, the
per-CPU array bufs
is declared to hold elements of typebufs_t
, which is a structure with a memberbuf of type u8 array
. The array size is determined by the MAX_BUFFER_SIZE constant defined earlier as 32768. Therefore, each element of the bufs array will have a buf member capable of holding up to MAX_BUFFER_SIZE bytes of data. -
The
per-CPU array bufs
can be accessed and manipulated bydifferent CPU cores concurrently
, making it suitable for storing and processing data in a multi-core environment.
- In summary, the code snippet attempts to save the contents of the
sys_context_t
structure into a buffer by calling thesave_context_to_buffer
function. If successful, it returns the size of the structure; otherwise, it returns 0 to indicate failure.
save_context_to_buffer(bufs_p, (void *)&context);
static __always_inline int save_context_to_buffer(bufs_t *bufs_p, void *ptr)
{
if (bpf_probe_read(&(bufs_p->buf[0]), sizeof(sys_context_t), ptr) == 0)
{
return sizeof(sys_context_t);
}
return 0;
}
-
Assuming the
bufs_p
pointer is notNULL
, the code proceeds to call thesave_context_to_buffer
function. This function is responsible for copying the contents of the sys_context_t structure pointed to by ptr into the buffer pointed to by bufs_p->buf. -
The
bpf_probe_read
function is used to safely read the memory pointed to byptr
and copy it into the bufferbufs_p->buf
. If the read operation is successful (returns 0), the function returns the size of thesys_context_t
structure (which is sizeof(sys_context_t)). Otherwise, if the read operation fails, the function returns 0.
save_args_to_buffer(types, &args);
10.1 Required Definition
static __always_inline int save_args_to_buffer(u64 types, args_t *args)
{
if (types == 0)
{
return 0;
}
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
if (bufs_p == NULL)
{
return 0;
}
#pragma unroll
for (int i = 0; i < MAX_ARGS; i++)
{
switch (DEC_ARG_TYPE(i, types))
{
case NONE_T:
break;
case INT_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), INT_T);
break;
case OPEN_FLAGS_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), OPEN_FLAGS_T);
break;
case FILE_TYPE_T:
save_file_to_buffer(bufs_p, (void *)args->args[i]);
break;
case PTRACE_REQ_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), PTRACE_REQ_T);
break;
case MOUNT_FLAG_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), MOUNT_FLAG_T);
break;
case UMOUNT_FLAG_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), UMOUNT_FLAG_T);
break;
case STR_T:
save_str_to_buffer(bufs_p, (void *)args->args[i]);
break;
case SOCK_DOM_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), SOCK_DOM_T);
break;
case SOCK_TYPE_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), SOCK_TYPE_T);
break;
case SOCKADDR_T:
if (args->args[i])
{
short family = 0;
bpf_probe_read(&family, sizeof(short), (void *)args->args[i]);
switch (family)
{
case AF_UNIX:
save_to_buffer(bufs_p, (void *)(args->args[i]), sizeof(struct sockaddr_un), SOCKADDR_T);
break;
case AF_INET:
save_to_buffer(bufs_p, (void *)(args->args[i]), sizeof(struct sockaddr_in), SOCKADDR_T);
break;
case AF_INET6:
save_to_buffer(bufs_p, (void *)(args->args[i]), sizeof(struct sockaddr_in6), SOCKADDR_T);
break;
default:
save_to_buffer(bufs_p, (void *)&family, sizeof(short), SOCKADDR_T);
}
}
break;
case UNLINKAT_FLAG_T:
save_to_buffer(bufs_p, (void *)&(args->args[i]), sizeof(int), UNLINKAT_FLAG_T);
break;
}
}
return 0;
}
The save_args_to_buffer
function is responsible for saving the arguments (args) to a buffer. The saving process depends on the types of the arguments specified by the types parameter.
Here’s a breakdown of the actions performed by the function:
- If types is 0, indicating that there are no arguments to save, the function returns 0.
- The function obtains a pointer to the buffer by calling the
get_buffer
function with theDATA_BUF_TYPE
as the argument. If the obtained pointer is NULL, indicating an error in obtaining the buffer, the function returns 0. - The function then iterates over each argument using a loop. It uses the
DEC_ARG_TYPE
macro to extract the argument type at the given index i from the types.
Depending on the argument type, the function performs different actions:
- For argument types such as
INT_T
,OPEN_FLAGS_T
,PTRACE_REQ_T
,MOUNT_FLAG_T
,UMOUNT_FLAG_T
,SOCK_DOM_T
, andUNLINKAT_FLAG_T
, it calls thesave_to_buffer
function to save the argument value into the buffer. - For
FILE_TYPE_T
, it calls thesave_file_to_buffer
function to save the file information into the buffer. - For
STR_T
, it calls the save_str_to_buffer function to save the string argument into the buffer. - For
SOCKADDR_T
, it checks thefamily
of the socket address and based on the family type (AF_UNIX
,AF_INET
,AF_INET6
, or others), it calls thesave_to_buffer
function to save the correspondingsockaddr structure
or thefamily
value into the buffer. - After iterating over all the arguments, the function returns 0 to indicate successful saving of the arguments.
10.1 Required Definition
The save_to_buffer
function is responsible for saving data to a buffer. Here’s an explanation of how the code works:
static __always_inline int save_to_buffer(bufs_t *bufs_p, void *ptr, int size, u8 type)
{
// the biggest element that can be saved with this function should be defined here
#define MAX_ELEMENT_SIZE sizeof(struct sockaddr_un)
if (type == 0)
{
return 0;
}
u32 *off = get_buffer_offset(DATA_BUF_TYPE);
if (off == NULL)
{
return -1;
}
if (*off > MAX_BUFFER_SIZE - MAX_ELEMENT_SIZE)
{
return 0;
}
if (bpf_probe_read(&(bufs_p->buf[*off]), 1, &type) != 0)
{
return 0;
}
*off += 1;
if (*off > MAX_BUFFER_SIZE - MAX_ELEMENT_SIZE)
{
return 0;
}
if (bpf_probe_read(&(bufs_p->buf[*off]), size, ptr) == 0)
{
*off += size;
set_buffer_offset(DATA_BUF_TYPE, *off);
return size;
}
return 0;
}
This function takes four parameters: bufs_p
, a pointer to the buffer structure (bufs_t)
; ptr
, a pointer to the data to be saved; size
, the size of the data to be saved; and type
, the type of the data.
-
The function first checks if the
type
is zero. If it is, it returns 0, indicating that no data should be saved. -
Next, it calls the
get_buffer_offset
function to retrieve the offset value associated with the data buffer type(DATA_BUF_TYPE)
. If the offset value is NULL, indicating that the buffer is not available, it returns -1. -
The function then checks if the current
offset
value plus themaximum element size
(MAX_ELEMENT_SIZE) exceeds themaximum buffer size
(MAX_BUFFER_SIZE). If it does, it returns 0, indicating that there is not enough space in the buffer to save the data. -
Next, it uses the
bpf_probe_read
function to read the type and save it to the buffer at the current offset position. If thebpf_probe_read
operation fails, it returns 0, indicating that the data could not be saved. -
The offset is then incremented by 1 to account for the saved type.
-
The function checks again if the
current offset
value plus themaximum element size
exceeds themaximum buffer size
. If it does, it returns 0, indicating that there is not enough space in the buffer to save the remaining data. -
Finally, it uses the
bpf_probe_read
function again to read the data from the ptr pointer and save it to the buffer at the current offset position. -
If the
bpf_probe_read
operation succeeds, theoffset
is incremented by size. Theset_buffer_offset
function is called to update the buffer offset value with the new offset.
The function returns the size if the data was successfully saved, or 0 if there was an error during the saving process.
- Defines a function
events_perf_submit
that is called with astruct pt_regs
pointerctx
as an argument.
events_perf_submit(ctx);
11.1 Required definition
static __always_inline int events_perf_submit(struct pt_regs *ctx)
{
bufs_t *bufs_p = get_buffer(DATA_BUF_TYPE);
if (bufs_p == NULL)
return -1;
u32 *off = get_buffer_offset(DATA_BUF_TYPE);
if (off == NULL)
return -1;
void *data = bufs_p->buf;
int size = *off & (MAX_BUFFER_SIZE - 1);
return bpf_perf_event_output(ctx, &sys_events, BPF_F_CURRENT_CPU, data, size);
}
Defines a function events_perf_submit
that is called with a struct pt_regs
pointer ctx
as an argument.
Inside the function, it performs the following steps:
- It retrieves a pointer to the buffer of type
DATA_BUF_TYPE
by calling theget_buffer
function. If the pointer is NULL, indicating that the buffer is not found, it returns -1. - It retrieves the offset of the buffer by calling the
get_buffer_offset
function. If the offset pointer is NULL, it returns -1. - It assigns the pointer data to the buffer’s data array
bufs_p->buf
. - It calculates the
size
of the data based on theoffset value
, using abitwise AND
operation with(MAX_BUFFER_SIZE - 1)
. - Finally, it calls
bpf_perf_event_output
to submit the performance event, passing thectx pointer
, thesys_events
map, theBPF_F_CURRENT_CPU
flag, the data pointer, and the size as arguments. It returns the result of this function call. The code snippet also includes a call toevents_perf_submit(ctx)
, which suggests that this function is invoked with thectx
argument at some point in the code.
11.2 Required definition
static __always_inline u32 *get_buffer_offset(int buf_type)
{
return bpf_map_lookup_elem(&bufs_offset, &buf_type);
}
The get_buffer_offset
function is used to retrieve the offset value associated with a specific buffer type. Here’s how the code works:
-
This function takes the
buf_type
parameter, which represents the type of the buffer. It then calls thebpf_map_lookup_elem
function to retrieve the offset value associated with the specified buffer type from thebufs_offset
map. -
The
bpf_map_lookup_elem
function searches for the given key(&buf_type)
in the map(bufs_offset)
and returns a pointer to the associated value (offset). If the key is found, the function returns a non-NULL pointer to the offset value. -
Otherwise, it returns NULL, indicating that the offset value for the specified buffer type is not found in the map.