BPF Syscall BPF_LINK_CREATE
command
This syscall command creates a new BPF link. BPF links are the newest and thus preferred way to attach BPF programs to their hook locations within the kernel. This syscall command is intended to replace both BPF_PROG_ATTACH
and other legacy attachment methods such as netlink and perf events.
Return type
If successful, this syscall command will return a file descriptor to the newly created link. The link is a reference counted object just like other BPF objects. The link is destroyed once no more references to it exist, which might happen if the loader exits without pinning the link or if the pin gets deleted. A loader might also chose to forcefully cause the link to detach from the hook point with the BPF_LINK_DETACH
command.
The returned file descriptor can be used with the BPF_LINK_UPDATE
and BPF_LINK_DETACH
commands.
Attributes
The attributes for this syscall are particularly complex. Ultimately the attach_type
field will determine which fields are used and how.
union bpf_attr {
struct {
union {
__u32 prog_fd;
__u32 map_fd;
};
union {
__u32 target_fd;
__u32 target_ifindex;
};
__u32 attach_type;
__u32 flags;
union {
__u32 target_btf_id;
struct {
__aligned_u64 iter_info;
__u32 iter_info_len;
};
struct {
[...]
} perf_event;
struct {
[...]
} kprobe_multi;
struct {
[...]
} tracing;
struct {
[...]
} netfilter;
struct {
[...]
} tcx;
struct {
[...]
} uprobe_multi;
struct {
[...]
} netkit;
};
} link_create;
};
prog_fd
This field specifies the file descriptor for the BPF program to be linked.
map_fd
This field specifies a BPF map for the BPF program to be linked to.
target_fd
This field specifies the file descriptor of the target you wish to attach the program to. The kind of file descriptor varies per program type.
For cGroup programs (BPF_PROG_TYPE_CGROUP_SKB
, BPF_PROG_TYPE_CGROUP_SOCK
,
BPF_PROG_TYPE_CGROUP_SOCK_ADDR
,BPF_PROG_TYPE_SOCK_OPS
,BPF_PROG_TYPE_CGROUP_DEVICE
,BPF_PROG_TYPE_CGROUP_SYSCTL
,BPF_PROG_TYPE_CGROUP_SOCKOPT
) the file descriptor should be a cGroup directory in the cGroup FS, commonly mounted at /sys/fs/cgroup
, for example /sys/fs/cgroup/test.slice/
. Such a fd can be obtained by using the open
syscall on the desired path.
For BPF_PROG_TYPE_EXT
programs this should be a file descriptor to another BPF program.
For BPF_PROG_TYPE_LSM
programs with the attach type BPF_LSM_CGROUP
it should also be a cGroup directory as described above.
For BPF_PROG_TYPE_TRACING
programs with the attach type BPF_TRACE_FENTRY
, BPF_TRACE_FEXIT
or BPF_MODIFY_RETURN
this should be a file descriptor to an existing BPF program.
For BPF_PROG_TYPE_LIRC2
programs this should be a file descriptor to a infrared device in /dev
.
For BPF_PROG_TYPE_FLOW_DISSECTOR
and BPF_PROG_TYPE_SK_LOOKUP
programs this should be a file descriptor to a network namespace. Named network namespaces are represented as objects in /var/run/netns
, a file descriptor to a namespace can be obtained open
syscall on one of these objects (/var/run/netns/{name}
).
target_ifindex
This field specifies the network interface index of the network device to attach the program to. This field is only used for BPF_PROG_TYPE_XDP
programs.
attach_type
Attach type specifies the attach type. For more information about possible values and their meaning checkout the Attach types section.
flags
This field specifies flags to instruct how to interpret other attributes. See Flags.
target_btf_id
This field specifies the BTF id of the target to attach to, used to specify the kernel function to hook to when attaching BPF_PROG_TYPE_LSM
programs.
iter_info
This field specifies over what kind of information a iterator program should iterate. It is a pointer to an instance of union bpf_iter_link_info
union bpf_iter_link_info
C structure
union bpf_iter_link_info {
struct {
__u32 map_fd;
} map;
struct {
enum bpf_cgroup_iter_order order;
/* At most one of cgroup_fd and cgroup_id can be non-zero. If
* both are zero, the walk starts from the default cgroup v2
* root. For walking v1 hierarchy, one should always explicitly
* specify cgroup_fd.
*/
__u32 cgroup_fd;
__u64 cgroup_id;
} cgroup;
/* Parameters of task iterators. */
struct {
__u32 tid;
__u32 pid;
__u32 pid_fd;
} task;
};
iter_info_len
This field specifies the length of the given iter_info
structure, for the purposes of compatibility in case new kernels add additional fields.
perf_event
union bpf_attr {
struct {
[...]
union {
struct {
__u64 bpf_cookie;
} perf_event;
}
}
}
bpf_cookie
This field is an optional opaque value which is reported back to tracing programs via the bpf_get_attach_cookie
helper.
The idea behind this cookie is that if the same program gets attached to multiple locations in the kernel, this value can be used to distinguish for which attach point the program is called. This ID value can for example be used as the key for a map which contains additional data the program needs or as key when collecting statistics.
kprobe_multi
union bpf_attr {
struct {
[...]
union {
struct {
__u32 flags;
__u32 cnt;
__aligned_u64 syms;
__aligned_u64 addrs;
__aligned_u64 cookies;
} kprobe_multi;
}
}
}
This sub-struct is a collection of fields which specify one or multiple kprobe attachment points to attach the same program to multiple locations with a single syscall.
flags
Bitfield of flags, possible values are:
BPF_F_KPROBE_MULTI_RETURN
- When set, the kprobes are created as return probes.
cnt
This field is the number of syms
, addrs
and cookies
to follow
syms
This field specifies a list of kernel symbols to attach the kprobe to. The value should be a pointer to an array of null-terminated string pointers.
[cnt][]char
This field is mutually exclusive with addrs
. Only one can be used at a time.
addrs
This field specifies a list of kernel addresses to attach the kprobe to. The value should be a pointer to an array of memory addresses.
This field is mutually exclusive with syms
. Only one can be used at a time.
cookies
This field specifies a list of cookies(bpf_cookie
) values for each attachment point. The value should be a pointer to an array of 64-bit cookie values or 0
if you do not want to specify cookies.
tracing
union bpf_attr {
struct {
[...]
union {
struct {
/* this is overlaid with the target_btf_id above. */
__u32 target_btf_id;
__u64 cookie;
} tracing;
}
}
}
target_btf_id
The definition in tracing
is overlaid with target_btf_id
in memory, and has the same meaning.
cookie
Same as bpf_cookie
but for tracing programs.
netfilter
union bpf_attr {
struct {
[...]
union {
struct {
__u32 pf;
__u32 hooknum;
__s32 priority;
__u32 flags;
} netfilter;
}
}
}
pf
The protocol family, supported values are NFPROTO_IPV4
(2) and NFPROTO_IPV6
(10).
hooknum
The hook number, supported values are NF_INET_PRE_ROUTING
(0), NF_INET_LOCAL_IN
(1), NF_INET_FORWARD
(2), NF_INET_LOCAL_OUT
(3), and NF_INET_POST_ROUTING
(4).
priority
The priority of the hook, lower values are called first. NF_IP_PRI_FIRST
(-2147483648) and NF_IP_PRI_LAST
(2147483647) are not allowed.
flags
A bitmask of flags. Supported flags are:
BPF_F_NETFILTER_IP_DEFRAG
- Enable defragmentation of IP fragments, this hook will only see defragmented packets. If theBPF_F_NETFILTER_IP_DEFRAG
v6.6 flag is set, the priority must be higher thanNF_IP_PRI_CONNTRACK_DEFRAG
(-400) for ensuring the prog runs after nf_defrag.
tcx
union bpf_attr {
struct {
[...]
union {
struct {
union {
__u32 relative_fd;
__u32 relative_id;
};
__u64 expected_revision;
} tcx;
}
}
}
relative_fd
The file descriptor of the program or link to attach relative to.
- If
BPF_F_BEFORE
is set, the program is attached before the program/link indicated by this field. - If
BPF_F_AFTER
is set, the program is attached after the program/link indicated by this field. - If
BPF_F_REPLACE
is set, the program replaced the program/link indicated by this field.
The above flags are mutually exclusive.
This field is used over relative_id
when BPF_F_ID
is not set.
relative_id
The ID of the program or link to attach relative to.
- If
BPF_F_BEFORE
is set, the program is attached before the program/link indicated by this field. - If
BPF_F_AFTER
is set, the program is attached after the program/link indicated by this field. - If
BPF_F_REPLACE
is set, the program replaced the program/link indicated by this field.
The above flags are mutually exclusive.
This field is used over relative_fd
when BPF_F_ID
is set.
expected_revision
The expected
uprobe_multi
union bpf_attr {
struct {
[...]
union {
struct {
__aligned_u64 path;
__aligned_u64 offsets;
__aligned_u64 ref_ctr_offsets;
__aligned_u64 cookies;
__u32 cnt;
__u32 flags;
__u32 pid;
} uprobe_multi;
}
}
}
path
Docs could be improved
This part of the docs is incomplete, contributions are very welcome
offsets
Docs could be improved
This part of the docs is incomplete, contributions are very welcome
ref_ctr_offsets
Docs could be improved
This part of the docs is incomplete, contributions are very welcome
cookies
Docs could be improved
This part of the docs is incomplete, contributions are very welcome
cnt
Docs could be improved
This part of the docs is incomplete, contributions are very welcome
flags
Bitfield of flags, possible values are:
BPF_F_UPROBE_MULTI_RETURN
- When set, the kprobes are created as return probes.
pid
Docs could be improved
This part of the docs is incomplete, contributions are very welcome
netkit
union bpf_attr {
struct {
[...]
union {
struct {
union {
__u32 relative_fd;
__u32 relative_id;
};
__u64 expected_revision;
} netkit;
}
}
}
relative_fd
The file descriptor of the program or link to attach relative to.
- If
BPF_F_BEFORE
is set, the program is attached before the program/link indicated by this field. - If
BPF_F_AFTER
is set, the program is attached after the program/link indicated by this field. - If
BPF_F_REPLACE
is set, the program replaced the program/link indicated by this field.
The above flags are mutually exclusive.
This field is used over relative_id
when BPF_F_ID
is not set.
relative_id
The ID of the program or link to attach relative to.
- If
BPF_F_BEFORE
is set, the program is attached before the program/link indicated by this field. - If
BPF_F_AFTER
is set, the program is attached after the program/link indicated by this field. - If
BPF_F_REPLACE
is set, the program replaced the program/link indicated by this field.
The above flags are mutually exclusive.
This field is used over relative_fd
when BPF_F_ID
is set.
expected_revision
The expected
Attach types
This section describes the possible values and meanings for the attach_type
attribute. These values are the same as used in the BPF_PROG_ATTACH
command and the expected_attach_type
field of the BPF_PROG_LOAD
command.
The attach type is often used to communicate a specialization for a program type, for example if the program should attach to the ingress or egress. Since the hook locations will differ, the capabilities of the program may as well. Please check the pages of the program types for details about these attach type dependant limitations.