Map type BPF_MAP_TYPE_SOCKMAP
The socket map is a specialized map type which hold network sockets as value.
Usage
This map type can be use too lookup a pointer to a socket with the bpf_map_lookup_elem
helper function, which then can be passed to helpers such as bpf_sk_assign
or the a map reference can be used directly in a range of helpers such as bpf_sk_redirect_map
, bpf_msg_redirect_map
and bpf_sk_select_reuseport
. All of the above cases redirect a packet or connection in some way, the details differ depending on the program type and the helper function, so please visit the specific pages for details.
Note
Sockets returned by bpf_map_lookup_elem
are ref-counted, so the caller must call bpf_sk_release
in all code paths where the returned socket is not NULL before exiting the program. This is enforced by the verifier which will throw a Unreleased reference
error if socket pointers are not released.
This map can also be manipulated from kernel space, the main use-case for doing so seems to be to manage the contents of the map automatically from program types that trigger on socket events. This would allow 1 program to manage the contents of the map, and another to do the actual redirecting on packet events.
BPF_PROG_TYPE_SK_MSG
and BPF_PROG_TYPE_SK_SKB
programs can be attached to this map type. When a socket is inserted into the map, its socket callbacks will be replaced with these programs.
The attach types for the map programs are:
msg_parser
program -BPF_SK_MSG_VERDICT
.stream_parser
program -BPF_SK_SKB_STREAM_PARSER
.stream_verdict
program -BPF_SK_SKB_STREAM_VERDICT
.skb_verdict
program -BPF_SK_SKB_VERDICT
.
A sock object may be in multiple maps, but can only inherit a single parse or verdict program. If adding a sock object to a map would result in having multiple parser programs the update will return an EBUSY
error.
Warning
Users are not allowed to attach stream_verdict and skb_verdict programs to the same map.
Attributes
The value_size
must always be 4
and the key_size
must always be 8
.
Syscall commands
The following syscall commands work with this map type:
Helper functions
bpf_map_delete_elem
bpf_map_lookup_elem
bpf_map_update_elem
bpf_msg_redirect_map
bpf_sk_redirect_map
bpf_sk_select_reuseport
bpf_sock_map_update
Flags
BPF_F_NUMA_NODE
When set, the numa_node
attribute is respected during map creation.
BPF_F_RDONLY
Setting this flag will make it so the map can only be read via the syscall interface, but not written to.
For details please check the generic description.
BPF_F_WRONLY
Setting this flag will make it so the map can only be written to via the syscall interface, but not read from.
BPF_F_RDONLY_PROG
Setting this flag will make it so the map can only be read via helper functions, but not written to.
For details please check the generic description.
BPF_F_WRONLY_PROG
Setting this flag will make it so the map can only be written to via helper functions, but not read from.
For details please check the generic description.