Skip to content

Program type BPF_PROG_TYPE_SK_SKB

v4.14

Socket SKB programs are called on L4 data streams to parse L7 messages and/or to determine if the L4/L7 messages should be allowed, blocked or redirected.

Usage

Socket SKB programs are attached to BPF_MAP_TYPE_SOCKMAP or BPF_MAP_TYPE_SOCKHASH maps and will be invoked when messages get received on the sockets which are part of the map the program is attached to. The exact purpose of the program differs depending on its attach type.

As BPF_SK_SKB_STREAM_PARSER program

When this attach type is used the program acts as a stream parser. The idea behind a stream parser is to parse message based application layer protocols (OSI Layer 7) which are implemented on top of data streams such as TCP.

The job of the program is to parse the L7 data/packet and to tell the kernel how long the L7 message is. This will allow the kernel to combine multiple data stream packets and return complete L7 messages for every recv instead of returning the TCP messages which might only contain part of the L7 message.

The return value is interpreted as follows:

  • >0 - indicates length of successfully parsed message
  • 0 - indicates more data must be received to parse the message
  • -ESTRPIPE - current message should not be processed by the kernel, return control of the socket to userspace which can proceed to read the messages itself
  • other < 0 - Error in parsing, give control back to userspace assuming that synchronization is lost and the stream is unrecoverable (application expected to close TCP socket)

Note

Before v5.10 it was required to have a stream parser attached to a BPF_MAP_TYPE_SOCKMAP if you wanted to use the stream verdict as well. On newer versions this is no longer required.

On the older kernels, a no-op program can be used to just return the length of the current skb to retain default behavior and pass verdict per TCP packet.

SEC("sk_skb/stream_parser")
int noop_parser(struct __sk_buff *skb)
{
    return skb->len;
}

As BPF_SK_SKB_STREAM_VERDICT program

When this attach type is used the program acts as a filter, comparable to TC or XDP programs. The program gets called for every message indicated by the parser (or TCP packet if no parser is specified) and returns a verdict.

The return value is interpreted as follows:

  • SK_PASS - The message may pass to the socket or it has been redirected with a helper.
  • SK_DROP - The message should be dropped.

Unlike TC or XDP programs, there is no special redirect return code, helpers such as bpf_sk_redirect_map will return SK_PASS on success.

As BPF_SK_SKB_VERDICT program

v5.13

The non-stream verdict attach type is a replacement for the BPF_SK_SKB_STREAM_VERDICT attach type. The program type has the same job and uses the same return values. The difference is that this the stream verdict variant only supports TCP data streams while BPF_SK_SKB_VERDICT also supports UDP.

Context

Socket SKB programs are called by the kernel with a __sk_buff context.

This program type isn't allowed to read from and write to all fields of the context since doing so might break assumptions in the kernel or because data isn't available at the point where the program is hooked into the kernel.

Context fields
Field Read Write
len
pkt_type
mark
queue_mapping
protocol
vlan_present
vlan_tci
vlan_proto
priority
ingress_ifindex
ifindex
tc_index
cb
hash
tc_classid
data
data_end
napi_id
family
remote_ip4
local_ip4
remote_ip4
remote_ip6
local_ip6
remote_port
local_port
data_meta
flow_keys
tstamp
wire_len
tstamp
gso_segs
sk
gso_size
tstamp_type
hwtstamp

Attachment

Socket SKB programs are attached to BPF_MAP_TYPE_SOCKMAP or BPF_MAP_TYPE_SOCKHASH using the BPF_PROG_ATTACH syscall (bpf_prog_attach libbpf function).

The programs should be loaded with the same expected attach type as used during the attaching.

Note

Before BPF_SK_SKB_STREAM_VERDICT and BPF_SK_SKB_VERDICT are mutually exclusive per map, only one or the other program type can be used.

Example

Example BPF program:

// Copyright Red Hat
SEC("sk_skb/stream_verdict")
int bpf_prog_verdict(struct __sk_buff *skb)
{
        __u32 lport = skb->local_port;
        __u32 idx = 0;

        if (lport == 10000)
                return bpf_sk_redirect_map(skb, &sock_map_rx, idx, 0);

        return SK_PASS;
}

Example userspace loader code:

// Copyright Red Hat
int create_sample_sockmap(int sock, int parse_prog_fd, int verdict_prog_fd)
{
        int index = 0;
        int map, err;

        map = bpf_map_create(BPF_MAP_TYPE_SOCKMAP, NULL, sizeof(int), sizeof(int), 1, NULL);
        if (map < 0) {
                fprintf(stderr, "Failed to create sockmap: %s\n", strerror(errno));
                return -1;
        }

        err = bpf_prog_attach(parse_prog_fd, map, BPF_SK_SKB_STREAM_PARSER, 0);
        if (err){
                fprintf(stderr, "Failed to attach_parser_prog_to_map: %s\n", strerror(errno));
                goto out;
        }

        err = bpf_prog_attach(verdict_prog_fd, map, BPF_SK_SKB_STREAM_VERDICT, 0);
        if (err){
                fprintf(stderr, "Failed to attach_verdict_prog_to_map: %s\n", strerror(errno));
                goto out;
        }

        err = bpf_map_update_elem(map, &index, &sock, BPF_NOEXIST);
        if (err) {
                fprintf(stderr, "Failed to update sockmap: %s\n", strerror(errno));
                goto out;
        }

out:
        close(map);
        return err;
}

Helper functions

Supported helper functions

KFuncs

Supported kfuncs