Skip to content

Program type BPF_PROG_TYPE_CGROUP_SOCK_ADDR

v4.17

cGroup socket address programs are triggered when a process in a cGroup to which the program is attached uses socket related syscalls. This program can overwrite arguments to the syscall such as addresses.

Usage

This program type can be used to overwrite arguments to socket related syscalls or to block the call to the syscall entirely. Which syscall depends on the attach type used.

BPF_CGROUP_INET4_BIND and BPF_CGROUP_INET6_BIND

v4.17

This attach type is triggered when a process calls the bind syscall with an IPv4 or IPv6 address respectively. The typical ELF sections used for this attach type are: cgroup/bind4 and cgroup/bind6.

Note

Since v5.12 the 2's bit of the return value is used to indicate that checking for the CAP_NET_BIND_SERVICE capability can be skipped. Normally this capability is required when binding to a privileged port (<1024). So when a BPF program rewrites the listening port on a process without the capability it can set this bit to prevent the kernel from blocking the call.

BPF_CGROUP_INET4_CONNECT and BPF_CGROUP_INET6_CONNECT

v4.17

This attach type is triggered when a process calls the connect syscall with an IPv4 or IPv6 address respectively. The typical ELF sections used for this attach type are: cgroup/connect4 and cgroup/connect6.

BPF_CGROUP_UDP4_SENDMSG and BPF_CGROUP_UDP6_SENDMSG

v4.18

This attach type is triggered when a process calls the sendmsg syscall with an IPv4 or IPv6 address respectively. The typical ELF sections used for this attach type are: cgroup/sendmsg4 and cgroup/sendmsg6.

BPF_CGROUP_UDP4_RECVMSG and BPF_CGROUP_UDP6_RECVMSG

v5.2

This attach type is triggered when a process calls the recvmsg syscall with an IPv4 or IPv6 address respectively. The typical ELF sections used for this attach type are: cgroup/recvmsg4 and cgroup/recvmsg6.

BPF_CGROUP_INET4_GETPEERNAME and BPF_CGROUP_INET6_GETPEERNAME

v5.8

This attach type is triggered when a process calls the getpeername syscall with an IPv4 or IPv6 address respectively. The typical ELF sections used for this attach type are: cgroup/getpeername4 and cgroup/getpeername6.

BPF_CGROUP_INET4_GETSOCKNAME and BPF_CGROUP_INET6_GETSOCKNAME

v5.8

This attach type is triggered when a process calls the getsockname syscall with an IPv4 or IPv6 address respectively. The typical ELF sections used for this attach type are: cgroup/getsockname4 and cgroup/getsockname6.

Context

C structure
struct bpf_sock_addr {
    __u32 user_family;  /* Allows 4-byte read, but no write. */
    __u32 user_ip4;     /* Allows 1,2,4-byte read and 4-byte write.
                * Stored in network byte order.
                */
    __u32 user_ip6[4];  /* Allows 1,2,4,8-byte read and 4,8-byte write.
                * Stored in network byte order.
                */
    __u32 user_port;    /* Allows 1,2,4-byte read and 4-byte write.
                * Stored in network byte order
                */
    __u32 family;       /* Allows 4-byte read, but no write */
    __u32 type;     /* Allows 4-byte read, but no write */
    __u32 protocol;     /* Allows 4-byte read, but no write */
    __u32 msg_src_ip4;  /* Allows 1,2,4-byte read and 4-byte write.
                * Stored in network byte order.
                */
    __u32 msg_src_ip6[4];   /* Allows 1,2,4,8-byte read and 4,8-byte write.
                * Stored in network byte order.
                */
    __bpf_md_ptr(struct bpf_sock *, sk);
};

user_family

v4.17

This field contains the address family passed to the syscall. Its value is one of AF_* values defined in include/linux/socket.h.

The context allows 4-byte reads from the field, but no writes to it.

user_ip4

v4.17

This field contains the IPv4 address passed to the syscall. Its value is stored in network byte order. This field is only valid of INET4 attach types.

The context allows 1,2,4-byte reads and 4-byte writes.

user_ip6

v4.17

This field contains the IPv6 address passed to the syscall. Its value is stored in network byte order. This field is only valid of INET6 attach types.

This context allows 1,2,4,8-byte reads and 4,8-byte writes.

Note

8-byte wide loads are only supported since v5.3

user_port

v4.18

This field contains the port number passed to the syscall. Its value is stored in network byte order.

This context allows 1,2,4-byte reads and 4-byte writes.

family

v4.18

This field contains the address family of the socket. Its value is one of AF_* values defined in include/linux/socket.h.

The context allows 4-byte reads from the field, but no writes to it.

type

v4.18

This field contains the socket type. Its value is one of SOCK_* values defined in include/linux/socket.h.

This context allows 4-byte reads from the field, but no writes to it.

protocol

v4.18

This field contains the socket protocol. Its value is one of IPPROTO_* values defined in include/linux/socket.h.

This context allows 4-byte reads from the field, but no writes to it.

msg_src_ip4

v4.18

This field contains a IPv4 address which is the source IP of the message about to be sent. Its value is stored in network byte order.

This field is only valid of BPF_CGROUP_UDP4_SENDMSG attach type.

This context allows 1,2,4-byte reads and 4-byte writes.

msg_src_ip6

v4.18

This field contains a IPv6 address which is the source IP of the message about to be sent. Its value is stored in network byte order.

This field is only valid of BPF_CGROUP_UDP6_SENDMSG attach type.

This context allows 1,2,4,8-byte reads and 4,8-byte writes.

Note

8-byte wide loads are only supported since v5.3

sk

v5.3

This field contains a pointer to the socket for which the program was invoked, its type being a struct bpf_sock.

Attachment

cGroup socket buffer programs are attached to cGroups via the BPF_PROG_ATTACH syscall or via BPF link.

Example

BPF_CGROUP_INET4_BIND and BPF_CGROUP_INET6_BIND
// SPDX-License-Identifier: GPL-2.0

#include <linux/stddef.h>
#include <linux/bpf.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>

static __always_inline int bind_prog(struct bpf_sock_addr *ctx, int family)
{
    struct bpf_sock *sk;

    sk = ctx->sk;
    if (!sk)
        return 0;

    if (sk->family != family)
        return 0;

    if (ctx->type != SOCK_STREAM)
        return 0;

    /* Return 1 OR'ed with the first bit set to indicate
    * that CAP_NET_BIND_SERVICE should be bypassed.
    */
    if (ctx->user_port == bpf_htons(111))
        return (1 | 2);

    return 1;
}

SEC("cgroup/bind4")
int bind_v4_prog(struct bpf_sock_addr *ctx)
{
    return bind_prog(ctx, AF_INET);
}

SEC("cgroup/bind6")
int bind_v6_prog(struct bpf_sock_addr *ctx)
{
    return bind_prog(ctx, AF_INET6);
}

char _license[] SEC("license") = "GPL";
BPF_CGROUP_INET4_CONNECT, BPF_CGROUP_INET4_GETSOCKNAME, and BPF_CGROUP_INET4_GETPEERNAME
// SPDX-License-Identifier: GPL-2.0
#include <string.h>
#include <stdbool.h>

#include <linux/bpf.h>
#include <linux/in.h>
#include <linux/in6.h>
#include <sys/socket.h>

#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>

#include <bpf_sockopt_helpers.h>

char _license[] SEC("license") = "GPL";

struct svc_addr {
    __be32 addr;
    __be16 port;
};

struct {
    __uint(type, BPF_MAP_TYPE_SK_STORAGE);
    __uint(map_flags, BPF_F_NO_PREALLOC);
    __type(key, int);
    __type(value, struct svc_addr);
} service_mapping SEC(".maps");

SEC("cgroup/connect4")
int connect4(struct bpf_sock_addr *ctx)
{
    struct sockaddr_in sa = {};
    struct svc_addr *orig;

    /* Force local address to 127.0.0.1:22222. */
    sa.sin_family = AF_INET;
    sa.sin_port = bpf_htons(22222);
    sa.sin_addr.s_addr = bpf_htonl(0x7f000001);

    if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0)
        return 0;

    /* Rewire service 1.2.3.4:60000 to backend 127.0.0.1:60123. */
    if (ctx->user_port == bpf_htons(60000)) {
        orig = bpf_sk_storage_get(&service_mapping, ctx->sk, 0,
                    BPF_SK_STORAGE_GET_F_CREATE);
        if (!orig)
            return 0;

        orig->addr = ctx->user_ip4;
        orig->port = ctx->user_port;

        ctx->user_ip4 = bpf_htonl(0x7f000001);
        ctx->user_port = bpf_htons(60123);
    }
    return 1;
}

SEC("cgroup/getsockname4")
int getsockname4(struct bpf_sock_addr *ctx)
{
    if (!get_set_sk_priority(ctx))
        return 1;

    /* Expose local server as 1.2.3.4:60000 to client. */
    if (ctx->user_port == bpf_htons(60123)) {
        ctx->user_ip4 = bpf_htonl(0x01020304);
        ctx->user_port = bpf_htons(60000);
    }
    return 1;
}

SEC("cgroup/getpeername4")
int getpeername4(struct bpf_sock_addr *ctx)
{
    struct svc_addr *orig;

    if (!get_set_sk_priority(ctx))
        return 1;

    /* Expose service 1.2.3.4:60000 as peer instead of backend. */
    if (ctx->user_port == bpf_htons(60123)) {
        orig = bpf_sk_storage_get(&service_mapping, ctx->sk, 0, 0);
        if (orig) {
            ctx->user_ip4 = orig->addr;
            ctx->user_port = orig->port;
        }
    }
    return 1;
}

Helper functions

Supported helper functions

KFuncs

Supported kfuncs