Skip to content

Map type BPF_MAP_TYPE_PERCPU_HASH

v4.6

This is the per-CPU variant of the BPF_MAP_TYPE_HASH map type.

This map type is a generic map type with no restrictions on the structure of the key and value. Hash-maps are implemented using a hash table, allowing for lookups with arbitrary keys.

This per-CPU version has a separate hash map for each logical CPU. When accessing the map using most helper function, the hash map assigned to the CPU the eBPF program is currently running on is accessed implicitly.

Since preemption is disabled during program execution, no other programs will be able to concurrently access the same memory. This guarantees there will never be any race conditions and improves the performance due to the lack of congestion and synchronization logic, at the cost of having a large memory footprint.

Note

The bpf_map_lookup_percpu_elem helper can be used to access maps assigned to other logical CPUs which can negate the above mentioned advantages.

Attributes

While the size of the key and value are essentially unrestricted both value_size and key_size must be at least zero and their combined size no larger than KMALLOC_MAX_SIZE. KMALLOC_MAX_SIZE is the maximum size which can be allocated by the kernel memory allocator, its exact value being dependant on a number of factors. If this edge case is hit a -E2BIG error number is returned to the map create syscall.

The max_entries attribute indicates the max entries per-CPU so the actual memory size consumed is also dependant on the logical CPU count of the host.

Syscall commands

The following syscall commands work with this map type:

Helper functions

The following helper functions work with this map type:

Flags

The following flags are supported by this map type.

BPF_F_NO_PREALLOC

v4.6

Hash maps are pre-allocated by default, this means that even a completely empty hash map will use the same amount of kernel memory as a full map.

If this flag is set, pre-allocation is disabled. Users might consider this for large maps since allocating large amounts of memory takes a lot of time during creation and might be undesirable.

Warning

The patch set1 does note that not pre-allocating may cause issues in some edge-cases, which was the original reason for defaulting to pre-allocation.

BPF_F_NUMA_NODE

v4.14

While settings this flag is allowed, only a value of -1 is allowed in the numa_node attribute, which indicates no specific NUMA node. Since each logical CPU has its own hash table, it is impossible to allocate on only a single NUMA node.

BPF_F_RDONLY

v4.15

Setting this flag will make it so the map can only be read via the syscall interface, but not written to.

For details please check the generic description.

BPF_F_WRONLY

v4.15

Setting this flag will make it so the map can only be written to via the syscall interface, but not read from.

For details please check the generic description.

BPF_F_ZERO_SEED

v5.0

Setting this flag will initialize the hash table with a seed of 0.

The hashing algorithm used by the hash table is seeded with a random number by default. This seeding is meant as a mitigation against Denial of Service attacks which could exploit the predictability of hashing implementations.

This random seed makes hash map operations inherently random at access time. This flag was introduced to make performance evaluation more consistent.

Warning

It is not recommended to use this flag in production due to the vulnerability to Denial of Service attacks.

Info

Only users with the CAP_SYS_ADMIN capability can use this flag, CAP_BPF is not enough due to the security risk associated with the flag.

BPF_F_RDONLY_PROG

v5.2

Setting this flag will make it so the map can only be read via helper functions, but not written to.

For details please check the generic description.

BPF_F_WRONLY_PROG

v5.2

Setting this flag will make it so the map can only be written to via helper functions, but not read from.

For details please check the generic description.