Enabling `perf` in Kubernetes with Docker’s default seccomp profile

Have you been trying to profile your Kubernetes applications with perf? Maybe you want to see what all the FlameGraphs fuss is about? If your version of Docker was upgraded within the last year, you’ll likely run into issues.

Starting in v17.06 of Docker, perf_event_open is blocked by the default seccomp profile. Which means running perf inside your container will get you this:

perf_event_open(..., PERF_FLAG_FD_CLOEXEC) failed with unexpected error 1 (Operation not permitted)
perf_event_open(..., 0) failed unexpectedly with error 1 (Operation not permitted)
You may not have permission to collect stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid:
 -1 - Not paranoid at all
  0 - Disallow raw tracepoint access for unpriv
  1 - Disallow cpu events for unpriv
  2 - Disallow kernel profiling for unpriv

Trying to alter the suggested /proc/sys/kernel/perf_event_paranoid from within the container gets you the expected:

bash: /proc/sys/kernel/perf_event_paranoid: Read-only file system

What to do? You’ll need to enable CAP_SYS_ADMIN. This flag is one of many Linux capabilities, so named for the extra capabilities they grant. These flags grant scoped permission escalations for threads to perform specific tasks, from changing file attributes to altering the system clock. CAP_SYS_ADMIN is a particularly overloaded one, a kitchen sink of permissions escalations mostly geared toward profiling work.

If you’re only working with Docker, you can add --cap-add SYS_ADMIN to your docker run command, as explored here.

However, if you’re living that Kubernetes life, you’ll need to enable it using a securityContext. In the container spec of your deployment file, add:

securityContext:
    capabilities:
        add: ["SYS_ADMIN"]

And you’ll be good to go!

Note that you need to strip the CAP prefix when adding capabilities in Kubernetes. You can read more about container privileges in Kubernetes here.

Remember to remove this setting when you’re done using it! perf_event_open is blocked by default because it grants user processes privileged access to the system. Branch deploy your change, use it, then rollback.

Alice Goldfuss

Alice Goldfuss
Alice Goldfuss is a systems punk with years of experience working on cutting-edge container platforms. She’s an international speaker who enjoys building modern infrastructure at-scale and writing fiction on the weekends.

Alice has written articles, consulted on publications, built communities, and sipped many cups of tea. She hasn’t written a book, but you’ve probably read her tweets (@alicegoldfuss).

2021 Year(s) in Review

Published on January 03, 2022

Awards Eligibility Post 2021

Published on November 15, 2021