Hey, everyone! I’m Jose, and in this blog post, we’ll explore the challenges of observing and debugging applications in Kubernetes. Following that, I’ll introduce a project that’s ready to take on these challenges and is pushing the boundaries of the eBPF system inspection world: Inspektor Gadget.

Before we jump into the practical aspects of debugging containers, let’s take a moment to understand what eBPF is and its impact in the world of observability.

Intro: eBPF and Observability

eBPF is a kernel technology that grants us the ability to enhance the kernel capabilities by attaching and executing programs without the need for modifying the kernel’s source code or adding kernel modules, which could take years.

eBPF programs are typically event-driven, meaning they are executed when specific events occur. For instance, when a system call is made, a network packet is received, or a function is called. In each of these cases, an eBPF program can be executed to collect data or perform actions. When it comes to observability, we want our eBPF program to be triggered only when the event we are interested in occurs so that we can collect the necessary data to understand what’s happening in the system.

However, eBPF isn’t just about observability. It has been widely adopted in various areas, particularly networking and security. But for the purpose of this blog post, we’ll focus on observability 🔎.

Challenges in Kubernetes

The intrinsic distributed architecture of applications running in Kubernetes makes them difficult to debug. When there’s an issue, it’s not clear where to start looking or even what tools to use. Most of traditional diagnostic tools (e.g., tcpdump, top, ps, etc.) or eBPF-powered tools (e.g., BCC) are designed to work at the host and process level, as such, they usually only allow filtering by parameters such as process or user ID.

Additionally, these tools are not container-aware because they do not have knowledge of Linux namespaces. It means users must manually correlate the output of these tools with the corresponding container. It also implies that for network-related issues, users have to move between Linux network namespaces to run tools, like tcpdump or ss, in the correct context. This process is complex and requires some deep understanding of the backend of container technology and, in general, of Linux namespaces.

We can summarise the challenges we found with traditional diagnostic tools and eBPF-powered tools as follows:

  • Lack of container awareness:
    • Data enrichment: We need to manually associate the collected data with the Kubernetes metadata (container, pod, namespace and node) of the process causing the issue.
    • Filtering: We are unable to filter by container or any other Kubernetes metadata.
    • Complexity: We need to move between Linux namespaces to run the tools in the correct context.
  • Deployment: We need to deploy the tools in each node where the application is running.

Inspektor Gadget: The Journey to Tackle These Challenges

First Steps

Back in 2019, we embarked on a journey to tackle these challenges. We started by collaborating with the BCC project to support filtering by container (using the mount namespace), allowing us to identify events coming from a specific container.

To fully address the remaining challenges, we decided to create a project called Inspektor Gadget. It aims to enable the execution of BCC tools directly on Kubernetes nodes, solving the deployment issue. The very first version of Inspektor Gadget successfully achieved this goal.

Kubernetes Integration

The project gained traction and soon became fully integrated into the Kubernetes ecosystem:

  • Combining output from nodes into a single view.
  • Supporting filtering by pod, namespace, node and/or labels.
  • Enriching data with Kubernetes metadata.
  • Creating the kubectl-gadget plug-in to provide a kubectl-like experience.

Covering More Use Cases

At a certain point, we decided to go one step further and start creating tools to cover kubernetes-specific and more complex use cases, out of scope for BCC. This led to the creation of several tools, which we started calling “gadgets”. Some of them are:

  • trace dns: It allows to trace all the DNS queries and responses in the cluster. Among other information, it provides the query success, whether the response contains an error, the name server that’s used for the lookup and the query-response latency.
  • advice and audit seccomp profiles: The advice gadget provides a seccomp profile based on the system calls that are being used by the application, while the audit gadget allows to verify what system calls are being used by the application and are not allowed by the seccomp profile.
  • snapshot process and socket: These gadgets allow to take a snapshot of the current state of the running processes and TCP/UDP sockets in the cluster.

Further Enhancements and Community Growth

We continued to enhance the Inspektor Gadget framework by adding more features, such as:

  • Detection of container creation and deletion: This feature allows us to start monitoring a container as soon as it’s created without the need to restart the gadget or losing the very first events.
  • Further integration with Kubernetes: For network-related gadgets, we added support for converting IP addresses to pod or service names. In this way, users can easily identify the source and destination of the network events.
  • Support multiple output formats: Apart from the default column-based output, we added support for JSON, JSON pretty, and YAML output formats.

On the community side, Inspektor Gadget has been growing as well. It has a growing number of contributors and users, and it is now a CNCF sandbox project.

The Sky’s the Limit: Inspektor Gadget’s Evolution

Seeing the development of Inspektor Gadget as a project, we started making several questions to ourselves. Why restrict it to Kubernetes? What about standalone containers? What about allowing users to run their own eBPF programs? This led the project to evolve in several ways.

Going Beyond Kubernetes

The challenges described above are not exclusive to Kubernetes, they are also present in standalone containers. This led us to create a version of Inspektor Gadget that can be used with standalone containers and that is fully decoupled from Kubernetes. This is how the ig tool was born.

The ig tool now also supports monitoring processes running directly in the host, outside of any container. This is particularly useful for debugging system-wide issues.​

By the way, systemd units support is coming soon as well 🚀.

Running Custom eBPF Programs

At this point, Inspektor Gadget became a powerful framework that builds, packages, ships, deploys and runs eBPF programs in many different environments. However, we are still limited to the eBPF programs (gadgets) that we provide.

The image-based gadgets idea aims to remove this limitation and make Inspektor Gadget a framework to run any eBPF programs, like Docker does for containers. The image-based design document gadgets provides a detailed explanation of this feature.

Here a summary of how it works:

  • Build your own eBPF program and package it as an OCI image:

    ig image build -t my-gadget:v1 .
    
  • Manage your gadgets as you do with containers:

    ig image pull|push my-gadget[:TAG]
    ig image list
    
  • Run your gadget:

    ig run my-gadget:v1
    

This feature is already available in ig and kubectl-gadget as experimental. We are working to make the gadgets that we provide available as OCI images as well. You can find more information about the progress in this issue: [EPIC] Implement Image-based Gadgets.

Where is Inspektor Gadget Heading?

Our vision for Inspektor Gadget is to be a uniquely complete tool for eBPF system inspection, supporting across Linux host processes, systemd units, containers, and Kubernetes​:

Docker is to containers what Inspektor Gadget is to eBPF programs.

Join us in this journey by contributing to the project, providing feedback, or just using it and letting us know your thoughts.