Osquery and containers

Containers on Linux present new detection challenges and opportunities for endpoint security agents. Ryan Mack (Uptycs) looks at ways osquery can be used to extract meaningful information about running containers.

 

Presented at osquery@scale 2021.

Session transcript

Hello, welcome. My name is Ryan. I'm an engineer at Uptycs. I work on osquery, EBPF, and specifically focused on containers. Today I'm going to be talking about how you can leverage osquery in a containerized environment.

I'm going to start talking a little bit about our goals in a containerized environment. However, some of the background that I think is important to understanding the approaches that you can use in osquery to inspect containerized environments, go over the tools that are available to you inside osquery, Talk about how we can leverage those tools to make useful detections in the containerized environment, and finally, I'll share a little bit of my closing thoughts on where we're going next with this technology.

So what is our goal? Our goal is to make it easy to monitor containers and to detect security problems in container based environments. This means a couple of things. First, it means, obviously, we want to have as high visibility as we can to everything going on in our containerized environment. It also means that we want to be able to leverage the unique properties of containers, the way applications are typically deployed in containers to create detection rules that are potentially higher fidelity than those we have available when operating on a normal Linux host.

Let's start with the background. W hat are containers? I'm going to give you a really brief history of process isolation on Unix. We go all the way back to slightly before I was born with the introduction of chroot in Unix V7. Chroot was probably the earliest primitive that we had for process isolation. The idea being, if you have a process that you don't want to see the entire file system, you can constrain its worldview to a sub-directory on your server. Anyone who used to run web servers, or any kind of Linux server that was connected to the internet before containers, I'm sure is familiar with the process of setting up a chroot environment for your process.

Jumping ahead, Namespaces is the concept of everything as a file system on Linux, on Unix in general, Plan 9 introduced the idea of multiple namespaces, meaning different process groups can have a different view of that file system. We can expose different parts of the system to different process groups. That idea came to Linux about a decade later. The Linux started only with the idea of multiple namespaces for the file system. So, if you look back to the system calls or the parameters to the system calls, they're all just called "new Namespace". We never had the idea that there was going to be things other than a mount Namespace, the file system Namespace.

Two big changes came in 2006. The first was cgroups, which is short for control groups. This was a way to limit the resources available to a process group. This allowed for you to specify the maximum amounts of CPU, memory, and, then later, even GPU usage to a particular subset of your processes. This allowed for some level of isolation and fair sharing between different co-tenants on a Unix box. Pspaces was the addition of multiple process ID spaces on Linux. So when you look at your list of processes inside a container, you might see different numbering and a different subset of the processes than if you ran that on your host. That was introduced to Linux in 2006.

 

After that, additional namespaces were added to support different views of the networking, IPC, and even the list of users that are available on the system. Now that we have all of these different namespaces, different mount points, and things that we need to configure, LXC, the Linux container project, was our container runtime that actually managed the setup and tear down of all of those namespaces in order to constrain your process group inside a running container.

 

Of course, most of us are familiar with really, I would say the next inflection point in this development, which was 2013 with Docker. Docker leveraged LXC but instead of just focusing on the life cycle of running your process, it really focused on improving the entire development life cycle of your container. It allowed you to create reusable image layers so that if you wanted to build multiple containers, they can each have shared common components you only had to build once and it also really improved and streamlined the distribution of container images, which really kind of made container deployment as easy as just installing a package on your system. I would say that was probably one of the most important inflection points in the acceleration of container adoption.

And, of course, today, everything is a container. I am presenting these slides from a Google Slides presentation. The web server for that's probably running in a container. I'm recording this video on a web application, probably running in a container. So as you can see, it's been a rapid adoption in the last few years to where most of us are thinking about how to deploy our services inside containers today.

 

So why did I give you this history? Well, because it's part of the important question, the answer to "what is the container?" And the answer really has two different ways of thinking about it. The first is all of these kernel features; a process group configured with different namespaces, different cgroups, and a mounting of layered file systems in order to configure the environment where your process runs. In addition to that, we also have the second idea, which is there's this logical object that lives inside your container runtime. There's like an ID, that Docker understands to be a container, and an ID that Docker understands to be an image. And where this gets complicated is how these all connect to each other. What we're left with is the challenge for us.

We need tools that are able to reconcile these two ways of thinking about Linux containers. Now, since this is an osquery conference, you're guessing my answer is osquery, you're right. Osquery provides really comprehensive visibility into things happening inside your containers, in the runtime as well as in the kernel. And it exposes SQL tools to allow you to join these concepts literally in your queries.

 

So, what are the tools available to us in osquery for containers? The first is a comprehensive container runtime introspection. I'll go into a little bit more detail on each of these points in a moment, but this basically means that if there's an API in your container runtime, you can expose that information in a query in osquery. Next: system tables that have additional columns with information about the container environment that they're running. Events tables that have additional container related information annotated or decorated onto your events. And last, what I like to call "container-enabled" system tables. These are tables that typically would operate on the file system mounted on the host, but here you can also leverage additional features to look inside of running containers. We'll look at each of these in a little more detail.

 

What does comprehensive mean? I'm not going to go through each of these, but you can see just looking at the tab completion in osquery that for Docker, Crio, Containerd, and LXC, we pretty much have a table representing anything available in the respective container runtime API. Two of these that you'll probably use the most are going to be the Docker containers table. Here what I'm showing is your ability to leverage the fact that there's a shared ID, the process ID namespace, that you can use connective query that is inspecting your process list with information that the container runtime understands about that container.

 

So, for example here, we're able to connect up the processes running inside a container to the information about the name of that container coming from the Docker runtime. Of course, not every process is going to get captured if you're periodically polling the processes table. We want to leverage the eventing image to capture very short-lived processes, as well as file and socket events. Now, those data sources are now coming from EBPF, which you've seen in a presentation earlier in the conference, and that captures enough container information that we can add these decorations that allow you to understand the container context where these events are happening. And these examples here, I'm first showing you process events where a user is running a VI inside of the container and we're able to connect that back to the image name and the container name from the container runtime. Similarly, we're able to detect file modification and connect it back to the container name and the container ID from the runtime. That's going to be very important as we start to build detection rules, where we want to leverage our understanding of what a container should be doing with things that the system is doing that might actually be a security vulnerability.

 

The next category that I want to talk about is "container-enabled" system tables. The first example here is looking at the Etsy password file. On your host, you have an Etsy password file, but most likely inside your running containers, you have a different one. These extensions will allow you to check the contents of the file system within the scope of your running container to extract useful information about that environment. Here, you'll also see the ability to join across the shell history table with that users table, both from within inside that running container. In this case, this is actually showing us that from that information we saw earlier, I user used BI to add a new user and then started using that user to edit files inside the container.

 

Here's another example that I find particularly useful. For example, you can run your package listings, or in this example your Debian package listings, simultaneously on the host as well as all of the running containers and extract whatever information you want. In this case, we're just doing a quick summary on the number of packages installed on the host as well as the running containers. So that covers, more or less, the kinds of data that's available to you inside osquery that's relevant to containers.

 

Now, how do we leverage that to make useful detections in a containerized environment? Well, I want to talk a little bit around how you can think about detections in the container that might be a little different from your host. First, we want to understand why are containers special from a detection point of view?

 

First is, containers are immutable. In theory, building your software, installing your packages all happens during the steps when you're building the container image. Once the container is deployed and running, typically you aren't going to be installing new software in there. Next, containers are single purpose. Again, this isn't always true, but we'd like to see containers where each container is running one service and it generally has a consistent set of other services that it talks to over time. Next, nobody logs into a container. Of course, typically you deploy your container in production and you aren't debugging inside of a running container on your production system. And last, containers have a well-defined behavior. This is sort of a summary of points two and three here, but it does mean that when your container is doing something different, you can actually characterize those deviations over time because none of the four things I've just told you are strictly true all of the time, and everyone uses containers a little bit differently.

 

Let's look at examples of how these properties of containers can be turned into detection rules. First: containers are immutable. So here's an example of a detection rule written on processed file events coming out of osquery. If for example, you see a file is being created inside your container with one of the execution bits set in its mode, it's possible that someone is now introducing a new script or a new binary inside your running container. That's definitely a signal and it may not be typical for your production environment. Similarly, you don't normally run RPM or APD package in order to install new packages in your container after they're running. Here, we're looking at a detection rule based off process events when someone runs a package manager inside of your container. And again, here's an example of a file rule. If someone modifies a file on disk inside your container in one of your system pads, quite possible that someone is trying to replace one of your standard system binaries with malware.

 

Let's look at our next rule we were talking about. Containers are single purpose. For example, if you're just running Apache httpd and your image name starts with httpd, you might be surprised if there are other processes running inside that container. So here we've written the detection role based off process events or the httpd container if it starts to see executable with different names. Obviously your container deployment may be different, but this lets you think about the fact that for a given container image, you can start to characterize the processes that you expect to see in them and provide very high signal when something abnormal is happening.

 

Another example here, if your application only communicates over TCP, here's an example of a socket rule, an event rule on the socket events table, if you detect an outgoing connection over UDP, excluding DNS, that's an indication that someone might be communicating to a command and control server from software running inside of your container. Here's an example, nobody logs into a container. Again, here we're detecting process events. If any process is invoked with its parent process being SSHD or if you are detecting any login processes inside your container, you probably are detecting someone who's actually figured out how to log into your container that you aren't expecting. So I think this is a useful signal in a container environment as well.

And lastly, all of our applications are going to be different and as you deploy any sort of detection rules, you will over time determine what are the things that are showing up as noise in your alerts. This is just an example that as you characterize your container's behavior, you can slowly build up rule sets to reduce the noise and really ensure that all of your alerts are high signal.

 

Of course, the combination of multiple events is a really good indicator that you've been compromised. This is a screen grab from one of our tools where you can build a detection of correlation across multiple events. You're showing here multiple process events, showing in-network events, as well as file operation all of which indicate that there's a good chance something is happening, but in this case we've seen 90 such signals within a 15 minute window. That's a very high indication that you've actually had someone reach one of your containers. I think this is actually a capture of detonating a malware example in one of our research environments. So that's how you can build detection rules and leverage them in a container environment to give you really high signal detections.

Let me go over again what we've been talking about so far. First of all, our goal of really trying to give good visibility into a containerized environment and write detection rules that are really high signal and give you confidence when there's an alert that it's actionable and something you should be paying attention to. We covered briefly the background of how containers evolved on Linux and how that introduced both concepts inside the kernel, like C groups and namespaces, as well as the concept you're more familiar with from your container runtimes. We talked about the tables and tools that are available to you inside osquery when you're operating in a containerized environment and how we can leverage those tools to build high signal detection rules that really give you a good, actionable intelligence when there's the possibility of a security breach in your containerized environment.

 

Last, I want to talk a little about where I think we're going next with this work. The future, I think, is evolving in three directions. The first one I would talk about is that we're moving onto more layers of abstraction. Everything we've talked about here is really focused on a host and at that kernel and container runtime environment, but going forward a lot more of this is being managed for you by container orchestration frameworks. So in the future, we definitely need to think more around how do we incorporate information coming from Kubernetes or from your cloud provider and incorporate that into your analysis and your detection rules. In addition, put of that abstraction is we're not necessarily even really deploying hosts or node groups anymore. Now we're working with things like Amazon Fargate, where we're just deploying containers and allowing the cloud provider to figure out how to map that onto actual virtual machines. So now we have to think about, well, how do we build detection rules and how do we build our detection environments inside this host free environment? And if you take that to the extreme, we're starting to think about how do you handle Amazon Lambda, where you're not necessarily even running services anymore, you're possibly just writing a function and pushing it for Amazon to manage the full life cycle.

 

The next direction of evolution is towards more secure container environments. I think this is particularly interesting for us right now. Amazon Fargate runs on a modified hypervisor that is actually much more secure but it eliminates a lot of the approaches that we're currently using for process inspection. For example, you can't run eBPF and you can't even run Audit inside of a container running inside Amazon Fargate. Similarly, Google's introduced gVisor, which is this interesting user space layer between your application and the kernel that provides additional security, but also limits our visibility. So that's the continued direction for research.

 

And last, more intelligent. Everything I've talked about today is around how you can build the detection rules for your containers based off your understanding of the expected workloads. Going forward, machine learning is a really powerful tool for being able to understand the baseline behaviors of your container application and given that containers are generally single purpose, it's a lot easier to build well-trained models over what's considered normal and abnormal behavior for your containers. These are all kinds of the directions I see things going and hopefully we'll have interesting stuff we can talk about in future presentation. Thank you very much for your time. If you have any questions, and this is the live video of osquery@scale conference, I'm happy to answer them inside of our chat room right now. Thank you and have a great day.

Related:

Session description