Overview

About this video

What You'll Learn

  1. Use Kubernetes securityContext fields to attach seccomp profiles to pods instead of manual node-level annotations.
  2. Watch the operator sync seccomp profile ConfigMaps to nodes via daemonset updates and keep pods using valid annotations.
  3. Trace blocked syscalls with strace and generate reproducible seccomp profiles from Podman workload logs.

Daniel Mangum and Sascha Grunert walk through the Kubernetes seccomp operator: what seccomp is, installing the operator, applying profiles to nginx pods, tracing blocked syscalls with strace, and generating profiles with podman.

Chapters

Jump to a chapter

  1. 0:00 Holding screen
  2. 0:30 Introductions
  3. 3:20 What is seccomp and the seccomp operator
  4. 18:00 Installing the seccomp operator
  5. 20:00 Seccomp profiles
  6. 31:00 Deploying nginx with and without a seccomp profile
  7. 57:00 Switching to Linux because Docker for Mac wasn't working
  8. 1:01:00 Tracing blocked syscalls
  9. 1:04:00 Listing syscalls with strace
  10. 1:09:30 Using podman to generate seccomp profiles
Transcript

Full transcript

Generated from the English captions. Timestamps jump the player to that moment.

Read the full transcript

0:30 Introductions

0:42 Hello. Hello. Hello. Hey. How's it going? Hey. How? Alright. How are you both today? Doing pretty good. I'm fine. Excited to be joining. This I guess this is my sort of second time on the stream. I don't know if the first one counts. It wasn't it wasn't official. It was impromptu. So but we kinda The best streams, though. Right? The impromptu ones where you have no idea what's going on or not. Mean Right. This is my twentieth time, and I still have no idea how to introduce this stream other than I'll be like, hi. Like

1:21 eventually, I'll come up with some sort of script. But for now, when you know, it's working alright. Yeah. Well, wasn't it. You've been getting these streams out super fast. When did you start doing this this specific show? Because I've I've been watching, and it's like it's almost, like, daily at this point. Whereas I think I've been doing streams for about a year now, and I'm at, like, 21 shows. So, like, your pace is infinitely faster than mine. Yeah. I'm averaging three a week right now. Wow. Which is quite a lot. And they're, you know but, I mean, as you two have just

1:55 found out today, there's, like, zero prep. I I pretty much just find people that are doing cool shit and, like, say, hey. Do you wanna come and join me? And then just, you know, free flow, kinda go through the technology and see what works and what doesn't. Like, it's not like there's a whole lot of prep going into this that's taking loads of time. Fair enough. Not that I should be broadcasting to other people that are gonna tune in and watch. This is really polished, we all know exactly what we're doing. We rehearsed for weeks.

2:22 Okay. So today's mission is we're we're gonna talk about some Kubernetes cool stuff. I believe and you can correct me if I'm wrong here. But this is mostly possible since Kubernetes one nineteen, which came out last week. Is that correct? Oh, we are backwards compatible. So we are always trying to pull up bring up some enhancement to the community which works on previous versions of Kubernetes too. So yeah. But the initial reason for our work was that we found out during the graduation of seccomp to GA in 01/19 that, hey. There are so many enhancements which could probably go with that feature,

3:02 but it probably would be better to put them out of three. And now we have our little community around it, and we are getting more and more contributors to the project. And, yeah, we're pretty happy that we can provide some valuable content to the community in that way. Yeah. I'm I'm really excited to play with us today, actually. Mhmm. It's one of those things that I know you know, security is a big deal, especially when we're talking about cloud computing and Kubernetes and especially if you've got public facing traffic of any kind. So we're just in attack factors and all that

3:20 What is seccomp and the seccomp operator

3:34 other stuff. I'm not gonna pretend I know about security, but I know a few words I can drop in, like, attack vector. And second operator, which is a project that YouTube have been involved in, I think, is a big component of that. So do you wanna just take a minute to describe what this project is and and how it works and helps people be securing the next one? Dan, do you wanna take it? Sure. Yeah. I can I can jump in on that? So as Sasha said, set comp has existed in Kubernetes for for quite a while and and and the Linux

4:03 kernel as well. So, basically, set comp is just, you know, when you have a process running in user space, it typically, to do anything meaningful, needs to access some of the functionality in kernel space. And to do that, it uses syscalls. Right? So it says, hi, kernel. Like, you have more privilege than me. Please go do this task for me. And so you can imagine since it has to communicate in this language, you could pretty easily put in a layer they there to filter what calls are going through to the kernel. And that's a pretty effective way to, you

4:37 know, establish security boundaries on things because the the functionality of different syscalls is pretty well defined, though there are a lot of them, which we'll probably find out later on in the stream. But being able to restrict what syscalls a process can accomplish is is really effective. And especially when you bring in containerization, you can apply profiles to that. It can get a lot more effective. So anyway, you've been able to do this for quite a while with annotations on pods and containers in in Kubernetes, and you can also do that on on pod security policies to enforce these sorts of

5:13 things. Most container runtimes, which for folks who aren't as familiar with Kubernetes, Kubernetes is an orchestration orchestration layer that sits on top of a container runtime. Right? So you could swap out the underlying container runtime, which we we may talk about a little bit. But most of them ship with kind of a default set comp profile, which is like, this is, you know, the bare minimum security you could have here. It's like, you it's not gonna do anything too atrocious. But most people, you won't even see enabling that, right, because it is an annotation you have to add.

5:45 So it's kind of like unstructured data. It's not a formal field if you look at the API spec for a pod. So, yeah, you don't see it used a lot in Kubernetes even though it's been around for quite a while. And as Sasha mentioned, as part of the one dot '19 release, Sasha and I and Paulo Gomez, who, led a big part of this as well, graduated set comp to GA. So essentially, what that means and this sounds kind of simple, but if you're familiar with kind of how API changes work in Kubernetes, you'll know it's

6:14 it's not super simple. We basically just took those annotations and made them a formal part of the security context. So that has a number of benefits. One of the biggest I see is that it's just, you know, a a field that's there in the API spec, right, that you can set as part of your security context. It's in the documentation as such, and it's less of kind of like an afterthought. So anyway, basically, same functionality for the most part, but just moving that to the actual formal pod spec. And as Sacha mentioned, as part of that,

6:45 we wanted to enable you to make it easier to use that comp. So once again, for folks who aren't as familiar with Kubernetes, typically, there are a number of nodes running. So each of those nodes have a container runtime on them. The Kubernetes scheduler is gonna go and take your pods, are made up of containers, put those on the nodes, which are gonna run those with the container runtime that they're interfacing with. So you can imagine if the if set comp is something that's at the kernel layer, that is on a per node basis. Right?

7:17 So you can't just say, I I'd like the set comp profile to be used and assume that it's present on all nodes. Right now, you have to actually go in and, you know, when you're setting up your machine or SSH ing into your machine or something like that, you have to go and actually put them in a directory that's specified with the kubelet flag and make sure those are present on all of your nodes. So it's it's very for something that you can configure on a pod spec, it's very divorced from the actual running of a workload.

7:45 Right? It's very much a a cluster administrator activity, which, you know, has has some pros and cons. Right? Typically, the cluster administrator is gonna be knowledgeable of those things and can do it. But if you want to, you know, constantly be changing the seccomp profiles that are available and creating new ones for for different processes you're running in in a different container, then that's kind of a pain. Right? So let's say I'm I'm running a a workload and I'm just a developer in my organization. I I need to create my pod. You know, I don't want to have to

8:15 email my my cluster administrator and say, hey. Will you go make sure there's a seccomp profile on this node? And then I'm gonna put, you know, like, specific node taints on here to make sure the pod gets scheduled there and do all that. Like, it it's it's mixing of responsibility there, it's kind of, like, moving away from this whole pattern that Kubernetes is trying to, enforce. So what we'd like to do is take the, set comp functionality and make it available more at that developer level, or just make it easier for cluster administrators to to use and and create pod security

8:48 policies. So, anyway, the set comp operator is kind of typical to Kubernetes operators. It it watches, an API type. Right now, we're just using config map, so there's no new API type that's added. And that config map has data in it, is basically just set comp profiles. And it'll go and it runs as a daemon set. So this operator is running on every node, and it actually goes and puts those in the correct directory, which means that you can then say on your pod, you know, use this set comp profile with this name, which I know exists because I

9:19 can queue control, get my config maps that are set comp profiles and and know which ones are available to me, and then apply them to my workloads. And we can run through through doing all of this, but that's kind of the main idea. We'd like to make it really easy to use this really valuable functionality. Okay. There is a whole lot of information there. So I what I'm gonna do is make sure that I have understood it correctly, and I'm gonna try and scrape it back to you now. Sounds good. So second. Right? Let's remove Kubernetes and containers for now. It's

9:53 a Linux kernel primitive that allows me to define the syscalls that any binary running on that machine are allowed to execute. Right? Exactly. The second profile is something that I have to craft myself, and I put in a directory that Linux kernel expects it to live in. And so how do you connect the profile to the binary? Like, it's beside Kubernetes. Like, if I was just doing this on a Linux machine, how would that work? Yeah. I mean, it's more or less so the profile profiles itself are not not written in JSON. So they are just for example, you can use

10:30 libseccom, which is a c library, and then you can use it to interface with the seccom BPF filter filter, and then you can generate or build the filter. For example, you allow only a subset of this calls for your current currently running architecture. And then you have to call something like filter load, and then the filter gets loaded into the kernel and attached to a process. And, yeah, that's it. So it's it has to be attached to a running process. Oh, okay. Okay. So that that was a bit of a disconnect for me then. So I have a running process. I attached

11:05 the profile to it, and that process is then secure. If that exits and is rerun by someone else, I would have to reattach the profile? No. I don't I think the profile is gone if the process exits. But I think, originally, seccomp had no support, so there was no support for those data structure where you can specify differences calls. In the first place, it was just something like, okay. We want to use seccomp or we don't want to use it. And if we enable seccomp for a process, then the process can't do new SYS calls. So that's

11:37 that was the main intention, but that that wasn't flexible enough. And then they decided to go for BPF. It's not eBPF, so it's not this extended BPF filtering, but it's just a plain BPF. So there's more or less something like a you can imagine it something like a lightweight common module which gets which gets loaded on runtime and then, yeah, traces and blocks business calls. So we have different actions. Right? So we can block business calls, but we can also trace them or lock them. It is possible as well. Okay. Then from my naive perspective, is it

12:09 okay to think of seccomp as, a firewall for SIS calls? Yeah. Probably. Yeah. Okay. Okay. The next stage there that I don't went over was that Kubernetes has support for the seccomp profiles, and they're handled through annotations on the pods deck. And the challenge with that is that they don't always exist in all the nodes. Is that correct? Yeah. So there there's a couple challenges. Like, the first one there is that it's, you know, annotations are are, like, an unstructured bag of data. Right? So it's not immediately clear. You're not gonna get, like, validation on on your annotations like you would with,

12:53 you know, the open API schema of a pod. So that's kind of the first thing. But more in relation to what you were saying, so when you create a pod with this annotation, let's say it's pre one dot 19 before it's moved to the actual security context. If that pod gets scheduled to a node that didn't have that seccomp profile, then that would be an issue. Right? Because it it couldn't apply it. So you could say, you know, typically well, actually, I mean, I guess it depends on the organization. But typically, you'd either have them on all nodes in your cluster or

13:27 you'd have a very specific workload. Maybe this is even more likely because set comp isn't super common to have enabled on workloads. You typically have, you know, a specific node that you're, like, forcing scheduling to that has a set comp profiles there. So so yeah. Basically, you know, the the pod is an abstraction for scheduling, but set comp doesn't really, like, fit into that abstraction, unfortunately. Right? It's node specific. And you see other things like, you know, let's say you want to use, like, GPUs or something like that, you know, that's on a node. There's other times you have to

14:01 do this. But because this is built into the container runtime, you you basically shouldn't have to do that. Right? It's not like it's not like specific hardware. It needs to be, you know, like, obviously, like a Linux node and that sort of thing. But we're we're kind of paying a price right now that you shouldn't have to, at least in our opinion, I think. Okay. I'm kinda following along. I think the best way to tackle this is let's play with it. I mean Right? Yeah. Yeah. I mean, we also have this kind of hierarchy of when it comes to

14:34 annotations or passing data around with annotations. Right? So this is, from my perspective, kind of a little of a problem for making users actually use the feature because they have to specify an exact annotation. And then they can say, okay. Now this profile applies to the pod, but they can also say, okay. It's it's slash container name, and then you can reference your container name inside of the pod to apply the second profile only to a to a single container. So and and this fact already makes it pretty complicated to use. And, yeah, we can

15:08 solve that issue by having a dedicated field in the sec security context of the container and the pod. Yeah. If I think if we jump over to the repo, we could look at some examples. I think we have some the pods with annotations and with the security context. Probably helpful to take a look there. The seccomp operator repository. Right? Yep. In Kubernetesorg. Yep. That was I think, Tasha already put a link in here too. There we go. Nice. You can ignore this mindset right now. It's just our our brain dump. I'm not sure I want to ignore it.

15:51 It's very colorful. Later. Later. Okay. So what do you want me to pop open then? I think installation and usage might be good. It's the app there at the top. That would make sense. Yeah. Definitely. Alright. So, yeah, we have the our stream there. And then here, we're first, it's looking at installing the operator itself, and then it's looking at creating a profile. So we talked about beforehand that right now, you'd go and actually kind of, like, put this these JSON files in a specific path on your node. And so what we're showing here basically is

16:33 just putting those JSON files in a config map. And what the operator is gonna do is it's gonna see that because of the annotation there that's, you know, set comp.security at kubernetes.i0/ profile. So it's gonna say the operator is saying, look at all config maps with this annotation on it, and then it's gonna take all of those profiles in line there and just put them in a directory on all of your nodes. Right? So it's a daemon set. This operator is running on every node. So each one of those processes will go and actually put that on the

17:03 node. And if you scroll down, then we'll show usage of it, which this usage obviously is not specific to just seccomp operator. You could manually put these in there and then do it, but you'll see the top one here is with one dot 19 where we have the security context there and set comp profiles and actually actual field there. And then you'll see before one dot 19 below that, you have that annotation, you know, which is a little more cumbersome. Right? You're not gonna get that validation and that sort of thing. So, yeah, that's a general idea. And as

17:39 as Sasha also mentioned, you know, a security context, you can have it at the pod level, and then you can also have it more granularly at the container level. Can just like any other thing, the security context, the container level, the more granular one is gonna override the the top one. But if you're doing that with the the annotations here, they're, like, at the same level but overriding each other. So it can be a little bit confusing as well. Here, they're both being applied at the pod level, though. Alright. Got it. So I guess I could just run through these

18:00 Installing the seccomp operator

18:12 steps on my cluster. Right? Yeah. Yeah. Sounds good to It's be nice and easy. It's gonna work. And then that's it. So I am running on a Mac. And I still still don't know how to look at tabs. There we go. So that means that I have it's several version of one sixteen. So if I'm working with the Docker for Mac Kubernetes, that means I need to use the annotation based approach. Right? Yes. You would you would need to if you're using one dot 16 Kubernetes. Perfect. Well, let's install this operator. And I appreciate that there's a QPC channel

19:03 apply and not create. And this is going to put stuff, I'm assuming, in my cube system namespace? No. We use it. Yeah. Exactly. Yeah. And so you'll just see one pod running here, right, because it's a a single node cluster you have here, presumably. So it's a daemon set, so just one will be running here. One thing you could also do is if you well, I guess you can see it right there. The other things are created. So you can see that we have a config map, which is for the seccomp pro the seccomp operator itself. So, you know, as you

19:46 would want with something that is gonna manage your seccomp for you, we apply seccomp to it as well. So that's what that config map is for. Then you'll also see the default profiles there, which is just a collection of profiles that would be, we think, universally useful, especially right now when people are just kinda getting their feet wet with set comp. There's some things to try out. So yeah. Alright. So I guess I could just pop this open there and take a look at it. Right? Yeah. Go for it. And you're gonna be able to tell me what is

20:00 Seccomp profiles

20:20 actually going on here. Yeah. We're gonna describe every Sense call to you and exactly what happens in the current model. Well, here's a random pop quiz for you. How many Sense calls are there? What is it? 200? So is there 300 yet? I don't think so. There's 200 something. Yeah. 200 something. Okay. So very matter then. The seccomp operator shifts with his own seccomp profile. Right. Yeah. And and, Sasha, I mean, you or you or I could go through how we actually there there is some, like the net container, I think, is kind of interesting how this gets applied. So

20:57 we might wanna look at actually you know, there's kind of like a chicken and egg thing. Right? Are you using how can the seccomp operator, like, do the things it needs to do with the seccomp profile if it doesn't exist yet to put the seccomp profile there? You know, like, there there's a chicken egg thing we can go through in a little bit. But, Sasha, do you wanna give kind of, like, a rundown of what this profile has here just like the different sections of the the JSON here? Yeah. First of all, we have our default action. So we just want to

21:25 throw an arrow in any case if anyone any of the SIS calls which are allowed. So this is the list down there. Any other Syscalls should be forbidden. And then we have our architecture. So we don't have support for something like MIPS or s three ninety or something like this. So we just support x x x eight architectures for now. And then you have a, yeah, a list of SIS calls which are allowed. And if you go a little bit more down, can you go down to the list? And there should be also an action

21:55 related to those SIS calls. Yes. So it's the action allow. And and those SYS calls are allowed. We could also do it the other way around. Right? So we could allow all SYS calls and and deny some of them. But, yeah, if there's a new SYS call coming into a system, then we probably would allow it. So it's not not the intention, not the right security approach from my from our perspective here. And now we have a little list of SYS calls, and some of them are necessary to actually spawn the process. For example, if

22:23 you look at execve, if we wouldn't be allowed to run execve, then we wouldn't be allowed to actually spawn any process. So run c or the underlying container runtime wouldn't be allowed to actually execute the workload process. So this is something we have to really keep keep in mind. And there's also something like uname down below, probably. Let me it's just a rough guess. Yeah. There's uname. And this is something which is required, for example, for the Go runtime. So we wrote a Go application, and when we start this Go application, then it does something like a u name to

23:02 look up some, yeah, resources on which system we are running on and stuff like that. So the runtime has a little garbage collector running in in the background, and it's necessary to run the uname. If we, for example, would deploy something like a c binary, which is statically linked, then we probably would not need uname at all, or we wouldn't need to use calls like read or something like this. And it's what it wasn't that easy to craft the profile, but we kinda recorded the SIS calls. And even after recording the SIS calls, we just

23:36 saw that, okay, there are some SIS calls missing. And this also kinda influences how we develop the application. Right? So if we want to add now new features which need new SIS calls, then we indirectly have to adapt this the second profile as well. And this can be kinda pain, but yeah. I mean, it's really a security driven development approach, and I really really appreciate that. Yeah. So, Rasha, I think you pointed out something really important there about some of these syscalls are, like, kind of unrelated to the actual process that's eventually getting run. Right? Some

24:12 of them are specific to it running in a container runtime, etcetera, which I think really emphasizes something we were talking about before we started the stream about wanting to be able to record the profiles using the set comp operator. Because if you were just, you know, looking at this binary by itself, depending on, you know, how you are running it on your machine, you may not need all of these different calls here or you may need some that aren't present here, etcetera. So, you know, developing one for the specific context you're running in can be

24:45 a bit challenging, which also gets introduced, right, if you want to for instance, I I work on another open source project for, like, my day job, and we we did a stream a while back of crafting a set comp profile for it. It's a little bit difficult to kind of, like, ship a set comp profile with your thing that's gonna be, like, generic. Right? Because you don't know the context that people are gonna be running your your application in. You don't know the underlying container runtime. You don't know a variety of different things. You don't even know if

25:17 they're running it in Kubernetes perhaps. So shipping a seccomp profile alongside of it is difficult. But we think that just as, you know, you install let's say, you know, like a service mesh that has, like, security contacts on some of the pods that are running, we think you should be able to ship a seccomp profile with that, and that should be able to work. And, you know, if if for some reason it just absolutely cannot, then we should have an easy out for that. Right? So that's another area where we'd like to make this a a more seamless process.

25:50 Okay. Right. It's it's just it's starting to click. I'm getting there. So I think the the first question I kind of got is we've just deployed this operator now. What is the what is the process when I get this wrong? Like, say, go and run a container a pod here, which has a profile and is trying to execute as this pod doesn't exist. What kind of errors, what logs am I getting back? Is there something to help me understand what's going on? Yeah. So the first thing during development of such a profile could be to put the

26:30 action not to arrow, but to its section. It's a it's a convection lock. Right? Mhmm. And which locks the error messages into the audit lock. And the audit lock is something so it's audit d, and audit d is not available on every system we have to admit. And it's also configured different differently on on systems. But for many default distributions like Ubuntu, for example, then they would did that would result in logging the action. And this is call this is call number you have to look up. This is call name then if you just get a number.

27:08 Alright. I've got an idea. So before I go through my weird idea, let's pop open the other config map. Right? So we got the default one. So the idea behind this default one then, and I'll just put words in your mouth and you can correct me, is that this should allow most processes to run without so I guess it's it's blocking the more extreme or dangerous this calls. Is that it's is that what it does? I mean, this this profile and we wanted to provide some default profiles for some common applications. And, for example, this is for NGINX,

27:44 I think so. We only have one for NGINX right now. Oh, okay. I think I misunderstood that. I thought this was, like, a default profile that would, like, just block really dangerous syscalls and let everything else do. Oh, no. No. No. Most container runtimes already ship their default profile. But on the other hand, we have to say that the default profile in Kubernetes is unconfined, so it's not applying seccomp at all. But it's the profile called runtime slash default or the duplicated profile is docker slash default. Yeah. That that's a really good point. So, like, right now, without the set comp operator

28:22 at all, you can use the underlying container runtime default profile, which does exactly what you're talking about, David, that it blocks super dangerous things. Honestly, like, it, you know, it that is enhanced security. It doesn't really go a super long way for for preventing things because, right, it it doesn't want to break your your functionality. Right? But that is definitely the first step. I'm glad you pointed out. That's definitely the first step to using set comp just, you know, using the runtime default profile. And if you're seeing issues, then you probably have a fairly privileged workload that

28:55 you're trying to run there. Alright. I'm curious. What happens if I don't allow the access to this call? It it depends. It depends on the application. Right? So so this is another good point. So when recording this calls, it's not like when you start a process that it just makes, like, every syscall ever. You know, if it especially if it's a long running process like a like an operator. So the seccomp operator is a great example of this. You know, depending on input it receives, it might make different syscalls. So you can't just, you know, like, execute and be like, alright.

29:29 I know everything that's running. You technically should, you know, have to go through, like, every path to be, like, totally sure to to know not literally every path. But, you know, basically, the idea is that you need to have some real world use of it to be able to understand what what calls it would make, which is a great case for recording, right, because you can just monitor something in its normal operations and and see what happens there. Alright. Awesome. I'm feeling brave now. But, for example, exit is pretty interesting because it would mean that the process is not allowed

30:07 to terminate itself. That's what I was wondering. Like, what if it what's the what if it's finished? Yeah. Yeah. And it this could also this could result in that the process just hangs around. I mean Wouldn't if it tried to make a sit well, it depends I guess it depends on what your policy is. But if it's, you know, an error policy, I believe that the process is gonna get killed. So maybe you're, like, effectively exiting there anyway. Like, maybe we should just remove the exit call from from all set comp profiles and then just

30:39 have it killed when it makes it. But that that's a funny one to look at for sure. Yeah. So if the default action is to audit and an application wants to exit, is it it's just gonna live forever. No. No. It would it would be if the default action was to audit, it would log, then the exit call is made and allow it to actually exit. So it's not yeah. It doesn't, like, it doesn't prohibit it from happening. Okay. Cool. There we go. Let's deploy NGINX then. Is that the is that the next step here?

31:00 Deploying nginx with and without a seccomp profile

31:09 Yeah. Alright. So if I just do this the really hacky way and then we edit it, We can add the profile. Or is there something on the documentation here that's gonna make this okay. We'll do it this way. I'll follow the documentation, and then it could says it. So Internet with that's not Oh, the namespace have to be has to be changed, I think so. Also, the config with name. Oh, so this is the one nineteen syntax, actually. So what we're saying is I can't use that, and I I I need to use this one.

31:54 So I can just copy the annotation. So local host, I'm assuming the space operator is the operator. My namespace is going to be, like, operator and default profiles NGINX. It's NGINX one nineteen, the other one. So let so if I wanted to get that, that's just the details from that I just described on tag map. Default. Alright. Just trying to make sure that I know how that is built up. So this is the key here. The namespace of the default profiles is coming from the names. That's local host operator namespace Yeah. Contact map name, and then the key

33:05 within the contact map. Yeah. So the reason for that long path was to not being able to override the second profiles from a different namespace if I don't have access to it and things like that to have a security from yeah. As the parsing is the security pattern here. Yeah. And you can also as we'll see later on, you can run the operator itself in a single namespace. Basically, only watching for config maps in a namespace. So you could say, like, all of my config maps will always be in the set comp operator namespace, and I don't want

33:38 people with access to other namespaces and the ability to create config maps. Go to do that. That could become less relevant as one of the things we're exploring right now is creating a set comp CRD where you have a specific type that you could grant our back on. So there's kind of like a trade off, right, because using a config map is something people are already familiar with and know how to structure. But you have unstructured data in there now. Right? It's just like a bag of JSON that that's not getting validated on creation. And you can't apply our back specifically for

34:12 you know, you could use OPA or something to write policies, but you can apply our back to say, like, you can't create set comp config maps. Right? You can only say, you can't create config maps or something like that, which, obviously, those are used for many different things. You may not want to do that. Is there a plan to introduce a CID in future versions? Yeah. There's actually a PR open right now, which I definitely wanna give the person a shout out. So, Sasha, I didn't remember their name. If you want to drop it. Colleen. Yeah.

34:40 Colleen. So she's been working on that, and and there's a PR open on the repo. Folks wanna kinda, like, weigh in on that. It's it's a fairly straightforward implementation, right, because there's already a a defined, like, API for a seccomp profile. So it's kind of just, like, adding that in. But there are a couple of, you know, like, decisions about, you know, how we want that structured, if you wanted to be able to put multiple in one CRD or, you know, etcetera. So definitely, people are welcome to weigh in on that. Awesome. I'm looking forward to playing with that.

35:16 So what was I doing? We were gonna apply this into the next platform. There we go. That's my problem with these streams. Was that I just have too many random thoughts in my head, and I just digress far too much from what I'm actually supposed to be doing. That's how you end up talking about interesting stuff, though. Right? I I hope so. I certainly find that interesting. That's for sure. It's definitely more fun for the people on the stream. Cool. I'm glad. I'm glad. So we have a test part now that is running with the

35:48 engine x profile based on the annotation. Let's just Nice. Let me just check then. Right? So I I can do a port forward to our test pods and just confirm that I actually have an index. Like, that's to me, that's step one here. So Nice. I'm gonna go in the database part. Better wrong. Let's see here. Oh, this says Caulfield. Is that what that is? Potentially. Yeah. Yeah. Yeah. I mean So I I was gonna exec into it to do something the Internet wouldn't do, and already it's blocked me. So Yeah. That that's a that's a good example

36:46 right there. I mean, like, you know, you want to restrict your images with things it doesn't need, like, you know, you don't want someone to be able to exec into it if they don't need to. But here, that adds another layer. Let's say you wanna use the NGINX pod and you don't want people to exec into it. Apparently, your set comp profile will disallow you from doing that here. So what I'm gonna do is go add that Cisco and just assume I know how to do it, please. I know if I get anything wrong, yell at me. And we do have a

37:15 question if you're feeling brave as well. So I'll pop that up just now. We'll like you to edit on the conflict note. Cool. The question is from Go ahead. The question is from Robert, and he's saying, so could the operator log syscalls and app made that prevent it from running by the profile? Yeah. So the if you set the the default action or the action on a stanza to to log instead of, you know, like, error allow, then it is gonna log those out. I think that that that we could make that a little bit easier to consume. Like,

37:49 there there's ways to, right, get to that log. Like, for instance, in the the example in the Kubernetes docs that we added, I'm running, like, a kind cluster locally. So I just literally tail the syslog on my local machine, and I can see those syscalls logged out there. But, you know, we could potentially make that easier to consume or or, you know, you you could already export those somewhere else and and there's probably frequently people are. But building something to the operator that would do that would definitely be interesting as well. We I I don't think we want to, like,

38:24 shove too much into this operator. Right? Like, it shouldn't be like your logging mechanism, for instance. But if there's something around making that easier to consume, I definitely think so. One thing we could do is, you know, like, do some sort of, like, a venting on a pod or something like that to say, like, oh, this was killed for this reason. Right? So there's it should already say if we got the pod now, like, you know, something happened. It might actually even say that message there. But, you know, if there's something we can add to that to make it a little

38:55 more transparent, that'd be a a great idea for sure. Awesome. Really good answer. So while you're answering that question, I did edit our contact map, and I added that Cisco, which oh, and then I can't even remember the name of it. There we go. Get PGRP. Which Cisco is that? I'll ask you on the spot. I have no idea, Sasha. Save me. Let's just see what It's it's related to users and groups somehow, but I'm not exactly sure what it does, but it's something like user or group. So it looks like it's setting and getting process groups.

39:41 Yeah. So it it gets a group ID. So I've added that. Now I already have a running pod. I'm assuming that would be updated in real time. So will I have to kill the pod and redeploy? Exactly. Yeah. Yeah. Okay. So we can delete. That's a text part. Yeah. Oh, okay. I expect another this call now to show up. But Yeah. Me too. This is the joy of of crafting the second part of us. And like I said in that tutorial, we do it by just logging all of them and then, like, adding them. Well, we

40:22 also give you the completed one so you don't have to do it. But when I was writing that, I was logging them all and, you know, it's like, oh, this one works now. Alright. What's the next one we need to fix? So it can be be confusing for sure. Yeah. Well, I just did the gross period to you though, and of course, there's no way that's just full. Some let's give that now. Okay. So Good. How many of these do you think we're gonna have before I actually get inside of this this container? Oh, a few for sure.

41:04 There's no Cisco load here, though. It cannot set terminal process. Alright. So let's assume in like, we're already understanding why this is a bit painful. Now we knew this was coming. So why don't because it it could be it could be dozens. Right? There's no point in going through these one by one. Yeah. I mean, like, for for for this, There may be a point in doing it for your your production workload, but you're right. There could be there could be dozens. Right? Because it's gonna kill on the first one. You could also set it to log, right,

41:42 and potentially get all of those at once Yeah. Seen. So Okay. That that's a really good idea. So let's but I I'm curious first. Can we set the default runtime one first? Can you tell me how to do that? Then we'll come back and maybe change the engine x one to do log instead of error. Does that make sense? Yeah. Sure. Sasha, do you wanna instruct that one? I don't we want to change oh, okay. I thought we would like to change to not to error, but to log for the NGINX profile. And then we

42:16 could look into audit log. I'm not sure if audit log is audit is configured on your system, but I would expect so. So we should already see some ZS calls coming in from the workload. So what I was curious was what we're saying is if we don't wanna use the default ones that are provided, there is the container runtime default. I was wondering if I could apply that to 10 exec into the container. Then, yeah, this will work. Yeah. We can try it out. Yeah. So how do I'm assuming the annotation is the same. Right? So Yes.

42:49 Yeah. So what's It's just runtimes runtime slash default. Is that it? That's all I need to get a default. Yes. Yeah. Why is that not in every piece of documentation for deploying pods ever? Because because it's an annotation. Maybe. Maybe because no one knows what seccomp is, who's using Kubernetes. I'm not sure. That's that's probably a huge generalization, but I know a lot of people are not familiar with it. But anyway. But this I sorry, Andy. Go. Go. Sure. Yeah. The fun fact is that that we actually running a different application. Right? So we're running on top of Bash now and not

43:31 on top of NGINX. So So then yeah. Yeah. That's what I was really curious. That's what I wanted to do with the default NGINX profile because I didn't think I would get blocked from getting inside the container. That was just a value add. That was cool. What I wanted to do is something dangerous. Like, the attack vector that seccomp is helping people with is if someone penetrates my application and gets access to the container, we wanna stop them doing things that this application isn't supposed to do. So now that I'm in here, like, is there a way for

44:01 me to introspect that second profile? Is it available in a container under slash prop slash one? Like No. I don't think so. The con the underlying container runtime, our case, probably run c and applied the profile already. So it's not possible for the process at all to inspect the profile now because the process is not that privileged. We also yeah. We don't have access to to something like this, but what could So what would it block me from doing? Like, what what would be a bad Cisco? I mean, runtime default is not very constrained. Right?

44:37 Let me just double check. Yeah. I'm taking a look at it right now. See. Because I'm assuming it's right. If I could run PS, I I would only see one port out there anyway. I mean, I guess, I would be able to can I I can send this call to the kernel from here? Right? Is that possible? Can I just send this as well? You could try. Let's see. Significant syscalls. Here, I can I'll drop this in the chat too. Yeah. I'll do that. Let's do that. I'm curious. Do we have something like NS enter

45:18 in the binary NS enter, for example? Yeah. I'm assuming we can maybe get some tools. So You could use, like, set NS or something like that. Probably Debian based or. Yes. Yes. Set n s or n s enter should not work. This should be blocked probably because it would allow modifying the namespace. Oh, namespace. It's there. It's. No. Is NS enter in path already? That sounds dangerous. Run a program with different preference send. I mean, that sounds like something I shouldn't be able to do without a default one ten. I don't think it's actually explicitly blocked,

46:13 though, from what I'm looking at. So Let's see. Does it have wait. Let's look at the ones there. Well, that I'll just look at that title then. So we're looking for set NS and say it's been to Well, lately, Ubuntu. I'll say that in essence is allowed by by default, I think so. Think so. A profile I'm looking at. I look probably, I'm looking at a different profile. So right now, as an attacker, I'm in quite a good position and even work to default, potentially in a good position of default run type. Yeah. It depends on what you're trying to

46:55 do. But like we're saying, it's it's pretty permissive. Okay. Right. We'll we'll move on. I'm not sure what I can do here to try and trigger that default set of profiles with you. Can we see that profile? I'm looking the one I just put in the chat here looks like the one for Docker. So this has a list of all the blocked ones. I don't know if, like, clone may be available. That one is also part of the allowed sys calls. It is part of the allowed ones? Yeah. It says block on the docker docs here.

47:48 Alright. It's interesting. It also says I can't meant. So in theory, if I ever just to, like, do a look back, man, if I can even remember the syntax for that. Yeah. You can just yeah. Something like this. Either they I'm never gonna remember that design. Which profile were you looking at, Sasha? I'm looking into a containers common Oh, okay. Profile, which is used by bot man in Cryo. But it was originally adopted by from Docker, which is the JSON. Yeah. Just the link. This one. Nice. Okay. So that works. Yes. But that that that that that says

48:49 pro blocking me there, I'm assuming? That's just the permission levels of the user. Right? That's so it's not actually even getting to making this code, I don't think. Is that is that how you interpret that, Sasha? But if you just leave those arguments away and just do something like mount one to two or something like this, should probably also not work. One, Yes. Yeah. It's blocking. Looks like that this is blocking. Okay. So we can test that, right, by just deleting our wonderful pods here, removing that one line, and then see if I can execute that. Yeah. I I don't

49:31 know what I'm trying to show to myself here. I'm just I'm just very curious. So I'd have to remove the workload. Right? So one. Oh, two, one, one. Yeah. I think that's the same error. Alright. Oh, well, there goes that plan. So what we're gonna do now is we are going to change the NGINX profile to not get me the error, and then it should allow me to exec and and then we can understand the default action stuff. So I pop open this default profiles again. And the default action is this. Right? Mhmm. So if you just change

50:25 error note to log, then when you encounter something, you know, it'll do. It'll just log it instead of killing the process. Alright. Just click that. Okay. And then I can delete and reapply. Did I run get pods? What about broken? Yeah. Describe describe the pod and see what the events look like there. And I create some sales tour. So is my profile Yeah. I'm wondering. Does your profile exist? I'm assuming that's what's the problem. What do we have done here? So, yeah, this is very explicit allow or default action. But where would I find the default actions

51:36 if I wanted to do if I wanted to see what options I have available here? I believe the only actions are log, error, no, and allow. I'm fairly certain. Let me make sure on that, though. Yeah. Those are the actions which are part of the runtime spec, the OCI runtime spec. I always break stuff. Yeah. That's interesting. So did you apply the updated profile before you created this pod? Yeah. Well, yeah, I'm Edison in Lane. How does the operator handle changes to the conflict map? Do I need to restart the operator? You shouldn't need to.

52:30 See if there will you, Kaye, describe the config map? Because we we do event on the config map, so you should be able to see if anything went wrong there. Let's see. So three minutes ago, that would suggest that that was saved successfully. Interesting. Why don't I edit it again? Yeah. And then also look at that pod spec and make sure that it's the right name. For error null, we said it's a a note that I describe again. Five nine four minutes with the a saved seccomp profile. What does that mean? Is it not picking

53:30 up the changes? It's just a weird Docker for matching. We're gonna find the notes. Yeah. Try try and Oh, we just remove it. Oh, no. Yeah. Try try and removing the profile, and then let's see if we can add it again. Or you can just delete the config map either way. Yeah. Let's delete the config map rather than doing that. So delete the comp on config map. Do you have a copy of it locally? I just copied it. Yeah. Oh, cool. You can have default profiles right, and all. And I'll remove all those generated for that.

54:24 The one I need there. Yeah. Make sure the app label? Profile true. You shouldn't need the label, but you do need that annotation there. But, yeah, if you leave it, it probably won't hurt. Yeah. Okay. So we've got this. And our profile, that was good to me. Let's change this back to the log then. K. And we'll apply our default profile. Nice. So let's read this gray for a second, by the way. Okay. Cool. So now that we've got this as log, what we're seeing is and we can now redeploy our engine x part, and I may or may not be able

55:15 to exact into that. Oh, that load definitely seems to be Yeah. That's interesting. Oh, let's see. Air training. That's interesting. I haven't seen this. Can we Should we just touch Docker for Mac? Is that where we where we move over to our app at a straight up black box? Maybe that's going to be a bit easier. We could give that a try. Yeah. I'm just curious if Docker for Mac maybe doesn't implement the the default action. I mean, is that possible? You're saying it's an OCI specs thing, so it really shouldn't be. Right? Yeah. It shouldn't be an issue to change

56:10 it from error node to log. But, yeah, we could try it on on your Linux box if you want. There was something that Sasha said earlier where it logs to audit d. Is that what you Yeah. But now what if that doesn't exist? Because this is gonna be a Linux cut VM. Like, I I'm not sure if things like audit d would be there. I think it should still work. I don't know. Correct me if I'm wrong, Sasha. I'm just I'm not swinging in. I'm not completely sure about the included bits into the logging mechanism.

56:48 I know it looks to warlock messages, and it can also look to, yeah, to audit d, but I'm not sure. Interesting. Okay. Yeah. Let's try it on the Linux box and see what happens. Yeah. Alright. So if we close this one, this is my end of XPN. But I came prepared because I figured there might be something that happened. Docker from Mac. Although, I'm not that prepared because I didn't install Cain. So I don't think that's gonna be available there. Cube. Otherwise, I'll just take care of it. Yeah. Okay. Let's just get the Kind off.

57:00 Switching to Linux because Docker for Mac wasn't working

57:43 We also have a question. So why don't I leave that to YouTube when I did this? So Bella is asking maybe a silly question. There's also a saying there's safe space here. Can this operator be used on pod specific labels? Not right now. But we were thinking oh, I it just depends on what pod specific labels means. I would interpret it as that we have something like a filter mechanism for those for the profiles to apply the profiles to a pod. Then we were considering that. But we have no real feature how we could implement it. So one idea

58:29 would be to have something like mutating webhooks that we have that we could, yeah, reference second profiles in a more easy way directly from the workload so that the operator can actually modify a pod to apply a workload automatically after recording or something like this. Yeah. The closest you could get to that right now, I guess, is using a pod security policy that enforces Yep. Using it, which isn't going to actually add it for you. Right? It's just gonna say, like, oh, you can't, you can't create this pod without the seccomp annotation or or security context here.

59:08 Alright. Thank you for that question. If anyone else has any more questions, feel free to drop them in the chat. We will do our best to well, I won't, but, you know, my wonderful guest here will do their best to answer them. So I now have a cane cluster. Is it by magic? I I have no I don't have cane CTL. Thank you for being too confident. See, you learn so much on this livestream. Right? You learn how to install kubectl. You learn how to set comp stuff. You're gonna have to break everything. I I am just on fire.

59:47 So On kind, it should definitely work because we have end to end test running in kind, which also checks for the audit log, which also uses the audit log. Okay. We're back to where we were. So let's edit it. Not gonna have my auto complete now. That's gonna drive me crazy. Edit our config map default profiles. We're gonna change this to error node log. Yeah. We are going to run this part. X dot yaml. And we are using the default profile, which we've now just modified. And kind spun off for one eighteen, so we're still gonna use an annotation instead of

1:00:47 the security project. Sounds good. That one's won't be the same error? Looks like you do. Well no. So this is a new machine. It won't have the image yet. Oh, gotcha. Yeah. Yeah. We'll connect the image. It's gonna be okay. I'm so impatient. Like, what is next? Although, maybe it's not gonna kill the image because I would expect that to come now, which means maybe there is an error. Maybe Yeah. Let's see. Yeah. I'm just gonna push it. Code. Oh, no. That's good. Okay. So now it's running. Nice. So with it with this profile, what was failing

1:01:00 Tracing blocked syscalls

1:01:43 with Docker for Mac was my ability to then exec inside of this container. Yeah. There's no works, but we don't care about that. What we actually wanna care about is where did this log? Or Yeah. Now we have to exit the node. So yes. That's right. We're in. I'm not on the action. Yeah. So we can jump and say it. K. Yeah. Now we can look into Warlock AuditD, probably. It's not there because I think time doesn't ship all the d. You should be able to actually, just on the machine itself, they should flow through to the syslog. Right?

1:02:35 So you could probably tail the syslog. So if you just do, like, tail dash let's see. And if you could grep for test pod. Yeah. Yeah. That's it. Nice. That's the audit log. Yeah. So there you get the syscall. You've got +1 09271113489. So these are all ones. So so since we have that allow block, the you know, that's that whatever block is gonna take precedent of the default action. Right? So these are all ones that would be blocked in Arano. So, you know, if we wanna switch it back to Arano, we need to add all of these different

1:03:12 syscalls there. And you can run something like a new syscall and then your this is call number, and then you get the name for it. Yeah. Ah. Okay. If it exists. I actually Well, don't know what he used this call is. What what is that? Not sure which one. I actually haven't used it before either. Typically, I just use one of those websites, you know, that has, like, the searchable thing. It's probably part of the auto d package. Yeah. Yeah. That's what I got from here. So I see, yeah, audit d there. So Nice. Now it should be available.

1:04:00 Listing syscalls with strace

1:04:01 Yeah. I'm learning. I'm learning. Yeah. Really? Yeah. I just pick random numbers. Yeah. I'm so easily impressed. Okay. That's really cool too. So how how do I send this as can I just send a random Cisco for the kernel? Is there, like, a way of debugging that? Yeah. Alright. We'll pass on that. I'll stop trying to stop running, like, random brain thoughts actually. Yeah. So But for example, you could also do something like if you have a running or a binary application, something like okay. S trace, for example, is pretty easy to use. So if we run something like S

1:04:42 trace minus c and then l s, s trace is not available. Yeah. We can fix that. Yeah. I know that I know that package then. S trace minus c and then l s, for example, should give us an idea about which SYS calls are now required to run this. And if we now run something like s trace minus c l s slash, then there should be yeah. Then we have a different list of SYS calls because it's higher privileged. Interesting. Right? So it really depends on the application and what we want to do. And allowing something like

1:05:23 exec ing into a container workload is probably good if you think about debugging an application, but we also have to reconsider then that we then allow yeah. We have to allow additional resources to actually do something with with the bash, for example. Right. So does that not make generating I'm not gonna say easy, but, you know, if I can just do s traced s c l s, and then assuming I just do, like Yeah. Print by how did that not work? I mean and that should work. I know it. Yeah. Yeah. Yeah. We had some discussions last week about

1:06:12 how we could record those profiles, and the idea was, yeah, we could some run something like S Trace, but S Trace has a has a performance implication. And if we, for example, run it without minus c, then we get all the details which we don't need. So we also get all the paths for which got accessed during the loading of the libraries and things like that. And there is some new feature on the Linux kernel. I'm not sure if it's that new, but it's called the second modifier. And this is pretty interesting because we can

1:06:41 create some client server architecture we call in between seconds. So we can have something like a server, which is monitoring the actual application, and the server provides a file descriptor for clients. So which could be a different process, on our case, probably the operator. And this operator gets then all the ZIS calls during the or due to the second notifier feature. And one idea would was to yeah. How could we now bring this into the operator? And there are multiple ways. For example, there's a different container runtime like c run, which now propose something like a plug in mechanism

1:07:19 that we can have something like a notifier plug in to record to record the profiles. And but all of those topics are not really standard. So we also have to say, okay. We probably can't support it on every machine because, yeah, not every Linux can also support the second modifier, for example. There are also some ideas about to do the same, like like the s trace thing. We could compile a little BPF module into something like an OCI hook, and this OCI hook will be trickled right before the workload starts, gets the process ID, and then starts the

1:07:53 recording, and then puts the recorded SIS calls into a predefined JSON blob. Yeah. But this would also not work, for example, with container d because container d does not support OCI hooks yet. Or, for example, it would also not work with Docker. So we are just evaluating, and we are thinking about possible solutions around it. But it's not that easy. Right? So Yeah. It this is really is is really high for those profiles. If we want to have something like s trace, what is also BPF trace, for example, which can where we can register tracing points for this calls on the system,

1:08:32 but then we would record this this calls on the whole system and would have to look for our process so our container process. I guess the the challenge there, though, is, right, is that in order to use s trace or BPF to extract syscalls from the application, you have to run through every code path. Right? Because, like, even just with that really trivial l s example that you showed me that the syscalls that were used between one director and another were different. Like, so if I were to profile or try to create create profile for engine x and

1:09:02 someone just hadn't had a certain endpoint that uses, like, a redirect syntax, then you'd never know that syscall was there. Yeah. Is there not I'm sure there isn't or maybe there is. But, like, I'm assuming just analyzing the ELF binary itself is the best way to get that information. Is that not possible? Yeah. This could be possible as well. I mean, if the binary is stripped, then we don't have a chance to get to this call back in. So you said that at the start of this kind of stream, you've been working on a way to generate

1:09:30 Using podman to generate seccomp profiles

1:09:39 profiles. Is that something you you wanna walk us through? Yeah. Yeah. I can show it I can show it to you if you want. Let me just share the screen here. Well, you should see see it now. Right? So I just played around a bit. So yeah. I mean, I have this OCI hook installed, which is just a JSON which executes a binary right before the container workload starts. And I have to be run Podman as a privileged user here because otherwise, the the hook wouldn't be able to export through record the SYS calls. So this is a real drawback.

1:10:15 We have to increase the or decrease the security footprint by doing something like recording those calls. So this could be could be a problem in the future. But what we would have to do now is just to add another annotation to the workload, which is this, and that way, IO container strace this call. And then we can specify an output file. And, for example, for other container runtimes like Cryo, it would work in the same way. We could also just add this annotation to the Kubernetes workload and then record the profile into a directory. And if we now run something like let

1:10:50 me just change that. Yeah. We run echo high without attaching a TTY and without doing anything in bash, then it should succeed. Yeah. This succeed. And if we now look into the profile, then we can see that we have now a generated profile here. So it has the same default action for error and some just calls in. Also, me just save that. And now let's do something different. Now we want to to attach a TTY and also run echo high. I just just override it, and then we have a look into that. Save it as my second profile.

1:11:42 And if we know this goes, and we can see that we have, for example, different set of SYS calls here, which is pretty interesting. Very cool. And I guess sorry, Nico. I mean, we can yeah. We can we could also do something that's just attached to the profile and run something like l s or what what else could we do. We could do something like r and d and remove or we could create a directory, and we could remove it again. But this works. And if we now Can you click our minus nine one? Oh,

1:12:18 we can try it afterwards. One second. And we know it created a third profile. It's just with our second and our third. We can see that we have, yeah, many more. We have to execute batch, for example. We have to fork a process. We also have to get our current directory, for example. Otherwise, we would not be able to change into a directory. And we also should have yeah. We have on deal. And we also have MKDO, probably. Yes. So how did you exit that container? Control d? Yes. And because I don't see the exit Cisco.

1:12:59 A completely random observation, but it wasn't there. I mean, mean, I'm not sure if it triggers exit in the back, but let me just check. Yeah. The second and the fourth, probably. Okay. For batch, we need all those SYS calls down here and exit. Exit is part of both, probably. No. It's not. It's exit group. It's not exit group, but it's part of both. I'm pretty sure that I if I run control v, then it will do something like detaching from the TTi and then doing something like exit exit group or exit the process. Okay. So here's a

1:13:41 random thought. Can you just run that against NGINX? Oh, yeah. Mean, how how how accurate would that be to the default profile that is shipping on the second popular end? Yeah. I can do that. So NGINX. Okay. It was And I mean, I 19. I'm curious now. Right? So you from what you've been telling me so far, this dash dash annotation on Podman is using something called an OCI hook. Can you explain that that to me? What what what does OCI hooks enable? Yeah. So the OCI runtime specification has a direct feature baked into, which is

1:14:21 called OCI hooks. And runtimes can support it to, for example, run a pre start hook, which gets executed directly before the container workload starts. And this hook is just can can can be any binary, and then we can pass our information down to that hook like the output path of our yeah. For our profile. And but not every container runtime supports it for now because I'm not sure. I don't think that container t supports it, and this is going to be a problem to bring a feature via an OCI hook down to to all users.

1:14:59 Yeah. So now we have this NGINX still running. I just can exit it again. And if we look at NGINX dot JSON, then it looks like this. And we can also I just would like to reformat it. And we can also diff this one with the one in the seccomp operator. I mean, what I would expect to see different here is that there's no accept call because we never browse to it. So I guess things like that would cause the profiles to be different. I think we I think I have to resort this one. Yeah. There there you go. There's

1:15:49 no except. So Yeah. Because we never went down that code path. We never actually made a request to reset it. It's a very cool tool. That's that's nice. I mean, that would make this so so much easier for people to I can just share that. It's this one. But it's also, say, on BPF hook. Yeah. Has to run with a higher set of privileges. We have to need something like admin. Yeah. So this is pretty high privileged when it turns to security from the security perspective. And we also need the kernel headers on disk, which is also kind of strange, but we

1:16:29 have to compile a BPF module on the fly right before executing the hook. And, yeah, therefore, we would have a need for for having the actual kernel headers for the currently running kernel on disk. So So is it possible then if say, I'm using Kryo as a run time for Kubernetes and just to enable that tracing on all my applications for a week, two weeks, and and collect profiles. Yeah. This this would be this would be possible, but I'm not sure if it's a good idea to do that into in production. Right? So I'm not completely sure about the

1:17:04 security implications we we introduced with that hook. So we we also have another question, which think is is touching on a few things that we've covered today. So Igor is saying, it may be hard to debug a failing application with a second profile active, which I think we've kind of already confirmed because I couldn't exec them to the container. What are what are my options for can I change a profile and line actively, or do I always have to restart the process to make a modification to that? No. I mean, we have to restart the process.

1:17:38 Right? And this is also some implication we had. It's about a general architecture of Kubernetes. Right? If you change an annotation, then it it restarts the profile. But for example, if you change the content of the profile, which is not directly referenced by the or which is only referenced by text by the annotation, then it won't have any influence on the actual workload. So yeah. Yeah. We thought about solving this somehow, but it's probably not cool if the operator comes around and restarts your workload because just the second profile updated, something like this. Not not that easy decision.

1:18:18 Yeah. Well, that that brings up another point. I mean, a big thing right now is if you actually delete a config map, then it's not going to actually get cleaned up out of the directory. So that that profile still exists even if you delete it. Delete the config map that led to its creation. And you may think, you know, initially, that's not a good thing and, you know, maybe it's not. But what if you delete a config map and there's a pod using that set comp profile and then later on that pod gets restarted? It's just gonna fail, right, because it's not

1:18:51 gonna be able to access that set comp profile. So right now, we're we're thinking about actually, I muted there. How we want to, be able to clean those up. And one way to do that is to, you know, get a list of all the pods and say, alright. I'm going to clean this up if no pod is currently using this seccomp profile. So that case, you don't get, you know, where, like, a pod gets evicted and then can't restart. Or you could do things like, you know, keep a a global list, you know, somehow of of what pods are are are there.

1:19:26 So just trying to do that without, you know, huge performance implications. But in general, like, pods and, you know, getting pods with a certain annotation is not a super big deal, especially, you know, we don't anticipate that people are gonna gonna be deleting set comp profiles, like, multiple times a second or something like that. So that's likely something we can do, but it may also be something where we want to be configurable depending on how folks wanna interact with it. Awesome. So is there anything with the seccomp operator that we haven't covered yet? Is it what am I missing?

1:20:04 I think we've covered most of it in its current state. We have, like and we've mentioned a lot of a lot of ideas for the future. So if anyone's tuned in today and and saying, you know, I I need this functionality, like, definitely open an issue. Definitely show up to our meetings. There's also the set comp operator profile or channel in Kubernetes Slack. One big thing, which I don't think we touched on yet or maybe I just forgot, is we're thinking about right now renaming this to the security operator. The reason for that being is that you

1:20:36 can pretty easily extend this functionality to some other things, namely the most obvious would be AppArmor, which is kind of similar to to set comp, and we'd like to be able to support that. And I I think Sasha is actually leading the charge for making AppArmor GA as well in Kubernetes for an upcoming release, hopefully, one dot 20. And so as that happens, it'd be a great time to also introduce AppArmor functionality into the operator. And then there's also some other things we can do with security. So in general, set comp is one the things we think is

1:21:07 valuable and is underused because it's it's hard to work with, and we'd like to extend that to other things as well. So, you know, that kind of opens up the scope of the operator quite a bit. But, hopefully, it becomes something that people can kind of, like, trust on to help make using security features of Kubernetes a lot easier. Awesome. I guess the the next question after that then is what's next for this Stack On top here besides the our commerce stuff? Is there anything else coming coming soon? Yeah. So we are working on the CRD,

1:21:42 which should which is based on the RCA runtime specs. The second profile specification, so it should be give us some good opportunity to validate the content early on. And then we are working on the recording, but I'm I'm not sure if the recording makes it into a feature anytime soon, or maybe we are target we are trying to target another release for this year. And yeah. Let's see. At least as a proof of concept, we could edit, and then we could try to get some feedback from users. Awesome. Well, I just wanna thank you both

1:22:16 for making seccomp understandable for me and for making it easier for everyone with your hard work. Like, honestly, I didn't expect seccomp being something that I would kinda be able to get a handle on in such a short amount of time. And I think you've both done a great job of kinda walking through this. That's amazing. Thank you. Thank you. That which means you're both now committed for helping me understand the SEO Linux future date because I have no idea how the hell that works either. Yeah. Well, that that's another area where, you know, it brings some of that functionality to

1:22:43 the operator would be would be awesome. So but thanks for having us on. It's been super fun. And outside of this episode, I love all the stuff you're doing with the the other folks you've had on as well. So appreciate you making good good content to help people learn and, you know, be aware of things happening in the community. Thank you very much. That's awesome. Feel happy now. Well, I will let you both get back to your days. If anyone has questions, you can reach out to all of us on Twitter. All of our handles are nicely on the screen. Of

1:23:12 course, there's comments on YouTube. And thank you very much for watching, and have a nice day all. Thank you. See you. Bye bye. Cheers. Bye.

Technologies featured

Meet the Cast

Weekly Cloud Native insights

Stay ahead in cloud native

Tutorials, deep dives, and curated events. No fluff.

Comments, transcript, and resources

More from Rawkode Live

View all 173 episodes
Kubernetes

More about Kubernetes

View all 172 videos
podman

More about podman

View technology