About this video
What You'll Learn
- Rebuild Russell's kubelet manifest set, fix API server flags and probe ports, then verify control plane pods start.
- Track hidden runtime tampering by auditing systemd services, processes, and kubelet config paths to catch unexpected eBPF probes.
- For the community cluster, debug application outages by checking kubeconfig, pod selectors, CoreDNS resolution, and deployment labels.
Rawkode debugs three broken Kubernetes clusters solo: Russell W's bash alias and static manifest trick, Noel Georgi's eBPF kubelet redirect, and a community-destroyed cluster needing CoreDNS repair.
Jump to a chapter
- 0:00 Holding screen
- 1:45 Introductions
- 1:50 Introduction and Event Overview
- 4:30 Thanks to Sponsor (Teleport)
- 5:14 Clustered.live Interactive Site & Giveaway Intro
- 7:26 Debugging Russell w's Cluster (No Control Plane)
- 8:00 Rawkode vs Russel W
- 8:33 Examining Kubelet & Static Manifests
- 14:10 Russell w's Cluster: Hint - Check Hints
- 25:28 Russell w's Cluster: Crystal Maze Story & Hints Reveal
- 28:21 Russell w's Cluster: Discovering the Bash Alias Trick
- 30:37 Russell w's Cluster: Finding the Real Static Manifests
- 33:17 Russell w's Cluster: Hint - API Server Configuration Issues
- 35:36 Russell w's Cluster: Fixing API Server Port & Probe Issues
- 38:36 Russell w's Cluster Fixed & Validation
- 42:21 Russell w's Cluster: Recap & Learnings
- 42:50 Rawkode vs Noel Georgi / frezbo
- 42:57 Debugging Noel's Cluster (No Nodes, Missing APIs)
- 48:46 Noel's Cluster: Asking for Hints
- 49:01 Noel's Cluster: Hint - APIs Playing Hide and Seek
- 50:58 Noel's Cluster: Discovering API Runtime Config Flags
- 53:49 Noel's Cluster: Hint - Don't Trust the Kubelet
- 54:09 Noel's Cluster: Diagnosing Port Binding Conflict
- 58:31 Noel's Cluster: Hint - Trust Manifests?
- 1:03:05 Noel's Cluster: Hint - eBPF Involvement
- 1:04:39 Noel's Cluster: Discovering the Malicious eBPF Redirect
- 1:05:38 Noel's Cluster: Fixing the eBPF Trick
- 1:06:31 Noel's Cluster: Recap & Learnings (eBPF)
- 1:07:00 Rawkode vs The Internet
- 1:07:30 Debugging Community Cluster (Destroy-It Challenge)
- 1:08:52 Community Cluster: Correcting KubeConfig & Initial Check
- 1:09:17 Community Cluster: Debugging Application Deployment
- 1:14:20 Community Cluster: Diagnosing Networking/DNS Issues
- 1:16:19 Community Cluster: Fixing CoreDNS Configuration
- 1:17:06 Community Cluster Fixed & Recap
- 1:17:36 Giveaway Announcement & Process
- 1:18:55 Drawing and Announcing Winners
- 1:20:08 Conclusion, Thanks & Future Episodes
Full transcript
Generated from the English captions. Timestamps jump the player to that moment.
Read the full transcript
1:50 Introduction and Event Overview
1:50 Hello, and welcome to today's episode of Clustered. Today's a bit different than normal. Today is Clustered versus Rawkode versus the community. Not really sure how it's gonna work, but let me share some stuff with you. First, check out Rawkode.live. That is the YouTube channel. Please subscribe. Click the bell. You'll get alerts and notifications for all new episodes of Clustered, as well as all new episodes where I explore cloud native products and technologies with their founders and make this all a little bit easier for us all. There's also a Discord server available at Rawkode.chat. There's nearly 500 of us on there now
2:34 talking all things cloud native Kubernetes and everything in between. I started doing something a little bit different lately, and I'm giving away a lot of swag. Some people have asked, what if I just wanna buy a t shirt and support it? So I have included the link there. So please feel free to check out store.rawco.com. But also, I will be giving away lots of swag over the next weeks and months. So stay tuned, enjoy the episodes, and I hope you get to win something. There's a couple of little links on this site there. So custard.live
3:11 is a thing that I threw together literally in the last couple of days to try and make our streams more interactive and more engaging and give you an opportunity to win. It was painful fighting with the Twitter API, but we do have something that works, and I'll walk you through that in just a moment. Today's episode is Versys episode with myself. So we have a cluster broken by some regulars. You may know no or fresbo and Russell w. I had a third cluster, which I asked someone to break, but unfortunately, due to constraints, time, and,
3:52 well, life, it wasn't able to be broken in time for today's episode. So you'll also notice here, there is a Rawkode.live/destroy- it. This is a public KubeConfig to a Kubernetes cluster running on Equinix metal. My challenge to you lot is go fuck with it. I will be using well, I will be blocking access to it very soon. But for the next thirty minutes, feel free to try and break it as you wish. I also want to thank Teleport for sponsoring, sponsoring custard. You know, I've been using Teleport since the first episode. I think it's an absolutely fantastic
4:30 Thanks to Sponsor (Teleport)
4:36 product. It's just wonderful that I get to use it on the stream every week and get to show you all how cool it is. So you should go check it out. They support the show. They're providing the swag. They're the ones that are making all of this, all of these giveaways possible. So thank you to Teleport. To support the show, go to raw Rawkode Live / Teleport. It is a UTM link, but it just means that they have a little bit of feedback that shows, you know, supporting this channel works for them. That's quite a lot of talk for the
5:04 start of this episode. I'm also very nervous about this episode because I'm fixing things alone, so I do have a beer. Okay. Let's take a look at today. This is clustered.live. Let me remove our sponsor message for the time being, where we can click saying I'm with Twitter. Now there have been a few people that have mentioned that the permissions to access this application are a little permissive, but unfortunately, the Twitter API only really exposes read or read write. There is nothing in between. So I'm really sorry about that, but it is the best that we can
5:14 Clustered.live Interactive Site & Giveaway Intro
5:49 do. I also ask for those permissions because when you enter the competition, it will follow my account, follow the Teleport account, and repeat retweet the tweet. Now I don't see it here because I've already entered, but there is lots of warning messages. I'm not doing this without your permission. I hope you understand that. It's really just a workaround the Twitter API and the fact that I can't check, painfully cannot check if you follow an account or if you have re tweeted a tweet. I cannot stress it enough. This is not the way that I wanted this to go,
6:20 but it's the best that I could come up with at least in this amount of time. So I'm gonna try and make this better, but if you do want to win some swag today, we have three Rawkode swag packs, we have three Teleport swag packs, and we have 10 copies of Siam Patak's SKS training manual. So enter the competition. Your odds are really, really good of winning. Right? We've got ten, thirteen, six 16 prizes, and we don't even have 16 entries yet. So chances are click that button and you can win something today. Rawkode.live/destroyit will take you to this guest where there
6:58 is a KubeConfig. Download it, attack the cluster. I'll be getting to that in around thirty minutes time. And also, I know the permissions on my application are a bit wild. They allow me to upgrade your profile. They allowed me to delete your account, I think. There's a whole bunch of things. Again, not by choice, but the code for all of it is open source. You can find that at github.com/rockodeacademy/clustered.live. Cool. Today, we are gonna fix some clusters. So I have three here. We're gonna start with Russell w's cluster. So thank you for taking the time out of
7:26 Debugging Russell w's Cluster (No Control Plane)
7:38 your day to break this. What I'm gonna do is start my timer. There we go. And zoom in. Oh, hold on a minute. Why have I given myself four or five minutes? Alright. We'll do what I can. I actually have much less than four to five minutes per cluster because I have three. So really, I've got, twenty minutes per cluster. And I can't really keep up with the chat and all the talking, so I'll try and keep my eye out there. Who broke Teleport? Come on, Kevin. You're not allowed to break Teleport. You're also telling me it's a Twitter page.
8:00 Rawkode vs Russel W
8:25 It's absolute shame. Awful. Cannot stand it. Okay. Let's see what we've got. The reason I'm starting with Brussels because I did look at nulls and notice we don't even have any worker nodes. I'm gonna come back to that one. Instead, we'll start with this one. We'll log on to the admin account and I'll export my kube config. Alias key and I run get notes. Alright. I have no control plan. Thank you, Russell. Need a bigger drink. Okay. So what do we do when we have no control plan? First thing I'm gonna do is check our kubelet.
8:33 Examining Kubelet & Static Manifests
9:18 I can see the service is loaded, but it is stopped. I'm gonna try just starting that back up. And while that starts, I'm gonna jump into our Kubernetes manifest. Thanks. Thanks, Russell. I hope you scripted that. Have yourself for each break. Yeah. It may go that way now. I don't have any static manifests. They all appear to be the same number of baits. Think they're all alright. Okay. They're not all the same number of bytes. The letters are the same number of bytes. So I think a, b, c, d, and e are the fails I need and everything else
10:32 is noise. Alright. I am gonna move a, b, c, d, e dot yaml to temp run ls. Yep. And I don't trust, I don't need these. So we'll call this dumping ground. Gonna move everything to our dumping ground. We're gonna move everything back. YAML to here. Where did that Selen come from? Good thing that's mine, I'm gonna push that back. Okay. Now we have a, b, c, d, e, and e. Yaml. I'm actually not fussed about the names. I just wanna make sure I've got the right components. So in fact, we'll just star.yaml. So we've got our API server,
11:33 etcd controller manager, scheduler, kubect. Okay. Hopefully that gets me a control plane online. I guess what we want to see now is do we have an API server? Do we have a controller? Not quite. Do we have an actual working kubelet yet? No. Okay. So our kubelet is broken. I'm gonna take a look at our kubelet logs and it is misconfigured. Okay. We shouldn't have an Etsy kubelet fail. We're gonna go into our with the system, system control cat, kuplet, and we're going to check and make sure there's nothing weird here. Exact start here has been tampered with,
12:40 and that is in this drop in here. Our kubelet should be in user ben user ben kuplet, maybe user ben local kuplet. User ben kuplet, the daemon reload, restart kuplet. And we'll take a look at those logs and that looks a bit better. Hopefully those errors go away. Let's see what we've got in the chat so far. Yeah. Well, brought an account then during the team stuff just to make sure that he gave appropriate time to each of the clusters that we have. So it's it's really more of a guide. It's not always enforced, but
13:45 yeah. I try to keep it there. Some love for Russell's break. People laughing. I'm not laughing. That's a lot of yaml. Lots of alright. Okay. Selim was not you, Russell. Thank you. That helps. I'm not sure why that Selim.YAML was there. You're a gentleman. Okay. So we do have hints if we need them and slash root hints. Okay. I'm curious if we have an API server. We do not. Controller, cube. All right, we've got a few things. So now our API server is failing. We can go to varlog containers and we can tail cube API server,
14:10 Russell w's Cluster: Hint - Check Hints
14:36 and it's been shut down. Structural schema condition controller. Interesting. Oh, yeah. They're called funny names. Alright. Let's see. API server, advertise address, low privileged authorization mode. All fine, fine, fine. Admission plugins, no destruction is okay. LCD should be alright. It should check that it's running though. And that looks alright. Not sure about the probes. Hard coded IP address is probably a private IP before that machine, not something I often look at. I'm gonna assume it's all right. I'm not sure if we should have a default seccomp profile on API server, but I can't think of this just running on
15:54 a web server. Yeah. It should should be alright. It's not scheduling anything. Okay. That looked alright. So I because I'm confused, I'm gonna take a look cdb at the top. Not probably a good sign, right? Yeah, think I'm just okay. Okay. Let's see, kubectl. Look at this in a bit more detail. When's this last start? This is sixteen o two. Hey, that's a while ago. So we're starting here. Oh, we got a shutdown here. How can I shut down before it starts? Yes. That's funny. I'm not in there. Okay. It could be API server, a control plane.
17:44 It looks looks fine. I can never remember this stuff. Oh, no. It is the crisis here. Runtime endpoint. Okay. API server four minutes ago. Let me do logs. Container ID. All right. Okay. So we can do logs, but we need ps and then log on our API servers dot again. Okay. Strange. I guess this is just the pod shows because the pod is created even though the container doesn't exist. I move fail. Just try and encourage the kubelet to reschedule it. K. So the KubePad errors are because it can't speak to the API server, But I'm not seeing one.
20:20 Start. I'm not getting any more logs. I'm gonna restart Kubler. I really wanna see this Kube API server failable too. So I'm just gonna remove it, which I'm gonna regret in five minutes. Watch. And we'll wait for that fail to come back, see if we can get some more logs out of that API server. I'm not sure what is happening here at the moment, but I'm doing okay for time, I think. Russell is just happy it took more than five minutes. You know, people always worry when I invite them on to break clusters that their effects
21:28 are too superficial. It just doesn't work that way. It's really hard debugging clusters. I'll see that because I'm doing one. The kubelet is running. We're still almost again. Add our messages here. And I wanna be able to filter that. So let's remove the follow. Search for API. Okay. So fail is creating a matter of pod. I finally got something that looks like it's useful and it's failing because the pod already exists. Yeah. So we did see that behavior with the CryoControl PS. That's it was created a few minutes ago. I suspect that's gonna be even
23:09 newer now. No. Is that RMP? So I've removed that pod. Let's see if it comes back. Yep. Nope. No API server logs. Kipla. What's happening just now? Okay. So, okay. So we've got API server crash loop back off. And there's a component skipping field to start container for the cube API server with a crash loop back off. Okay. Then why am I not getting logs? That's a good idea, Kevin. I think I will take a hint. I'm gonna take a look at our IPS. I'm gonna take a look this one more time because I feel
25:04 Why are we not getting logs? Okay. Is that a directory? Yep. Okay. So we're past the beach. What's the second ten? Oh, there's a read me. Okay. Welcome to oh, you never told me there was a bag story. Okay. So welcome to the Crystal Maze Cluster. You will need to travel through four zones in order to solve their puzzles. Good luck. I hope it's challenging enough without being too frustrating. Well, you failed. It's frustrating. Let's begin. Picture zone. If you want hints, I'll just drive straight in. Beach. Just a view of the beach. Yeah. We we we've done the beach.
25:28 Russell w's Cluster: Crystal Maze Story & Hints Reveal
26:00 So I don't think this is the industrial networking one. Something with the stairs and then there is a phrase you can't see the wood for the trees. Yeah. I don't want you to start opening all these hints, Russell. So which one which hint should I look at for the API server break? Give me a heads up. And I'll think. Really annoying me if there's more logs. I need to know why it's breaking. Must be something in here. Okay. Trees. Thank you, Russell. The forest. Oh, we got a couple of hints here. Okay, so many questions, which fails do you need
27:59 to edit? However, there's many fails in the static manifest directory, but only a few containers being run. Where is the API server? Right, well, I think I fixed the first part of the forest. Ah, sneaky. Very sneaky. Alright. Well, I do not need any of your aliases. Should have paid more attention to them now that I've just deleted them all because I wasn't actually sure what you were doing. It looks like you were just intercepting my LS and CD, which I meant, I thought you had modified my manifest directory. Yeah. It was far too quick to delete
28:21 Russell w's Cluster: Discovering the Bash Alias Trick
29:30 those aliases. I should have paid attention. Okay. It was intercepted in CD and LS. I now know I've got the proper LS. We're take another hint just because I deleted that stuff too quickly. I shouldn't have to log in or nor log out. So I don't think you'd maybe change the manifest. That's why I did that. The cat on the couplet, and I was gonna start poking around to see if you'd maybe told it to live in a different location. Let's take a look. There we And you know I should have trusted my gut. Okay.
30:37 Russell w's Cluster: Finding the Real Static Manifests
31:09 This is the actual static manifesto retro. API server. Good. Good. Good. I I I don't think there's anything wrong with that API server. Got me going around in circles here. We've got twenty minutes left. Might as well read all the forest hence while we're here. So this one was the battery aliases. We found the real manifest. So yes, I am frustrated. I did try calling user Ben. Okay, so you basically broken everything is what you're telling me. This is definitely this. That's the same. And I still have Kube API server that I think is all
33:13 right. Oh, you're telling me there's two problems in this API site. Thank you, Russell. Okay. Line by line and stop rushing. So there's definitely a direct command, advertiser address. Should be okay. I have IPv4. Okay. User thing. Hoping it'll have to check all the cert paths. Secure port is okay. There we go. Unless that is the poor. I've just never noticed that before. I don't even know if that's a fixer. I'm breaking it more. I wonder if I couldn't see logs because of that CD intercept. Maybe that was an old log. I still don't have any API.
33:17 Russell w's Cluster: Hint - API Server Configuration Issues
35:10 Okay. Well, the probe ports, the two issues in that spec. Oh, this is wrong too. All the ports are wrong. Okay. Let's try that. Log yet. We start in a kubelet. I hate you. Alright. I'm just gonna keep restarting the cubelet till I get an API server. Why is it not taping? Teleport was broken. What have I done? Cool department up. We have an API server at least. And I don't know if I broke Teleport or, Okay. Don't know what happened there. Okay. Let's export. Yes. It pods. I think they've been cordoned. Nope. Okay. So Russell says that's forest and cryptography
38:36 Russell w's Cluster Fixed & Validation
39:14 covered. It's just the industrial stage now. So let's see. These are gonna wait. Worker one, worker one, worker two, worker one. Okay. So I think your readme suggested that everything is on the control plane. I've not touched the worker nodes at all. Okay. Let's see if we've got any 10x rules. We don't. We are, these clusters are all one twenty two. No. Started provisioning them the day after release. Okay. So we've got some networking problem. And what we can do, see if we can get any logs from this pod here. Unable to reach the API server that may
40:30 just need, oh, come down. May just need a wee nudge. And nudge. Okay. Let's see if that's something that's happy again. I wonder if that's the same reason the ambassador is potentially broken. I will just start rotating pods. Let's see. Okay. That's definitely healthier. So that means I could do a k get deployments and deployments clustered image two. Let's see if that works. And it did. Let's see what happens. And I've got the dance. So I'm not sure what your industrial problem was, Russell, but it did not stop me from deploying version two. Alright. Cluster fix. Thank you, Russell. That was infuriating,
42:21 Russell w's Cluster: Recap & Learnings
42:25 but very good. Yeah. The bash alias is that, you know, I just don't even I don't think, you know, there's nothing, there's no feedback to let you know that you're working with any leads or anything like that. So yeah, maybe something I should start checking by default in the future. Yeah. And that image sneaky. Okay. Let's jump on to node cluster, which isn't looking very healthy at the moment. We have no worker nodes. Alias, k, control. At least I'm getting real LAS CDs are built in. Okay. K. Get notes. Which kip control. It's CTL. Looks all right.
42:57 Debugging Noel's Cluster (No Nodes, Missing APIs)
43:53 Version. I've got an API server. But something. We have no apps we want our core v one. Okay. So how would this be possible? Well, we're going straight back to our static manifest. We're taking a look at our API server configuration. Disappointed in my cursor did not start halfway down the file. Do we have all our controllers running? Certainly looks like it didn't start my timer. There you go. I mean, you could have reset the times on the files, of course, but it looks clean. Double check the image. We're getting bit by that twice. Okay.
45:38 So either I don't have access to these types. Okay. Let's get cluster roles. Hopefully it's not our back. Please don't be our back. So I have my admin account. Let's take a look at the context for Kubernetes admin. Speaking to the correct cluster. Cluster admin here. Not our back. That's good. That was the only lead I had. No. So let's check. Make sure this matches what we can expect it to see from the static pod manifest. No sneaky static admission controllers, no controllers being disabled. Okay. I should have brought two beers to this episode. Okay.
47:24 How are you removing pods? How are you to moving pods? Okay. Let's roll that out. What do we have access to? I'm able to retrieve the complete list of server APIs. Apps v one, the server could not How did you remove apps to be one? And where's core of e one? I've never seen that before. I'm impressed. Shit. I mean, it has to be it has to be API server. Unless she did something in the cluster and then out of the cluster. You know what? I think I'm I'm gonna have to reach for a hintel.
49:01 Noel's Cluster: Hint - APIs Playing Hide and Seek
49:01 Okay. So the APIs have decided to play hide and seek. Where have they gone? Nasty Goose. That's a hint. This is sneaky. Okay. So you could have modified kubectl, even though it's a binary. May not be the correct one. It does to get tree state clean. Make sure I'm calling this kubectl. I am calling this kubectl. Do we have any other kubectl? Shooter, Ben. This is too funny. There it is. Runtime config v one equals false and ask v one equals false. Okay. So I can see it here. Runtime, but not there. You've moved the stack there. You've done the
50:58 Noel's Cluster: Discovering API Runtime Config Flags
51:20 same, right? So let's take a look and our config here. It's not been moved. The static pod manifest is the same. How is that run time config again and to the API server? What if it's not running as a static pod? Alright. It's definitely not an as fail, right? Okay. So it could be there's two static manifests and one of these fails. No. I like this. I could see the problem. Do we trust a Qublet? And those tell me to read the hints again. Okay. The API is to say play hide and seek. Where have they gone? Those are nasty
53:15 gifts. Don't worry. Let me think. I'm trying to think my kubelet is not a kubelet. Know I should go to directory here. Can't really tell what you were building. I should have hidden it from PS as well. This is also a hint. You all are cruel. Okay. So we could be in a position where our API server scheduled by the kubelet just isn't running. That may be the case because got it. Okay. So that's nice. Our Kubelet can't actually schedule the API server via container D because the port has already been bound. So no has managed to
54:09 Noel's Cluster: Diagnosing Port Binding Conflict
54:22 I mean, I don't think system D and I took a look in the system D directory, but I guess it's been a lot more sneaky than that, which means in theory, I could probably just kill this process. No, it came straight back. I'm gonna stop the cube there. I'm gonna kill this. It's gone. Install, install, reinstall, kubelet. If you hacked my kubelet. That's taken a little bit longer than I was kind of expecting it to, but that's good or bad. There we go. And And then I'm going to restart the Kubelet. I'm going to assume you compare join
56:16 Kubelet to spin up its own API server before running a static MatterPods. I think that's maybe what you've done. No. Darn it. Okay, I'm gonna stop the Kubelet again. I wanna confirm that kubelet is definitely starting this process. Desperation on the Linux command. Container D. I think that is actually just part of CTR. Okay. That's a pretty big hint. Noel says, do you trust the manifest fails that you see? I mean, I think so. Okay. So you've not done something through a shell. This better not be eBPF. Do I trust the manifest files I see?
58:31 Noel's Cluster: Hint - Trust Manifests?
59:32 Okay. The API server does have an extra line in it. It's kinda what I think you're telling me. But I can't see it. I hate computers. I may have asked stop my QP. The kubelet.com should just be the user to connect to it. I don't think there would be anything in there. Seems normal. Woodworking is a great idea. Okay. So I have no idea. I'm curious what happens if I add desktop. I don't think it's adding. Let's kill the process to be sure. I think this has to be BPS and there's no way to debug that.
1:02:18 Have I broke it? It's not getting my flags I added either. You've stumped me now. Tell me what it is. I just don't know where to start now. So you're gonna have to I don't know how to use BPF tool. And here is the though. Russell, I did add a new param to the gamble. It didn't show up. But I I I don't know. So Nolan said he left tool, which means there is an eBPF hack in somewhere. No. No more hunch, just tell us. Do I trust the off system into the nodes teleport? Not particularly. You tell me I can SSH
1:03:05 Noel's Cluster: Hint - eBPF Involvement
1:04:00 in manually. You're saying run system control cat teleport d. Okay. So you're using VARTEP EPPF kit. The source, so there's a Rawkode Kube API server manifest in VARTEP, which is targeting a fail over there. So you've got an eBPF probe that whenever the kubelet tries to access the kubelet server YAML and save them the manifest directory, you're actually redirecting it to var temp. Was that supposed to happen? Alright. I'm gonna just kill it. It does not like that. Stop the kubelet. Kill the process of our temp. Oh yeah, it does not like me doing that.
1:05:38 Noel's Cluster: Fixing the eBPF Trick
1:05:57 Oh, you have to, okay, got it. So no, and the chat is saying I'll have to stop the teleport the process with the eBPF kit. That's nicest of eBPF and really difficult to track down. I think it's fair to say that eBPF is a devil's work. Let's jump back over to here just now. I really, what I should have done at some point is run a status and take a look at everything in this list. And I would have spotted our phony teleportee. I think think I would have spotted it. So there we go. Couple of learnings
1:06:31 Noel's Cluster: Recap & Learnings (eBPF)
1:06:59 there. Always take a look at the system, process table service list and never let Noel and Russell near cluster again. Yeah. I don't know what's happening here, so I'm just going to leave. And I'm gonna assume that eBPF kit is not gonna show up here. I'm curious how you had that, But we'll just leave it. That was tricky. Really, really tricky. Good work, Noel. Alright. I gave you all access to this cluster. I don't know if anyone has access to machine. I don't know if anyone's run or no one's SSH channel would give you access to.
1:07:30 Debugging Community Cluster (Destroy-It Challenge)
1:07:46 All right, I have removed. Yeah, Waleed, thanks for joining us. We'll quickly check if anyone did anything to this cluster. And if not, we'll call it an eight. There has been a guess. Alright. Guess I'm fixing one more. Alright. We have a kubelet. I have an API server. It's got secure port six six three. I bet there's nothing wrong with this cluster except for me. I didn't do the export properly. Yeah. Well, maybe someone did break something. Okay. Well, let's see why our application is crashing. There's no logs on that, of course. I should know that by now.
1:09:17 Community Cluster: Debugging Application Deployment
1:09:35 Someone changed the label. Sneaky. Although, maybe it's here. So maybe it's a typo in my automation. That would be funny. Let's just bump it up to V2 always. Oh, someone's changed the probes. Can't remember the ports on this application. I think it's eighty eighty. And here, a death by a thousand cuts I can see. And resource limits look okay. And what do we get here? Okay, I can't change the labels, so maybe that is part of my automation. Fix that back. Let's see what happens. Okay, the probes aren't passing. That's typically what the internal server error means.
1:11:09 It does look okay. Maybe I should expect to No, it's broken. Okay. Let's see. It should be a node port service. Maybe someone's modified that too. Selector is wrong. Okay. So I think someone actually deleted that deployment and you applied it with the broken labels. So it's probably not my automation. And we've got an endpoint now. Is that healthier? I think that's healthier. Possibly gonna time out because, Postgres does have endpoints. What happens if I curl? Okay. What's wrong with my application? And see, don't even know who to ask for help now. I actually think if we pull up my
1:12:52 automation, I think it just runs on port 80. I don't think it has eighty eighty at all, but I could be could be wrong. No, it's okay. Or 88. Okay, so what is wrong? Clustered. We've got an endpoint. It's running. But I'm not really able to do anything. So let's try getting a shell. Let's see what happens. Let's see if gonna time out from the database or something else. Failed to connect to the base. Okay. So, alright, can it resolve? No. Okay. So get pods all. Where's core DNS? It is running. Let's see if it is broken DNS
1:14:20 Community Cluster: Diagnosing Networking/DNS Issues
1:14:54 and everything. I don't know if that's resolving. No. Okay. I have no networking. So it's probably not DNS. Never said that before and possibly that I just can't create a DNS within the cluster. Okay. Host is fine. Selim appears to be okay. I'm kinda running out of time because I do have a hard stop, but I'll see how far I can get through this. If anybody made changes to this cluster and you're still watching, drop a message in the chat and maybe we can speed this up a little bit. Okay. So Wonder if someone just applied cluster policies
1:16:19 Community Cluster: Fixing CoreDNS Configuration
1:16:25 rather than modifying the config map. Because modifying the config map would be a bit mean. There we go. See, sometimes you make it more difficult for yourself by looking into like, oh, they could have broken suddenly when it can just be the really, really, really simple things. Okay. So we've got some swag to give away. And thank you to anyone who broke that cluster. That was a nice easy one, which is appreciated after those two car crashes that we had. And what do we need to do? Well, we need to go to Rawkode Academy clustered
1:17:06 Community Cluster Fixed & Recap
1:17:29 live. So like I said, this repository is all online. We can run chicken dinner. And we'll see we have 19 participants in today's competition. I can stop by the timer now. We don't need that. Right. There we go. And we have 10 winners. So thank you to our sponsor Teleport who are providing 10 copies of SIAM's CTS book, and they are providing three vouchers for swag. I am also gonna be giving away three Rawkode t shirts. So in total, we are looking for 16 winners today. The way this is gonna work is the script just doesn't, you know, it's
1:17:36 Giveaway Announcement & Process
1:18:19 it's I need to be able to run it like so. It's just gonna spell 16 names. So what I'm gonna say is the first three names get the Rawkode t shirts. The next three names will get the teleport swag, and the final 10 names will get the CKS book. Hopefully, you all are happy with that. I will make the code around this competition a bit more sophisticated over the next days and weeks, but it just needs a bit more time. This is what I could put on very quickly, and I'm gonna go back to fighting the
1:18:51 Twitter API very soon. So Python three chicken dinner, 16 winners, please. Only three if you're not gonna win. I'm I'm really sorry about that, but that's that's the way the cookie crumbles. Okay. Rawkode T shirts go to Walid. Just me and open source and Crux. We have Teleport Swag for Noel, Kevin, and Rishba, and oh, I won. I'm gonna have to draw one more. The Meric, Russell, Safiya Safiya. I'm not sure. Adriyasha, Roberto, Jason, Philip, Steve, and Atnish. So I will save these to our winners. I will draw one more because I obviously don't want to win.
1:18:55 Drawing and Announcing Winners
1:19:51 I'm gonna move the sponsor message just now. And hopefully we don't get a repeat name. Sid Palas, there you go, Sid DevOps directive. You have also won CTS book. So there is our winners. Thank you everyone. That's SASSA. Awesome. No worries. I'll reach out. I have all your Twitter handles, which are stored in Firebase. I will drop you a DM. We will organize your swag, and we will be doing this every week on Clustard, so make sure you come back. We're also looking for more people to compete on Clustard and more teams. So if you wanna come
1:20:08 Conclusion, Thanks & Future Episodes
1:20:29 on and have some fun, regardless of your level of skill or experience with Kubernetes, we'll find a way to get you on and have some fun with this. So drop me a DM on Twitter, say if you want to join solo or teams or both, and we'll do our best to make that happen. I'm gonna give one final thank you to Teleport for their support and for letting us give away some swag. And I am going to say goodbye, and I will see you all later. So thanks for joining us. Have a wonderful day, and
1:20:56 I'll speak to you all soon. Bye.
Technologies featured
Meet the Cast
Stay ahead in cloud native
Tutorials, deep dives, and curated events. No fluff.
Comments