Watch / Klustered Live
Overview

About this video

What You'll Learn

  1. Diagnose broken Kubernetes control plane by verifying kubeconfigs, API server flags, and etcd certificate and namespace settings.
  2. Debug application deployment failures by fixing RBAC create permissions, ReplicaSet rollout behavior, and image-related configuration mistakes.
  3. Recover cluster stability after storage and scheduling failures by correcting Postgres manifests, resource quotas, and node taint handling.

Two teams, RXM and Raft, debug broken Kubernetes clusters. RXM chase an etcd cert path typo and an API server namespace typo, then a missing Postgres manifest. Raft fix an RBAC ClusterRole missing the create verb and work around a broken scheduler.

Chapters

Jump to a chapter

  1. 0:00 Holding screen
  2. 1:21 Welcome and Show Intro
  3. 1:50 Sponsors: Teleport & Equinix Metal
  4. 2:37 Introducing Team 1: RXM
  5. 3:36 Technical Streaming Issue
  6. 6:00 Team 1 Re-introduction (RXM)
  7. 6:44 Starting Debugging Cluster 1 (Raft Tech)
  8. 7:25 Cluster Access & KubeConfig Issues
  9. 10:15 Investigating API Server Connection
  10. 12:13 Checking API Server Manifest Flags
  11. 15:01 Debugging Editor Permissions (Vi)
  12. 16:05 Fixing Editor Permissions
  13. 17:42 Debugging ETCD Connection Failure
  14. 19:36 Investigating ETCD Manifest & Certs
  15. 20:57 Checking ETCD Logs
  16. 25:04 Fixing ETCD Cert Path Typo
  17. 29:21 Checking API Server Logs (Again)
  18. 30:34 Locating Namespace Typo in API Server Config
  19. 32:07 Fixing API Server Namespace Typo
  20. 32:38 API Server & ETCD Pods Running
  21. 32:58 Checking Application Pods (Scaled to Zero)
  22. 37:57 Scaling Up Application Deployment
  23. 38:28 Debugging Pending Pod (Node Taint)
  24. 39:58 Untainting Worker Nodes
  25. 40:45 Fixing Application Image Name Typo
  26. 44:04 Debugging Readiness Probe / Pod Ready
  27. 44:40 Checking Application Status (Database Failure)
  28. 45:04 Debugging Missing Postgres Database
  29. 45:59 Applying Postgres Manifest
  30. 47:51 Cluster 1 Fixed! (RXM Success)
  31. 48:05 RXM Team Debrief
  32. 49:31 Transition to Team 2
  33. 49:38 Sponsor Recap
  34. 50:02 Introducing Team 2: Raft
  35. 51:04 Debugging Cluster 2 (RXM)
  36. 52:15 Initial Cluster Check & App Status (V1 Working)
  37. 52:57 Attempting Application Upgrade to V2
  38. 55:01 Investigating Pod Errors (Old Logs)
  39. 55:40 Checking Deployment Status (V2 Image Set, ReplicaSet Failure)
  40. 56:26 Debugging ReplicaSet Creation Failure (RBAC User Forbidden)
  41. 57:19 Investigating RBAC Permissions
  42. 1:00:27 Checking Cluster Role Bindings
  43. 1:03:49 Investigating ReplicaSet Controller Cluster Role
  44. 1:04:53 Fixing Cluster Role Permissions (Add 'create' verb)
  45. 1:06:30 Forcing Deployment Rollout (ReplicaSet Delete)
  46. 1:07:02 New Error: Resource Quota Exceeded
  47. 1:08:40 Locating and Deleting Resource Quota
  48. 1:10:08 Forcing Rollout Again (ReplicaSet Delete)
  49. 1:10:21 Pod Stuck in Pending (No Scheduling Event)
  50. 1:11:24 Checking Node Status
  51. 1:13:09 Identifying Scheduler Issue
  52. 1:13:35 Implementing NodeName Hack (Bypass Scheduler)
  53. 1:14:14 Editing Deployment to Add NodeName
  54. 1:15:59 Checking Application Status (V2 Working via Hack)
  55. 1:16:33 Discussing the Scheduler Hack
  56. 1:17:11 Raft Team Debrief
  57. 1:18:58 Outro and Thanks
Transcript

Full transcript

Generated from the English captions. Timestamps jump the player to that moment.

Read the full transcript

1:21 Welcome and Show Intro

1:21 Hello, and welcome back to Rawkode Academy. This is a new episode of Custard. I have been away for three months, almost three months now. And so I don't remember how to do any of this live stream and stuff. I don't even know if my little video is gonna pop up at the bottom and stuff. It's just been far too long. But today's episode of Clustered, we have two great teams, two very broken clusters, and we're gonna do our best to work through all of those fixes and get things jamming today. Now, before I bring on our first team,

1:50 Sponsors: Teleport & Equinix Metal

1:50 I wanna thank our sponsors for today's episode of Clustered. Teleport have been sponsoring Clustered from the very beginning. We've been using Teleport from the very beginning and they recently started sponsoring us. So Mads Props Teleport has been great working with you. If you like what you see as we use Teleport to debug and fix these clusters today, you should just check out rockode.liveteleport to show your support. It really helps and I will thank you much later. Also the hardware has been provided by my former employer, Equinix Metal. If you want to try it bare metal cloud, you go

2:21 to metal.equinix.com. And when you have an account, use the code Rawkode, this will get you $200 of credits to start kicking the tires. These are some pretty nice and beefy machines. So have some fun with them, I do. All right, now we're going to meet our first team for today. Here we go. You are all now live on air. So mind what you see, be polite, all that jazz. Can we start at the top right? We'll start with you, Christian. Just say hello, tell us who you are, and then we'll move round clockwise. Alright. Hello. My name is Christian Laxina, and

2:37 Introducing Team 1: RXM

2:55 I'm with RXM LLC. We are a cloud native training and and engineering company. That's me. Hi. I'm Christopher Hansen. I'm, one of the instructors for RXM. I also I'm a cloud happy to be here. Thanks for having us. Good good evening and good morning. My name is Ron Petty. I am also an employee of RXM. I'm a consultant. We we work on books and trainings and we love Kubernetes and we're we're glad to be here. Thanks for the opportunity. I think I made a mistake. I just went to my YouTube page and I'm streaming to the wrong video. So how

3:36 Technical Streaming Issue

3:40 wonderful is that for a start. Right? Mother shit. The first thing's been hacked. Yeah. But the comments are coming from the the right video. Is this a bug with my streaming software? Good question. Because I'm seeing the comments from my video, but I'm very clearly streaming to part seven of a teleport course. Alright. I'm gonna hit finish. Just change over and see what happens. Alrighty. Should burn my streaming software. Alright. So we get to do our things all over again? Well, I've just pushed the live button again. I'm gonna keep us here, and then we're gonna see what the hell is

4:23 happening. I don't know what. Here we got some comment rolling in now. Russell. Hello. Russell. Yeah. I'm gonna wait till I confirm that it's working and then I'll There we go. But it's off to a fantastic start, I've gotta say. I'm really, really happy with that. Yeah. Thanks for that. It's still not coming through. Do it live. Alright. I'm just gonna post a comment with the other video because my streaming software is YouTube being weird. Now we got a couple of top chats already. Yeah. So, I mean, the people are just tuning in there. I have no idea what's

5:27 going on. The chat from the real video is coming to my software, but the stream is going to a different video. So I'm going to assume this is a YouTube bug. I have no idea. But we're just gonna roll with it now and get started. I posted the video link and it's come through on the same chat, which is even weird. But we're just gonna get started. There's 21 people waiting in the other video and I hope they'll find their way over here. Alright. So I'm gonna I'm gonna I'm gonna be really horrible. I'm sorry. Could you all introduce yourselves

6:00 Team 1 Re-introduction (RXM)

6:02 again and then we'll get started on today's cluster? No worries. I guess I'll go ahead and get started. My name is Christian Laxina. I'm with RXM, and we are a cloud native training and consulting company. Happy to be here. Yeah. Thanks for having us. My name is Christopher Hansen. I'm also a team member at, RXM. I'm a cloud native engineer and instructor. Super excited. Right on. And and the the third wheel. My my name is Ron Petty. I'm a consultant at RXM, and I am also very excited to be here. Alright. Thank you all. Alright. Well, I think we just dive straight

6:44 Starting Debugging Cluster 1 (Raft Tech)

6:44 into this cluster. Let's see if we can get this thing working. So this first cluster is from our second team who are at Raft Tech. I'm going to pop my screen up here and we have our teleport session. I'm going to click connect and open a root terminal on our control plane node. Now, if you can all go to activity and active sessions and join this, and once you're in, if you just type echo, hello, whatever, let me know that you're there. Can see we've got one more person connected, two more, you know, yep, four, there we go. Perfect.

7:23 So we're all in the same session. This would be a good time for you to configure your KubeConfig and test to see if we have an API server. I'll let you all take it from here, but have fun and best of luck. Alrighty. Thanks. Thank you, David. So Alright. Let's go ahead and, get ourselves going. I believe there should be an export statement. Nope. Let's, we don't have an export statement. So let's go ahead and do config view. So, basically, what's happening here right now is it looks like my kubectl is being pointed to the correct well,

7:25 Cluster Access & KubeConfig Issues

8:00 yeah, it's being pointed to the correct file, but I don't seem to actually have, well, the ability to access it. So let's see here. Okay. Oh. Looks like the config is belonging to root. I just happen to be root as well. It's read write. It's not in the path. It's a little weird. Which keeps CTL. Alright. And it's not an alias unless I'm missing it. Nope. Doesn't appear to be. Well, what does that that mean? Alright. So it's not an alias. It's so it would normally be, like, user bin kubectl. Let's be pedantic here. It is there, but no permission. No permissions.

9:02 Interesting. User and KubeCTL. Which KubeCTL? We're almost back to life. Maybe. We'll see. Alright. Alright. Now now, Christian, export. Go. Alright. So we'll do we'll then do the export or cube config. So, usually, I believe these are cube ADM clusters. Right? So let's see. If I do l s minus l or well, I am rude, actually. Etsy let's see. Kubernetes. Okay. Etsy cube ADM. Is that correct? Ron and Chris, go ahead and yeah. Oh, here. Let me yeah. It should be a Etsy Kubernetes admin. So there's that. So then if you do export equals That's right. Let's see.

10:00 ADM. Enter. And then kubectl get pods. Oh, it's export cube config. There we go. Oh, yep. One one one would break. Yeah. Yeah. So for the audience, what we're doing is, we're actually pointing our KubeConfig our KubeCuddle to use the admin KubeConfig. This is actually something you can only do as root. So let's see here. It doesn't look like the admin conf is working at this point. Interesting. So let's see here. No. Ron, let's see if we act Let let's see if we actually got the correct oh, what is that? Oh, okay. So let's see if we actually have the

10:15 Investigating API Server Connection

10:52 correct environment variable name. What do you think, Ron or Chris? Yeah. But I'm pretty sure that was right. Yeah. It seems that way. I mean, we could be a little more Yeah. Yeah. That's true. Or Etsy Kubernetes. I'm like, I can I can spell? Okay. And I mean, one well, I mean, it's clearly reading it. I I guess I guess I was being dumb there. It is reading it because it wouldn't have the IP address otherwise. Right? So That's Yeah. I'm I'm I'm afraid to oh, no. I I turned off my plug in, so this teleport here should should

11:38 work here. So let's take a over a few. Is that the IP of this machine? Yeah. That's what I was gonna check too. 144492866443. So let's just try Telnet. 136. So the IP address is a BGP advertised address to this machine. I think it's safe to say that it is correct. Yep. So HTTP or yep. And then TLS. So that would imply oh, let's see. Crap. Sorry. So PS, Waxa, Grep, Cube API. I always love to see which parameters people use with PS. Well, nobody seems to use the same one twice. So look at the advertised address there.

12:13 Checking API Server Manifest Flags

12:36 Yeah. That ain't, that's not correct. That's, well, it's, it looks like that's the local address. Yeah. So let's just take a look there. So, 10 Dot 25212133. So that should be okay. Okay. Then it where's the HTTPS privileged stuff? Alright. So many settings. Well, privileged author. I really have a hard time reading these multilayer things. Find Where's the setting to enable TLS or disable? I can't Let's see. Oh, yeah. The secure report is actually there. So minus my secure report is in there. Ideally, insecure report should be absent here for you security minded folks. And I don't think I see that. I

13:36 wanna move that if you guys are reading. But yeah. Yeah. Go for it. Chris, what was So oh, I just was looking. I just wanna look at the YAML file. It's a little bit easier to read the flags. It's best to put it through a pager just so that we can follow the Yeah. Line of sight as well rather than we have to scroll to where I think you are. So Fair enough. I'm looking at the flags right now. Yeah. Perfect. Alrighty. So our advertised address is different from what's in the KubeConfig, so it seems.

14:21 At least it should be. Oh, yes. Yeah. So maybe just switch this, you're saying? Yeah. Give it a shot. Yeah. Yeah. Okay. So what was it? 10 dot I copied it. You can scroll up. Yeah. Not cool. This is I meant This is the 1361444928 numbers over here. Gotta be interesting. No. Stop. Alright. Add it here. Read. Write. What? Uh-oh. What's going on here? I'm gonna sledgehammer this because I'm I'm simple. Oh, come on. Is this the Oh. Is it is AppArmor No. Did they disable did they disable our VI as well? No. The professions of VI. Oh,

15:01 Debugging Editor Permissions (Vi)

15:21 come on. They they did. Those dirty dirty who would hurt VI? Just use Emacs. What? Well, they'll never get out of it. Alright. User bin. I'm gonna guess it's alternative. Alright. Etsy alternatives. Let me guess. It's it's gonna go to, like user. Oh my god. User. And not basic. Oh my god. All these aliases. There we go. CH mod seven zero zero user. Thirty. I don't I thought I like GoRaft as a team, but I don't know. You mess with them. You you mess with my family. Alright. Did that work? Hey. We're alive. Alright. Kubernetes.

16:05 Fixing Editor Permissions

16:24 Copied something into my buffer, so I'm thinking I'm gonna bork this. Oh, no. I locked out. My plug in hold on. Let me it's this plug in. Let me add add a rules, save changes. Ernie, my plug ins my plug in is a VI plug in for Chrome. What was the IP again? Oh, man. There oh, I do have it. Alright. Hey. 6443. Oh, wait. We we we can't use comments. No. That's lovely. Alright. So I get. Let's see. Alright. And 2562. Make sure nothing else has changed in the meantime. 6443. Is refused. Interesting. Stat.

17:32 Nothing is listed on 6443. Yeah. Alright. So what happened there? So I think we need the more observational view here. So Airport 6443. Where'd you say you saw the insecure? Oh, I I don't think it I don't think it was there. Truly. Yeah. It's not there. Let's see. I think I think it was he was just using just talking for education purposes. You know? It shouldn't be on. That's the teacher in in Christian. Yeah. Alright. So we are secure port 6443. So it should be, doing now, usually, doesn't the, Kube API server run on the

17:42 Debugging ETCD Connection Failure

18:30 host network? So can we check if this has, host network true? Yes. One second. Go. And host network is true. Alright. So it's definitely it should be listening on that address and that port. Yeah. Let's take a look at the logs. Oh, wait a minute. We can't connect it, so we have to do tail dash whatever. Oh, it's just tail at the start. Var log Kubernetes. Yeah. Done. Failed to con two, three oh, etcd. A local host. Yeah. Everybody's favorite component. So I'll make sure it's yeah. No. It's not. No etcd. What's going on here?

19:36 Investigating ETCD Manifest & Certs

19:36 Let's check it for the Do have the manifest is in there? Yep. Yep. Yep. It's there. Let's see if the but let's see if it's actually valid. Or read only or n empty. K. So v or let's cat Etsy Kubernetes. Alright. And it's advertising it's not advertising on the local host. It appears to be advertising on the on the node's IP. See that first flag up there, Ron, on advertised client? Oh, that's client URLs. Excuse me. Yeah. So hold on. We should if it's not starting, there should be a log somewhere. Right? So Etsy use log should be a pod,

20:39 but we can't get to the API server, or can we? So so this is a static pod. So where do static pod logs go? They do they always go to do we just not think about this? Is it alright. There's at CDs. See, no such directory. Oh, let me guess. We've port our certs. Okay. So let's take a look at this guy. What's in there? Okay. So it'd be peer, that cert. Or well So well, let's see if it's just hidden or something. Or wait a minute. Client is sorry. Let me use my brain here. So

20:57 Checking ETCD Logs

21:45 this is FCD that's failing to start because it can't find it for client client. I guess we could just use a peer cert. What is as long as they all have the same CA, it shouldn't matter. Right? Right. Well, yeah, we don't really care about the clients. Yeah. Right. We don't care about the clients, but let's check that out, the API server real quick. The client cert, though. So Yeah. It's a redial probably because it can't talk to that CD. So this guy let's see. What is he using? API server at CD client. Client cert.

22:30 My god. I'm having a brain fart here. Yeah. No. No worries. So it looks like here, those are all pretty standard for the API server. But it seems like ETCD itself, can we check the spec for it? Seems like ETCD itself is trying to refer to that client dot CRT, which clearly doesn't exist. So let's see if we have a client dot CRT somewhere. Go ahead and type it in. Well, it's not yeah. That's what we're looking at, the Etsy Kubernetes. That's this directory. It's not I guess, pure cert is what we're supposed to use, maybe.

23:07 Cert file is client dot CRT. So I think Oh, point it to Kubernetes. Yes. I don't remember what was in the directory. Oh, what was in the directory was I don't know if you can see highlights here, folks. But Oh, wait. I I got it. I found it. It's at c d Kubernetes PKI API server. Or, mean, it should it should be the thing we were ignoring before. Yeah. But this but it looks like this error here is referring to that. So I believe this should be the server dot c r t. Oh, yeah. Cert file. Duh. Yeah.

23:45 So that's the server cert. Cert. And if this is a one this is a one node cube ADM master. Correct? It is. Alright. So that means that it will eventually come up. Or it crashed so fast. Yeah. So let's go see. Oops. Here. Anything NCD? No. That's I think that's it. Yeah. Yeah. But I I can never remember if it changes. So we just wanna see that time stamp is updating. Right? But did should it did change. It doesn't align. I can't read it. Alright. It was twenty three, so let's just watch it. Oh, it did up twenty four now. So

24:47 it should be Yep. Should be there. Yeah. Your tool command is using the API server instead of the STD log. That's not the STD log? There it is. So what's this now? Ah, c a a cert. Too many too many a's. You trying to are you trying to speak the SLA of our certificate? Shall I make sure a, man. Yes. Go for it. Yeah. Okay. The cone the cone of shame is yours. Alright. So c a a dot c r t. Alright. This is a oh, that's sneaky. Alright. Let's go before we leave, let's go

25:04 Fixing ETCD Cert Path Typo

25:28 to the if it was them. Let's see. So I'll take back the Koenig then. Yeah. Peer cert. Peer cert. Alright. So let's go ahead and try it. I do recall that the API server is also trying to connect on local host, but we'll see how that oh, wait. No. It's still One one one problem at a time. Time. Yeah. Yeah. One. Alright. So at time. Or watch this again. Lookup magic. Before. So It's twenty five after, so it should oh, there it goes. It's been deleted. I feel like this watch default setting of two seconds is more than two seconds.

26:13 Right? It should, like, be, like, a little spinner or something to prove that it did something. I guess on the right, it is. I think that's the clock. Yeah. Yeah. There it is. Alright. I guess I'm impatient. I should read the whole thing. Every two seconds. So we're just waiting for it to come back up. Wait. Wait. Wait. Were we we were editing that in place, weren't we? Yes. We were. So remember that bug, KubeCuddle? Yeah. It came up, though. Kubelet. First the first time. Okay. Alright. We we were being positive here. It came up Oh, there it is. Kubelet has that

26:56 bug where it sometimes won't read a file you edit in place. So That's true. Do I look better a little? Alright. Let's see if the API server came to life. Nope. Not yet. So let's check the API server log. Yep. Alright. Oops. Let's just do this tail. Reverse flow. View server. Context deadline succeeded. Come on. That's what you get. You were right. 127. Go for it, Christian. Yeah. The the look close was the Alright. So, what was the IP again, Ron? Do you still have that new buffer? Oh, go ahead. Sorry. No. I I just

27:45 killed you. No. I didn't. Wait. Wait. Wait. Wait. The it's a servers flag. Right? So Yep. I think that's normal, guys. Yeah. This one? Localhost is normal. What's normal? For API server. That setting. Oh, connecting to the to the NCD via localhost is normal. So Right. Resolving a problem that's not a problem. Oh, come on. So what I'm gonna do is I'll grab that because there is something that caught my eye inside the the YAML, and it's it's this one here. So client URLs are client connections. It's listening for on this IP address and Alright. Not

28:33 local host. So let's w q out of that. And, Ron, let's do your The local one should work too. I mean, it's the same. Right. Everybody's if everybody's if it's listening to zero zero zero. That's true. I mean So now we have to wait for etcd to come back up. Hey. Oh, there it is. I guess I guess you're right. Alright. Guess I'm on. I'll shut up now. What's Kubernetes? I've never heard of it. I know. Right? So let's Alright. Wait. Yeah. That should been fine. Oh, come on. Get pods. Don't be cruel. This is like killing hello world. Why did

29:15 we kill hello world? What did Hello World do to you? I haven't checked the API server logs. I I'm not convinced you have the API server and STD talking to each other yet. Yeah. That's right. Well, that log is that's the wrong log file because that's four or five minutes old now. It's the but it's oh, it did sorry. You're right. You're right. Thank you. Thank you. Thank you. Delete faster. Delete. Delete. Oh, yeah. Now I don't know which one is the latest. Number one, right, or something like that because it rotates. Right. What am I looking on here?

29:21 Checking API Server Logs (Again)

29:59 For 1727, just about a couple minutes ago. So 90D. Yeah. Just a 3. Yeah. Yeah. They had a comma one before, but I don't see that now. Whatever. Oh, man. Some somebody heard our ETCD. Storage key not found. Registering master leases. Alright. Type this wrong. Kube admin reset. I will read that read those two error messages slowly. The one above the story chatter. I think you're missing something. So we got this. Oh, so oh. Alright. Our this is this is coming from our admin. No way. Explain it, Ron. Explain what you saw. So for the audience It's cube system

30:34 Locating Namespace Typo in API Server Config

30:56 is misspelled. Yeah. System or system with no e. Where am I looking? Am I so you're looking here. Nope. Don't think it's in there. Is this not where you set it? It's in here. Right? Like Or Hold on. Let me set that up. I'm looking for the namespace. Works. Isn't it in our cube config file? Am I blind? No. Not unless you said it explicitly. So that would mean it would be going to default. Was I supposed to API server log? Yeah. It's you're looking at our admin comp here. So Oh, wait a second. Yeah. This is for

31:56 the. Manifest hold on. These manifest a p error, q API error. There we go. There it is. Womp. Someone drop me a womp, please. Alright. How do you have the somebody has a this is lot of typos. Somebody should really was this was this Ron or me typing? Is that what happened? Did we make this kind of a This wasn't a hack job. This this was just my bad skill. We got how many API servers we got going here? We got two of them now. That's not good. Hey. We're in stereo now. Hey. It worked.

32:38 API Server & ETCD Pods Running

32:43 One of the API servers worked. It's horizontally scaled. That's right. Oh, there's Amazon Just update the and that's it. Just update the pod. Alright. Alright. Someone else take over. I'm too nervous. Alright. I'll go ahead. First off, let's get all our pods and all of our namespaces. Wait. Wait. Yeah. Wait a second. Where's the app? It should be showing up in the get it should be in the default namespace. Where's our default namespace? We are in the default namespace. Oh, wait. Why don't we where's our app and where's Postgres? Let's see. Trickery. I called make it explicit. Do dash end

32:58 Checking Application Pods (Scaled to Zero)

33:24 default and check. Alright. Maybe the maybe there's some crafty thing where they've e b f'd us. I don't think you'd be e b p f. I mean, if you look at your cube system namespace, you've got one important thing at an address space. Right? Yeah. That's true. So let's yeah. So that means that the deployments And the controller manager is kind of important. Right? Yeah. Oh, yeah. But, well, we don't need the controller manager if we bypass it. Yeah. That and a cluster has also been scaled down all the way too. Same with the STS, I believe.

34:00 The stable set, I mean, for Alright. Well, the controller manager needs to get fixed. That's why these are zero. So let's let's go after that. Alright. Ron, I'll leave you to the go ahead. I heard it I heard it works if you let it restart 200 times. Maybe we should just wait. The two hundredth time. It's like, yes. Oh, come on. Overcome my difficulties. Oh, really? Come on. Here. I can't I can't live like this, like a caveman anymore. Alright. Here. I got I've got your Hold on. I've got that on the buffer. Completion bash. Alright. I'm I'm no I'm no longer

34:40 a caveman. Alright. Describe. Pod. Describe pod. Why can't I just read my mind? Come on. Of course, my mind's empty. So you know? Alright. Image is here. Boom. Boom. Crash. Why are we crashing? Yeah. Let's check our logs on it. Yeah. Why can't you be friendly? You can use Kube CTL logs now, Ron. Savage. Alright. Alright. We got two logs here. Who's who's the newer log? The top guy is the newer log. Right? 17331738. Alright. Now let's go open this guy. Adam. Alright. I'll start walking backwards. Missing time stamp. Sorry. Tell me to go slow if we're

35:55 going slower. I'm just kinda seeing if there's, like, a big cat. I'm looking for ASCII art. Where's the big thing saying you got hacked? I don't I don't see it. How far back in time? We're still at seventeen thirty three. Oh, this is at the top. Port has been deprecated. This flag has no effect. Well, maybe some kind of studying here. What is happening here? Starting. Waiting. Let's see. Did the should maybe check to see if it crashed because it is oh, wait a minute. Can we do, like, dash p? Oh, yeah. Keep pill logs. Yep.

36:39 Like yeah. Let me use the use the tools. There we go. So That's a big stack. Dash p is the previous container, the one that's previous crash is what we're looking at. Right. Look at all this Go code. Expository. You wanna type it to a picture so we can we can follow along? Oh, it's a oops. Sorry. My bad. I thought I I thought I had it in BI. That was my my bad. Okay. Here we go. Let's check failed. Failed to wait for API server being oh, wait. For API server was down at that point, but that it was down back

37:26 then, I think. That's why you've had to look at the previous logs. Because it's restarted since we were talking about it. Yeah. Maybe now it's healthy. Yeah. Maybe. Do like it when things fix themselves? Right? There we go. Hey. Hey. Round of the clock. We're restart all the through. Yeah. Just have to restart until it started again. You know? Wait. Wait. Automation automation works. Wait. We still don't have a default anything. I know. Right. Where's our Just scale up just scale up the deployment. Alright. Christian, go. It was at zero? Yep. Get deployed. Christian. So alright. So then I'll

37:57 Scaling Up Application Deployment

38:07 do keep pedals. Good. Show it no mercy. Deploy one. Deploy clustered. Was it one? Yeah. Yeah. Let's just do one for now. Oh my gosh. K. Forgot a getting close. Gonna check if we have our pod first. Right. It is currently pending. You oh, you cursing schedule, are you? Yeah. Let's just grab this pot here. It's been No. No. It's tank. Wait a minute. Why is it going? Yeah. What's going to the next one? We don't need to we don't need a machine. It's true. Just get it. Just remove the tank. The clock's ticking. Yeah.

38:28 Debugging Pending Pod (Node Taint)

38:53 So I'm going to so they're currently scheduled, so that means they're not accepting new loads from the scheduler. So we'll do cube cuddle uncoordin raft was it uncoordin node? Raft worker one? I don't think you need to it. It's just uncoordinate raft worker one, I think. Yeah. There you go. And two. Alright. Oh, we just need one in. Yeah. Just come on. It may be it may have an affinity for the master. Right? Alright. Because it's zero of one nodes. So I'm guessing the the deployment has a, affinity. Yeah. So that's what I'm gonna check for

39:28 by clicking at the YAML. Yeah. And oh, right. Let's go. There you go. There you go. So Wait. Cluster d d d d d. That's a lot of t's right there. So we're gonna have to fix that. That's the way it goes. The show. It's cluster. Alright. You gotta fill that buffer for, you what we tell people. I don't see any affinities. Do you, Chris or Ron? You gotta search, man. Do do you're in VI or less. I'm on less. Hit forward slash. Forward slash. Okay. Forward slash. And then backslash c, lowercase c, and then

39:58 Untainting Worker Nodes

40:11 you said taint. You just look for taint. Right? Yeah. You don't need to do capital t. The the backslash c was case insensitive. Alright. Hold I was Oh, yeah. No. You're looking for a taint is a node taint. You're looking for node selector or an affinity, so look for it would be under the pod spec. So go well, so We're on dot containers. I don't see node. I don't see n. I don't see affinity. Alright. We'll fix the fix the image. Alright. So Watch. That is the name. There you go. That's the So we'll do we'll edit it and Minimum

40:45 Fixing Application Image Name Typo

40:55 replicas unavailable. Because we And then I say forget it. Just untaint the masternode and Yeah. That's true. You know? Who cares? Isn't that what we did? No. We yeah. I did the the workers. Oh, man. Come on, man. I said Oh, no. That's still it's still Replicating out of control? Oh, yeah. Just just to do your just remove the master tint. Right. Keep cuddle. What was a taint? I'm sorry, guys. Keep cuddle, taint. No. No. And then you can do all. Yeah. Kubernetes. IOS slash. Yeah. So so so so It's node dash role dot Kubernetes dot I o

41:38 slash master. Okay. So node Why can't autocomplete fix this? Right. So it was node It's 2021. Come on. Alright. There you go. Yep. There we go. No. Okay. Error image pull. Okay. So it's still the thing is still not going through. Plus third? Third. Yeah. Edie. It's d. It's it's it's not no. Wait. It's not e d. It's d. Clustered. They're creating. It's it's creating. We're good. Oh, wait. Did it oh, I I can't spell then. It is easy. I can't either. Right? I was trying to Alright. We got, like, four minutes left, so it'll get pods. It's running.

42:32 Alright. Wait. I forgot to start the timer on time, so you've actually got a bit longer. So you're you're good. Okay. Okay. Alright. Wait. Go. Go. Go. We gotta do a set set image now to version two. Alright. Right. So keep Cuddle set. We we might we might drop on step one or two. Deploy clustered. And then what was it? It was what was the what was the name? I don't know the name of the container, dude. Mean, it was Scroll up. Scroll up, aren't didn't you just show that? Or Containers is clustered. It's just called clustered.

43:07 That's the Oh, yeah. If it's just the deployment, I think it's the same name. Right? Yeah. Whatever. We'll we'll find out. Oh, wait. Don't get cocky yet. We're so close. Yeah. Dude, good time. What do you Rollout history. Rollout history. See if it's oh, you didn't record the rollout history, dude. Come on now. They they deprecated it. Oh, wow. That's been awful. There's been a lot of changes made to this one. But, yeah, no. I didn't, I did not record the rollout history. So let's see what's our next move for this one. Just describe the

43:42 pod and see if it says v two, and then let's open up the browser and see if it worked. Oh, you've you've got a few more things to fix before it work. You just you've you've just not seen them yet. That was the angel of death talking. So it is not ready, though, unfortunately. So, unfortunately, it looks like we gotta we do have a couple more things. It seems to be failing. It's a readiness probe, if there is one. Delete the readiness probe. Go for it. I like you. Thank you. Yeah. Alright. Go fast. Rariness probe. Yeah. No.

44:04 Debugging Readiness Probe / Pod Ready

44:21 Don't don't do this. Kids don't do this. Forget that. Just forget that there was a Rariness probe. You know? It's, it's fine. It's also a also a rogue replica set that seems to be trying to do something. Not sure what's up with that fella. But it's ready now. Woo hoo. Alright. Wait. Wait. We're not done. There could be, like, a service blocker, you know, or something like that. We we we gotta make sure we can Okay. So this is where I would encourage that I will go to the teleport page. I click on applications and we

44:40 Checking Application Status (Database Failure)

44:51 have a nice little helper to launch our clustered app. And what you want to see is me dancing. Are we gonna see me No. I wanna see you dancing. Come on up. Oh, no. Build to connect to database. Because our Postgres is not up. Yep. It's actually gone. Get deployed. What if a stable timed out. Alright. Is it scaled scaling? Or Well, there is no stable set. So Yeah. Was Postgres Yes. On its own? Or Yeah. So there's there's a cluster IP, but there's no Yeah. There's no there's no database, unfortunately. That's such a shame.

45:04 Debugging Missing Postgres Database

45:35 So oh, no. That's such a shame. But we got the pod up and running. So what's next? Yeah. Who who needs data? Right? Also, what's up with Ambassador being down. That's a bug I need to fix. Don't worry about that. Okay. Okay. Yeah. That's yeah. That's that's fine. So We slow slow down. So there's no Postgres pod. Do we have Postgres YAML? Is that in this command? The YAML from github.com/rawcode/clustered. Oh, okay. So There's a workload directory. Slash raw code slash cluster. Clustered. Oh, cluster. Right. Clustered. Train node e. And inside the workload directory, you'll find opt

45:59 Applying Postgres Manifest

46:25 Kubernetes, and you'll see Opt Kubernetes. Everything there? Yeah. Postgres is there. There you go. Oops. Not blame. I don't wanna blame. That seems dirty deleting the actual code. Yeah. Right? We should we should have just deleted the Kubernetes repo. You got the answer. That's not gonna work. What happened? Set pace there. Yeah. Hey. I was gonna say, you could just it's a public repo. You could just keep CTL apply it from. Hey. He wanted to do the caveman art where you blow paint on your hand. Right? Fine. Yeah. Yeah. That's true. Alright. So let's do keep pedal apply my

47:08 apply minus f from that guy. Oh, look at you being all fancy pants. I'll do that. That's right there. Unchanged. Unchanged created. Get some pods. PV and our PVCs. Just make sure that it's not, you know, backed by anything. Alright? So Or it should be backed by something. Yeah. Or if it should be backed by something. Let's go check that repo there. Boot you up, man. Probably has a probe or something, so taking a second here. Likely. Yep. Do it. So oh, go ahead. Sorry. Yeah. Sorry. Alright. So Alright. Alright. Let me refresh our clustered application.

47:51 Cluster 1 Fixed! (RXM Success)

47:53 Oh, yeah. Hey. You got the pants. You well done. Nice. You go. Nice. Alright. Good job. There there was a lot to fix there. Right? Yeah. Did did we beat the time? You did. I mean, I I started the timer late, so I'm not actually sure how long it was. But it was I think you were under the the forty minutes. I think you were okay. There was twelve minutes left on the timer and I started ten minutes late. So I think you had three minutes left. Alright. You just got a minute. What you mean

48:05 RXM Team Debrief

48:28 there was one second left and we defused the bomb? Exactly. Yeah. Awesome. Yeah. That that's how it went. We'll edit the video to reflect that. Don't you worry. But then, no, good job. That was good teamwork. You communicated well. You worked through lots and lots of typos and death by a dozen cuts there almost. But you got to the end. Oh, yeah. Deleting the stateful set, I thought was particularly sneaky. But there you go. Alright. Awesome. That was a lot of fun. That was fun. Yeah. Yeah. That really was. So so we should go to the YouTube

48:57 now and watch watch hopefully, the reverse penalty. It's it's a different video because of the small hiccup that we experienced at the start. But if you go to Rawkode.live, you'll be able to see that we have a session live in action. So feel free to jump out of this call, jump over there, and we'll have the Rav team join us in just a moment. So thanks again. Well done. I'll speak to you all soon. Alright. Alright. Thanks. Good luck. Go Rav. Thank you. Alright. So the Rav team will be coming to join us in just a moment once

49:28 I I'll just boot out all the There we go. So Rawkode team, come and say hello. While we wait, just because of the awkward start, I'm going to thank the sponsors again. Teleport, we've been using since the start of Clustered is an amazing product. Check it out at rockwood.liveteleport. I really appreciate that. And Equinix Medal, my former employer, has been very gracious and continues to sponsor the hardware that we use on Clustered as well. So if you want to check out a bare metal cloud, you can use the code Rawkode for 200 USD in credits. Go spin up some bare metal.

49:38 Sponsor Recap

50:00 Alright. We have the Rawkode team. So let's bring them in. Hello. Hello. Hello. How's it going? How was that? Was that fun to watch? It was a blast. Are you Sure was. Screaming at the television going, Josh, it's right there. No. That was a great job. Was hard. Alright. Well, thank you again. Can we start with a little bit of introductions? We'll just start with you on the left there, Hitesh, and then we'll move over, and then we'll get started with the the next cluster. Sure. So hey, everybody. I'm Hitesh Sharma. So I'm a senior engineer at Raft,

50:02 Introducing Team 2: Raft

50:39 and I was I've been playing with Kubernetes for about a couple years now. So really excited to see what we have to fix here now. And I'm. I'm the lead RNG engineer at Raft. I've been working with Kubernetes for a little bit over three years now. Really excited to see what they broke up in this cluster. Alright. Awesome. Thank you very much. So let's get my screen shared. There we go. I'm going to open a session on the control plane node of this cluster. If you could please go to activity and active sessions and join this session, just give

51:04 Debugging Cluster 2 (RXM)

51:18 me an echo hello to let me know that you're there and then we'll kick things off. There we go. One. Alright. Let me refresh. Remember, it's under activity and active session. Got it. Alright. I can see we also have the RXM team in the chat saying, let's go. So they're prepared for this too. There we go. And we got you both here. Sweet. So export your KubeConfig, test for an API server. Best of luck. Take it away. Alright, bro. Go for it. I'll I'll I'll wait ten minutes when I start the timer. That's what I did with the first cluster.

52:13 See what we got. That's alright. I jumped on. We can go home. I think we're done. Alright. K. Everything is running. I hope they I hope they remember to break it. What do you wanna do? Edit the deployment? Let's see. I was gonna check if we can get to the application. Yep. Do you want me to fill it up? So Sure. We go to applications. We'll spin up clustered. Yep. Version one does appear to be working. As well. Okay. Let's try v two. Just added the deployment. Let's see. Adding some of the shortcuts. Let's see what we have there.

52:57 Attempting Application Upgrade to V2

53:31 It too? Yep. Let's try. Interesting. A lot of alright. So let's see. Check the scheduler. Go for it. You have access. K. That looks good. Just do the unexpected end of file. Yeah. Did they put some tabs in the YAML? Like, there's a connection refused too. Yeah. End of file. I mean, these error messages do appear to be from yesterday. Let's see what they did. Oh, wow. Look at that. I think that's just me. Okay. Yeah. There's nothing there. We tried. That's not how we tried. Nice try though. So So let's talk about what happened there. Right?

55:40 Checking Deployment Status (V2 Image Set, ReplicaSet Failure)

55:44 You did a cube control edit of the deployment and you believe you've set the version to be v two. Can you and nothing's happened. So maybe you wanna confirm that? Yeah. They're described on the on the deployment. Did they mess around with our back? K. Alright. So that's v two. But zero replicas created for the new replica set. What's responsible for on the replica sets? So this one is not ready. This is the new one. Where is my mouse? Non existent. Yeah. Isn't that clicking on the window? What do you what are you trying to

56:26 Debugging ReplicaSet Creation Failure (RBAC User Forbidden)

56:55 do? Describe the the new RS. Let's let's do the do we do the source completion? Yep. User forbidden. Service account. Now let's look let's look and take take a look at the I wonder if What are you wondering? Talk to me. I'm trying to think if the if the service account should have had labels to access that. So the user we're seeing under the gray isn't the default service account. Because you described the default service account and the default namespace, but this is the replica controller and part of the controller manager. Yeah. This is this is from the cube

57:19 Investigating RBAC Permissions

58:39 system. Can you do describe service accounts in the cube system? Now let's look at the Just get all of them. K. Let's do describe. Same one. But what I'm trying to look for Isn't it gonna be for the replica set, though? Is there an error around the replica set? The replica set controller? Yeah. K. So service account doesn't have any permissions, right, in the Kubernetes cluster? What do we have to look at next? If the service account has no permission. Well, We'll need to look at binding? Yeah. Exactly. Let's look at the role bindings and see if we can mark out what's

1:00:08 going wrong here. Go for it, Brock. Oh, we forgot you saw our favorite tool. Yeah. Hey. You know what? I'm stopping let's stop being a caveman. He thinks there's good one. Yeah. Yeah. Alright. I gotta do the export thing. Yep. K. Let's get the role binding. Is that a roll binding missing then? Is this all for the no. Sorry. Sounds about right. Is there any role? Say that again? Who? Roles. Yeah. Okay. Let's go back to the error because I feel like we've Yep. Maybe gone too deep into one place. Got a deployment and see your replica sets.

1:00:27 Checking Cluster Role Bindings

1:01:50 Yeah. Bout to create user system service account cube system, replica set controller. So we just need to give access to replica set controller. Yeah. We need to, give it a a role binding. Yeah. So for that, we can do That's tying this one. Right? Yeah. Okay. Do you have that, up? Checking real quick. That is evil. Well, I guess service account is fair. It's making me question some of my knowledge here because I I can't remember if the controller manager rule back are hard coded or not. I'd I actually expected to see more from a kubectl get rule bindings. Ah, but

1:03:16 it may be a cluster rule binding. We don't look at those. So I would do a get cluster cluster role binding on the cube or the span of the namespace. There you go. That's looking much better. Now we can see that we have the replica set controller role binding there bind into the cluster role of replica set controller. So you're gonna wanna start introspecting those two. The arrow binding doesn't so we need to give axe when the default namespace, I'm assuming that's correct? This rule binding here, I'm gonna I'm gonna assume binds our replica set controller

1:03:49 Investigating ReplicaSet Controller Cluster Role

1:04:05 service account to this cluster rule. So maybe start by just taking a look at that cluster role and see what permissions it has. You wanna do a describe on that? I'm just gonna check something. For it. Go for it. While while I'm looking something, you can go for The replica set controller. Yeah. So cluster role space. And then alright. Yeah. That way works too. Seems okay. Create, list, delete, update. I think we're missing a verb in pods. Yeah. Create, aren't we? Do edit. Is it over here? Am I in the right place? Yeah. You'll need to search for

1:04:53 Fixing Cluster Role Permissions (Add 'create' verb)

1:05:23 pods. There we go. It still doesn't seem like it's coming up. Might have to kill it. Again. Alright. Might have to trigger it again Or not. Sorry. Deployment. Image. I'm downgrading it back to v one. Okay. And then putting it on v two. I can't create because it did something or thought it did something, but it doesn't look like it is picking up the changes yet. So let's take a look at the the the logs of the deployment again. And you could just delete that replica there, and it it should force it to recreate it.

1:06:30 Forcing Deployment Rollout (ReplicaSet Delete)

1:06:54 There you go. I mean, it's still not creating a pod, but Let's see what this Oh, different error. There you go. Cited quota. Oh, that's fine. We can just delete that other part, and it should be fine. Yeah. I mean, sneaky, but it's no. Pending. That's v two? Well, it's still v one, though. Because that is still gonna deploy v one. Yeah. The the deployment is set to v two right now. Right? The deployment is set to v two. V two. So replica failure failed to create. What is what is this saying? Minimum replicas unavailable.

1:07:02 New Error: Resource Quota Exceeded

1:08:00 But I think this needs to be edited. Yeah. See if there are any gotchas here. And that threshold, period, success. That status, that's not gonna make a difference. So path is good, which we know, because that was one of the things we had broke. We need to edit the we shouldn't need to edit the image from the replica side. Right? That should be picked up from the deployment? Correct. So if you see a a an error message about quotas, what parameters does Kubernetes expose for quotas? Do you know? I don't remember from the top of my

1:08:40 Locating and Deleting Resource Quota

1:08:50 head. That's gonna be there was a second one. Or is it the Unless they tainted all the no no. But then it would complain about a taint if they did it with taint. There's a sub command, kubectl API dash resources, which will list all the custom resources or all the resource definitions within a cluster. Not not necessarily a customer. I had to correct myself there. I don't know if you can do this in canines. We may have to use vanilla cube control. Uh-uh. Uh-uh. Yeah. You found a quarter. There you go. But it's set to two, so

1:09:41 it should be okay still. But I guess we could increase these to three. Or you could delete the quota. I I prefer the shotgun approach. Delete it. I'm not gonna say how, but that might have been how some stuff disappeared from that previous cluster or that the previous team there. Was trying to to debug. But do we need to trigger this again? No. It thinks it's on v two. Might have to delete the replica set, though, again. Oh, just speed up the reconciliation. Oh, something is pending. Yep. Let's see. That seems to be a v two.

1:10:21 Pod Stuck in Pending (No Scheduling Event)

1:10:27 Yep. I'm I'm half expecting an ingress mess or something. And why is it pending forever? If you look at the log So Do they have any tolerations on the thing? I don't know. Can you look at the GitHub and see that they they mess up this deployment? Yep. There's no logs to it. There's nothing in here in the deployment necessarily. The replica set says that it successfully created. Still in pending mode. Okay. Normally, for something that's stuck in pending mode, we'd see an event for where it was scheduled and an event for the image being pulled. We're

1:11:24 Checking Node Status

1:11:33 not really seeing that. No. It's not pulling images. Wonder why we're not pulling And it's not gonna say the node check if you look at that. Do it describe on the nodes? I just wanna see, like, if there's anything blocking. K. Up and running. Unscheduled. Unscheduled. Yeah. Oh. Very Kubelet has sufficient memory available. Disk pressure. It filled up the disk, didn't they? No. This is no no disk pressure. I think you're okay. Oh, okay. Oh, no. No. Status is false. Hold on. Hold on. I'm reading this wrong. That note appears to be healthy. That's sufficient memory available.

1:12:45 So to what was it going to mean? I think this is and I think it's too late to come up. Yeah. It doesn't seem like there's anything here that's being overused. If you go back to your podcast, let me find names first. Right. So the the one thing that right now is is pending with no nodes. So you can either bypass the scheduler or maybe look at why the scheduler is not assigned to that node. You got two different options. That was my my next thing is to look at the scheduler and see if something

1:13:09 Identifying Scheduler Issue

1:13:23 is messed up there because it's putting this on the control plane. Right? Can you pop can you put this under the control plane too and see? So this is one of my favorite hacks with Kubernetes, but the scheduler is really, really simple. I mean, you can just modify the deployment and put the node into the spec and then bypass the scheduler altogether. A cool hack. But nobody should ever do it in production ever. I don't think anybody should do any of the Kubernetes in production. That there's the answer to everything. I'm checking the scheduler. Right after that spec line, you can literally

1:13:35 Implementing NodeName Hack (Bypass Scheduler)

1:14:09 just put node name and then the name of a node. Oh, really? Okay. Yeah. I did not know that. Did I know? Go grab that name of a node. Oh, okay. It should be fairly easy to remember. Wait. Connection refused. I don't like that. Was it camel cased? It is. Yeah. If I've remembered it correctly. We'll find out. Oh, I did not wanna say it. Name is not correct. You'll need look at another pod. You'll be able to see it there because the scheduler adds it. Like at the pod, you mean? Or or Yeah. Oh, not pod. Look at the YAML.

1:14:14 Editing Deployment to Add NodeName

1:15:03 Yeah. Because it it gets added to the pod, you can set it. Oh, no. You have to set it on the pod spec within the deployment YAML. I'm just being silly. We have to go through the deployment, go down to spec, template spec, and then set note in. Oh, I see. What can we move around the Yeah. Wrong spec. It won't be the container. It'll be the Oh, it would be at the At the same level as containers. Yeah. Was such a good idea. There we go. There we go. Well, something got scheduled. We did.

1:15:55 So it's b two. Do we wanna see if the URL is exploding or not? I I really hope that horrible hack is not just ended this episode. Got it. It has side effects. I blame you. Me too is up. It could just be my browser cache. So I've had to hit refresh 40,000 times before I think Is that it? Oh, damn it. We fixed it. It did stop my end. Yep. There we go. Alright. I didn't know that was gonna be the end of the the carnage. No. I feel bad. Yeah. Oh, well, it's fixed. I wonder I wonder

1:16:33 Discussing the Scheduler Hack

1:16:41 if we're missing stuff, but those are those are some pretty low downfalls. I mean, we still haven't fixed technically the the node. Right? We kinda hacked our way. Just, yeah, something running. So do you want do you want to attempt to fix the scheduler? Or do you wanna just call it you've you've that's it. You've done it. It's it's good. I don't know. I'm okay with saying it's good. Good enough. Enough. Right? It's deployed to production. The site's live. Let's not touch it. Yeah. Alright. Perfect. Well, good job. We we flew through that one. Some

1:17:11 Raft Team Debrief

1:17:15 fun things to fix there. The scheduler is always a fun one. I mean, the scheduler does a lot, but it's really easy just to bypass it and move on to something else. So Yeah. Alright. Well, that was fun. Two two good clusters. Did what did you prefer? Watching someone fix yours or fixing someone else's? I preferred the fixing part because I think like, you're sitting at the screen, like, just throwing popcorn at it. I'm like, come on. I mean, when we were watching, was the whole time, I just wanted to tap in the chat. Like, no. No. Look at that namespace.

1:17:50 Look at the namespace. There's a typo there. So but, I mean, they did a great job too. I mean, we we also had some lot of small cuts on the worker nodes as well, but then the workaround for them was also just to spin up the application on the control plane. So that worked out as well because there was no way the worker nodes were coming up. Ah, right. Okay. Yeah. I don't I don't think I noticed that. Yeah. Yeah. We had a lot of small little cuts. We'll we'll put it in the PR, But that was that was sort of RMO.

1:18:22 It's just a fat finger, a bunch of stuff that nobody ever should touch in themselves. Right? Like, the search Getting up gates. Yeah. Especially the search on the worker nodes. Yeah. Alright. Sweet. Well, thank you for taking time out of your day and week. You know, breaking those clusters is not easy. It takes time. And thank you for joining me live and, you know, typing in front of people, which is also impending doom every time we do it. So as thanks for sharing your knowledge and a good job. I'll let you both get back to your day and I'll

1:18:54 say goodbye to everyone else. Thanks. Alright. Two great teams, two great clusters. That was a whole lot of fun. Apologies for the weirdness with the YouTube video at the start. I do not know what happened there. I'm gonna have to dig into that. Thank you to our sponsors again, Teleport and Equinix Medal. We use Teleport to debug these customers, to fix them, to share terminals, to get access. It's a really neat open source product that everybody, literally everybody should check out and have running in their production infrastructure. If want to know more, go to Rawkode.liveteleport.

1:18:58 Outro and Thanks

1:19:26 I would really appreciate that. And EquinixMetal provided the hardware. If you want to check out Bare Metal Cloud for yourself, it is a whole lot of fun. Use the code Rawkode for $200 in credit. Alright. We will be back soon with more clustered and even more at the Rawkode Academy. Things will be getting back to normal now, almost. I have finished my paternity leave. Lots more coming soon. Thank you for enjoying and watching with us. I'll see you all soon. Thank you. Thank you for watching Rawkode Live.

Technologies featured

Meet the Cast

Weekly Cloud Native insights

Stay ahead in cloud native

Tutorials, deep dives, and curated events. No fluff.

Comments, transcript, and resources

Additional Resources

More from Klustered

View all 45 episodes
Kubernetes

More about Kubernetes

View all 172 videos
etcd

More about etcd

View all 24 videos
PostgreSQL

More about PostgreSQL

View all 22 videos