Overview

About this video

What You'll Learn

  1. When Kubernetes solves growing microservice complexity better than custom deployment scripts
  2. How DevSpace lets developers test selected service stacks inside remote namespaces
  3. Where Argo CD, Flux, Helm, Kustomize, KEDA, and Crossplane fit

Rachel Sweeney shares Built Technologies' migration to Kubernetes: when (and when not) to adopt it, taming developer experience with DevSpace, and the ecosystem of Argo CD, Flux, Helm, Kustomize, KEDA, and Crossplane that powers their platform.

Chapters

Jump to a chapter

  1. 0:00 Introduction & Guest Background
  2. 1:10 General Challenges of Kubernetes Adoption
  3. 5:06 Challenges for People Adopting Kubernetes
  4. 8:19 When and How to Adopt Kubernetes
  5. 9:33 Why Kubernetes May Not Be Needed (The Fighter Jet Analogy)
  6. 11:57 The Migration Story at Built Technologies
  7. 12:13 Reasons for Built's Migration to Kubernetes (Microservices, Custom Tooling)
  8. 15:30 How Kubernetes Addresses Built's Challenges
  9. 16:04 Improving Developer Environment with DevSpace
  10. 18:00 Leveraging the Kubernetes Ecosystem
  11. 18:55 Essential Kubernetes Tools (Argo CD, Flux, Helm, Kustomize, KEDA)
  12. 21:29 Current Stage of Built's Migration
  13. 23:30 Developer Challenges During Migration
  14. 26:45 Team Structure and Reliability Ownership
  15. 28:17 Tips for Smaller Teams Migrating
  16. 31:49 Lessons Learned and Hurdles Faced
  17. 34:12 Disaster Recovery Plan Status
  18. 34:50 Handling Developer Environment Complexity Explained
  19. 37:58 Is Kubernetes a Net Positive for Built?
  20. 39:39 Conclusion & Guest Plugs
Transcript

Full transcript

Generated from the English captions. Timestamps jump the player to that moment.

Read the full transcript

0:00 Introduction & Guest Background

0:00 Okay. So welcome to the show, Rachel. Thank you for dedicating some of your time to sit down and have a conversation with me. For the people listening at home, could you tell us a little bit about you? Sure. So, yeah, my name is Rachel Sweeney. I currently work at Book Technology. I've been in the Kubernetes space for a number of years now. And, yeah, I live in the Maryland, Washington DC area. Happy to, you know, dive into how I got where I am today, but live here with my wife. Live on a boat about four months a

0:32 year. I like having an eclectic lifestyle. So, yeah, I have a lot of varied interests. I mean, I could sit and talk about living on a boat for four months of the year for the next couple of hours, but maybe that's a session for another day. You said you've been in the cloud native Kubernetes space for a number of years now. What what led you to to where we are today with this terrible mess of Kubernetes? Like, what's your experience when why were you unfortunate enough to end up in the Kubernetes space? I always looked down on the Kubernetes

1:00 space, and I've I've had so much fun in it. You know? It's it's tough. Right? So I wanna know what made you the engineer you are today and how you ended up here. Yeah. It's a great question. So I did not have, I guess, like, a traditional IT background. I got a degree in Chinese history. I worked as an executive assistant for a number of years, and I spent a lot of time, like, liaisoning with our board of directors and kind of learning how, you know, companies work, how you get value, and stuff like that. And I'd always been big into

1:10 General Challenges of Kubernetes Adoption

1:30 computers since I was probably three or four. You know, I started programming when I was, don't know, 12 or 13 and taking apart and rebuilding computers and doing all that, like trying to install Linux. And I knew that I wanted to get into tech space professionally. And as soon as I discovered the whole DevOps philosophy and movement, it just immediately clicked. I was like, there's so much value in this, and it's just, you know, how do we make things more efficient and just, like, string together pipelines, you know, orchestrate containers, and all that good stuff.

2:02 And I very quickly went down that rabbit hole of just, like, how do you use things like Docker or Ansible and Chef and got into Kubernetes and CICD pipelines. And I was, fortunate to get a job working at the Pew Research Center. And when I was there, I was, essentially the only DevOps engineer type person. We had a data science engineer that did a lot of, you know, stuff you would expect with data science engineering. And there was a lot of overlap with his role into DevOps. And we knew that we wanted to migrate to Kubernetes because we

2:36 were having some, I guess, scalability issues with data collection. So I, you know, took on the work of figuring out, like, how do we deploy Kubernetes? How do we migrate workloads over to it and all that good stuff? And after doing that for a couple of years, I ended up going to Fairwinds, which is a Kubernetes centric company. You know, part of Fairwinds is focused on consulting and standing up clusters for clients who are new into that space, and that's initially where I started as an SRE. And from there, it was, like, you know, helping stand up clusters, making sure they're working

3:13 well, and how they should be handling a lot of, like, ad hoc requests. Like, how do I make sure I can, like, track a client source IP all the way through to my application so it doesn't get lost in network address translation, stuff like that. And, eventually, ended up on the, the software as a service side of things as a tech lead where I was, you know, working to help out with our SaaS product to get best practices in place for a lot of our clients. So, you know, we had a platform that would surface up a lot of different information

3:46 in your clusters, whether it's, like, security or best practices or costs. And so a lot of that was, like, helping clients install Falco in their clusters so that they could get all that information surfaced in one area or, like, how do you right size workloads and all these different things. And then finally, you know, I'm I'm now at Built Technologies as the tech lead to migrate, our stack over to Kubernetes for a lot of various different reasons. And so, yeah, that's kind of been the journey that's gotten me there, and it's just, really just following a lot of interest as

4:19 I dive deep into the Kubernetes rabbit hole and, know, realize all the friction along the way or the things that are the gotchas that you need to know about. And, yeah, it's been quite the journey. Awesome. Thank you for sharing. It's funny because as you were talking about all that, there was a meme that popped up on Twitter just this week that kinda it it feels really familiar to the story you're sharing. And it's the meme of, let's adopt Kubernetes and there's a little logo. And then the next bit of it is, like, and here's all the stuff you need after

4:46 it. And there's, like, the Helm logo, the Argo logo, the Flux logo, the Falco logo, the PEXI logo. Like, just this this sort of tools that we have in the cloud native landscape because, unfortunately, just doing cube admin in it is, like, step one of many, many steps that you need to successfully run and operate a Kubernetes cluster. Quite a pickle, I think, we're in these days. So if you could tell me about the challenges you've seen with people adopting Kubernetes. Yeah. So some of the challenges that come up with adopting Kubernetes are, you know, there are a lot

5:06 Challenges for People Adopting Kubernetes

5:16 of unknown unknowns, I guess, if you're not in the Kubernetes space. And, you know, some good examples of that are things like you know, I could list off a a lot of different phrases that you might not be aware of if you haven't been in Kubernetes, but, like, how are your pod topology constraints configured? And if you've never heard that before, you're just like, what on earth is that? And, you know, it's something that you can configure behavior to make sure that your your workloads have as much uptime as possible to be more resilient in case, you know, certain sections of infrastructure

5:46 go down. There's also things like, you know, what are your image pull policies, your network policies. You know, going from Docker to Kubernetes, it's actually less restrictive in terms of syscalls, whereas I think Docker want blocks something like a 28 syscalls and Kubernetes is probably in, the sixties or so. And so if you're making that migration, are you aware of that? Do you know that maybe you should be locking down some more of these? And if so, why? And just, you know, needing to know all of those as you configure the behavior for the platform.

6:19 And there's, you know, plenty of other things like pod disruption budgets. And so I think all of those things are the, like, steep learning curve of Kubernetes where it is a platform that will do exactly as you tell it to. It will behave as you configure it to, but you have to know how to configure it. And so that knowledge gap is often something that a lot of companies struggle with. I think there there used to be something more of, like, you know, should I do self hosted or managed? But I think for most people in the cloud,

6:50 managed is a, you know, a way that a lot of people go since it has matured a lot, and there are reasons for self hosted. But and then, yeah, just going back to that skill set piece where your developers now are deploying to this platform, and do they know how to configure things if that's something that they're working in? And, like, how much awareness do they have for Kubernetes? You know, I know at at Built, we have a lot of, education material available for developers. So as we make this migration to Kubernetes, they're going through initially, you know, courses on

7:23 Pluralsight, now in Udemy, and, a lot of great YouTube videos that you put out and just kind of, like, learning all these pieces and what they need to be aware of and going from there. So I think those are a lot of concerns that you should be aware of going into it and prepared to answer. Yeah. It sounds like there are two key things there I picked up on. It's it's one of them. It's like almost everything in Kubernetes is some sort of toggle or a bit of configuration. You know, like you said, down to the

7:48 topology spread constraints. You've got disruption budgets. You've got limit ranges. You've got quotas. You've got do you wanna do seccomp filtering? Endless, endless, endless, YAML, endless, endless, endless options. And unless you have seen some stuff before, I don't know. Like, do you even know what half of these options are? Like you said, the unknown unknowns. Like, if if I had never had to configure these things before, then they're just out of sight of mind as well even though they exist. So that again, that's maybe part education problem, but then part you don't know these things until you

8:17 need to know them as well. Right? Like, top of the head question here, you can just say yes or no. But can people be successful with Kubernetes by just having a 12 line deployment dot YAML with their image in it? Like, how far is that gonna get them and how deep do you need to go to then? I guess it's a scaling problem. I don't know. Yeah. I think with, just about everything in tech is it depends. You know? It depends on, like, the complexity of what you have and, like, maybe you can start off slowly just like if you

8:19 When and How to Adopt Kubernetes

8:46 know you want to move to Kubernetes, you do just throw one thing up there in the cluster and see how that works. It it seems like there are a number of of strategies for, like, organizations shifting to Kubernetes, whether you, you know, outsource it to have somebody stand up a cluster for you and make sure you have that, like, rock solid foundation, or maybe you hire a consulting company to come in and get your knowledge gap, to kinda close that so that you can deploy things quickly to Kubernetes. Or maybe you do just hire engineers that

9:17 already have those skill sets. And so I think, you know, depending on the scale of what you're doing and, what you're trying to accomplish with Kubernetes, you know, it really just depends. And, honestly, if you if you don't have a ton of scale or you don't have a lot of complexity in your applications, then, you know, maybe consider why are you thinking about Kubernetes in the first place. Because, like you said, there are so many toggles, and it's kind of like, if you have a startup and you have a very simple use case, you're not just gonna, like, hop

9:33 Why Kubernetes May Not Be Needed (The Fighter Jet Analogy)

9:45 in a fighter jet and, you know, toggle every single switch when, like, a simple plane would work to get you to your destination. Hopefully, that analogy made sense. But, yeah, like, maybe you get to the point where you need all the toggles, but initially, probably not. Yeah. I mean, I think you're 100 right, and the analogy is great. You know, I take it a step further. Like, you know, if people need to travel 60 miles down the road, go on a train, not even a plane before you jump on a fighter jet. Like, the Cloud Run and Lambda can

10:13 get you substantial amount of the way depending on the complexity of your application. But, yeah, you're right. Like, it's it's definitely a a fun problem. And I'm gonna kinda just talk about clustered for ten seconds here because what I thought was interesting is you said, I said you could have 12 lines of YAML. You said that could get you so far. And I think we're both on the same page there. And then you mentioned that you could bring an external help, right, like a company to help you with that migration to Kubernetes. And I was thinking in my head, if

10:38 you give a company that hasn't seen Kubernetes before a 300 line deployment YAML. And let's assume as an example here. Right? There's a pre stop hook that does a sleep for thirty seconds. Now if you know what that's there for, great. Right? But if you don't, it's probably the first thing you're going to delete and say, well, that's the purpose. Like, I don't need why am I doing this sleep for thirty seconds? That's ridiculous. Let's move it out. But if people have seen clustered before, they know that pods actually don't really handle failure that well and the pre stop can

11:07 actually give you enough time for NGINX to handle the connections that are currently in flight before they get moved through. All these edge cases. Right? Which is why I say clustered is a great show. People should definitely go check it out. But I don't know these things from operating Kubernetes clusters and production. I know these things from clustered as well. And I think that's a really cool way for people to learn. Yeah. I'm gonna plug myself. I hope that's allowed within the first five minutes of the show. Yeah. I I mean, having been on clustered, thoroughly enjoyed it, like, wonderful learning opportunity

11:33 and certainly watched a time just to, you know, level up my own Kubernetes knowledge because so many guests just have such creative ways of bringing the clusters to their knees and, like, could happen to you accidentally. You know, you hopefully won't have somebody in your cluster just, like, you know, toggling everything, you know, in a a negative way, but good to know. Alright. So you've mentioned migrations a few times, and that's a part of your new role where you're working at Bell. So what I I'd love to dive into is just if you're happy to share information on what

11:57 The Migration Story at Built Technologies

12:05 the current landscape is and what your path migrating to Kubernetes looks like, then, hopefully, we can dive into that in a bit more detail too. Sure. Absolutely. So, yeah, it's funny when I when I first interviewed at Bilt, one of the questions I actually asked our one of our directors was like, you know, why Kubernetes? Like, it is so hard. Why do you actually wanna migrate to Kubernetes? And, you know, a lot of people will migrate for scalability issues and stuff like that. But, one of the the big reasons for us moving to Kubernetes is that we started off

12:13 Reasons for Built's Migration to Kubernetes (Microservices, Custom Tooling)

12:36 with a monolithic application as many companies do, and it just, you know, became fairly complex. And we started to break it down into microservices. And now all of those microservices, they have their own dependencies in terms of other microservices. Each microservice has its dependency in terms of, like, maybe they need a queue in the cloud, maybe they need, like, a Kinesis stream or a Kafka cluster or something. And so with all of these microservices being broken down and having all their their dependencies, there's a a certain amount of complexity that has grown in, I guess, the the way

13:13 of simplifying. You know, simplifying things makes them more complex, but more bite sized in a way. And so we ended up with custom internal Python tooling to manage these complexities. And so we you know, for example, we have one custom internal Python application that takes care of deploying a stack of infrastructure to a different environment, whether it's like a staging environment, production environment, or demo, and it takes care of, like, all the secrets and config changes across environments, making sure all these pieces are where they need to be, you know, running Terraform. It does some Docker building and stuff. And that

13:51 has you know, in that situation, it solved some of that complexity, but now it's kind of all been pushed into a Python app that if you are not working at build, you don't know how it works, and you kinda have to come in and, you know, put things into there. And when things grow or slowly shape in a different direction, you now have to write more code in order to be able to support that. And then the, I think the thing that kind of triggered the Kubernetes migration is the complexity of those applications got to a point where

14:22 developers were no longer able to do development on their local machine. And so now if you need, you know, n number of microservices, let's just say 10, it's like a small stack you're working with, They all have all these AWS infrastructure pieces. You know, where do you do that development? And in our case, we have another internal Python tool that stands up a, essentially, a virtual machine and, you know, provisions everything you need. All of the the Terraform is there, and all of your, containers are being run there. And that Python app has just been,

14:56 I guess, time consuming in terms of the number of tickets that it's created for helping developers when things start to break or, you know, some developers don't use it anymore because it's complex and it has its issues. And so that's something where, you know, we're this Kubernetes team is helping developers move faster since we want them to release features to our customers as fast as they possibly can. And right now, this developer environment is slowing them down, and it's something that we can solve. And so with migrating to Kubernetes, we are able to package up a lot

15:30 How Kubernetes Addresses Built's Challenges

15:32 of that complexity in things like Helm charts. We're also bundling it with cross planes. So if a microservice needs, you know, five different pieces of infrastructure in AWS, it's all packaged in that Helm chart. And so now you can, kind of just build the collection of Helm charts that you need where you can say, you know, I need j, k, l, z, y, and b. Those are the microservices I need. And if you, you know, deploy them all at once, you're gonna get all of the infrastructure that you need. We've got it set up in a namespace for developers, and

16:03 then they're using, you know, DevSpace to be able to do development as if they were doing it locally. And so for us, that has been for the developers that are on there right now, like, it is a a huge time saving effort even for the ones that were able to do development locally, which is quite surprising. But that's been, yeah, that's been the big push to migrate to Kubernetes at Build, and it's been going great so far. So Nice. What I think is it's interesting there is, like, from the people that I've spoken to in the past, like, when they're moving to

16:04 Improving Developer Environment with DevSpace

16:35 Kubernetes, it's generally because of, like, microservice service adoption and they've got all of these containers and they need some sort of orchestration. But it sounds like the journey that Bel Air on is kind of actually a bit more twofold. You know, it's kind of part taking legacy code, breaking it down, making it more maintainable and moving forward. But also the self-service platforming side of it, like, given developers that ability to have a cross plane resource, get a Kinesis thing and get ups that along with the code that they're shipping. I mean, that's just a really strong enabler for teams. Right?

17:05 And like you said, it allows them to deliver features to customers quicker. And I think that's a one of those understated things with Kubernetes. It's it's normally all all we have containers. We have operational complexity. Let's do that. But, actually, if you do it right with the right people and you build a platform, is you you give a really strong foundation for companies to iterate and evolve faster than their competition. Yeah. That's a great point. And thinking too about how, you know, we're creating new microservices every single month and being able to support that. You know, if we

17:36 automate a lot of that to where Home Chart just gets created and, you know, the cross plane APIs are already there, we don't need to be in the loop for these developers. And there are going to be cases where they always are going to need us in the loop. But if we can get, like, 80% of the developers just self-service, like, that is such a powerful thing for getting features out to customers as fast as possible. One other thing I might add, really quickly is, you know, one of the the situations we had with our custom internal Python tooling is

18:00 Leveraging the Kubernetes Ecosystem

18:09 a lot of the orchestration of containers is something that we were building into that into that Python app and things like how do you handle end to end encryption with things like MTLS. You know, we've rolled our own version of that, but now we have to maintain it and support it. And it's been really fun seeing the Kubernetes ecosystem grow up and mature where there are so many off the shelf solutions that people used to have to code their own solution for. And so it it does seem like, you know, if your company is spending more and

18:42 more time coding custom solution for things that could be just had off the shelf with Kubernetes, like, maybe that's the time to take a look and see, should we be migrating to this since we have these complexities that are growing? Awesome. Just because you talked about how the Kubernetes communities, they're they're evolving, they're producing software. Like, the MTLS one is a great example. Right? You know, Linkerd came out and just said, hey. Run us in your cluster, and you get automatic MTLS for free, which is is wonderful. So, you know, kind of diving in on

18:55 Essential Kubernetes Tools (Argo CD, Flux, Helm, Kustomize, KEDA)

19:10 that a little bit, like, the CNCF landscape has dozens, hundreds of projects depending on how deep you look. Are there three to five off the top of your head that are just a must have installs for all Kubernetes clusters? Yeah. There are so many things in the ecosystem, you know, from, like, edge case things to common core things. And I think some of the big ones that most people are gonna start to encounter is, you know, do I deploy these things to Kubernetes clusters? What does my CICD pipeline look like? You know, maybe if you had a, CD tool

19:44 before, it's may work well with Kubernetes. It may not, but tools like Argo CD and Flux are two great ones to take a look at. You know, they're designed specifically for deploying to Kubernetes clusters. And so those are really big ones that are gonna help with just all kinds of things even if you want things like, you know, blue green deployments or canary deployments or just how you configure that behavior. You know, they're gonna help you support stuff like that. And so those are things I think that most people end up looking at. Another is how do you actually deploy all

20:18 of this YAML that you have? You know, you're quickly you're just gonna become a YAML engineer as there's all these toggles to set. And do you want to package that up in something like Helms that you can, you know, version it, release it, and just have all of these, you know, defaults set in place? Or do you wanna use something like customize so you can just patch and override certain values as needed? Those are two things that, I think most people are gonna encounter as well. There's so many other things out there. Like, another one which I think is really cool

20:49 is just tools like you know, it used to be things like the like, horizontal pod autoscalers and stuff, and you still do have that. But tools like KEDA are really cool where you can, you know, scale out depending on, like, the depth of your queue size or something, or there's just so many custom triggers for when might you want your application to scale out. And I think a lot of people will probably end up encountering and using something like that. And if you're not in Kubernetes, you know, how do you solve that issue? Going back to that ecosystem thing, it's like,

21:20 do you end up coding your own version of Keta that you then have to maintain? And so that's a really fun one that I enjoy as well. Awesome. Alright. Let's dive a little bit deeper into the migration story then. So how far into the Kubernetes migration are you with built at the moment? Yeah. So we are, you know, we've been working on migrating developers over to the Kubernetes cluster, and we've got, you know, over a hundred different microservices that we need to deploy. And so we've started kind of tackling them one at a time and just figuring out, like, what are

21:29 Current Stage of Built's Migration

21:54 the best practices that we want for this these different applications and how do we deploy them? And we've got our clusters. You know, we've got these environments stood up. We have a bar for, you know, what do we need to do in order to bring developers onto it? And then separately, what do we need to do for it to be production ready? And so a lot of those production concerns, we're not focused on them just yet since our our biggest bottleneck is how do we get developers as productive as fast as possible? And so we've currently got one full team,

22:25 I wanna say about 30 developers on it, and, you know, it's already saving them. I think, like I think they said there was something in particular that took seven minutes, and now they're able to cut it down to one minute. And it's just, like, something that happens every single day across these 30 developers. And it's just become part of their daily workflow where they could do development locally, but they're all working in Kubernetes because it is a lot faster. And we are also onboarding another team in the coming weeks. We're gonna work with one of the senior engineers there,

22:53 make sure his stack looks the way it should and acts the way it should, and then slowly roll it out to the rest of the developers. So, hopefully, over the next few quarters, we'll just be migrating all of the developers over. And then as we're doing that, building out automations that we don't have to manually do this every single time of converting their application into a home chart and all that and, you know, kind of scripting out what does that translation look like from their application being deployed in ECS before to now being deployed in Kubernetes.

23:24 And so, yeah, currently in that process of migrating developers over. Okay. So the old system was container based on ECS. You've now built this platform. You're encouraging people to come over and deploy to Kubernetes. You know, I I don't wanna say, oh, it's easy. They're already on containers. Right? Because obviously, there's all the devil's in the details with all of these things. So what have been some of the main challenges that the developers have experienced as you bring them across to the new Shiny? Yeah. I think, I guess there's a couple. So one is, you know,

23:30 Developer Challenges During Migration

23:58 what is the incentive for them to go to Kubernetes or, like, why should they? And, you know, in their old environment, it was problematic, some of them gave up on using it. And then we had, like, a share or we have a shared ops environment, and so developers kind of take turns deploying their code to that shared environment to test things out and then roll it back. And so from that standpoint, you know, they won't have to fight over resources anymore in Kubernetes, and so there's a huge draw there for them to get into into Kubernetes,

24:28 It's like a big incentive that they they're ready for, they're excited for. And then for the, you know, 30 or so people that are in there right now, I think, initially, a lot of it is just knowing, like, you know, how do I like, right now, we just have, you know, ephemeral or not ephemeral, but we have, like, MySQL running in the cluster. It's got, like, an EBS volume backing it, so there is persistence there. But it's still somewhat ephemeral. Like, it's not like an RDS instance that's gonna be there for a long time. And so some of the things are like,

24:59 how do I access that? You know, if I wanna do a migration, what does that look like? And how do I get these endpoints, etcetera? Just like, how do I think about how to do what I was doing before? And then, you know, if I wanna deploy maybe a different microservice or it's not behaving the way I expected it to, how do I reset that and stuff? And so I know there was a sprint or two where I just spent, you know, hours and hours working on documentation to just try and answer all of those questions so that they had

25:26 exactly what they need to, you know, self-service themselves as much as possible. And, you know, I think a few weeks after onboarding those 30 developers, it kinda got quiet, and it's like, does that mean they're not using it? Like, what does that mean? And I it's kind of a good sign because they're they were using it, and it was just kind of working for them. And we'd have questions pop up every now and then, but for the most point or for the most yeah. They're fairly resourceful, and I've just been able to solve a lot of those issues once we did

25:53 that prep work of making sure that they had everything that we thought they would need. And then if they didn't, you know, we added that to that that base of documentation and tooling so that they were able to support themselves. Wow. So you actually provided documentation before migrating people over? A bit of both. Yeah. We, for a few like, we had some very rough skeleton documentation for, like, the the brave people that started it first, and then they had, you know, 50 questions or so. And, you know, we took that and kind of just slowly

26:24 added that to the documentation so we didn't have 30 people asking 50 questions and, you know, so on as we onboard more. Yeah. I mean, it's a it's a great idea of what I should probably try myself sometime, but I'm I'm always too keen to do the engineering stuff and I'm always a bit lax on the documentation. A bad trait of my own that I should definitely resolve at some point. I'm curious about them. You know, you've got these target developers on the platform. You used to do SRE. I mean, are you the sole reliability

26:45 Team Structure and Reliability Ownership

26:53 team for this cluster? Is that something you're training up the dev team to do? Is there another team that takes ownership of that? Like, what does that look like within built? Yeah. So we have, I wanna say, 12 engineers, you know, that typically do, DevOps type stuff or, you know, follow that philosophy. And so the, team structure recently is probably about five are, you know, SREs focused on all of that. And then I wanna say five, six others are focused primarily around Kubernetes and getting developers onto Kubernetes while also supporting some of the traditional ways of doing things.

27:32 Most of the engineers that are on that team that's focused on Kubernetes and getting people over have come from companies where they've worked with Kubernetes before. So I don't think there might be a a handful of people that haven't worked with it before, but most of the people on the team do have experiences at other companies, big and small. And so that has been, just amazing. Like, they're wonderful people to work with, and they have such amazing ideas that, you know, oftentimes, I would never even have imagined. And so it's it's been really rewarding working with them.

28:06 And I think the you know, roughly 12 of us are supporting those, maybe a 50 developers while we work up to this to, you know, help them be more productive. Wow. So do you have any tips? You know, I'm assuming, you know, Kubernetes migrations are something that a lot of organizations, a lot of teams are are doing. Not all of them are fortunate enough to have, you know, a team of 12 or 15 people to kind of help that be reliable and onboard them. But I'm curious, like, just because it's all at the top of

28:17 Tips for Smaller Teams Migrating

28:34 my head. You know, I'm working with a company right now. There's just five developers. They're migrating to Kubernetes. The why is of doing that? That's another story. We could touch on that later. But, you know, they want like, if they wanted some great advice that says, how do we make sure that we don't destroy our own cluster, destroy our data, deploying our own workloads onto? Like, what are the safeguards that you can encourage them to do as someone who's been quite successful, it sounds like, with a migration? Yeah. I think probably just looking at, you know, what are

29:02 the best practices out there and what are the the you know, there's so many good articles and YouTube videos and thoughts about, like, how to get started in this space and just thinking about, like, what are those core things that I need to be concerned with and going from there. If it is something that you are new to, then maybe it is worth kind of just, like, dipping your toe in in terms of getting stuff spun up, making sure you're comfortable with it, and seeing what that looks like. I've always, you know, been fortunate, I guess,

29:31 to, like, reach out the community, and I think that's a a huge thing that's very helpful as well is make sure you are engaged in the community. I think we are very lucky to have so many, you know, really smart people in the community at our fingertips, whether it's on Twitter, on Slack, on Discord. You know, know when I found your Discord, you know, to shamelessly plug that, but there's just so many wonderful people in there that if you throw a question out there, they're happy to answer. And, you know, there's book clubs you can join and stuff like that.

30:00 And so I feel like just, you know, making sure that you are educating yourselves and focusing on, you know, a lot of those best practices is definitely a good way to start. And that's assuming that you are coming to this new and you don't have, you know, external people that are are helping out with that. So I think that's probably the harder way to do it is just having internal people, like, skill up without any outside consultation. So Yeah. I think that's really important. The cloud native and Kubernetes communities, the code themselves, the projects. Right? They're they're

30:34 evolving constantly. Like, the Kubernetes and cloud native today is not the Kubernetes and cloud native of even three, six, nine months ago. Like, everything is just the wheels are in motion and maybe sometimes a little bit too fast. So I I couldn't imagine trying to build a platform without always keeping up to date with what's happening, reading new books, watching new videos. I I I just don't even think it would be possible. Like the Kubernetes APIs change fast. So, you know, the project released. It used to be for five times a year. It's now four

31:03 times. No. It used to be four times a year. It's now three times a year. Deprecations as we're all aware happened second fast in the Kubernetes landscape. So, yeah, I don't think you can stick your head in the sand and just say, we have a cluster. The job is done. It's just not gonna work that way. Yeah. It's it's funny. Every time I look at the like, what are the new things in the CNCF? I'm just, like, always blown away by how many new ones are in there. And you know, when we migrated to or when

31:27 we started, figuring out, like, how do we want developers to work in Kubernetes, there were just so many tools between, like, you know, Telepresence, you have Tilt, you have Dead Space, Scaffold, just so many tools on the scene just for this one little area of how do you solve this problem. Yeah, definitely a lot to keep up with. Yeah. Definitely. Alright. Let's flip that question around then. You know? Sorry. How long have you been at built? Been about maybe eight, nine months, right around there. Yeah. I'm assuming that few things have maybe been wrong in that time. Is there anything that

31:49 Lessons Learned and Hurdles Faced

32:00 comes to top of mind that you could, like, share? Like, any failure stories that we've been able to learn from? Or has if you just been extremely lucky and everything's been, like, five stars all the way through? Yeah. I think what's been challenging is, figuring out, you know, which there are so many options in the cloud space and which one to go with. And so we've gone through a lot of POCs of trying all these different tools and seeing which ones work best. And sometimes it is hard because they are, they both do such a good job at

32:30 what it is. I think, also, when we first got started with cross plane, you know, it's a there's a lot to that tool, and, like, it it has so much value in what it can deliver once you get everything up and running. But getting it up and running is pretty challenging into itself, and that was something that we struggled with for a bit where we're like, you know, how do we do this? How do we solve this and all that? And in that situation, our manager, Jacob, he reached out to AWS, and it turns out that they had a

32:59 program. You know, if you're in AWS to essentially work with a number of, you know, amazing people, like, I think Carlos Santana and Christina and a number of other wonderful people there. And they helped us, like, get up to speed with cross plane and how to like, what are the best practices there? How should we think about structuring this, and what are the patterns that this will look like? You know, we've come across this technical issue. How do we solve it, or what's the best way to solve it? Some of it is challenging because it is, you know,

33:27 a fairly new tool, and so some of the best practices are still being figured out. But that's one where we probably flailed around for a little while, and then, thankfully, we were able to get some help from AWS in that regard. And that's something that was has been tremendously helpful. I'm trying to think because I know it hasn't been entirely smooth sailing. You know, there's certainly things where, like, I wish we'd done them sooner. Some of those being, you know, more automation around things, like getting home charts up and running. You know, I probably should have done

33:56 that a little bit sooner, but some of that also has probably just come from as we migrate them over, we figure out what we can automate and just kind of iterate from there. And so, yeah, it's been a lot of a lot of hurdles for, you know, built complex infrastructure. Okay. Have you ever executed your disaster recovery plan? So we have, for the ops side of things, we certainly do. When I say ops, I'm just referring to, like, our traditional, like, ECS, custom internal Python for all that. For the developer side of things, you know,

34:12 Disaster Recovery Plan Status

34:30 we've got those two bars of, like, what do we need to get developers on board, what do we need for prod? And the prod threshold is where things like the disaster recovery plan, you know, we certainly need to have that in place and practice before then. And so at this moment, it's not as high of a priority as getting the developers on board. So short answer is no. So working with developers in the in the past. Right? They're doing this migration to Kubernetes. Sometimes it's a migration to containers. Their old development environments have have changed. Right?

34:50 Handling Developer Environment Complexity Explained

34:59 They don't just do composer install, a gem install, a gem bundle, or a a pep install. They now have all of these containers. And if they're moving to Kubernetes, the chassis either move into some sort of service oriented architecture, maybe even microservice or arc microservice oriented architecture. The development environment gets a lot more complicated to the point where working locally maybe isn't an option anymore. So I'm curious what is the setup look like and how are you just handling that complexity about? Yeah. So what we've got right now is we have developers that, in the old environment, it would spin up

35:34 the entire stack of everything that you would could potentially develop on it built, and most developers don't need everything. Sometimes they do, but not always. And so in this new situation, you only spin up in the cluster exactly what you need. So if you want to, you know, do development on microservice a, in order to do that, you have a dependency of microservice like b, c, d, e, f, and all of their AWS infrastructure. And so but all you care about is microservice a, and that's what you wanna develop on. And so with that, that's where we

36:08 can spin up just that stack of resources in their, namespace. And then from there, now that they've got all their dependencies in place, they're going to want to do development on that microservice. And using DevSpace, as soon as they type DevSpace dev, what it does when it kicks off that script is it actually, it starts off where, like, you know, you've already got your home file installed for the microservice that you want to work on, and DevSpace is essentially just gonna, like, do an upgrade install over that, and it's gonna put some of its magic in there. And it's

36:42 going to sync everything that you have locally that you define that you wanted synced remotely, and it's gonna place that into your remote container. And so at this point, the developer actually has the experience that it is local because they can work in Versus Code or IntelliJ or whatever they're using. And as soon as they, like, make changes and save them locally, it's gonna sync it remotely. And then that way, they're able to test their changes against all of those other microservice dependencies, and they can just iterate a lot faster where they've got all this remote stack. And,

37:17 yeah, it just comes off as if it is a, like, local environment. Awesome. That's DevSpace by Loft. Right? That's the same project? Yep. Which I think is in the CNCF now. I wanna say they just yeah. Yeah. I think it was donated to the CNCF. That's awesome. Very cool. Yeah. It's one of those things that I personally struggle with. Development environments in Kubernetes is definitely challenging. And I think we're really lucky right now to have things like DevSpace and Telepresence and Scaffold. And I'm sure there's many more that I'm forgetting. So yeah. Definitely awesome. We've covered, like,

37:49 challenges. We've covered some benefits, things that you're getting things that went wrong, things that are going really well. And I'm just curious, you know, you're nine months into this migration. Developers are being onboarded, so I'm assuming the answer is yes, but I'll let you go into more detail. Like, is this a net positive adopting Kubernetes going through the trials and tribulations of having services and containers and the cloud with all this ephemerality? It build, certainly. And, yeah, I think, obviously, that's gonna vary from company to company, but the complexity is not going away, and it's only continuing

37:58 Is Kubernetes a Net Positive for Built?

38:24 to grow. And so, you know, wrangling that complexity is just getting more and more expensive in the way that we were doing things, certainly around in developer environments. And so by, you know, doing this work now and allowing developers to have these environments, they're able to move a lot faster, which saves them time, gets more features out a lot faster, and, you know, ultimately just helps out with the the business value of, like, why does this company exist? It's to meet the needs of our customers, and the faster we can create features for that, the better it is. And so that

38:57 is a problem that it is, you know, starting to solve today with a lot of these you know, with the teams that are on board. And so I think that's just gonna continue to pay dividends, especially as it matures, the platform gets better, and, you know, new microservices come on board. Maybe we change some of that architecture of how the apps are configured. And, you know, with all of that, just continuing to reduce the time that it takes developers to get features out to customers, this is certainly a net positive. So Wow. I mean, I kinda wanna end the

39:28 episode there because, like, that last thirty seconds, so I'm by the CNCF should be paying you for. I'm getting royalties. Who's saying? Alright. Well, thank you for sharing your story. It's been really insightful. And just learning about this migration, the way that you're doing that. I love that you've mentioned tools all the way through. I can just imagine people sitting there writing these down and going to check them out when they get a chance. So thank you so much for that. I'll now give you the opportunity to plug yourself, your blog, your Twitter, your LinkedIn, your company, anything you want. Feel free

39:39 Conclusion & Guest Plugs

39:59 to to share away. Sure. Yeah. I guess the first shout out I would give is, you know, just to Build. It's been a, you know, an amazing place to work with some incredible talent here. Unfortunately, we're not hiring DevOps people at the moment, but if an availability comes up, I would definitely encourage people to reach out. And I found that I've been learning so much here recently, and it's just really fun. There's so many great people to work with and fun challenges, and so that's really allowed me to thrive, and that's been fun. Also, Fairwinds. You know, if you're entering the

40:30 Kubernetes space, Fairwinds has so many great open source tools. There's things like Polaris. There's things like Goldilocks, Pluto. The list goes on and on. You know? And whether you want to know best practices or you wanna right size containers, there's a lot of great stuff that Fairwinds produces. They also have a great SaaS product, you know, if that's something that, you know, as the complexity grows and you have many clusters, how do you get visibility into that? There's just a lot of wonderful things coming out of Fairwinds to help people manage all of that. And so, yeah, those are definitely things

41:02 I would encourage people to look into and yeah. Alright. Awesome. Well, thank you very much for your time. I hope we speak again soon. Have a great day.

Technologies featured

Meet the Cast

Weekly Cloud Native insights

Stay ahead in cloud native

Tutorials, deep dives, and curated events. No fluff.

Comments, transcript, and resources

More from Cloud Native Compass

View all 23 episodes
Kubernetes

More about Kubernetes

View all 172 videos
Argo

More about Argo

View all 7 videos
DevSpace

More about DevSpace

View technology
FluxCD

More about FluxCD

View all 12 videos
Helm

More about Helm

View all 49 videos
Crossplane

More about Crossplane

View all 4 videos