Hands-on Introduction to Velero | Rawkode Academy

Watch / Rawkode Live Live

The embedded player needs JavaScript.

Open the video stream (HLS) Download captions (VTT)

Overview

About this video

What You'll Learn

Install Velero with the CLI against a MinIO-backed S3 storage location
Back up Kubernetes resources and persistent volumes, including selective namespace backups
Restore a deleted workload into a fresh namespace after simulating disaster

Velero maintainer Carlisia Thompson joins David to install Velero against a MinIO S3 backend, back up Kubernetes objects and persistent volumes, then simulate a disaster and restore the workload into a fresh namespace.

Chapters

Jump to a chapter

Transcript

Full transcript

Generated from the English captions. Timestamps jump the player to that moment.

Read the full transcript

0:49 Introductions

0:49 Hello, and welcome to today's episode of Rawkode Live. I'm your host, Rawkode. Today, we are gonna be taking a look at project Velero, a tool for Kubernetes to help make all of our backup and restore lives a lot easier. Now before we begin, there's just a little bit of housekeeping I'd like to take care of. First, please, if you're not subscribed already to the YouTube channel, subscribe now. Click the bell. This helps other people find the content, and it means you get really cool notifications when more episodes are going live. Also, if you are not watching live and

1:11 Conclusion and Farewell

1:19 you're watching this later, you may have questions. You may want to suggest new episodes or technologies you'd like to see covered, or maybe you just want to chat with other cloud native and Kubernetes people. We have an active Discord server with a few hundred people there. Feel free to come and join us and talk technology. And lastly, I would love to thank Equinix Medal, my employer. They allow me the time, energy, resources to put this show together and provide cloud native learning materials so we can all learn together. So if you wanna check out Equinix metal,

1:48 you can use the code Rawkode. This is different from previously, so don't fall for the trap. This now gets you 200 US dollars and credits that you can use on the platform, which you can use very, very quickly and under fifty hours, or you can get up to four hundred hours of compute with our smallest instances. So feel free to check that out and let me know how you get on. Alright. Now, with today, we're taking a look at project Velero, and I am honored to be joined by Carlisia Thompson, a engineer at VMware, a maintainer of Velero.

2:16 Hi there. How are you? Hi, David. Thank you for having me on the show. Hi, everybody. It's our pleasure to host you today. We're very excited to take a look at Velero. Do you wanna just start? Give us a little bit of an introduction about yourself. Share whatever you wish, and and we'll take it from there. Yeah. Sure thing. So I am an engineer. I work for VMware. I'm one of the Velero maintainers. There are a few of us from VMware. There is someone from Susie Sousa, and we always welcome people who are interested in becoming maintainers,

2:54 just for the records. What else? I am a CNCF ambassador. I think I'm very interested in what's going on with the community. I try to help out when I can. I do I am a host of the the Podlets podcast, which which is a panel of hosts that talk about cloud native technologies. What else? I think that's plenty. Yeah. Like, what else? I mean, you listed so much there. You must you must like being busy. I I like to say that I am Brazilian because if in case people are wondering where my accent comes from.

3:33 I do live in The US, though. So in California. Well, thank you for sharing all of that with us. We're very excited for today. So why don't we start off then a little bit about what project Velero is? Yeah. In fact, let's just start with that. What what is project Velero? Yes. Project Velero is an open source tool. It's a CLI tool that you install on a Kubernetes cluster, and you can run backups on that cluster. And you can run backup, complete back backups of a complete cluster, or you can do selective backups. For example,

3:45 What is Velero?

4:14 if you just wanna back up a particular namespace, you can schedule backups. And then if, lo and behold, something happens, you can pull the whatever backup you want and restore it back in that cluster. You can also install Velero in a different cluster and restore a backup on that cluster, which is sort of like make doing a migration. And it's I think it's it's very easy to use given what it does. It's very well supported. It's a very active project. It's been work worked on for years, and, it's it is sponsored by VMware. And VMware is definitely making a lot of

5:05 investments in it. Yeah. That that's great to hear. I think, you know, backup and restore is just one of those things that a lot of people may not think about upfront when they're piling up their Kubernetes platform, but it's like definitely something that you should always have a strategy for because disaster recovery is, you know, all well and good until you need it and you need it to work well. I think Velero is fulfilling a very important Yes. Requirement. I I yeah. I I think so. I agree with that. And a lot of people leave backup for last.

5:39 Let's say if you were exploring Kubernetes, coming on board with Kubernetes, you are thinking of so many different things, and the last thing on your list is backup. And I think it should be the first thing because as you tweak things around, if you just had a backup to restore everything back to where it was before, it would be so much easier. So just having that, feature alone, just be able to okay. Let let's put everything back where it was because we messed up something here as we are learning. You so even for that and Velero is

6:14 so easy to install and run. That's there is no excuse. You should not be running scripts. Velero does all of that for you. So Exactly. Let's let's say that the the clip from today's episode should be every Kubernetes cluster has Velero available. Like, it just I think that should just be table stakes now, but backup recovery is so important. Yeah. And of course, you know, you also have to restore things regularly. And I think some of the more high performing teams that I've worked with or, you know, even just speaking to a coupon and other

6:48 events are are teams that have got this done to an art form as well. Been able to I've seen one organisation, don't know if I can say their name, but maybe I'll tweet it later, who actually do almost immutable upgrades on their cluster by using backup and restore to do the upgrades. Like they don't do any in place upgrades. They spin up new clusters and then restore from the old cluster to the new one. And I always thought that was almost a superpower of theirs because it just opens up so many different avenues for that platform.

7:17 Does that make sense? Yeah. It makes Do you see people doing that? Well, there are so many different use cases. For example, the ability ability to just back up your environment. Let's say, I'm the developer. I'm doing something. I back up the environment, and I say, hey. Restore this, and you're going to have this thing to run, which is exactly like my thing. Right? And it could be a developer sending things to, people who to test or for or for people to try out, like, for the marketing department. I don't know. There are so many different use cases that

7:52 you can use something like that for. Yeah. Definitely. Yeah. I couldn't agree more. And you said something there twice that I'm just gonna bring up a third time and hope that it doesn't jinx this, but you said it is very easy to install and I'm looking forward to my backup and I'm looking forward to the easy installation of Velero to my cluster. I have prepared a really simple workload, but hopefully it's visual enough that we can see where Velero fits in, how the backup happens, and then even trigger the restore bit. So what I'm gonna do is

8:25 Installing Velero

8:25 share my screen. This is the Velero.i0 homepage. This will be where anyone at home who wants to try this on their own time should just start. We're gonna dive straight into the documentation. And, I mean, I always ask this question, and then feel really silly afterwards. But I guess the first step is for us to install Velero. Is that right? The so the first step is so the first step is you need to provision the cluster. You have to have the cluster. Yeah. I know it's for the then I probably thought about that. So you have to

9:00 have the cluster up and running. There I I say that because there are so many tools out there now. Sometimes you might think of well, there are also pro will provision things for you now. So you you need to have that. Then you have to go through the documentation and find the provider you are using or if you're using on prem and find the documentation for the plug in that's the corresponds to that provider. So do you know which provider you I mean, you know what which provider are you using? It's a bare metal cluster on Equinix Metal.

9:35 Does that complicate things? No. No. No. So in that case, you're going to so because we need to connect to a an s three or s three comparable storage. So in that case, we we can use menu unless you have looked ahead and have set up some other storage. I should have. I do have access to AWS. I could really quickly create a bucket. I'm assuming we would just need some really simple credentials to get that running. Yes. You you need the access key and in in the bucket name. Yes. You also said the other option there was Menio. Now

10:21 Menio has a pretty stable helm chart, which we could throw into a namespace on the cluster itself. Right? Because we're doing backup and restore of individual namespace, I believe. So Yes. So I just wanna say something very important about alerts, red flags. You don't wanna do that in production. You don't wanna because you're backing up a cluster, and if your cluster and if that the if that's where you have your store the storage for your backups and the cluster goes down, you don't you don't have a way to restore it. Right? So do not do that in production.

10:55 You wanna have your s three buckets somewhere else other than in your cost environments. Yeah. That's great great advice. If we can keep that in mind, you can, yeah, you can have Minio running anywhere. I think I'll just grab a a Docker command and stick it on a host. I'll keep it outside the Kubernetes control plane. Just because you're right, I shouldn't stick it there. So I wouldn't use the helm chart, but I can just run this on a machine. We'll pretend that it's in a different data or somewhere else or it's very safe and secure

11:32 and it'll be fine. Totally fine. I'll just grab an IP address. And there we go. Oh, very good point, actually. I don't wanna install Docker, use containerd. Trickier. There's a binary. Right? Let me. I should have looked ahead. There we go. Because, you know, in in my head, I knew we needed an s three compatible object store, but I just didn't think. Why I also didn't think to ask you. So that makes two of us. Yeah. Well, we'll just grab this binary and run it. And many of those really take any setting up, so it'll be

12:33 fine. We'll make it executable. Just run manual server. Think I a command here rather than me just bashing on keys until it works. Oh yeah. Okay. So, server and then just the directory. Perfect. Cool. So menu admin and menu admin, I think it's just the credentials we can use. Yeah. We'll we'll just take it from there and we'll see what happens. What's the worst? That looks good. And now we have another terminal. Okay. Slight segue done. Hopefully, disaster averted. We have our s three compatible storage. So I have a Kubernetes cluster. I have a workload. It has a persistent volume. It has

13:28 state, and I have something that works and acts like s three. So our next step So so with menu, the plug in that we're going to use is the AWS plug in. Okay. So should I just go into this basic install bit first? So what I would do is go on the search bar, Yep. Up up up on the left. Search for menu. Yes. That the and and as a if if people noticed, that page says evaluation. This is not for production. It's a it's a very good way to set up if you wanna evaluate Velero.

14:17 Alright. So I will do. I'm gonna open that. I'm gonna turn to my right because I'm gonna look at that. It's a bit too tiny for me. Oh, sorry. I I should have No. It's earlier. If you right click on it and say show controls, you can fill screen the video. Oh. Yeah. I was so annoying. But, yeah, if you right click and show control, should you'll be that'll make your life a lot easier. Yeah. Now we now we get it. Okay. I'll just quickly brew install Velero and get the CLI available to me locally.

14:50 Oh, that reminds me. If? We just released the 1.6 version, and I don't think we did the brew. We updated the brew install. Ah. Oh, no. It's there. There you go. 1.6. It is? Yeah. Somebody did it. Great. Yeah. One sec, though, seems to be coming through here. Yep. Yes. So that's what I wanted to to use today. Oh, yeah. This is the That's a just not that. Still version. Right? Okay. Alright. So let's see what we're looking at here. What we're looking at here is we have the client version, and the error is saying

15:34 you don't have Velero installed anywhere where that we can find. So there is no server side installation of Velero, which is to be expected because we haven't done that yet. We're going to use the client CLI to to to do that. Okay. So we just need to configure. So you need to The secret. Pass the so so we you need to have the the AWI the menu credentials in a file or pass it as a I think if you pass it as an environment variable, it would work. And we'll just But I think it's that

16:21 file. It seems to be quite simple. So menu dot creds. And mains was menu admin, menu admin. That should be that taken care of. Do we need to oh, that's the manual deployment. So we can just ignore that. And I guess what we want is this here. Yes. So just the plugin version is going to be version one point point two point zero. So you might wanna paste that into a a text editor so you can edit that stuff. Yeah. The secret file needs to point to to that secret file you just created. Okey dokey.

17:12 Just all my automation from spinning up the cluster. We can get rid of that. Velero install and paste. Okay. We're gonna Alright. So line three. Yes. And bucket name? Should probably create a bucket. Right? So let's just grab the IP address for this. This is our cluster. And I just log in with those credentials. I'll just quickly create one. Yep, new bucket. There we go. I'll call it Velero. And now we have the bucket name. So, oh, good. That's what that's what's there too. I called this Velero.creds. Use volume snapshots false. What's Yeah. It's gonna be false because

18:16 with let's say Okay. Okay. So so that's okay. So so let's let's take a pause. Yeah. So so far, we have configured the storage with we're looking at the storage where the backups are going to go to. So when we if if and when we need a restore, we'll pull from that storage. Mhmm. The other part of of the of the Velero configuration is persistent storage. So without the persistent storage piece, we can back up and restore all day long all of the Kubernetes objects. Now if you have persistent data anywhere that you also want to back up,

19:05 then you have to set up something another piece, which is the, the volume snapshots. Now be if you were using one of the providers for which we we have a plug in, so we the Velero team maintains plug ins for AWS, GCP, Azure, and we also there is also a team at VMware that that maintains the vSphere plug in. And there are a bunch of there are some plug ins for other persistent storage. Yes. So we we just have to go in documentation and see what's listed there. Now if you'd if you're not using any

19:50 of those providers, if you are, we recommend you use this specific plug in because it's going to be better. You can use Rustic. Rustic is also an open source tool. It's a tool that it's integrated in Velero, and we use it to do to, do a file system level backup of that persistent storage. So with the plug in from a provider, we don't do that. What we do is we call the provider's API, and we trigger snapshots. So we just say, you know, okay. The user wants a backup of this persistent storage. Take a snapshot. Take a snapshot.

20:28 Right. And the information gets stored in the metadata. So we know the ID, and we know where to fetch it from when we when we wanna do a restore. We do the same thing in reverse. We just we access the provider's API and say, okay. Give me this this snapshot. Alright. So so if we don't have the that plug in, so we use Rustic. And now we have to look at the documentation for how to install Rustic with Velero. And by the way, just so people know, if you run the Velero install against let's say you configure one way and you want

21:03 to update the configuration, you just run the Velero install again in that whatever it's there is going to update everything. Okay. Just just in case, for example, if we run the install, oh, we forgot to install REST, like, we could run it again with the REST command. So let's go and, in on the documentation and see what I would do is just open a new new tab there and search for REST. Yes. That the first one. Okay. So we have to Okay. So oh, yeah. I could just do the new version. Flag. Alright. Yeah. We we so we set the use

22:01 volume snapshots to false because we're not going to have Velero calling any provider's API. So that's just that's just it creates less CRDs, less resources related to Velero that we we wouldn't need anyway. Okay. So you have the s three URL set up to your correct IP. Right? The 145.40.c. Okay. You typed very fast, so I didn't see you updating that. And I think that's it. I I don't know. Alright. Let's do it. I mean, it's fake confidence. But, mean, let's do it. Yeah. You're right. So Velero installed provider AWS because we're using our s c compatible API.

22:48 We've got a bucket name. We're using Rustic because we don't have the snapshot API provider support, and we've got that set to false here. That all makes sense. We pass in our API credits here for menu, and then we connect the dot here. So I will just quickly run this here. Is it gonna use my kube config environment variable to speak to the right cluster? I guess so. We'll find that. I just Sorry. Say that again? I'm assuming it's just gonna use my default context for Oh, okay. Great thing to mention. There is a global flag that you can

23:32 use to point to a specific context if it's not the one you're on. But if you don't use that flag, it's going to install on the context you're on. Okay. I'm I'm okay with that. I've got my context set to there. We also got a comments in great tip. Raverley UK has been uninstalling and reinstalling the Velero change of config. So there we go. Oh, my file name must be wrong. Minion. Minion crudes? How did I manage that? Minion crudes. I must have been in salt stack mode there. I have no idea. Maybe you're hungry

24:14 for a filet mignon. Perhaps. Yeah. Okay. It looks like it has deployed everything we need to our cluster. Do you want me to run the logs command to make sure that looks I think yeah. Let's show it. You didn't catch the last o. Okay. I see no errors. So our confidence was just we have installed Velero to our cluster. Does this just get installed and not the default namespace then? Velero namespace. So it installs by default on the Velero namespace. If you do wanna install it in this namespace a custom namespace, you can also do

25:05 that. We have documentation for that. Okay. Awesome. No left as a comment saying bananas. Obviously, has been watching the minions movie recently. So I do love those movies myself. I'm a sucker for them. Yep. I do too. I love it. They're so cute. Oh, yeah. Definitely. My daughter's only two and a half, but I'm waiting for her just to get to that age where she starts to pay more attention for longer periods of time so I can just like watch all the minions movies all the time. Okay. So Velero and Rustic appear to be running.

25:30 Creating a Backup

25:36 They appear to be happy. Is that it? Stream done. We're finished. Right? It just works. That's it. That's it. So we could try creating the backup. Okay. I'll Do want me to tell you the command or you wanna show the docs? Yeah. Well, I'll I'll pull up the docs, but we can we can definitely just feel brave. I'm assuming if I just type dash help, I may get a little bit of guidance as well. Yes. Yes. Okay. So we got Velero backup command. Oh, we got a bunch Sorry. So before we do a backup, let's look

26:16 at that backup location. Do a Velero get backup dash location. Alright. So so that backup location, if you're not familiar with Velero, you're going to be wondering where did I with where did that get came from? So when we did the install, we passed some parameters. For example, a dub use the AWS plug in. And the so the Velero the way Velero works is for each storage location where you want to install the sorry. For each, storage location where you want to install, not install, but upload the backups to, you need to have a corresponding

27:14 backup location. That is sort of, like, the abstraction that Velero uses to to add configuration for the storage location. So everything that Velero needs to know to make that connection. If you don't create it ahead of time, if you don't pass a specific flags or installation, Velero is going to give you a little hand and create a default backup storage backup the the name is the full name is backup storage location. In the CLI, it's backup dash location, but it's the same thing. So we'll create one for you that's named default that's going to be using the parameters that

27:57 you passed in during the install. So we we knew you were using the AWS plug in. So that backup location in in the s s three URL, that little flag I might got might have gotten that wrong. So all of so that's part of the configuration. You can look at the CRDs that got the the CRs that get created, if you're interested. So at any rate, this is a very good command that I I really like this command, that gives us the the information if the location is available or not. This is the very first thing when I do support. That's

28:37 the first thing I wanna know. Is your backup location connected and up and running? Because if it is a new you're not gonna be able to upload backups or download backups or restore backups. So that's up and running. Great. You can also use that command to create new ones. You can delete it, and there is always one that will be the default one. And what that means is when you are creating backups and you don't specify a backup location, Velero will use the default. But if you have, let's say, three or what however many, you can pass to the backup commands

29:21 the specific backup location name to use for that backup. Same thing with the restore. Wherever you wanna restore from, if you don't specify, Velero will use the default one. And and and, again, I I love that command that just lets you know everything is up and running. And if not, then then we dig through the logs and see what's going on. Okay. Awesome. There's a a lot to unpack there. I've got a a couple of questions. So we can have multiple bucket locations. So we could, in theory, be backing up to AWS s three and Google's

30:00 own cloud storage thing. Is that something I would have, like, one backup go to multiple locations? Is that a common pattern, or would it be that different backups go to different locations? Absolutely. So you have to set up one location, like, at least one location for each of those providers if you want to send back up to different providers. It's it's definitely a good strategy as as a means of back backing up your backups. Or maybe you just want to that's I don't know if people do that, but you might wanna have a section of your

30:36 cluster guru, just GCP or another section not some namespace go to AWS. Yeah. In other words, flexibility. So so one thing to point out is for people who is familiar with Velero, up to 1.5, you could have that, but Velero only accepted one secret key value pair. So if you wanted to switch from one provider to another provider, you had to manually update the Kubernetes secrets to the secret you're using because Velero could only handle one secret. I mean yes. So with with version 1.6, actually, that is the the the main feature we added. And there is a blog post also that

31:28 talks about it, and there is documentation that talks about it in drum drum roll. I never start with the end. The feature is you can have you can now have multiple secrets with Velero. So you can definitely you just create you have let's say you create the secrets on Kubernetes. Let's say you create five secrets. Mhmm. And when you create the the backup storage location, you can pass in a name of the secrets you want to associate with the backup storage location. So now you can have a secret for AWS, a secret for GCP, and you can just tell Velero, hey. This

32:09 is the secret for this location that that and use that. So that's pretty neat. Yeah. Very interesting. Definitely. So the other thing that's jumped out to me here is that, you know, I did an API resources, and I just kind of grabbed it down to the Velero stuff, is that everything that we're working with can just be configured through a custom resource. Which means I can get ops all of my backup and restore processes as well. Is is that right? I am forgetting now what the glitch would be, but Velero has some some hooks there that

32:49 even it will not recognize some stuff, and I'm forgetting what the stuff is. Mhmm. Because we do an automatic we keep reconciling to check the if there is new backups to to be uploaded, for example. I think if you create a backup just manually, let's say, or with GitOps bypassing the Velero CLI, I there is something that I'm not remembering. I have to ask my teammates that Velero will not recognize and Right. Get you in trouble. Okay. Noted. Okay. We got two comments there that should quickly tackle before we we kinda progress there. So Noel

33:32 is saying that he spotted the Velero bug command and is just saying kudos. That's awesome feature. I never Yeah. Thank you. Never noticed that, but oh, yeah. Yeah. Thank you for for we love knowing what people love. For version 1.7, we have planned to do a Velero diagnosis tool. It might be called Velero crash or something that it's a bit tricky because, if you run that and it's going to output things from your cluster, it it needs to be nothing that they'll be private. So we we're going to be very careful about that. But, basically, you run

34:15 that. You output if you need if you have a problem with Velero, if you're asking for support or if you just wanted to troubleshoot it yourself, You run that. It will give you all of the, all all the information you have to look at, like, the the proper logs, all the proper things that will be relevant. Anyway, I'm just saying we're always looking for to to make things easier for for people to to see what's going on. Awesome. Okay. We'll tackle one more, and then we'll we'll continue. But Nils is asking, can Velero back up etcd?

34:48 Yeah. Yep. Yep. That's a every Kubernetes resource gets backed up by Velero. Awesome. Alright. Everything from etcd. Yeah. Which is which is great because, you know, we wanna be able to restore everything from the ground up if the worst were to ever happen. It makes a lot of sense. So so one thing that may be useful for people to know, people who are not familiar with Velero is Velero only accesses the through the server API. So, we're not taking the file system backup of. There are tools that do that if and you can use that in parallel with Velero

35:28 too. But Velero only uses the API, and, that is what makes it be able to do, for example, selective backups because with the API, we have we we can exclude things, and there's a lot more flexibility. So just so so people are clear, we're not doing a file system backup of the entire SCD. We're just backing up the the the objects in the database. Okay. Awesome. So I guess our next step here would be to create our first backup. Yeah. Let's do it. Alright. So I did see there was a Velero backup command. So I'll just

36:11 Oh, one neat thing or maybe maddening, depends on how your brain works, is with all Velero commands, except I think for the downloads, you can interchange the command. So you can do create backup or backup create, and either one will work. I actually did an entire episode where I tried to build a kube control plugin to override the Kubernetes verb noun behavior because I really disapprove of it. So I'm glad that I could just take backup create rather than create backup in it. But I guess personal preference. I like I like starting with the verb,

36:46 so I stick to that so I don't have to think, wait. What am I writing? Okay. I'll I'll I'll do it your way. So you would do create backup? Yeah. Yeah. Okay. And give it a name. Okay. So backup one. Oh, back one one. Okay. And then I need to do dash help on this to see if I need other flags, or is that gonna do something for me out of the box? That's it. That's how you need it. I mean, again, if you wanted to specify backup storage location to to save it to that wasn't the default, then you add the

37:26 flag there. If you wanted to Sorry. I knew. Yeah. So that's just gonna back up everything? Like, that's just the default behavior except everything. Yes. Okay. Yeah. Yeah. And that a good that's a good point. If you wanted to do a selective backup, then there are flags you can use. Backup. Alright. It's not done. Not done. Okay. Sorry. So what is done here is the request for the backup to for the backup operation. So Got it. You can do yeah. Velero describe or you can do Velero get backup. Backup or backups, that's another thing. Single or

38:13 plural doesn't matter. I have backup one. Yeah. So is this just a a wrapper on the Kube control? I mean, it's it's just going to the Kubernetes API and and request in resources. Is is that correct? Yeah. We are now wrapping the Kube control tool. We we have we we are doing our own API calls. Okay. But, yeah, it's only going against the server in in getting that re Velero CR that gets created with each backup. There is a CR that gets created, and that gets updated by the reconciling things that happen. And, so what we what

38:55 we are doing is getting information from from that resource. Okay. Oh, yeah. So I can actually see here that it looks like I can just describe my backup backup one, and I'm probably gonna get, yeah, very similar information. Not displayed quite as nice, but very similar. Okay. Let's go back to the Velero command. First thing that jumps out to me here was first is completed, which is great, but also there's an a default expiration on my backup. So it's, like, twenty nine days. Yes. Yes. So you can set that TTL. Some people have asked if can you set

39:34 it forever? No. We don't have that ability right now, but some people have asked, can you just, like, make it not expire, basically? But, otherwise, you can override the thirty day default and set your own expiration. Yeah. I guess the forever would be essentially an anti part because you don't wanna take one backup and then say keep it forever. I'm assuming we want this scheduled to run-in a regular interval and expire the older backups as we progress through time. So I can understand that constraint. Yeah. Okay. And I guess this also means that you shouldn't

40:09 schedule your backups around once every twenty nine days just to create a new one. That's probably not ideal either. Is there a a cadence that you would recommend? Is it is it daily? Is it hourly? Does it depend on the workload? Like, what what would you do in your own clusters? To be completely transparent, I am not running clusters in production. Right. Got it. And even if I were, I'll be I'll be cautious about saying I don't know. I think you need more data points than one. It's definitely very good to be asking what's the best practice, but I think

40:47 it would vary. Just as an engineer, I would say it varies it would vary. It should vary depending on what is it that you that you're backing up. Is it is it programmer stuff? Is is it production? Yeah. That's a that's a great part. My question was extremely naive, but you're right. Like, it depends on the workloads on the cluster. Right? Like, there are some applications. I mean, maybe Kafka would be an example that does its own replication, so you could maybe back it up at a slightly lesser velocity than maybe something else that doesn't.

41:20 There are too many ifs, buts, whens, and maybes there. So the let's just straight that question from all records. The question is not naive at all. You is a very pertinent question that everybody should be asking because you also don't want to be backing up and thinking of, okay. This is data that's being transmitted up to the cloud, being stored. You pay for all of that. So you have to think about it. So it's not naive at all. It's just that it's it's, going to depend on things that I have no idea how to how to, help you

41:56 with. Okay. So, I mean, I'm amazed at how easy this has been so far. We have installed Velero. It's already made a backup. I'm assuming if I just jump over to our menu and hit refresh, we have our backup one directory. We've got a whole bunch of stuff in here. We've got the CSI volume snapshots. We've got volume backups, resource lists. I guess this is just all the different components of my cluster backed up individually. Does that mean I can restore these isolated or or as everything at once that that I have that option and flexibility?

42:40 Alright. So if you go back to the command line, I wanna show you one feature, and then I'll answer your question. So you can run a Velero, Downloads. I forgot the flag. I think it's download backup and then the name of the backup. Could I could be wrong. Yeah. Let's look at the help, please. Download help. Oh, I think this was done. It's a non command download. Oh, that's okay. So Velero backup I'm sorry. Velero backup, that's the one that doesn't have the the switch. The little backup downloads in the name of the backup. There we go. Yeah. Alright. So that downloads

43:30 the whole thing to your local machine, and then you can untie and explore the contents. Now to ask your quest to answer the question you asked, yes, you can selectively restore. So one back best practice for backups in general is to have at least one periodic backup that you're backing up everything. If you can, some some some organizations are going to have restrictions. They're not gonna want some people backing up some namespace. But as long as everything gets backed up one way or another. But so, ideally, you have a backup for everything. And then when you it's time to restore,

44:16 you can decide what it is that you want to restore. And then you can say, okay. I just wanna restore particular namespace, for example. Okay. So, I mean, there's no point in us restoring this to our cluster. Right? We should probably cause a little bit of chaos into something. Although, I think I've all I think I've just identified a mistake that I've maybe made during the setup is that I don't think we can actually delete the default namespace. Should I deploy my fake stateful workload to a new namespace? Or can I delete the default names? I

44:49 don't think I've ever attempted that before. What what you can do is maybe do you have a the workload installed in that cluster? Excuse me. I do. Yeah. So we got a time writer, which is every ten seconds raising the time to fail on a persistent volume. Okay. So you can delete that. I mean, if you I suppose that is a it's not a high value thing. No. Yeah. I mean, Velero is robust enough. It's just like in a demo situation. I I don't wanna be troubleshooting your production cluster. No. This is an entirely contrived cluster. So

45:28 let me share with the viewers what I I've put in place just to kind of demonstrate what we're doing. So this is just a really simple bash script, but it does have a persistent volume. And then the persistent volume, we're writing the time every ten seconds to it. So if I just run a WC dash L, you know, this has been running all day. We got nearly 2,000 time entries and our PVC. So I guess what I would like to see now is that if I delete the persistent volume and the time rate or stateful set

46:00 is that we can actually restore those almost 2,000 entries into our cluster as if nothing happened. Is that Hold on. Because I I am forgetting now if with Rustic, need to specify something during the backup. Uh-huh. Okay. So let me just Do you want me to open up the backup and and poke around a little bit? Or Either that or we could look at the docs. Yeah. Maybe we should look at the docs. That's what that's what people other people would do. So I'd rather I I think I'm just the the silly one that would go the

46:34 harder. I think the docs is better. Yeah. Agreed. We might have to create another backup if I think there is something special we need to do. Okay. Alright. So the installation requires the use rustic flag. We did that. Mhmm. Okay. To back up. Yes. That section. This is just creating a sample workload of a PVC. It looks like do we need a annotation on it? Yeah. Oh, no. That's an it's an No. So Oh, okay. Yeah. That's an We we can use the I think I always get those two confused. I think what we want is

47:22 the opt in. Using opt in pod volume backup. You only have one volume one pod to backup, right, that has to persist the bot better? Yeah. And it looks like there's a flag here. Is that important? Default volumes to Rustic, or is that something else? Default volumes. I don't know what default volumes to rustic means. Hold on. I mean, let me also read this. Backup create dash help. Let's just see what it says here. Okay. So it defaults I think true. And So Yeah. I think it I think it has backed up. Because the rest default volumes to Rustic is the

48:23 default of true, and it seems to suggest that it will use Rustic by default. It should be it should be out of yeah. It's not a production workload. I'm okay if I lose my 2,000 timestamps. I'd be a little upset, of course, but I will go over it. Do you wanna look at the content of the backup, see if there's Rustic stuff in there? Yeah. Let's do that. So let's just move this to this directory and then load it. Got a whole bunch of stuff. Faint finder. No. Open. Yeah. That works. Okay. So I'm assuming

49:06 would be under resources, persistent volumes. Got a JSON fail. I think there should be a Rustic folder. I don't Rustic folder. Use Rustic all the time. And and when I do use it, I forget what is what should be there. There's no Rustic folder? Above resources? We've got a metadata. Yeah. I don't know. I mean, we oh, there's a that's our claim. Maybe we don't have the actual data. I mean, I could always, I guess, just the prep for I can't remember the format of my timestamps now. Oh, I used the date command. So I searched for April.

50:02 I don't think we may be I don't think there's any data. I don't yeah. I think the there will be a folder with Rustic. Let's do the opt in. Like, from the documentation, let's go in the opt in and run the cube cuddle, the number one. Yep. So I notate the pods. Okay. Let's do that. So if I do get pods yep. Still work. We've got a cube control command here. We're in a default namespace. I guess I could just annotate stateful set time writer and we're adding, we just have a PVC volume. So I think,

50:48 yeah. We're adding an annotation here. This is backup PVC volume. Done. Okay. So So now we create See number yeah. See the yeah. Create the backup. So number two, create a backup, and then we can look at the just the backup describe command to see if there is rustic stuff in there. Okay. So it's still backing up. I mean, it's taking a bit longer, I guess. I think it's, yeah, it's finished. Okay. Okay, so it looks like we can use the customer resource. Get what was it? Pod volume backup. Try all. Pod volume. Pod volume.

52:11 Yeah. Okay. So I annotated the state full set, but I haven't restarted the pod. So I probably set this up to fail there because if I describe the pod, we're actually probably not gonna have the annotation. Yeah. So I should probably just annotate that part rather than doing the stateful set. Yeah. But I'm learning, which is important. Okay. Let's create backup three, and we'll scrape scrape three. Okay. Gives that a few seconds to finish. It seems like it's only taken around, I think, ten seconds last time to back up the So so the very Velero create backup or

53:09 backup create has a flag wait, dash dash wait, if you want to use that. Ah, nice. So it will run we would only return after the backup's created, and you can also control c out of that. Just FYI. Ah, nice. Okay. I'm not seeing anything for the pod volumes back up. Of course, that doesn't mean that it hasn't backed it up. But based on what I see in the docs here, I kind of expected maybe to see something there. Should I download it again, do you think? Did you run that cube cube control bits to see the label for the backup?

54:09 I can run it again. Our backup name was backup three. Yeah. So we're getting an empty list back. And I'll just describe the pod and make sure I didn't mess up either. Oh, that's not a Okay. Okay. I think I'm working because I'm doing this the hard way and not actually reading the documentation. I don't think this is the type of volume. Is it maybe the name of the volume? Is that something I think it's the name. Yeah. Yeah. But that's just me being rather shitty again. Okay. So what was our volume actually called? It's

54:54 called data. So I would just re annotate. Fourth time's a charm. Fourth time's a charm. Not who said third time's a charm? Exactly. Overwrite. We are innovating. Okay. Back up. Let's go to work this thing. Four. Oh, no. I should've done all good. I'm glad I got that command wrong because I wanna do the wait flag dash dash. Cool. So I guess now that I've got the annotation correct, the backup volumes annotation takes the list of the PVCs that we want to opt and to backup, which makes much more sense than my random just backup all PVCs.

55:43 Let let's run the cube control bit for the label, see if it's what shows up. Okay. So that was this command here. Only my backup's got a new name. Look at that. Look at that. When you do it right, it works. Awesome. Okay. So now let's download it and see the structure. The rustic data should be there. Yep. Thank you, River DK who's very thankful that I don't prerecord my terminal stuff. No, I do everything very live to my own success and detriment depending on how my day is going. So I just created folders with wacky names. Now

56:36 we now have b k four and then now hungry for a burger king, but we'll jump in here, extract this, and we're hoping we'll see under resources something called Rustic. Yes. There we go. Yeah. Don't know. It's a JSON file. I I don't know if we should be seeing a JSON file or not a JSON file. I'm I'm I'm not entirely sure. You said it's not a file level backup. Right? So JSON may be may be embedded there. However it works, I'm not entirely sure. But I think I think that that rustic folder is a good good sign.

57:21 Yeah. I'm I'm feeling And it would these were labeled properly, so I think you can go ahead and crash your workloads. Okay. So I'm a silly Kubernetes operator who comes along and says, hey, that stateful set doesn't look useful. And delete. Oh, I'm in the wrong directory. And I need my kube configs. Let's try that one more time. Time writer. Because I'm having a very bad day, I haven't just deleted my stateful set. I am going to delete my persistent volumes too. So, we'll delete the claim first. Oh, I delete the volume first. Oh, and

58:00 Restoring a Backup

58:12 it should still work. Completely claim too. Maybe speed that up. And we'll run get. Okay. We have deleted the stateful set. We have deleted the claim and the volume. That 2,000 lines of very useful timestamps has now disappeared forever. But I guess we wanna restore our no broken production service. Yeah. Okay. I'm gonna just go out and I'll let you know because the CLI has been so easy to work with that I could probably just do restore backup backup four. Is that what you want me to do? No. No. Almost. Almost. Almost. It's Velero restore

59:04 dash dash from dash sorry. No. No. No. Backup. From dash backup and then the name of the backup. Oh, no. Yeah. I guess that works. I never put the equal, but that should work. Almost. No. Wait. What? Maybe it's the equal sign. It looks like it wants us to do a Velero restore Oh, I'm sorry. Gosh. Yeah. It's a create. Yeah. So you wanna create a restore. I intuitively, I do the same. I think, oh, we I wanna restore. What what I wanna do is restore, so I the command should be named restore. But you wanna create

59:50 because, basically, with all the little commands, you're not actually creating anything. Things get created or updated through the reconciliation. All that we can do is create a request to the API server. So we want to create a restore request from the backup, and that's gonna be that request is going to be sent, and then things will be done Okay. Magically. So let's kick that off then, and then we'll tackle a couple of the the comments that we've received. Yeah. Let's start it. Okay. So Noah was just again, reiterating that the CLI user experience is really nice. He's very happy

1:00:36 with that. Oh, did that not pop up? There we go. We got a comment from Hector Gibson who suggested that I run the rep rep again and the backup data just to see if it's there. Yeah. I think that's a good idea. Let's see. Because Noel actually posted another comment, which I'll get to in a second, but yeah. So it doesn't look like the data is in the backup that we download, but someone has suggested that perhaps the JSON file just contains a pointer to something that's in the SD object store. But I guess that we could poke around

1:01:07 menu if we if we wanted to. I'm assuming. So you can also do Velero restore to see the status. Velero restore get. K. Just that. Just well, you could've could've passed the name. So it's in progress still. Yeah. Because we didn't do selective restore of our broken or mess in the service. We're we're restoring everything in the cluster. Is that correct? Mhmm. Okay. Does that have any is this safe to do in a live production cluster or would it be something that you you maybe take it offline and restore before doing that? Because I'm assuming it's modifying

1:01:56 HCD and doing a whole bunch of stuff there. I'm just curious. I don't know how to I don't know the answer to that question. I think I think the answer is a safe yes. If you some when you do when you're working with databases, you want to do the sometimes you wanna do a what is it called, FS freeze back up. Mhmm. And after you have the backup, you want I think it's called FS unfreeze. And when you do a restore, you might have to do the same thing. I'm forgetting. Yeah. Just f s unfree freeze, restore,

1:02:39 or otherwise, you should be good to go. Yeah. I I'm sure there's stuff in the documentation that I'm just not reading, but it looks like our backup has restored successfully. So does that mean in theory, if I run get pods, we should see that our stateful set is back with it was before. Should be. Look at that. And it's up and running like magic. I do like magic. And we now have over 2,000 entries. That's awesome. Really, really simple to just get that running, do a backup, and the restore. That's exactly what we wanna see.

1:03:23 So, let's have a look just quickly. There's all my broken backups, but there's a good one. Okay. I was just curious if we have a pod volume backup here. So I guess this is the bit that's actually got my data bit. Very cool. Yeah. Yeah. Let's just check. Yeah. Okay. I'm pretty happy with that. That was good. We got a a question from no. Doesn't Kubernetes complain about applied revision of the revision was updated after a backup and we try and restore? I have no idea. Something I guess. I don't know what a applied revision is.

1:04:16 Maybe it's something that I know by a different name. It's not something I am confident in either to try and describe, but I believe that when Kubernetes resources are modified and the revision number goes back in time, that it would complain that the update was out of sync and probably wouldn't take place. But I'm assuming Velero just handles these things for us. Yes. Something to experiment with. Do you think he's talking about group versions? No? I'm not sure. Feel free to give us a little bit more context, Noel, if you want us to have a little bit

1:04:55 of a chat about it. And if anyone else, of course, is watching and you have any questions or wanna see something with Velero, you know, now is the time to drop in before we we finish up for this session. Okay. Is there anything you would like us to take a look at before we finish this session, or do you think that we've covered enough of the the capabilities of of what we wanna use Velero for? I think that is the bread and butter restoring. And then for anything besides that, what I would recommend is going through the docs

1:05:29 and, to figure out, for example, how do you do how to include and exclude things. We have a page for that. You can schedule backups, so you can just you just continuously be creating backups for you based on what you specify. What else? There is a page for contributing. It feels always welcome. Right? Oh, absolutely. There is a GitHub discussion where people can drop questions, and everybody is, we try to answer. Anyone is is welcome to try to answer questions. We do have a okay. This is not what you asked, but I'm jumping ahead. You

1:06:24 asked if I there is something related to the Velero tool that we should look at. You know, Velero is not complicated to use. That's that's about it. Yeah. I'm I'm really happy with how simple using the CLI was to get started. I like that it takes care of all of that, you know, essentially toil of actually installing it to my cluster. I didn't need to go find a Helm chart or worry about how to do that. It just kind of it just worked. And creating the backup restore, I mean, it's just I love it when things are just

1:06:58 simple and it's just easy to use. It makes my life easier, especially when there's something as important as as backups and restoration. We we we do want these difficult problems to be handled by the tooling. And I think Velero's doing a great job at that. So I think I mean, the only comments we're getting right now are a little bit of discussion about my terrible audio quality in previous episodes. So thank you for reminding everybody there. Yeah. And there's a resource page on the Velero. Website where you can find ways to, con like, get in touch with the team,

1:07:36 and, there is a community meeting every Tuesday, 9AM Pacific time. Everybody's free to attend and also drop any discussion topic on our HackMD file that's it's all linked on that page. Awesome. Alright. Let me pop us back and we will. There we go. So Velero handles backups and restores. We've seen how easy installation was. We've created backups and we restored. All you need is a SD compatible storage location or API. You've seen how easy it was to get manual there. I should have came prepared with that setup, but we got there in the end. I really I I think I went off

1:08:21 the page there a bit too early, but I love the schedule command. It just took cron syntax. I'm excited to start deploying Velero to my clusters now and taking advantage of this. I just wanna say thank you for joining me today. It was a pleasure to kinda go through this and get your insights into Velero. The 1.6 release seems to be really, really cool. I'm glad that that is out and shipped. Is there anything you'd like to share about what's coming next in Velero before we finish up? Oh my gosh. Blank. We are going to do some

1:08:55 so there is a set of tooling that's sort of new. It's called Carvel. It has some Carvel. Yeah. So y y t and KDB, and it's it's basically for deploying, building images, templating, and we're going to be compatible with that. And the why I'm saying that first is because I'm going to be doing that work. I'm just not remembering what else is. So it's on for version 1.7, but maybe we should have looked at so at the very root of our repository on GitHub, there is a road map dot m d file. And now that I'm reading it

1:09:38 yeah, so we're going to integrate we're going to make artifacts available so people who use those tools can install Velero, deploy Velero using those tools. And we continuous working we are continuing work to do e two e test test for Velero, which is awesome. We're going to do the the diagnostic tool that I mentioned. That's gonna be great to have. The c so we have CSI a CSI plugin for snapshots, and we're going to bring it to general availability. We are going to do plug in versioning Okay. Which is we don't have versioning right now for the plug ins. Yeah. I p

1:10:25 v six support. I know the upload progress monitoring is on desired items, but I think it's going to be implemented for 1.7. Yeah. That's what's on the list. Those are the big ticket items. Again, a quite a a number of of items there. So, you know, people that are watching or their interest has peaked in Velero, go and check out the repository. Hopefully, you can start contributing to a really cool project. Alright. Thanks. And yes. Okay. No. No. Go. Please. I I I No. We have an integration with the tilts. Some people might not know tilt, but there

1:10:59 is instruction in to set it up, and it is ridiculously easy. If you if you wanna run Velero codes, that's where I suggest you start. Even even I've been working with Velero for years, and that's what I use because it's the easiest and it's faster faster than anything else. Well, personally, I do have go around. I I do have previous videos with the tilt team and the carvel team. And strangely enough, Carvel video is probably one of the worst audio recordings on my side that I've ever done. So anyone that does watch that, the demo is very cool of Carvel. I just

1:11:33 Oh, I'm gonna watch it today because I I am I have to learn. No. It was great session. I learned a lot. The Carvel tool was is really, really interesting. So alright. Thank you again for joining me today. I hope you have a a really good day and a great week, and I will see you again soon. Thank you for joining me today. Same to you, and thank you so much, David. Thank you. Bye bye. Alright. Bye.

Meet the Cast

David Flanagan

@rawkode

Carlisia Thompson

@carlisia

Weekly Cloud Native insights

Stay ahead in cloud native

Tutorials, deep dives, and curated events. No fluff.

Documentation

Velero resources page

Velero Tilt integration instructions

Code

Velero ROADMAP.md

Additional Resources

The Podlets podcast

More from Rawkode Live

View all 174 episodes

Hands-on Introduction to Kueue

Hands-on Introduction to Kueue

Hands-on Introduction to Odin

Hands-on Introduction to Odin

Hands-on Introduction to Iroh

Hands-on Introduction to Iroh

Hands-on Introduction to Yoke

Hands-on Introduction to Yoke

Hands-on Introduction to sympozium

Hands-on Introduction to sympozium

Friday, January 23rd, 2026 - Chevron7

Friday, January 23rd, 2026 - Chevron7

More about Velero

View technology

Kubernetes Disaster Recovery

Kubernetes Disaster Recovery

More about Kubernetes

View all 173 videos

Hands-on Introduction to Kueue

Hands-on Introduction to Kueue

Hands-on Introduction to Yoke

Hands-on Introduction to Yoke

Navigating Kairos: Immutable Operating Systems with a Cloud Native Twist

Navigating Kairos: Immutable Operating Systems with a Cloud Native Twist

More about MinIO

View technology

MinIO, we won't miss you.

MinIO, we won't miss you.

Hands-on Introduction to Namespace

Hands-on Introduction to Namespace

Introduction to Thanos

Introduction to Thanos