About this video
What You'll Learn
- Build and run Firecracker microVMs by downloading binaries and starting VMs through the REST API socket.
- Explore Firecracker internals like one-process-per-VM architecture, jailer separation, and device emulation for networking.
- Compare full and dirty-page snapshots by capturing, restoring, and validating VM state during the demo.
Radu and Gabriel from AWS demo Firecracker from the ground up. They walk through firecracker and jailer binaries, booting a microVM over the REST API on a Unix socket, networking and device emulation, the Go SDK, plus a hands-on snapshotting walkthrough comparing full and diff snapshots.
Jump to a chapter
- 0:00 Holding screen
- 0:50 Introductions
- 1:22 Introducing Firecracker and Guests
- 2:45 What is Firecracker?
- 2:46 What is Firecracker? (The Elevator Pitch)
- 4:00 Why Firecracker is Fast & Minimal Emulation
- 5:18 Firecracker Use Cases
- 6:30 Installing Firecracker
- 6:38 Preparing for the Hands-on Demo
- 7:08 Hands-on: Getting Started & Downloading Binaries
- 8:52 Understanding Firecracker and Jailer Binaries
- 10:14 Hands-on: Starting Firecracker & API Socket
- 12:30 Running a Firecracker microVM
- 14:11 Firecracker Architecture: One Process Per VM
- 15:47 Hands-on: Downloading Demo VM Images
- 16:57 Building Custom VM Images
- 18:40 Hands-on: Configuring & Starting the VM via API
- 22:02 Viewing the VM Console
- 22:57 Logging into the Demo VM
- 23:42 Networking & Device Emulation Discussion
- 31:15 Firecracker API (Swagger) and Firectl Tool
- 34:00 Feature Demo: Snapshots
- 34:07 Transition to Snapshotting Demo
- 34:45 Snapshotting Demo Introduction
- 37:58 Snapshotting Demo: Initial VM Run
- 39:00 Snapshotting Demo: Modifying VM State
- 39:38 Snapshotting Demo: Saving the Snapshot
- 40:57 Explaining Snapshot Types (Full vs. Dirty Page)
- 42:05 Snapshotting Demo: Comparing Snapshot Sizes
- 43:00 Snapshotting Demo: Restoring the VM
- 44:27 Snapshotting Demo: Verifying Restored State
- 45:00 Getting Involved
- 45:06 Post-Demo Discussion
- 46:01 Contribution & Higher-Level Integrations
- 47:00 Q&A
- 47:47 Audience Q&A
- 1:01:12 Conclusion & Thank You
Full transcript
Generated from the English captions. Timestamps jump the player to that moment.
Read the full transcript
0:50 Introductions
0:50 Hello, and welcome to today's episode of Rawkode Live. Today, we are taking a look at Firecracker, a tool for micro VMs that is fast. Now before we get started, we are just gonna do some housekeeping. First and foremost, please subscribe to the YouTube channel and click the bell. This will get you alerts all future episodes of Rawkode live as we cover more and more cloud native technologies. Also, if you'd like to chat or you're not watching this live and have questions, you can join us in the Discord channel that is available at Rawkode.chat. Alright. Now, like I said, today we are
1:22 Introducing Firecracker and Guests
1:24 gonna take a look at Firecracker and to do that, I am joined by Gabriel and Rawkode from the Firecracker team. Hi there, Gabriel and Rawkode. How are you both today? Hello? We're good? Yeah. I always do that. I really need to get better. I I ask a question and throw it to people and then have them content to answer first. So, Gabriel, why don't you introduce yourself? Tell us who you are, and then we'll we'll hand it over to Rado. Sure. So first of all, hi, everyone. My name is Gabriel. I work as an SD here at AWS,
1:57 and I focus on developing Firecracker. And, yeah, as David said, today, we're gonna talk about Firecracker and look at the short demo on how to use Firecracker at the end of the stream. Okay. Okay. And I'm running on software dev manager. I've been with Firecracker since the start. I'm really excited about the the entire project and even more so about the the the kind of the space. I generally view what we're doing as a as a safe platform for for serverless containers and functions. That's just barely a space that's getting started. So I think it's going to be very
2:40 exciting for many years to come. Awesome. Thank you very much. Okay. So why don't we start with just a little bit about Firecracker? Like, what is the the elevator pitch? What what is Firecracker? So the elevator pitch is that I assume what we started with is that you can do everything you can with containers, but you can have multitenant isolation. That's basically what we built it for. And this is something that that practically everyone that uses Firecracker is enabled by. Like, the classic example are Lambda functions AWS Lambda functions. And the other thing that we're discovering now is
2:46 What is Firecracker? (The Elevator Pitch)
3:21 that by kind of having this abstraction layer at the at the virtualization level, very interesting things like like snapshot restore and and container cloning and things like that start becoming a possibility. And this Gabriel will kind of show a bit of that in his demo. So that's the I'd say the elevator pitches that were today, we're enabling multi multitenant containers. And then it looks like we can facilitate a lot of a lot of dense utilization and just very efficient use of hardware going forward. Yeah. I think when people traditionally think of of VMs, we we think of slow cumbersome
4:00 Why Firecracker is Fast & Minimal Emulation
4:03 processes, but that's not the case with Firecracker. Right? It's it's fast, it's late. It's, like, taken all of the boxes for things that we need with. Is it true virtualization that has does each of the VMs have its own kernel just like we would traditionally think of? So how do we manage to get it? How did you manage to get it so fast? We so it's there's nothing magic about it. So there's only two things to it. The the first part is that we just don't emulate much. And there's the of the more heavy devices, it's just network and and block device.
4:39 There's also VSOC for for kind of container control cases. So Firecracker itself is very small. We we do rely on the Linux KVM, which does a lot of things for us. And then the other thing is that, obviously, there's nothing stopping you from putting in a huge application and a huge OS in the guest. But but our for all our use cases, our customers just use very tiny guest kernels and kind of guest file systems and so on, which you you need to do both to to actually get the the fast and small effect. Yeah. So you mentioned that it's powering
5:18 Firecracker Use Cases
5:19 land or or or you can power that kind of use case. What other use cases are you seeing adopted outside of AWS? Is that something that people are sharing their stories with you? So there's some that we know about. Yeah. The there's a list on our website, so let me just go ahead and reach from that. Oh, I've got it here. You want me just to share share my screen? Yeah. You can share your screen. So it's, like, right towards the top is the third paragraph or so. So we've made the website open source so people
5:49 can add themselves to that list now, and that's, like, the alphabetically sorted list of people that use Firecracker. Of these, some are so Firecracker container deep, also an Amazon project. And then I think AppFleet, KoiWeb, OpenNebula, and and all the Weaveworks stuff, Firecracker and Ignite, those are actually, you know, other teams, other organizations that are using Firecracker to build their their software. And then containers and to some extent, also, Weave, Ignite, and Unique. And those are those are building blocks that, again, kind of feed into the ecosystem. All of these things are the use case
6:27 that I mentioned at the start. So, basically, multitenant containers or functions with very low resource overhead. Alright. Awesome. Well, I'm very excited to be playing with it today. So we're we're gonna run through getting started with Firecracker. In preparation for this, I have spun up Ubuntu machine on Equinix Metals. This is the bare metal Linux machine, and I can say hello here. This is a unmodified vanilla Ubuntu 20 o four machine. So we'll walk through the entire process of getting Firecracker working on this machine and cover some of those use cases that Raghu mentioned as well.
7:08 Hands-on: Getting Started & Downloading Binaries
7:08 So we're gonna work through the getting started guide from the GitHub repository, which people can find at firecracker dash micro v m slash firecracker. You mentioned earlier, Raghu, there's kinda getting started here in the read me and then there's more within the documentation, like, should we start? Will it be right on this one here? I think all the Gabriel address studies got a better grasp on it than I do. Yes. So the getting started section is a good place to take a look at how Firecracker works. And if you look at the that there's a sentence which starts with the
7:53 Firecracker binary, there's also another link there which says you can go to the quick start guide. But do go over the getting started guide first and Okay. So you want me to go to the quick start guide first? No. Start with the getting started section, and then we can go to the quick start guide. Okay. So I'm assuming we probably don't need to build it. We can just grab the release batteries. Yeah. Yeah. Okay. And this is AMD sixty four. There we go. I forgot how to curl for a second there. Okay. So we have two binary files here,
8:52 Understanding Firecracker and Jailer Binaries
8:56 Firecracker zero twenty four and JLR. Can you give us a quick TLDR on these on what these are? Yep. So the Firecracker binary is Firecracker itself, which is the VMM that launches virtual machines. And the jailer is you can record it as a safety feature, which we recommend to run together with Firecracker. It has the role of locking Firecracker inside the network test spaces so that you can have an extra isolation boundary whenever you're running multiple micro VMs on one post. The purpose of this isolation would be to, for example, mitigate the risks whenever sandboxes escape might happen from a micro VM,
9:54 and that can be, for example, try to bring down all the Firecracker processes. Alright. It's an interesting user and group ID. This is downloaded with as well. I don't think I've ever seen it that high before. I'll just ignore that for now. Okay. So we've got our binaries. Let's jump back to our guest. Start to gauge. We don't wanna build Sorry. Can I have one thing? Yeah. Of course. Go. So so one thing to mention here is that we definitely want we wanted to add the virtualization layer as an isolation boundary, but we wanted to keep everything that was already
10:14 Hands-on: Starting Firecracker & API Socket
10:31 kind of a standard practice in the container world. So that's why we also have this JLR, which applies all the usual namespaces and c groups and seccomp and things like that. So we're we basically kinda wanted to add a security layer, not replace all the usual ones that Linux offers. Okay. Awesome. Thanks. Alright. Let's click on this quick start gate. Think we grabbed the bang and erase, so we can skip over that. We also need to make sure that we have KVM available on our kernel. Alright. Let's just run a quick update first. Let's see.
11:21 Linux tools KVM. There's some modules. Yeah. Let's try the Linux Toast KVM. I don't suppose either of you know if that's correct before I start. Yeah. I'm sure it'll be okay. That was a nervous confidence in my voice, but I'm sure it'll be it'll be fine. It's added to service. What's it called? Okay. I think you can just check for the dev k, you know, file descriptor. Yep. So that's what that's what we need to see. Right? Yep. Okay. Yeah. Okay. Nice. We've got the binary and now we can are the access access settings correct for for
12:27 the I think for the KPM? I think there was a step to David is running as a route. So Yeah. I'm running as route. So I just assumed I'd be able to do anything. Sure we'll find out pretty quickly. Yeah. Okay. So, let me try and understand what's going on here. So we have to run. Is the Firecracker binary going to run some sort of daemon process that runs on the host? That's kinda what I'm guessing here by the fact that it's gonna create a socket for us to communicate with. Yes. I see you nodding. Okay.
12:30 Running a Firecracker microVM
13:10 Yeah. Exactly. Whenever you want to start a Firecracker instance, you first need to run that command line that that command there, which says Firecracker dash dash API sock. The point there is to create an UDS socket, which is used to instruct that Firecracker instance to, for example, set the number of ECUs, set the amount of memory for your micro VM, and all the all the properties that you may want to put into for that micro VM to run. So now there's there's an extra step that you need to do. You need to launch another terminal,
13:54 or perhaps you can launch this next TMX session and have Yeah. I know those terminals. Okay. We'll just leave that that process running. Yeah. And we'll pop over here. Something that may not be super obvious, the the general approach that we have is that there's one Firecracker process per VM. So there's not a single control process for all of them. Kind of part of the security architecture, there's a one to one mapping. Oh, nice. And all of all of the vCPU threads are basically threads within that process. Okay. So does that mean that the API
14:11 Firecracker Architecture: One Process Per VM
14:35 sock that we've actually specified here is kind of arbitrary, and I would just create a new socket for each of the virtual machines. Right? Got it. Yeah. Yeah. Yeah. Got it. Like a thousand VMs, you only you have a thousand of these to talk to. Cool. I don't I mean, if if we're feeling brave, we can spin up a thousand VMs on this. Well, sure. But maybe not necessary just now. I think I did throw some sizable hardware at it actually. Yeah. We've got 64 cores and 300 250 gig of RAM. We can run a few.
15:07 Alright. So Actually actually, it depends on what you run inside of micro v m. If you don't run a lot of user space processes inside of micro v m, you can launch a fairly large number of micro v ms Because the memory is not allocated before startup, each page fault allocates a small chunk of memory, which is added for that Firecracker process. Alright. Okay. So we have our second shell. Can I just copy and paste this? No. You skipped a step which says that you should download a kernel and root file system. Yep. You're right.
15:47 Hands-on: Downloading Demo VM Images
15:54 Download these. You're running a Next 86 machine. Right? You should be clicking. Yep. Those two. Okay. FSL. This is our kernel and we also need our Rufus. So I'm assuming these this this kernel and Rufus that I've got here are really just for kind of playing around purposes rather than something I would actually run. I think these are for, yeah, the for our guy for the quick start guide, and Yeah. They may be used in some of the tests as well. This is one area where we're working on now. We need to get a a more kind
16:45 of a comprehensive method of of posting test resources because they're starting to get more combinations of test cases. But, yeah, I think right right now, these these will do. Is there tooling available for people that wanna build their own, you know, kernel and root file systems? Is that difficult for people to do or is it just cumbersome but trivial? For for kernels, it's pretty straightforward as long as you take a look at our kernel config file that we we host on the on the repository. For the time system, it's a bit more complicated. We do point at the guide in our
16:57 Building Custom VM Images
17:30 in our quick start guide somewhere. But it's it's it's not completely it's it's not a nice root file system that you can, for example, use to store kernel modules. It's a simple system that you build based upon a container, and then you can use it to start up micro VMs. So to answer your question, we as I just said, we are working on this, and I think that pretty soon we will be providing a nicer method of of building root file systems. I mean, would it be possible to use container images as the root file system and
18:17 inject the kernel? That's exactly what we do, actually. Alright. So, for example, you can spin up a an I find container, install all the packages that you want, and then simply dump that that request to a Alright. Alright. Well, let me copy this. It's gonna be an ISH and a nice friendly bang. And I called this kernel and I called this riff. And then we have a image bucket URL then does a detection of is this trying to download the root fail system in kernel? Yeah. It actually downloads them. Yeah. Alright. Okay. So we could just save this.
18:40 Hands-on: Configuring & Starting the VM via API
19:25 Yeah. I I don't think we need that. Alright? We Yeah. We don't need that. Alright. Okay. Because we download it manually, we can okay. Got it. Yeah. The second step actually uses them. Okay. So that's it. Alright. We'll go through this in a second. Let me paste this in. I've already got a main. I'm gonna be horrible and call it main too. Okay. So what we have here is we're working at the architecture again. I need to replace this with kernel And because we already know the architecture, can just clean this up. Right? So it's a bit
20:16 easier. Really should have used that. There we go. Okay. So this is curling through Unix socket where we have our Firecracker daemon process running. It is using a put. Okay. So the Firecracker daemon is exposing an http style API where we can post the or put the images into. And then we specify is this just a kernel bit parameters. Right? Yep. Mhmm. Okay. I understand some of this. Can I just run main two? Yep. And we got two zero four, which is success, no content, I believe. So that worked? Yes. Now we should you should follow
21:14 the next step and set root file system. Alright. Fine. I started about pressing them anyway. So we called this change to root f s, and we're just doing the exact same again. We got a two zero four, that's good. And then this starts the machine, so there's some sort of actions endpoint where we can send commands to it. Alright. I'm not gonna bother saving this one to the script. This is no local fail. Yep. Done. They have a VM? Yes. But you need to switch back to the other terminal. Oh, look at that. So the reason why you had to switch back
22:02 Viewing the VM Console
22:15 to the other terminal was that because you launched the initial Firecracker process in this terminal here, the standard DNS, standard out handlers are still kept by this terminal instance. Okay. So so this is the serial console of that machine, basically. Yeah. I'm assuming as there are flags I can pass into that initial Firecracker call to just have it run detached there in the background? If you use the jailer, yes. You can detach it and output the serial console to death mode, for example. Okay. I'm assuming I can log in as Rut. Is that the next step in the
22:57 Logging into the Demo VM
22:58 docs? Should I pay attention? No. Okay. So what's worth mentioning is that, you know, there's nothing Firecracker is we're we try to make it kind of just behave as a regular process so you can do all the regular things. And in this instance, you know, I think the the dealer does more fancy things to start it in the background. But in this case, it's just using share the input and output connected to the terminal. So that's it. K. If I type password, will it log me in, or is it It's Root. Root. That that was guess too.
23:35 Okay. And this is just Alpine Linux? Yeah. Yeah. Yeah. You you don't have a network interface here set up. You need to follow another guide if you want to have a network interface set up. Okay. But I do have a full Alpine Linux virtual machine running on my bare metal. So that's pretty cool. Awesome. So out of the box then, Firecracker doesn't ship with any networking, so I'd have to configure this device to be able to connect to the network. Does Firecracker expose any hooks or extension points for, like, desks and storage or local mounts or anything like that?
23:42 Networking & Device Emulation Discussion
24:23 No. No. So so I I wanted to call out two things. So first of all, it's it it'll be worth looking at the memory usage or the Firecracker process now and maybe after after we do something in this Alpine. And then the other one is by itself, Firecracker just kind of gives you a a number of commands that you can then use to create tap devices and those connect those to a network and that you can use to map block devices and then mount those inside the guest. So we're we're basically the the the whole idea is to have
25:01 a very low level I guess, in in some term, low level API that that just hands over all the all the, let's say, the decisions to the local orchestrator on the host. Now there are there are high level libraries. There is a Go SDK, and there's a integration with Firecracker container d to kind of move things up the stack. But at these at this level, it's just connecting things to the HTTP API, and we'll have to create those resources separately. So, like, if we want a network, we need to create a tap device. If we want a file system, need to
25:33 create a file and format it as a file system and then and map it to the to the Firecracker process. K. Yeah. That makes a lot of sense. Definitely. Alright. Let's should I go back to our our docs? What this is saying here? Is that we have our TTY if I take this login and password route and route. When you're done issuing a reboot command and say the guest will actually shut down Firecracker. Okay. And by default, our Firecracker VM gets one virtualized CPU and a 28 meg of RAM. Cool. And so what was it you said there,
26:23 Roger? You wanted us to look at the memory consumption of the process? Oh, I think you've messaged yourself. It should be around less than what's actually configured due to the mechanism that Gabriel mentioned below. It basically only demand falls on the the needed pages. Okay. So what's our next step here then? Do we is there more I think that's just the building Firecracker. Right? Okay. So do we want to do anything with this Alpine virtual machine? Should we move on to a different tutorial? What do you think is best? So there's kind of two ways we can
27:09 go. We can go through the network and the block device tutorial to give it a network and the block device and do things with it. Mhmm. Mhmm. We I guess we can also kind of skip to Gabriel's demo. He's using the the Go SDK, right, to to show a snapshot use case. Okay. So there's quite a lot in the documentation page here. So Yeah. The maybe I can talk through a few other things. So the the thing that's helpful, just to kinda explain the That that would be great. So the as you mentioned with the network, the
27:49 we have kind of a tenant for for being very minimal and only implementing the very basic things we need. And for example, if you look at most container workloads and most things like function, like, function, They just kind of have used the network mostly, and sometimes they use the disk. And so we only implemented those two devices. And then we right now, there's very simple back end. So for network, it's just a Linux tap device, and you need to precreate that tap interface and wire it up, you know, as you wish on your host. And then
28:23 and then you'll have that network interface into the guest. You can have as many of these as you want, and then you you can create files and format them as as as file systems. And, again, you can map those as block devices to to a Firecracker process, and they'll show up as drives in the in the guest that you must then mount. Obviously, in, you know, in in real use cases, all of these are are done once and then, you know, happen every time. Yep. And then kind of you this is the primitive you use to set up your
29:01 for example, if you wanna use a container ecosystem, this would somehow be equivalent to run c. So it's, like, even under the kind of container d layer. And, hopefully, one of the things we're working towards is to make this seamless. So, like, if you wanna use it with containers, you'll never be at this level where we're at now. You'll you'll kind of operate from from the container d level. There's also a Firecracker tool in the in the in the Firecracker microvmorg, and, yeah, that's that's kind of the the perspective. And so our focus is just on getting the primitive capabilities
29:39 be really useful. That's kind of kind of why we mentioned memory. The desire is that you have basically memory over subscription by default, which is what you have with every every Linux process. You know, the OS manages the memory for you. Right now, this is half true. You we we only demand full page memory in, but then we never give the memory back by default. There's a ballooning feature, which kind of again, the the system users can use to deflate the memory of the of the Firecracker VM. And snapshotting also affects memory utilization a lot, and we'll see that
30:18 in in Gabriel's demo. And then the other thing is just to to kind of get out of the way and start very fast. Right now, of booting a simple OS takes a hundred and a bit milliseconds with that kernel, and then you're kind of in user space. Then depending on what you do in user space, it can take, you know, a second or a hundred milliseconds or ten seconds. Yep. And snapshotting should also re resolve this by kind of resuming to the point where you snapshot before very fast. And those are kind of the two core
30:50 properties that that Firecracker is basically built to resolve while keeping everything secure and then integrating up the stack, which is definitely an area we're still working with on. Okay. I can I ask a couple can I ask a couple? Oh. I can hear myself. I can hear myself. It's weird. Oh, it's gone. Okay. So I'm gonna ask a couple of really naive questions then. So if I were to just if I were to shut down this Firecracker deeming that is exposing this socket here, Everything goes away. There's there's no nothing saved. It's completely ephemeral. It just goes away. I
31:15 Firecracker API (Swagger) and Firectl Tool
31:34 spin up a new one. I get a completely fresh Alpine Linux. Okay. I'm assuming based on the kind of the socket and HTTP API that you're showing to me that if I just pull up the no. Maybe one of my weird brush main files there. Yeah. Here, when we talk about the networking capabilities and those, you know, creating desks that we can pass in, are those just additional parameters here on this or very similar to this? There's a different API for every resource. There's a file in in the docs that's kind of generated from from our API server,
32:16 which has the full API. Right. I think it's it's in Swagger format, so it should be ingestible by clients. Some of the existing clients, I believe, are generated from that. Okay. Yeah. That makes sense. Top. Sorry. Where was it? In the a I might be wrong, but I think it's there. Yeah. It's in No. You're you're searching for the slider file. You could just use the GitHub feature, which says go to file. There's a button on your right. Yep. Go to file. Yep. And type Swagger. Yep. And that's the one. Alright. So this is just a completely documented
33:05 API for that socket that has been available to us. Yeah. And there's a network resource there, which has all the networking options. Okay. You also mentioned there was a fire CTL, fire control, fire cuddle, like, all the different variations. Does that make working against that HTTP API a lot simpler? Is that just mapping gets, puts, deletes, etcetera, to make it feel more like like, your control, for example? It's the kind of application that you would use to simply start the Firecracker instance by giving the Firecracker binary a kernel and root file system, and then everything will start
33:53 up for you. So, yes, if if if you are wondering what it does behind the scenes is that it wraps all the API calls that you just did using the terminal. Okay. Awesome. Well, why don't we move over to Gabriel's machine and take a look at a demo that is hopefully a bit more interesting than my airplane one and But that was really cool. I love just how fast it is And I like the architecture with the daemon and stuff like that. I think it was just such a cool project. Okay. We can see your Versus code
34:07 Transition to Snapshotting Demo
34:34 now. Yep. So in this demo, I'm going to show you one of my favorite Firecracker features, which is the feature. So to to to show the demo, I'm using 200 line snippet of of written in Go. Could you just drag that window a a little bit bigger and zoom in the font a wee bit, please? Yep. And actually the whole the whole window just so it goes to the correct kinda aspect ratio. I think I'll just make it a bit easier. One second. Yeah. Perfect. That's the one. Okay. Thanks. Alright. So wrapping around the all the CRL request that
34:45 Snapshotting Demo Introduction
35:29 David did, I'm using Firecracker SDK. This component is one of the open source projects which you can find in the Firecracker ecosystem. So if you if you want to take a look at it, you just simply go to to the to its GitHub page and take a look at it. And, yeah, this is this is basically the all all that I'm using to to make the demo simpler. For the micro that I'm launching, I'm gonna use a few simplified assumptions, and I'm not going to customize the number of CPU or the or the memory size. I'm using only two DCQs
36:17 for my micro VMs, and I'm using four gigabytes micro VMs. And for the current parameters, I'm simply using what we put in the quick start guide. And these two files here are taken exactly from the quick start guide, and they're the same that they did use. And then I also have the Firecracker banner, which I'll beat myself from from the main branch. I'm going to skip over this code, and I'm going to look at the flags that I've added for the demo. So the first flag simply the branches in microphone given a UDS packet. That's what it does. You just,
37:09 like, test that socket and then a microgenomic response. And then as I was saying, we're going to take a look at the snapshot feature. So one the the the first snapshot related parameter is two snapshots, which saves the the the microgame state to the snapshot file. And then the second will load the microgame from the snapshot file. So yeah. That's that's the demo. One other thing that I wanted to add, the Firecracker request, the current version of Firecracker request doesn't support those notes feature, but I will add it as a PR after this demo. So
37:58 Snapshotting Demo: Initial VM Run
38:01 here we have the Firecracker binary and the precompiled launcher. So that's just well, that's just come. So this is what I showed earlier. So let's right now, we're launching MicroVM using socket file called one dot sock. Not very creative here. And, yeah, this is the micro view. As you saw, it launched pretty quick. I didn't have any timing features here, but, visually, it doesn't it didn't look like any major time or waste even the operations happened there. So let's do something this micro VM so that we can save its state. One very simple thing that we can do is to put
39:00 Snapshotting Demo: Modifying VM State
39:09 a message on the the message profile. So let's just say echo state one to dev key message. And now if we look at the kernel app, we can see the screen saved here. So, yeah, right now, we can save the micro VM state. So what was the syntax? So to save the micro VM state, I first need to tell it what socket is because all the HTTP request will be sent to this socket. And now I'm going to tell it to save the snapshot file, and let's just call that snapshot file state one. And so
39:38 Snapshotting Demo: Saving the Snapshot
40:15 for the for the two snapshots operation, I've added the the time measurement line, which simply measures how long it took to save the snapshot feature. And for this operation, it took two hundred and eighty milliseconds. It's not much, but it's not less either. We we will see that if we do another snapshot, it will take less. And I'm not going to explain why that happens right now. So we see again that it's one nineteen milliseconds. So let's go a little bit what happened when we created this snapshot. So Firecracker has two ways of creating snapshot.
40:57 Explaining Snapshot Types (Full vs. Dirty Page)
41:06 One way is a full snapshot, and another way is a deep snapshot. So a full snapshot simply saves everything in the micro VM. So, for example, if you run a full snapshot on a micro VM, all the memory contents will be saved to disk. What happens when we run a deep snapshot? Well, a deep snapshot simply saves the dirty pages that have been used by the micro view, which is why the second snapshot took less. So what's happened after the first snapshot was that we we created the file and all the pages were marked as clean.
41:52 And then all subsequent pages that were detected as dirty was safe were safe in state two. And I guess we can look at the sizes of the files. Right? Yes. Exactly. And one side effect of saving less information is that we save less we we write less data to disk. So if we take a look at the two memory files, LS simply tells us that, okay, both files are four gigabyte in size, but they're actually not because these are sparse files. And if we take a look at the files using the u, we can see that
42:05 Snapshotting Demo: Comparing Snapshot Sizes
42:39 state one document has one seventy one megabytes in size, and state two document has only 348 k, which is magnitude of part of less than than the state one. So that's that's the reason, you know, micro view from the first snapshot. So what we do we're doing right now is that we are launching the second micro PM. We could create a socket at a file called two dot sock. And we will going to be using state one on the for the first snapshot. There's a there's a a short simplifying assumption that they took here, which is the fact that we are dumping
43:00 Snapshotting Demo: Restoring the VM
43:43 two files here. One one is the memory file, and the second one is the disk content. So we've launched the second microgame, and we've restored it from a snapshot. The performance says that it took only fifteen milliseconds. This is on my laptop, and it also includes the overwork and that Firecracker for SDK purposes. Typically, in production environments, we target about nine milliseconds, and you can also see that in our test. I can show you the data. So now let's check that this micro VM on the right was actually started from the snapshot that was created from
44:27 Snapshotting Demo: Verifying Restored State
44:32 first micro VM. And we can do that by looking at the message, and we can see the state file the state one message here. These these messages are are due to the fact that we are using in front of from and timekeeping is not in in in sync with the other micro view. Oh, yeah. This is it. Nice. Thank you. Now I'm trying to stop the micro again. Yeah. And it starts. I can publish this and send you links later if you guys want to take a look. Awesome. That's very cool. I like seeing how that
45:06 Post-Demo Discussion
45:23 snapshot all comes together and how it's just how quick everything is. I think that's a really I think that just opens up so much possibilities for the applications of Firecracker for so many different domains that I hadn't even thought of before we come into today's session. Hold on. Let me pop this back over to. It's such a really cool technology. I'm curious then as what we've seen today is like Firecracker deeming as opposed to an HTTP API. I can communicate with that HTTP API to do all of this cool stuff. So this is quite low level and you kind
46:01 Contribution & Higher-Level Integrations
46:01 of touched on this earlier rather. So now the plan that this technology exists and that's already done all these cool things. Let's just the build all of these abstractions on top of it now. Can people get involved? Like, you know, you know, people watch this because that's really cool. I wanna help. Like, what's the best way for them to do that? Yeah. So there's there's already a few things that are started. So the Go SDK and the right now, at least from from from what we've seen, looking around in an integration with container d is probably
46:31 a really good avenue to make this available to more people. And both of these things are project that's that are already there, and people can contribute to them. I think there's definitely to dos. They're in the same kind of organization as Firecracker. So there's a Firecracker dash micro v m. There's other folks that have done other integrations, like Weaveworks has Weave Ignite. So I think that at this point, there is actually more than one answer. We're we're really open to everything. We're trying to support all of the all of the integrations. So, yeah, I think there's a there's a
47:00 Q&A
47:08 variety of of projects to to pick from, you know, on which to contribute. We're definitely open to pull requests for Firecracker itself, though we we tend to be, like, laser focused on actually not having a lot of features. So we're just kinda working backwards from our use case all the time. And if we don't need it, then we're we're not gonna build it. But, yeah, at the same time, you know, people there's a there's a bunch of activity now in the space of of virtual machine managers. So there's a bunch of other projects if if people kind of
47:41 like virtualization that they can that they can get involved with. Alright. Awesome. Alright. We do have one question that is not in here from Michael who's asking, what are the ways to pass a desired IP configuration to the Firecracker VM? I'll pass that to Gabriel, I think. Yeah. Sure. So I I'm trying to understand the the exact question. Yeah. Can I do you mind if I rephrase it? Michael, I'm gonna just make some assumptions on what I think you're asking. But I can imagine a situation where I wanna run multiple Firecracker virtual machines and probably have
47:47 Audience Q&A
48:24 them communicate with one another. So, like, is it possible for me to create a, you know, some sort of bridge network device and then have all of the Firecracker VMs have an IP on a pre allocated CIDR for that kind of network? Right. So first of all, you don't have a way of forcing the Firecracker micro VM. Actually, the the the user space inside the micro VM to have an IP a certain IP interface. You can only do that through configurations. So stepping back a bit, yes, as you you mentioned, perhaps, to create 10 type interfaces
49:09 and connect each Firecracker instance to one type interface. And from this point on, after the start of Firecracker, you have multiple options. You can, for example, have a running on that on that Linux bridge and have the request issues at at startup time. One other doing it is the MMDS feature. The MMDS feature, I don't think I've mentioned it so far. It's like a scaled down version of IMDS, which is basically a way of putting metadata inside Hyper VM. And then when you issue HTTP request to certain IP addresses, you will get both of that metadata.
50:06 And you can use this metadata, for example, to create a maybe a system d service, which starts at large time and get that information from from the storage. Alright. Yeah. I think that makes sense. You you dropped it a little bit there, but I think what you were suggesting is the first option would be to have a a tap device per VM and have it boot from a DHCP server and allocate the IP that way. And the alternative way would be to run a process via system d that could have a metadata API passing some identification and then get back an IP
50:43 address and configure the device that way. Yeah. So under under getting the actual IP address, you can be as creative as possible. You have lots of ways to do it, so you're you're not bound to these two options. Yeah. Yeah. Just have an NTP server and hash it, make up an IP address. That's what I'm gonna do. Exactly. So so as an example of something that's let's say, that we're using our test, but it's not entirely crazy if you wanna kind of shave milliseconds off the start time. You can you can basically use the like, some segment of the MAC
51:17 address, which you can configure when you configure the the tap sorry. The network device where it via the socket, you can give it a MAC address. And then you can, by convention, between the host and the guest, have that have some bytes of the MAC address or some hash of it with the IP. Only this in our test just because it was simple, but these I'm not saying this particular approach, but these kind of approaches will be often very useful if you wanna start super fast. The vision is basically that we get to the point where the fact
51:52 that you're getting VM isolation just doesn't get it in the way of your workload at all. So on on Gabriel's laptop, it was fifty milliseconds to start from a snapshot. As you mentioned, it's kind of nine on a on a on a server grade machine. And there's like, on ARM, it's even faster. It's two point five milliseconds. Wow. And, you know, we've done some there's some crazy optimizations that are nowhere near production. But down the line, we see this number going to one millisecond. And at that point, you basically this is just a VM. So, again, you
52:27 need to start the the things inside it will resume exactly where they left off, but we're actually just walking a few steps back. All of this is in that preview, so don't use snapshotting production. We need to it actually it actually works very well, like, as a technology, but we need to address the security issue, which is that that that the thing where that cable did where he started a machine from a state. You can do that a thousand times, and not only can you do it, but it's desirable to do it because you can
52:58 start, like, at a, like, pre warmed JVM application state. But right now, we don't have a easy solution for the security issues that would be introduced by cloning all the random things and just by actually, you know, having RNGs in the same state and so on. K. And we're we're working on that with the we have some patches for the system d, team that will allow us to then further down the stack or up the stack, integrate with OpenSSL and the JVM and so on, for these systems to kind of figure out that they've been restored from a snapshot and
53:34 do whatever it is they need to do to become safe again, but that's not there yet. So until that's there, this is like a cool demo feature, but nothing more. Don't don't use this in production. Okay. I mean, every time someone tells me not to do something in production, I always get an overwhelming urge to go and do it. But I will I I promise I will try not to. Okay. We for compelling. So so you can, like, literally start thousands of VMs in something like, you know, twenty, thirty seconds, and they're all up and you
54:02 go SSH to all of them. And they take, like, no memory because it only false the pages that are being used at that moment by the, like, thing running in them at the moment. It every all the memory that was dirty during view, boot, and setup and everything is not brought back into physical memory. It is it is very, very efficient. Nice. Okay. What I I I really wanna experiment with running FTD and a Firecracker with the snapshots and and spending that backup. That to me just seems like a really, really cool use case. But we do have some more audience
54:33 questions, so we'll run through these, and then I'll let you get back to your day. So, we have a couple more networking things, but Nerd has asked, is there an option to hook into IPAM utilities to provide IP addresses without DHCP? I guess that this depends on how you're building your service. I'm not exactly sure on how IPAM works. But if you build your service in such a way that you can access it from the micro VM, then I I suppose that you you could hook it up. Alright. We'll come back to that. I I don't
55:14 know if there is a good answer to that to be fair, so we'll move on just now, but thanks for the question there. Michael's back and asking if there were any docs on option two. So do you have any canonical implementations of that, like, system d metadata IP address thing? Is that something that you do internally, or is it the documentation? We don't have any any proof of concept that that uses it, but I I think that we can provide some information on how to do it. If you take a look at the MMTS documentation inside the Firecracker repository,
55:55 it's pretty well explained on how to create a request and how to get the metadata from there. And from that point on, whether you're using system d or OpenRC or any unit system that you like, you can adapt it to to your environment. Yeah. I mean, I guess I I wanna make sure that my understanding of it is not, you know, completely silly, but it's it's really just a script that runs on the VM. It has an HTTP endpoint that says, I am this thing. Give me some an IP address, and it's literally just configuring it as, a standard Linux interface.
56:32 Yeah. Okay. Yeah. Yeah. So so so, actually, other the the the thing that we mostly do is just try to create this very much not in your way virtualization layer. And after that, you have a Linux system, and you do what you will what you need with it. Yeah. And what this is, a conscious decision. We were very supportive, and and some of those other projects that I mentioned are kind of developed by peer teams of ours. But and we'll definitely altogether invest in making it easy to actually run, you know, containers or entire Kubernetes pods with my MicroVim isolation. But
57:05 for Firecracker itself, it's just like a box with strings, and then you can attach them to what you need based on use case. We're definitely gonna support any other integrations that are not, you know, from from, you know, from Firecracker container deal and all of that. So I think that's actually a positive that it's kinda more tedious to get started at this level, for sure, but you can do anything you want with it. And I think there's integrations with Nomad and things like that, which are not, you know, the the regular Kubernetes path. Okay. We'll just run through two more.
57:40 So we got a question here asking, could we use Firecracker and Kubernetes with Kubernetes networking? Is this something that we've ignite does already? Yeah. So I I don't have the full answer, but it's there's two things that we we've just done. There's Weave Ignite, which basically kinda has a docker like interface for Firecracker VMs. And then there's Firecracker, which I believe just does Kubernetes pods with Firecracker isolation. And then there's also Kata containers, which runs Kubernetes pods with VM isolation. And I think they have an option for Firecracker or Cloud Hypervisor, if I'm correct. I am I need I
58:22 need to check up on the full list, but they have a number of VMs that you can use, Firecracker included. So those are the I think, like, the strict Kubernetes integrations that I'm aware of. And then there's also, as I mentioned, Nomad, and I think OpenNebula is something in this direction as well just to get your workload running with VM isolation. Alright. I just wanna say that I I, you know, I love this technology, but you only need it if you actually like, if you have a single kind of single trust boundary in your application, then you
58:53 don't need to segregate like this. It's it's more trouble than it's worth. But if you have, like, a big machine and, you know, you wanna run workloads from 10 customers or even, like, teams in your in your own organization and you don't want the their, you know, their binaries of knowing about each other, then this is this is basically what we're working towards. We wanna make this easy and very, very secure. Okay. So we also got a question asking about any idea about snapshotting support in Kubernetes? I'm not entirely sure what's been asked there, but do you? Do you wanna throw an
59:28 answer at that? Yeah. I can so I there the only thing that I'm aware of is that there is a a project called CRIU, which is about kind of snapshotting the the runtime of a container. Okay. And we've briefly synced up with them on the Snapseed thing to make sure that the approach works for both. But other than that, I I don't know of of anything else. I think in the long run, we definitely want every that uses Firecracker to take advantage of this. I'm hoping this will become the standard because it is once we once we figure
1:00:02 out all the kind of security implications and solve them and make them easy, I think this is by far the best way. You literally get I think, you know, I think it's realistic to just start entire pods in a few milliseconds. You kind of warm them up once, snapshot them, and then you just have almost instant startup times everywhere. But I think that's a couple of years off at this point. K. Thank you for that. Nerd, I do see your other comments on IPAM. Hopefully, Gabriel can get to those after the show for comments or whatever or anyone
1:00:31 else that's watching that has that knowledge. We'll finish with question then, which is well, then a simple one, but is there any managed Firecracker and Deepak? Is that something that AWS offer, or is that just Lambda? Yeah. Yeah. So there's no there's no, like, managed Firecracker instances or anything like that. You'll be implicitly using fire you you'll be running in a Firecracker micro VM if you're on a Lambda function or for some types of Fargate containers today. And, yeah, there's as as I mentioned in the beginning, there's a bunch of other integrations with open source tools. You can take a look at
1:01:09 that. Alright. Well, thank you, audience, for all of those questions. We are gonna call that there, though. Raider and Gabriel, thank you so much for joining me today. That's a that snapshot on demo was awesome. Walking through that getting started was also great. Just seeing how quick that was to do. This is a really cool technology. Hopefully, people will have their interest piqued and will check it out. And please check out the GitHub repository, contribute where you can. There's lots to be done. Thank you again. Have a wonderful afternoon, and I'll speak to you, Boston.
1:01:12 Conclusion & Thank You
1:01:39 Thank you as well. Bye. Thank you.
Technologies featured
Meet the Cast
Stay ahead in cloud native
Tutorials, deep dives, and curated events. No fluff.
Comments