Overview

About this video

What You'll Learn

  1. Model cross-service waits with durable promises and shared awakable IDs.
  2. Make CRUD handlers idempotent by treating validation and duplicate writes as terminal errors.
  3. Deploy Restate as a distributed runtime with high-availability failover across nodes.

Jack Kleeman walks through Restate, a single-binary durable execution engine written in Rust. We cover context.run, keyed state, suspending and resuming workflows, context.promise, and live code a TypeScript write handler.

Chapters

Jump to a chapter

  1. 2:33 Introduction to Rawkode Academy and Guest
  2. 2:58 Introducing Restate
  3. 3:24 Guest Introduction and Motivation
  4. 6:15 The Challenges of Distributed Systems
  5. 9:10 Restate's Core Concept: Durable/Suspendable Functions
  6. 12:34 Simplifying Code with Restate's API
  7. 14:09 Restate's Architecture and Position
  8. 15:04 Restate's Rust Core and Language SDKs
  9. 15:50 Operating Restate: Single Binary & Performance
  10. 1:19:13 Preparing for Hands-on Demo
  11. 1:20:04 Restate v1.0 Release and Stability
  12. 1:21:28 Exploring Core Concepts: `context.run` & State
  13. 1:42:53 Durability Demo: Suspending and Resuming
  14. 1:49:58 Real-World Use Case: User Registration Workflow
  15. 1:55:56 Implementing the Workflow with `context.promise`
  16. 2:00:18 Host's Architecture and the Distributed Write Problem
  17. 2:08:56 Live Coding: Building a Restate Write Handler
  18. 2:21:11 Idempotency and Generic Handlers for CRUD
  19. 2:27:53 Restate Roadmap and Future
  20. 2:28:51 Conclusion and Planning Part Two
Transcript

Full transcript

Generated from the English captions. Timestamps jump the player to that moment.

Read the full transcript

2:33 Introduction to Rawkode Academy and Guest

2:33 Hello, and welcome back to the Rawkode Academy. I am your host, Rawkode. Although, to my wife and kids, David Flanagan, and to my neighbors in this office box, the guy that has been playing Linkin Park far too loud for the last forty eight hours. So welcome back. Today, are carrying on with a Rawkode live series in which we take a look at awesome projects and cloud native and wider space that help make building distributed systems easier. And today, we are taking a look at Restate, which is found at Restate.dev, and I am joined by one of the maintainers

2:58 Introducing Restate

3:07 Hi. How's it How's it going? Not too bad. Not too bad. How are you? Yeah. Just enjoying the day. It's a nice easy start to the week, especially when I get to sit in front of the camera and play with some awesome technology. But before we get into that, why don't you tell us a little bit about you? Sure. Yeah. So I'm Jack. I'm a senior engineer at Restate. I focus mostly on the cloud platform and the Go SDK and a little bit of the TypeScript SDK, which I think we'll be getting into today. I spent pretty much my entire career in

3:24 Guest Introduction and Motivation

3:38 cloud native starting off at Monzo where I was doing, like, security for Kubernetes, basically. And that was pretty fun because we have literally 1,500 services at the time, and I think now they're, like, 3,000. Actually went really viral on Twitter, like, four years ago for tweeting an image of all of our services talking to each other, and all these people were retweeting it being like, this is such a bad design and trends and failures are gonna destroy you, all those kinds of stuff. I'm like, oh, wait. The retry failure application. And so that's kind of what started to

4:10 get me really passionate about, like, making distributed systems work, removing these problems, distracting them away. After that, went to Apple. I worked at SRE. I saw what it's like to, like, really maintain extremely large complicated systems, like highly multi tenant teams, teams that maybe don't do always the right thing. And that was really fun as well. Yeah. That that led me to Restate where, I guess, our mission is to try and make it easier to build complex interrelated systems. I mean, even when you have two processes running on the same machine, you have an system that can fail in

4:46 their communication. Then you have two different machines against even worse and two different data centers in two different countries. So many companies are struggling with just splitting one process into two, and I guess Restate is trying to make those things a lot better than they were before. So yeah. That's funny. So is the TLDR from that that we should just build monolithic applications and deploy them on a single box? I think if that box is a mainframe, then yes. I think absolutely. I mean, I guess for me, like, the pitch I always had at Monzo was Microsoft's

5:20 weren't solving a tech problem there. They were solving, like, an organizational problem. Like, they made it so much easier to have teams that don't have to, like, have shared design reviews or have, like Yeah. Shared, you know, complexity, a lot lots of concern about, shared libraries and things like that. Instead, we just had these really simple RPC contracts, and teams could just develop really quickly independently. And that's what got me hooked on this. And, yeah, to make that work, you need this, like, incredible platform team, and they were they were so good, the team of the Monzo.

5:53 And and, yeah, that's what got me passionate about. Like, if you can make the platform good enough, then you can get it to the point that it's almost as technically good as being on a mainframe, like massive IBM machine. But you get these massive development developer experience advantages if you can allow teams to operate independently and so on. So, yeah, that's my mission really is to make to make this better. Nice. I love that we have very similar mission. I mean, I think we've both been on this cloud now space for a while. And I always tell people that microservices

6:15 The Challenges of Distributed Systems

6:24 microservices are are really simple. That's the whole point of them. Right? You're supposed to be able to fit them in your head and be able to rewrite them rather than change them, but that complexity doesn't disappear. It's it's been pushed down to platform and infrastructure layers where they have to build a lot of tooling and automation and remediation. Everything has to happen there. But as in the let's make developers' lives easier. Let's have them focus on what they enjoy doing. And I think, yes, there's there's a lot of trade offs of every approach to building software.

6:53 But I think in 2024, when we have things like Restate and other tools, as we'll see today, these harder challenges are getting easier too, which is wonderful, and I'm excited to share that with people. Absolutely. Yeah. So the methodology that it seems to exist, like, in the world and what I've always experienced is, like, you have some kind of mismatch of, like, k v stores, queues, RPC frameworks, retries, and the end result can be, like, really reliable and really good. But that's quite a lot of mental load for engineers, and they have to worry about things like added potency

7:27 and just lots of new concepts. So my hope is that we can start to unify some of these tools together and just bring the code back to being 95% business logic. You can never pretend that the distribution system's not there or, like, the network isn't there. That's it's it's fantasy. But can we get much more of the proportion of the code to be business logic and not instantiate Kafka client or, you know, is this a retry or check the token against the DB to see if we failed? These sorts of things occupy a lot of time

8:02 at microservices businesses, and that's that's what I like to sort out. Nice. Now I I don't wanna spoil too much for the people that are watching, but I will clarify that I am using Restate right now in the Rawkode Academy infrastructure. And the first thing we deployed was turned out to be the easiest kind of code I've ever read in my life, but it's actually quite complicated what's happening underneath the hood. And it's what when people register for the Rawkode Academy website, we also create a user account for them. But we don't activate their account until they've

8:32 verified their email address. And we were able to hook that up at one lane of, like, recent magical magic that just kinda sits there forever and says, I'm gonna wait until you've proven to me that this account is active. It is astounding to see. And Yep. We're gonna show And this is all running serverless as well. Right? So when we say wait, what do we really mean? I guess we'll get into that, but that's that's the exciting thing. Awesome. Alright. So I've went on a tangent already. Let's back up a bit for people that are

9:00 watching, and they're still like, wait. What what the fuck is a Restate? Like, maybe we can give them the the patch. What what does Restate deliver for the average developer that wants to build fault on distributed systems? Yeah. So guess I'll start with the sort of the primitive that we offer, and then I'll I'll maybe I can go into, like, how it fits inside the infrastructure. So so what we offer is suspendable functions. And what I mean by that is code that can save its progress kind of at any point, and resume later. So if you've come from

9:10 Restate's Core Concept: Durable/Suspendable Functions

9:30 a Go background, this might sound a little bit like a distributed coroutine. If you come from a JavaScript background, maybe you might think of this as being like a distributed async run time or, like, async event loop. And, yeah, this is a little bit of a new concept, but the idea is that when your program fails, for example, like, the machine dies or crashes, we can resume from that point onwards. But it's a lot more interesting than just failure because there's also this idea of, like, sometimes code, it's a lot more helpful if you could just think of it as waiting for

10:03 something, like, for a long time, like, even for a week. And you can't really write code that way because you're guaranteed to hit some kind of infrastructure failure if you wait for long enough. Or if you're running serverless, you're being billed by the second. So the fact that the code can sort of invisibly or somewhat invisibly save its progress, shut down, and then resume at that point at a later date when the thing it's waiting for is done, that's really useful primitive. Sounds like that's the one that you were using, which I mentioned we'll go into. But,

10:33 so that's that's what we're trying to offer. The way that it works is by writing down more. So applications right now, are not particularly stateful, or we don't see them as being super stateful other than what they save in the database. And saving something in the database is something that engineers have to, like, deliberately do. And kind of the more you write down or the more you say, the better. And when we write to Kafka, we are also writing something down. I mean, it it it may not feel that way, but Kafka is like a durable

11:00 log and can actually store messages for a really long time. Right? So we are making it much easier to write things down. For example, when you make an RPC between two services, that's gonna be persisted. When you do something like, create an item potency token, that's gonna be persisted, and lots of other things as well. And we try and do this without really changing the way you write code at all. The end result is that the program can be stopped and then resumed and all the previous values filled in and then continue from where it

11:35 was. And that's what allows us to do things like sleep for a week or or wait for a user input for a year. I mean, these sorts of suspendable properties are really helpful, but it's also really helpful for failure of any kind. I mean, obviously, some types of failure, like user failures, like or, like, the infrastructure just isn't running. These things are difficult to deal with, but things like the network failing or the user closing the browser tab or these sorts of things. By writing more down, we can allow complex workflows to resume from the point to which

12:04 they failed. And this makes it a lot easier to reason about idempotency because if you think about, like, 10 steps and you get to the ninth, if we then start from the beginning, making that whole thing idempotent is quite tricky. But with Restate, you kind of only have to worry about that last step. Like, maybe it could be run again if it, like, failed before we managed to save the result, but we know that the previous steps that were saved, they will never run again. And that's makes things a lot easier for engineers. So this is sort of Can I challenge

12:34 Simplifying Code with Restate's API

12:34 one of the things you said there? Yep. You said that it doesn't change the way that we write code. And I've actually found no. I'm saying I'm gonna say you're wrong, but I kinda am. Right? I've actually found this made my code a lot simpler because we have this linear flow of what things are supposed to do, and I found myself I've not had to reach for, like, state machines and complex modeling of these situations anymore because we are able to map things as a linear progression of events. And I have found that things are are

13:03 just a lot easier. I'm not sitting there going, oh, I'm in this You're right. Waiting for activation, and I'm not having to worry about item potency of the first nine steps. You're right. Like, that is a challenge, but that that suspendable linear progression does change the way that we write code, but for the better, I drastically So I think what I really mean is you don't have to write the code much differently to if it was just running on your laptop. And, for example, waiting for input, it was just the process was just waiting or sleeping for a month. It was just

13:27 sleeping, you kept it on under your desk or whatever, which is a perfectly fine way to write software. But the moment that you go serverless or really into any, like, Kubernetes environment, all these things become somewhat impossible. So that's the world we want. It's, like, super imperative, like, coding on a Commodore sixty four. The thing that the code says it's doing is what it's doing. The right way to do it without Restate, as you said, is, like, actors and queues and complex systems, and they do change the way you write code. And there's absolutely nothing wrong with

13:59 them. They are amazing tools, but our hope is that we can distill down their properties with a different API. And maybe that brings me on to the position of the infrastructure because, essentially, we do look a little bit like a a message queue. We look a little bit like a Kafka. We're a stateful message queue. It's a streaming system, and we have similar properties to if you split your application up into lots of event handlers that, you know, every single step is its own event handler, and you would just kind of hop between each step with the

14:09 Restate's Architecture and Position

14:30 state that needs to be sent between each step. And that, by the way, is exactly how we built complex distributed systems at Monzo, was, we would just put everything into handlers. If it needs to be reliable, it's in another handler. And that is really a lot lot harder. So, yeah, that's what I'm excited about is is maybe you can just write, do this thing, then do this thing, then wait for this thing for a week, then sleep, whatever you wanna do. And it it would just look like the simplest code, but has different properties. Alright. I'm I'm really disappointed that we're fifteen

15:04 Restate's Rust Core and Language SDKs

15:04 minutes in and no one has said rushed yet. So I just thought I'd get that out of the way. I try not to go that long with it. That's true. But we get we get criticized for for advertising that we're for Rust. And part of the reason for that was that until about this week, we didn't have a Rust SDK. So now we do. We can say that we're There's a Rust SDK. That is as of, like, right now. Yeah. Yeah. Okay. That's just changed the entire session. Okay. Well, maybe we should talk about that

15:31 as well. But we have been criticized in the past because it's like, yeah, you're building with Rust, but you're not building for Rust people. Okay. Well, now we are. But but it's true that our first SDKs with Java and TypeScript, now we also have Go, Python, and and Rust. So things are expanding expanding fast on that front. I think there's obviously a lot of frustration in the developer community where people are rewriting things in Rust that maybe don't need to be re rewritten in Rust. But I think something like Restate should be written in Rust, but it doesn't necessarily mean the end

15:50 Operating Restate: Single Binary & Performance

15:58 consumer is gonna be written all their stuff and Rust. And I think I think they need to know TypeScript is yeah. Yeah. I don't think they need to mind, but I think the the really exciting thing about Restate from, like, a operator perspective is that it's like a single binary. It's really easy to get running. Like, you can run it on your machine. That's not like the dev server. When you type restate dash server presenter, you're running, like, the actual thing, and it's once exactly the same way on your laptop as it would when it's

16:23 running on cloud or when it's running, like, on the AWS box or whatever. And I think that's that's cool, and and Rust gives us a lot of nice properties there about it being very low dependency. And we don't we don't ever wanna have to rely on, an external, like, etcd or something like that, partly because, you know, several of us at Restate have had a really negative experiences running complex distributed systems that have, like, 10 components and and rely on, like, other distributed systems as well and so on. So we want it to be so easy

16:51 to operate, and we also want it to be really, really, really fast. Like, we want complex workflows to have single digit millisecond overhead from all of the writing things down. And to do that, you need to build a new kind of streaming storage system and making that distributed and fault tolerant and still really performant. It's not very easy. And, unfortunately, I have some smart people working on that. And they are using Rust, and thank thank goodness for that because I think it's it's a big help on performance for sure. And I can't write C plus plus to save

17:23 my life, so I don't know what else we'd be doing. Yeah. Rest is a great language for it, especially with SPTK and IO integration. It's like right into Linux disks. It's just never been faster. There's a lot of really exciting work going in that space. Yep. Now I'll just I'll call it out because you didn't. You know, I'm gonna say all nine out of ten nine out of 10 times problems with distributed systems as etcd. Like, it's not set for purpose. We should not be running it. I don't know why we're in this situation with Kubernetes. You're right.

17:50 Even better is when you have your DNS backed by then you're really winning. But, yeah, I mean, god, it's such a nightmare to run. And I'm so glad that we don't rely on anything like that. Zookeeper is worse. I think it's fair to say. At least Kafka is fine. We got off of that one. Yeah. I I do like that Kafka when start using Kafka for that rather than think that's why I I stopped using Kafka because of those Zookeeper stuff. I had too many problems. I I started looking out there. Red Panda is a great one. It it

18:15 requires Zookeeper. We've taken a lot of inspiration from them. They're also obsessive about this zero dependency, simple single binary, this kind of stuff. This is really, really awesome. Except they went c plus plus and you went to rest. So Yeah. Exactly. Alright. Russell's in the chat saying, so this is the async of event driven programming. Right? Absolutely. I often use this expression distributed async await, but it doesn't mean something to everyone. And I think this is something we struggle with in Restate because depending on what language you come from, I should probably be using different words.

18:48 And that the SDK is also, of course, different depending on the languages, but we're not like, the thing we're really passionate about is the runtime and, like, the thing that makes this performant and successful. And we want the SDK to be whatever makes sense to our users, but it help it hurts us in communication because, yeah, async gateway turns off anyone who doesn't like async gateway, which is, like, I don't know, half of engineers, maybe. Alright. Well, I am gonna get my screen ready for us to start taking a look at the docs and and show people where they

19:19 can get started with us as well. So, yeah, thank you, Russell, for jumping in the chat, and then someone confirmed that. I don't know if Ahmed is maybe on the Restate team for Yes. He's a he's a principal engineer working on the runtime. If we have any really complex questions about the distributed architecture, you know, maybe he could reply in the comments. And do we have anyone here that's gonna tell us how to work the Rust SDK? Because now that's the only thing I want to do. Definitely not me, but I'm never trying it so far. I have no idea who's watching.

19:52 No one hasn't ever changed it. I thought let's let's try. We'll do a second one. Don't worry. Alright. I am gonna throw up the screen share, and we can just talk about how people get started with that. I mean, it's our NF and I've messed up. We haven't covered. Like, how do you feel that we've given people a good overview of what Restate is and what are levers? Yeah. I hope so. Righty. So here is the Restate.dev website. It is one point o with a cloud offering and seed funding. I mean, it feels to me that Restate

20:27 is about two weeks old. I know I know it's not because we've been chatting for a while now, but it's been I mean, you did not waste time getting to that one point o, I've gotta say. Like, a lot of projects these days, especially in cloud native, I feel like they've been around for ten years and they're still in zero point twelve point one seven six. Like Yep. Clearly, you said a mission and you're like, we want to deliver best. This is what we're gonna deliver, and we got there. Maybe you could talk about that. Mission.

20:50 And and really what we know is about stability promises, and that's that's really scary and stressful thing. It's an early stage startup. It's like, are we sure that our feature surface now is the one that we want to, like, stand by for I don't know how long. But, yeah, we we have to be stable, really, to persuade people to rely on us for them, like, most critical use cases. We're not like a monitoring tool. If it goes down, would be really annoying, but you know or if it changes the API, it would be annoying, but fundamentally, it's fixable.

21:19 We are asking permission to be one of the most critical components of infrastructure, and that's why stability is so important. K. Awesome. So if we scroll down, you're kind of already summarizing a few of the main use cases. We've got workflows as code, and we can see a nice example here. I'll quickly work it if I can see what that's doing. So you're creating a new Restate service, and there's a handler. And then maybe we can just give people a bit of context on what context dot run means. I mean, this is how Restate handles, what, side effects? Is that the right

21:50 way to put it? Yeah. So things that you want to happen ideally only once. And once they've happened, we'll save the result, and we would guarantee that on failure or, you know, anything like that, it won't need to happen again and its result will just be filled in. Nice. And there's also the Java example sitting here front and center, which has already hurt my eyes. But, you know Yeah. That's the void. Yeah. I think that's an important I think that was a really clever I don't know who it restates, said list, you know, like, at Java

22:19 early. But, know, the enterprise world and the Java world have been doing BP and m and workflows and event driven systems and actor driven system and all this for decades. Like, this is not new paradigms for them. So I think that's a really slick move to just support that world completely. Also, all of my founders, they built Apache Flink, which is like a streaming system, which is all written in Java. So thank god Restate is not written in Java because I personally really, really, really hate Java. But these guys really know how to write Java, they

22:51 really understand that world. So I think it was it was always a big priority for us. I think Java is is I don't think I can't personally see it ever getting back into developer hands. Like, developers are front forward future facing, front facing, whatever you wanna say. It just feels like legacy to me, but then I am wearing a go t shirt. I am writing Rusty and Day Out. I am playing with shiny new languages. You know, I could sit here and talk about pony lang for the next six weeks, and nobody would even know what the hell I'm talking

23:17 about. But it's a super cool language, even if slightly academic. And And I don't know if Java can make is where where the work is getting done for most of the world. And, and I guess for us, like, the more traditional, your software systems are, like, the more excited we are because I think we have a huge amount of value to people that are cloud native, many of those people have already learned how to solve these problems that I'm discussing. I think they can solve them easier with Restate. But there's another group of people that it's like,

23:45 they are actually struggling to get off of mainframes or they are struggling to get off of monoliths. And for me, like, that's even more exciting because there's so much value to be unlocked there if we can help them do those things. Alright. Awesome. I'm gonna pick one more randomly just to see what code it gives us. I mentioned state machines, so let's pop that open. So this one actually has a state machine example implemented with a switch case. Not the worst way to do it. So there's actually something coming out this week. I don't know if you you've used xState, The

24:17 Yeah. I love x state. I use it all of them. So we have made it possible to turn any x state state machine into a Restate service, and it can run serverless. So you can have these, like, virtual state machines running in the cloud invisibly and move between these different states, and we will just store all the results. But the the critical primitive here that you can see is context dot set. So Restate has a type of service called a virtual object, and those objects have a key. And therefore, in a sense, they have, like,

24:47 unlimited number of objects. Each one kind of has its own state, and that's durable. That persists beyond the handler. I mean, it can exist forever. So you could use it where you would use Redis. You could even use it where you would use Dynamo or, like, any other database just to keep how you store, essentially. And what's cool is you have, like, guaranteed lock over that state while your handler is running, and that makes a lot of things a lot easier. But because it's specifically on that key, you're not, like, locking a whole table or anything like

25:19 that, and that makes that makes this quite simple to reason about when you're dealing with, like, distributed state. So for state machine, it just means you can just, like, write the various fields down. But if you wanna do things really fancy, we now have this x state integration. And and there, what we do is we just basically save the whole state of the x state state machine to a JSON blob, and then we rehydrate it whenever the next event comes in. That's awesome. The reason I like Restate is because the company behind it has the the ability to

25:48 actually visualize the state machines in a really nice way, which is great for documentation. Absolutely. And I think we we would also like to have more in this regard because Restate has a lot of information about what applications are doing. We understand these steps, you know, the run steps and also the calls between services. You know, it's actually quite hard to create a call graph between all of your services. You can kinda get it out of this deal and void whatever. But, like, we have all of this information, also what you're storing and reading from the

26:16 the key value store. So we think we could do a really cool visualization. So you're gonna provide Amonzo Death Star as a service for people to allow them to have their own image of their architecture. Yeah. Maybe the cool graph, but also, like, for individual implication, what happened? Like, what did it do? Where did it fail? A little bit like step function. This is what people love about step functions. They just hate everything else. Yeah. So that's what I would like to have for sure. Alright. Before we we start typing code, there's just one other thing that you said that

26:46 I wish was interesting. Right? And you said that there's locking that happens on the key level. So does that mean being built in rush, you're hooking on to, like, I'm assuming Axiom or some of the other actor primitives there to give every key message box with like, if I say update this thing and then another request comes in, this is update this thing. You're not dismissing the second one. It's getting queued and then executed after. Right? Okay. I think the right way to think about it is it's as if these RPCs are going through Kafka in both directions.

27:21 But the critical difference, and really wouldn't work very well with Kafka, is that Kafka would have partitions to have many keys. Mhmm. And a single stuck request or failing request or whatever would block the whole partition. And you might only have, I don't know, 64 partitions or something. So Restate, we don't have header line blocking like that. So you can only block the key, which would be the user ID or whatever. So a user can be in a bad state, but not a whole partition. And if you ever managed Kafka, you would know that these

27:49 block partitions are, like, bane of your life. Yep. I've been there. So, yeah, so that that's what makes it kind of possible for for us to not have these, like, nasty side effects between different users. So, yeah, it's a it's a pattern I've actually seen is that people have an RPC framework where all requests and responses go through a queue. And that that's essentially the primitive of how it works, and it gives you this really nice linearizability over a particular user. Like, for a particular key on a particular service, they will process every request in sequence.

28:20 And that that actually just makes life a lot easier. It doesn't, in my experience, have a significant performance impact. And we also have these shared handlers that get a read only view of a snapshot of state, and they can run concurrently. And even if you're not doing things with state, this just gives you, like, a lock primitive, basically. It gives you, like, a little bit of a read write lock primitive, where you can have some handlers that will always run only one at a time and some handlers that can run concurrently, and this is just

28:49 generally, that's all you need, I find. Awesome. Alright. I'll show off the nice diagrams, talk about Rust a little bit more, and then let's get into the docs. Okay. Alright. So our plan today is really to get people a taste of how they get started with Restate. What that local developer experience looks like, how they write their first durable function, and then we'll try and later on a little bit explore new things. So especially trying to get the concepts and the primitives in there so that people can just understand that lingo and then Mhmm. Hopefully start to build

29:22 something. Sounds good. And we said we'd start with TypeScript. I mean, is there a magic link for me to get a Rust SDK? I'm just asking. I'm not saying we do it. Not in the docs, but, you know, maybe soon. I'll send you something. Right. Okay. Now, from my end, and I should have done this in advance, is I will need to set up a couple of things because I am on Nexus and I can't just download planaries like an idiot. Of course not. I really yeah. The daily experience here, I imagine. No. I mean,

30:01 within my own repository, like, I have my I use dev env, which gives you this wonderful load next file. Mhmm. And I use Progen, which generates new services for me with everything I need. So it's just I've got it all down. But I don't like to do too much in advance before these sessions, but then this is important for the audience. This is just me being annoyed. So and I'm gonna enable TypeScript and JavaScript. We actually don't need anything else from next. And what I love now why have I got so many spaces? Oh, there's my mono

30:33 repository because I was working on it this morning. Mhmm. And then I hit return, and now I should have node. Nice. But I won't have NPM because I forgot to do NPM. Now, normally, I would use button, but I'm not gonna rock the boat that much today. So It does actually work, by the way, for the record. I I I just think bun is such a pleasant tool to Yeah. It's so much easier to just do bun run for everything. Yeah. I use it also for demos a lot. Yeah. But I wanna the docs are using MPX,

31:10 so I wanna just stick with what the docs are doing. Yeah. Keep it simple. Yeah. So this gives us our first example, which is just gonna create a greeter function, and we can just run these commands like so. There there are a whole bunch of examples and templates. Right? And if people want to see what else is available, they come to this examples repository? Yeah. Exactly. And the templates are here. So Yeah. So, also, you can get these templates from the Restate CLI. So I guess we'll download that a little bit later. But once you have

31:43 that, you can just do Restate Example, and it can give you, like, a starter TypeScript repo or it can give you a starter go or or or really anything. And that that's often a good way to get started. Nice. Alright. Let's see where we are. We've got stuff and things. I've got my code. And if I pull up this template here let me oh, I mean, that's the only way to do it. Just change the font size. 24 doesn't feel enough today. Yeah. It's better. It's good. Do you wanna just walk us through what's happening here? Then we'll run it and give

32:18 people an idea of what's happening what what it looks like. Absolutely. Yeah. Okay. So, yeah, let's start at the bottom. So we are creating a new Restate endpoint. What we really mean by that is, like, essentially a process running on a port, but actually behind that, there can be any number of services. This This is a little bit like a sort of gRPC server, and all of Restate has these, like, RPC stuff semantics. And we bind to that endpoint a service, which we create in line. So we mentioned before that there are virtual objects. There are also services which are unkeyed

32:52 and don't have state. And then there are also workflows, which are kind of an interesting special case of virtual objects that can be very useful. So we create a new service, and we define it as having the name greeter, and we give it one handler. And that handler is greet, and the greet handler a handler really is just any function that accepts the context or for a staple handler. It might be an object context, and any input parameter can be a string, can be an object with fields. It doesn't really matter. This is all gonna be encoded by

33:24 default with JSON, but this is also configurable. And then handlers can just return or throw. So it's quite a simple way of defining handlers. It's probably recognizable if you've come from, like, the tRPC world. Yep. And through TypeScript type magic, exactly like tRPC, you can have clients for other services and the types kind of all work, which is really nice. And actually, in many ways, it's a better experience than almost any other language we have an SDK for because TypeScript is so powerful for this Yeah. Which is really cool. So, yeah, that's it. And then at the end, we're listening on

33:56 ninety eighty, which is the port that we we tend to use. And the way Restate works from architecture perspective, I I recognize I didn't say this before, Restate calls your services. So it's not quite like a queue in the sense that you create workers and, like, pull down messages and process them. You can actually run your pods, like, kind of fairly normally, including behind load balances, and Restate will just call them over HTTP two. But you don't call your services directly. It needs to be called through re the, like, the Restate binary because Restate is doing

34:26 this, like, magical stuff behind the scenes. And and in fact, you can have a long lived HTTP request to Restate, and it might make lots of short invocations to your underlying service. It might be making Lambda invocations. You know, there might be thousands of such invocations, but to you, it just looks like a sim simple, you know, HTTP request. So that's that's roughly how the protocol works. Yeah. In this case, we're serving on ITT. Thanks. So I'm gonna just copy this because there's one thing I think we'll say, right, is that Restate isn't particularly opinionated in how you structure your codes. Like

35:06 like you said, you write your code your own way. Right? So if I call this Scotland, I will need to obviously go and Restate again, is that you can just structure your code however you want. I'm exporting this function. I can then import Scotland, and then we can just layer on these bindings, right, to add of course, many services as as we want. So Yeah. Absolutely. Or you can indeed you can define the Restate dot service in the other file and import that, all that sort of thing. The only thing you need to be cautious of

35:41 is when you are trying to just have a client for a service, you generally wanna make sure you're only importing types and don't import the whole file because otherwise at runtime, you could be importing all of the handlers, which maybe doesn't matter. But if you have, like, a large mono repo, it will start to matter. So this is actually also quite similar to CRPC the way we've approached it. Alright. I'm just gonna make this different so we can call these. Mhmm. Well, I don't know how to get a whiskey emoji. There we go. I had to giggle it.

36:19 That's unbelievable. Yeah. The movie picker wasn't working. Idea. I wonder if I could just have said whiskey emoji, mister AI. Woah. Okay. That's very cool. I don't have any AI systems. So Yeah. Copilot is fantastic. I happen to in to explore other ones, but because I was writing Versus code, I just Mhmm. It's it's For sure. Okay. So let's go back to the documentation. Right? So we did this, and then all we need to do to run this new service is to do an NPM run app dev. Now the app dev, I'm assuming, is just a

36:59 script in our package. Yep. Yep. It's using TS node dev. K. So NPM NPM run app dev. Now is is that enough? Are we are we are we working? No. Because we need to run Restate. Otherwise, it would be quite a magical library for all the things that I said without running anything else. But, unfortunately, no. That's that's another thing. Right. Well, I think this is this is cool for people to to understand. With Restate, you need a Restate server, and then your your functions, your services, your workflows, those are all deployed by you where wherever

37:37 you want. I use the workers. But yeah. Okay. Exactly. Yep. And yeah. So in that sense, we can sometimes look a bit like an API gateway. So or a little bit like a service mesh even. So but I just sometimes people really don't like these these kind of concepts, so it's it's I never know whether to mention them. But something like a durable API gateway or a durable service mesh might be a good way to think about Restate. Alright. So now it's telling us you need a server. Now this is where things are gonna get

38:08 interesting. Is that a static, like, a pile binary? It is. Is that gonna it should That's fine. Static is okay. Static is should be Muzel. Yeah. This is a real test to see if this works on a next machine. Oh my goodness. Yeah. That's great. Yeah. Okay. Cool. Nothing better than we are are fine. The minute there's any dynamic linking, I mean, I just say goodbye and we say, we'll meet you in the pub next week or about something. But Yeah. Okay. So we have a Restate server running on 000 binding port eighty eighty. We have our JavaScript

38:50 Express Lake server running here on ninety eighty. So now we need to do a little bit of glue magic. Yeah. And as it's telling us here, I need to do this, yeah, MPX, but we have to register our deployment with the Restate server. Exactly. Yeah. And what let me maybe expand on what deployment means, and this is a very overloaded term in distributed systems, I suppose. For us, a deployment is referring to, like, that that physical location at which services run, but many services might run there. So your deployment might be a Lambda ARN or a URL to a Cloudflare worker or

39:31 a local host port, but but it's completely unrelated to what services might exist. So you have a Restate Cloud setup, it looks like, and you're currently configured. So you need to do Restate config use end for local, and then it will start talking to your local Restate service instead of your cloud service. We're getting ahead of ourselves. There we go. Right. I was just gonna blow away that older entry, but that's nicer. Cool. Alright. So it's telling us it wants to create a deployment. There are two services that want to be added, Greater and Scotland.

40:10 It gives us the inputs and output values for these, and we can choose to accept or deny. So we now have two services deployed to Restate server being delivered via whatever means we want. Easy. Job done. Alright. We're we're just getting to the easy stuff. Right? Yeah. Yeah. Yeah. Yeah. Absolutely. Alright. So I thought, oh, I'm surprised to have Carl. That's good. Okay. So we could do Carl to the Restate server. And this is just the gRPC. Right? We're just doing yeah. Except no proto. I should have already said this. When we first started Restate, there was

40:46 a lot of proto stuff and we just quickly realized that the developer experience is really bad in most languages. So it's it's everything's just JSON, essentially. We we don't actually this Restate service does not server does not inspect at all what the bodies are, so it just gets passed to the SDKs. But the SDKs by default will use JSON and put up. I do like that it's issued to be based and not proto. It does simplify Yeah. Yeah. Was such a nightmare. Yeah. Pavel's in the chat saying that that binary is a statically 1.1 binary that's apparently

41:21 hot off the presses. I'm sure I think it's today or last yesterday evening that it came out. Yeah. Sweet. I like bleeding edge. So we got our high, but now, of course, I'm gonna test our Scotland One. So pop back over. What did I call it? I think it's Scotland slash greets. No? Oh, no. Scotland slash Scotland. Scotland. Way. Nice. Nice. I mean, there's not a lot to that, but I think what's really powerful is just what we're getting there, that bedrock of just a system that we can now build. I don't want to say anything, but I'm gonna say

42:00 almost anything on. Right? Question from Russell in chat. If you stop the process, do need to reregister? No. So Restate is, like, stateful, of course. I guess that's how it works. So not only do you not need to reregister services, but if you stop the process halfway through an invocation and resume Restate, the invocation will complete. So you should be able to halfway through a request kill Restate service, the underlying service, the whole machine, the whole data center, and start everything back up again, and it should finish. Now, obviously, if you had an HTTP request open,

42:38 we but not magical. We can't currently keep the TCP connection open. But when we have multiple reset instances as part of a distributed architecture, you would be able to kill any of them except for the one you're connected to. But everything will always finish. Can we demo that in a special way? I've got an idea. Right? Yes. Yes. Yes. We can use it. We can use a Restate CLI to actually and I can't remember the command, so please help me out here. How do I list the invocations? It'll be in l s. And there should be a way to watch

43:10 or you can just do watch dash n one. There we go. Okay. That was easy. I like it, mate. Like, I always quote myself, and I feel so cringey when I do it. Right? But the best developer experience is where I can be successful with intuition rather than informed decisions, and that was intuition. So Yeah. Props. Yeah. So we can call Scotland again, and it's too fast for the invocation to even show that there. Right? So Yep. Let's add a new let's modify Scott because one of the things we can do is not respond right away.

43:43 Yep. So we could do and, again, I'm gonna mess this up. Sleep. And this is measured in milliseconds. So what is five no. What is one minute in milliseconds? 50,000. 50 thousand? 60 Yeah. I was about to look really stupid there. So and let that that's a bit high. Let's go with two seconds first before I go show in Yeah. What I'm trying to show. So if we do that, how's that? I'd have to redeploy my service. No. Do maybe just needed the NPM dev to rerun? Should've worked. Maybe not on a well, it's not a watch. Maybe

44:27 it's not respawning correctly. Yeah. Okay. But it seemed to think it'd been modified. Should we go back? Oh, oh, wait. JavaScript. You've been writing to Restate. Yeah. So now we have our two second delay. So let's bump this up, and let's say thirty seconds is how long it's gonna take us to stop the Restate server, bring it back. Now we're not gonna get a response to the CLI, but we should see the implications. Right? So We should see the implications. Exactly. Yep. Alright. So thirty seconds. No. Yeah. Yeah. Yeah. And why don't we start with

45:10 not stopping Restate? Why don't we just start stop the service? Because it can be cool to watch the HTTP actually finish if you stop the service. Ah, okay. You close that. So we have our implication here, which just tells us that things are okay. We can restart that service. So you might wanna do once we have the ID, we can also do in describe, and you'll get a little bit more information about it. Anyway, it's about to run. So Yeah. That And it will continue the sleep. But what's interesting here, it's not gonna continue the sleep

45:41 from the start because that would be really annoying because then if you keep failing, it's just gonna do the sleep, like, with the original wake up time. Mhmm. So you actually shouldn't affect how the curl appears at all, the fact that you stopped the service. And in fact, we could demonstrate that by just printing out. I I I don't know if I'm teaching people how to, like, suck eggs. Right? But So if you you should use context dot console dot log, I think. Yes. Alright. Interesting. So logging is an interesting one because how does Restate work?

46:11 It actually does run the function from the start. It just fills in all the sort of meaningful side effects from what happened last time. And that means that if you just do a console dot log, it will usually log again on resumption. If you use context dot console dot log, it will usually not, but it it just it still can in some situations. Let's see. Okay. So here, we have You can see that's the first invoke. Yeah. We can shut that down. Yep. We can just, you know, do something for ten seconds. I'm just trying to make

46:46 sure that we that comes back just in time for that thirty second. Yeah. Exactly. Alright. It's been long enough, so we can restart that. It's gonna do the magic. K. It's getting resumed. It didn't do the original log again, which is cool. Oh, yeah. I don't even think about that. Of course. Yeah. There we go. 37. Yeah. Exactly. 17 to 30¢. That is that not magical? Like Yeah. That is the essence of it. This is the essence of function. It's ridiculous. Yeah. It's just Yeah. Coding has never been this easy. This is just joy in my face. This is what pure joy

47:21 looks like. Absolutely. And then what's even cooler, right, is so let's say that we deployed this on a Lambda. It wouldn't sleep at all. Like, the process wouldn't wait for a millisecond. It would just immediately stop. And then thirty seconds later, it's a completely new Lambda invocation. What's cool about that is the way that Lambda is built is, like, each invocation is really, really cheap, and you just pay for, like, milliseconds. So you're getting something a lot closer to the Cloudflare worker primitive where you pay for CPU time and you don't pay for IO time, which in practice on Lambda is impossible.

47:53 But this is something we, like, want to help people get to on Lambda because Lambda is such a great platform. So, I mean, I don't wanna get into the depths of it. Right? But I have always been curious. How are you hydrating this function to a point that it can, like, jump through to that next stage? Like, what So every line is right. We're not doing any magic. We're not saving stacks or stuff like that because that will never work when you change your code. It will just the whole thing with a segfault. We still run every line of code,

48:24 but we are sort of injecting the value. So when you do a context dot run, we're gonna be like, oh, we did that before, so we'll just immediately return the value that we saved. When you do a sleep, we are figuring out how much time we actually need to wait for. When you do context.contel.log, we check, are we replaying? By which, I mean, we're zipping through the function to the point that we left off. If we're replaying, we just don't emit any logs. But if we're not replaying, then we will. So these are So is that just like

48:54 memoization based on our invocation key and some representation of this line of code? Well, that's exactly right. It's it's memoization of of a journal. A journal is just a list of operations that you completed, but the actual number is very small. I mean, it's basically runs, sleeps, calls to other services, get and set. So writing an SDK is basically implementing a state machine over these operations. But the fundamental thing is you are writing this log. The log must be stored by Restate successfully before you continue, and then our job is to give you back that log as quickly as possible on

49:33 resumption and therefore have a very low overhead. And so there's this logs per invocation, but also Restate itself is a massive distributed log, And all of those little logs kinda filter into that and create one big log for your entire system. And it's very pretty, really. But, yeah, that's how it works. Okay. So we we obviously don't have a a plan for what we're gonna demo, but we started to say we. Right? I've dragged this down to Scottish team here. So now I'm thinking, can we write and I don't know if this is interesting or too trivial, but say

50:12 a flow where we can use signals to say, a whiskey, drink a whiskey, drink a whiskey till it falls over after so many whiskeys. Like, show a stateful workflow where we are sending events or messages to then change the output of the workflow. Okay. Sounds good. Does that show enough, or should we just go with one of the the demos, the examples? I mean, I was also wondering, do you have any because you're running recent production. Right? Do you have any anything that you actually wanna build for Rawkode? I mean, is that if maybe we could be helpful in some

50:44 way? I do. I'm now just worried that I'm gonna have to give people context and we have Okay. Yeah. Yeah. Sorry, man. I mean okay. Let's what what we can right, is that I'll show you the one that we have deployed, and I can tell you what I'm doing next. So Mhmm. Projects Academy and RPC. So I started off content. This is gonna turn into, like, a you help me session, but whatever. That's great, though. No? So like I said earlier, I implemented the user activation story. Now the way I thought I was gonna do

51:25 this, and it it goes against everything I wanted to do. So I have changed tack a little bit. As I thought, I'll just have an RPC system, which is, you know, publicly browsable, rpc.RawkodeAcademy, and you just get this web picture. It's a Astral website that is deployed to Cloudflare functions and pages where I have pages configured for the Restate entry point, which is just me making sure I can pass on requests via the different handlers to Restate. Like, nothing magic there. It's not that important. But we do have this user registered event, which can come in, and I'm using some

52:02 trigger dot dev and some Restate here. So I'll try not to go on it too much. Right? But we have a web hook from Workhorse, which is closed source thing, which I didn't think I would use until I see that you get your first million daily monthly active users for free. And I'll say, I'm never gonna have more than a million, so that's just free. Yep. And I like free stuff. Mhmm. Where we have a Restate client, and then depending on the event that we get, we are using our handlers from Restate to trigger the user

52:33 created, which creates our workflow. And then when we receive the email verification, we then trigger that. That's it. Yep. These are just functions where we get the client. We submit. I'm assuming people are easy to follow along with this. There's nothing weird or interesting happening here. It's worth noting that these are clients for users outside of Restate. So there's a kind of a concept of, like, you're either in a Restate service or you're just out of the code. And, course, Restate services have to interact with the rest of the world. So what we do in TypeScript is we

53:05 have these typed clients that give you basically the same API for calling services, except they actually go through the Restate ingress, which are running on eighty eighty. And as such, they don't have exactly the same magical properties of, like, RPCs can't fail, for example. They can still fail, but they have support idempotency keys, so you can generally come up with a very simple RPC retry type mechanism. And then, of course, once your request gets into Restate Land, then it has all these properties. Yeah. I guess worth pointing out is we are using the idempotency key here. So Mhmm. Whenever

53:40 Rawkode sends us a webhook that is individually keyed, and we will never send an email twice because we have this idempotency key in theory and hopefully in practice. That happened yet, so that's good. Now the next thing is the actual, you know, the registered handler itself. Now what's cool here and but I don't wanna show the code yet because just to make sure people understand what's happening. Anyone can register on Rawkode Academy. But what we don't want to do is send an email to someone who hasn't verified to that as their email. That's gonna lead

54:11 to spam or abuse. So, I mean, we could just spend thirty seconds on how we should do this without Restate. Like, if you had to build this yourself, what we should be doing? So I guess you can write into the database once they've been verified. That's easy, at least. So once they click the verification link, that's gonna write a key to database. Okay. No problem. The annoying thing is that I have to, like, scan the database to or, like, nightly or whatever to see who's been verified in the last day or whatever and send messages.

54:44 Or I can alternatively do some kind of event emitter thing. So when I write that key, I also emit an event with, PubSub or something. And then I have a PubSub consumer running in a process somewhere, and that is the thing that actually sends emails in response to those events. So probably need to use a database and a queue and have a few different handlers, like, interrelated with each other. Yeah. And there's there's challenges with that approach. Right? And I'm hoping this is familiar to people. If you go with the batch approach, which is, like, when someone does verify the

55:15 email address, we are gonna write something into table called verified with a time stamp. Every night, you look for people to verify that day, and you start sending emails. Not a great approach. Who wants a verified who wants a email twenty four hours later. Right? I said, it's not cool. Yep. The the real time approach, you've got two ways of doing that. One, you can do CDC on the database. That requires more configuration, set up pub subs, all that stuff. Or you write to Kafka, but then you've got the dual rate problem, which is well covered

55:46 in a federated systems and that can you publish it to Kafka without confirm that it's been activated in the database because you can't do a transaction across two different systems? So it's a such a simple problem on the surface, but actually doing it in a resilient way requires effort and due care. And here's the function. So there is a lot going on here because I've been playing with context set and phases and trying to, you know, just explore Restate in a bit more detail. But all I the only line of code that is important here is that when a

56:16 user is registered, we use this durable promise. And I'm just waiting for an event on this user ID that says that that user has verified their email. That's it. That's that's that entire system that we just described. And then everything after that is just me sending an email, which is run handled via context dot run so that it is hopefully only ever gonna run one. Well, do you see guaranteed to only ever run once? I'm not sure. So this is impossible because you could get to the last line of the function and the process could die, and

56:50 and we we just don't have any information to know whether it ran or not. What we say is that after there's been a successful completion that's committed to Restate and we move on to the next step, it will never run again. And this basically just makes item potency a lot easier to reason about. In this case, I think it is technically possible that you could send the email twice. And I think if recent accepts an item code as a key, you could just inject one in there, and that would solve that problem. Yeah. I mean, the challenge is I'm running

57:19 this in the cloud for workers, which don't cover guarantees that my worker won't be killed or whatever. So, yeah, there there's and that could be a problem. I think it should still be possible if you do, context dot rand dot u u I d, then you get a u u I d and that will stay the same. And then you can pass that on to to resend. But, I mean, yeah, also, resend probably can't guarantee it without some kind of inbound item urgency key. No. Technically, it's just not possible without a without a state identifier to prevent emails from being sent

57:46 twice. But it's as close to guaranteed as we're gonna get. I'm gonna I'll I'll say that. I'm quite happy to go in there. If want, if you're not Absolutely. Yeah. Yeah. But I just love so, again, that's just how I guarantee that we don't send an email to our users There's no pops up. There's no CDC. There's no writing to external data sources. It's just a promise waiting for an event. And I think that is just Yep. But it basically it is the pops up approach that we discussed. I mean, Restate is essentially splitting your handler

58:18 into a before event received step, which would be one handler. Otherwise, an event receiver and then what to do when the event is received. And it's got the KV in there, like the KV read and write. It's got the event reserve receive and send, but it's just all in the same handler. And that I guess that's the that's the magic really, but it it's still, like, the best in class approach, I guess. Yeah. There's no hacks. We're still just doing pops up. We haven't invented, like, a new distributed systems approach. We just kind of give

58:48 a new different API, I guess, if that makes sense. Yeah. That makes sense to me. So the bet that I hadn't showed here is we just added a new I don't know if you call it functions, endpoints, whatever, to this, which just resolves that promise for us, and then I'm triggering that by just pulling out the client and calling that as a function. I mean, it is So this this named promise thing is, like, a really handy feature of the workflow primitive in particular, and that's that's kinda what we created workflows for us to have this concept.

59:20 But we in general, in Restate, we have these things called awakables, and they're basically, like, promises that you can share between services, and they can be resolved with any value or rejected with an error. So you could pass around awakable IDs and basically have, like, really complicated async processes between services, and and awakables can be also resolved at the ingress. So you can have, like, an inbound web hook that results awakable. But the special case, like, the simplest special case of of awakables is in this workflow setup where you're mostly trying to communicate between handlers of the same workflow. And for most

59:54 purposes, that is really, really, like, easy and kind of simple. And I guess our approach is we wanna make it possible to build anything. Like, you can actually build a distributed queue, for example, on top of Restate using wakeables. We wanna make it possible to do anything, but we wanna make it easy to just kind of do a workflow that does simple things and not have to worry about, like, creating complicated processes, essentially. Awesome. Alright. So that leads us on to the next step. It's that I've had decided this isn't good enough. And for not not

1:00:25 Restate. Let me clarify. My deployment mechanism here is I'm essentially built in this monolithic RPC deployment where if I break it, I'm actually breaking a whole bunch of different handles. Now we get guarantees and result resiliency by going through Restate. Mhmm. But everything else in my system is deployed as individual workflow workers, and that's the approach that I started to take when I was looking at my technology service. So if we pop back up this is where I'm worried about just oversharing on what's actually happening here. Okay. No worries. But any service that we have

1:01:04 gets boiled down to a data model, a read model, and a write model. The write model is all handled via RPC. So just tell me if I'm glossing over anything too much here. Oh, that makes sense. We have a a schema defined in Drezel, and we're using the Drezel Valley bot plug in for everyone who's not aware of what that is. Valley bot is like Zod. It gives us TypeScript types that we can reuse in other parts of the system. So Okay. While this data model allows me to write to the database, I'm actually not handling

1:01:35 the rights and the service for reasons I can I I can get into if we wanna go into it deeper? Mhmm. But just though we have a schema here, and then the read model is all delivered via graph based. Graph based is the GraphQL Mhmm. Front end kind of thing where I could just say I have I have this type. This is the query I wanna satisfy, and it does it does it does all the jobs. The resolver for this is using the Dresol client to query the table and done. Now the reason this is over the more complicated than I'm

1:02:12 getting into is that in some cases in fact, let's just show this off because this is where things get a little weird. Now most people would have shows service, which I do, which has all of my shows on my YouTube channel. Mhmm. I then have a people service, which has all of the people that have been guests. And the reason I separate these things is so that the schemas are small, migrations are nonexistent, and I can replace the service. Like, the cloud Objects, basically. They're a little bit like virtual objects. But then I needed to then say, people

1:02:45 can be a host of a show. So I needed a new service for that. Now what Graphbase does is it actually allows me to define fragments of these types and does the aggregation at the front. So what I'm saying here is that we have a person which is not resolvable because this is not the person in service, but we know that they have an ID and that's how they're keyed. So it's a foreign key essentially. Exactly. Yeah. We're doing distributed foreign keys across services. Then the show is a service, which is also keyed with an ID.

1:03:18 It's also not resolvable, although we don't mark it as not resolvable because we have this new property that we want to add to it. Mhmm. And what we're saying is we want to people to be able to create a host switch resolves to a person using this resolver. What that means is I get this very cool API, and I'll just log in to GraphQL quickly because I don't have a GraphQL client available right now. Mhmm. See, this is the rabbit hole I was worried about going down. No. No. It's okay. This is pretty cool. I think it's so

1:03:53 interesting to look at, like, real world real world Restate usage. Okay. So here's my single API where I can do a query, and I can say, give me all my shows where the host and get me the the forename. This is where it's not gonna work. Right? Don't do it to me. Ah. Oh, there you go. Great. But we didn't get the show name. So so here we are. Right? We have Rawkode Live, and we have a host with a four name of Mhmm. Oh, yeah. It's just it's just weird formatting. There we go. That must be good. So

1:04:34 that's actually creating the show's service, the people service, and then the aggregation service injecting all that thing. So this is distributed join, essentially. You're you're joining across two different distributed services. Yeah. Ex exactly. So And is this all hitting the same underlying database or are they No. No. No. Every service has its own lib sql, terso db. So SQL lite, which makes local development experience very nice. Okay. Got you. Now this presents a problem. And this is why Restate is such a big part of my architecture now. Because if I want to create a new show,

1:05:10 a new person, and a host of a show, that is actually a multi step. Right? I have to create the show, create the person, and then say that that person is then a host of a show. But I can't just insert values into a table anymore. I restate different services. I see. Yes. Of course. Yeah. Yeah. So Restate becomes the orchestrator of these rights so that when I do have GraphQL endpoint and the same, it's a mutation and, say, we say create show. I'm not gonna actually do this. It's not gonna work. Hosts, and that could be Rawkode is the idea.

1:05:54 Like, whatever. This has to have a resolver that can create the show, but then has to guarantee that the Rawkode exists and then create it with another server. But it gets quite complicated from our right perspective, but I'm okay taking on that burden because I get the read API that I want for the website, the platform, the apps, all of the automation. Oh, you. Great. Very cool. So I know we're we're deteriorating the fuck now, but let me go back to the technology. Really understand the problem now. I I was waiting for it to get to, like, the

1:06:22 restate bit. I totally understand. This is a dual right. You've disaggregated. Yeah. I mean, Ahmed's in the comments saying, just do the whole thing and reset, mate. Problem solved. That is actually a dual right. But you still have foreign I mean, you still have foreign keys and re I mean, we still have this concept in Restate. You still have, like, IDs and, I mean, you know, keys for virtual objects, and you still have to read across two different services and write across two different services. The difference would be the right sort of can't fail, I guess, in a

1:06:55 they can't transiently fail. So if you risk right call 1 and then call 2 to do the first right and the second right, you don't have to have a lock or a transaction or anything like that because Restate would guarantee they would run to the end. But you can still do this even if they're external systems, even if you have to talk to these services that aren't running at Restate. It's just easier if they do run at Restate, of course. Well, we are quickly running out of time, and you've given me lots of fit for thought

1:07:21 there. I'm gonna suggest that maybe there's a part two where we can actually dive into how we could put in a virtual object approach and and build something out. But I'll finish my train of thought and just where I am with the technology service. Now Restate as an opinionated about where my services are deployed. The fact that I was putting them into a single RPC deployment was for convenience at the start, but now I'm like, actually, what I need is inside this technology service is a Restate directory, which has a single handler, which I deploy

1:07:51 to cloud for workers and register with the server. And that way, I can share my Drezel value dot schema, which allows me to do the input validation before doing the right to the Turtle database. And that's gonna be my model moving forward for every single right within the system. So every write is going through Restate. Nothing will go through GraphQL. It's all gonna be RPC. But instead of having a single RPC service, I'm gonna have many, many RPC services that Restate as the orchestrator of. Yep. Yep. And do you think that these rights will still go

1:08:25 Restate and then a non Restate service and then Terso, or would you be happy for them to be talking to the database directly from Restate? In which case, things I think would be a lot easier. I'm gonna have Restate talk to the database, and we can I mean, we've got twenty minutes? Okay. So you We could probably do this thing. About convert yeah. I think we could convert one of these Because the Dressel's client is regular. Object. Right? So and so let's just say we had Restate. I'll be bold now because I do have buttoned this dev in.

1:09:01 And I'm just gonna delete that. Holding one. Why don't you try the there should be a BUN template. Oh, have you used the BUN template for Restate? No. There should be if you do Restate example, like, the CLI. Oh, there is. Okay. Yeah. Yeah. You should try that. Well, I don't have the CLI. But I'm installing MPX. No? Oh, sorry. You're in a different directory, so you have, like, totally different tools now, don't you? Oh, Restate there for it. Yeah. Exactly. Yeah. That's it. Okay. So we can do ban run at least date. Example done, I think.

1:09:46 Let's just pull it up. So No. I don't think it's used there because you don't have the CLI yet there. Maybe I don't know where it would be used. Operate. We can just run help. Just give it a try. Oh. Because it's in my globals. Oh, no. That looks right. There we go. Yes. Okay. So it should be example. Yeah. I think you just run example, and then it will ask you. You can just do example. It will ask you what you wanna do. Yeah. TypeScript Fun. Hello. Well, it's nice. Alright. Let's just I'm just gonna extract this up.

1:10:39 Won't pull it that way. Don't mind. Don't know what temp does this the ZIP. You don't need the ZIP. Don't worry. Alright. Okay. So now we have a Restate reader service and button. Yeah. And I'll just make sure that works. So I should be able to do I don't know what the script is for BUN. It's just dev. Oh, there we go. Dev. Alright. Okay. Unify those. BUN install. Yeah. Just got it running on another port. Not that one, I think. That one. Yeah. Okay. Nice. Good. So now we have a service. So now what we wanna be able

1:11:28 to do here is called create technology. I don't think that's really important, but I don't know. Technologies create. And that's one step closer. This could literally be the first public user to just release 1.1. Yeah. I mean, yeah. It's true. You sure you don't wanna do this with Rust? No. I mean, I'm just No. No. No. Okay. So we already have types here. So let's do import from Drizzle. Drizzle schema, so no. I wouldn't normally import them like this. I would but workspace it. But And are the Drizzle, like, operations, like, when you do get and

1:12:36 set, are they sort of idempotent? Like, how how does Drizzle do I guess, is that transactional? I mean, I could do it. But would would it always be, like, create create if not exists in this sort of thing or up update or insert this sort thing? I I think it's just doing an insert. Okay. But, you know, we don't need to get into the technicalities of it right now. Okay. This gives us a TypeScript type. I wonder if I can just do Should be able to. Yeah. Type off. Maybe import type. Or There we go.

1:13:20 And now we just have something where I could say technology.no. We'll get to it in a minute. I'm sure we can work it out. So Mhmm. See, as a value, let's let's just let's not get ahead of myself. Right? Let's just take something. Because what I could do now is I can use my technologies and search schema. We may have to look up the Drezel docs. There's a validate function. Maybe I have to copy for whatever handler you currently have. Right? It's broken. I'm actually because as you could tell, I'm on a branch where I've been rewriting this.

1:14:07 Oh, okay. Yeah. Yeah. Here. Create insert schema. And then Oh, it's we can call pars passing in the schema. Okay. So pars. I'm gonna have to add Drizzle Mhmm. To this. This is where I should have done it as up on workspace. That's what I get for being half assed. I mean, I don't really need this package dot JSON. Right? I just need these dependencies. Yeah. Yeah. I shouldn't need it, but I don't think so. As long as I'm than me duplicating all of this. Now I know there's cleaner, nicer ways to do this. Let's just take that out.

1:15:11 Scripts is fine. Dependencies. Dev dependency is probably okay. But Yeah. But don't need anything. Okay. So I could do we should be able to run button install. It does appear directory. Mhmm. This now means I can get parse, which is probably morning because it's not used. That's okay. And and here, we can see parse like so. So now we know we have a valid technology. And I'll pull those docs back up. And if we don't, I guess, we would usually throw a terminal error, which is, like, telling Restate this is, like, never gonna succeed because the

1:15:54 input is bad. Yeah. The problem we have now is when I generate a type client for the service, I'm not gonna get the fields that I actually need to be able to fill in. So I need to do this, and I do need to solve that. You need to solve the type of schema thing. Yep. Yep. Yeah. I need to work out that. So but in theory, I should now be able to do if we import knowledge's table, Start my imports for me. I was hoping that we'd move that here. And Rusta would have done it for me.

1:16:35 Yeah. Oh, that didn't even work. Right. Now we can just say technologies well, that's that this is the tape. That's that's interesting. Okay. Let me pull up my graph here. So I've forgotten how to drizzle now. It's it's too much pressure. Oh, yeah. That looks right. Yeah. That looks right. Okay. Stop watching me. Alright. Okay. Import import schema. That might be the type you were missing for oh, no. That could be the insert schema. That's the star. So Oh. Which is gonna have to go up one more directory. Yeah. No. Yeah. Yeah. Okay. Schema. So now this is actually d p

1:17:40 insert and technologies table two. Okay. No. This is untyped. So and it wanted an extra argument. I'm just gonna use the docs because I have no idea what I'm doing. Users values. So you do the table first and then the right. Yeah. Yeah. Okay. Done. Nice. So if that's the last step, you don't even need to use a context dot run because, I mean, it's the last step anyway. It's nothing afterwards would ever happen. Yeah. Well, that would work. That would kinda work. And then if you need to do, like, multiple of these calls, like, you're calling technologies create and then you're

1:18:37 calling something else dot create. So technology says the easiest service, and it doesn't have any dependencies. It just needs to set a service. But, you know, if we were doing a more complex example, like the show hosts, yes, we would have to first do a query to make sure the show exists, make a query to make sure the person exists, then do the answer, assuming they're not already inserted into the thing. So there's all there is a lot more logic. This is the most primitive version of the right that we could possibly do. Yeah. What I'm curious about is where I

1:19:07 can get that typed work. It could be a No. That's just that's a actual table. So Mhmm. There must be a way to infer. Oh, there is a infer function. Right? So but I'm doing that with the insert schema. This is voodoo to me. Yeah. Probably this TypeScript type stuff is that when it when it's not working, it's, like, impossible to figure out why. And when it works, it's, like, so magic. Yeah. I I don't think it's important for the the demo. Like, the fact that I have this function, which I can now deploy and

1:19:13 Preparing for Hands-on Demo

1:19:58 register with the thing is is important. I would love to be able to run it. I don't think it's gonna work right now. And it's more than about my Libiscale client, which I've not seen before. I'm not sure. Is that just because of this TS config? Yeah. Maybe I haven't imported it. Like, because Well, I have a I moved it in the title. Right? So And you definitely have it at the root. So we're we're using this TS config. So I might be able to just nuke your Restate one. Oh, yeah. Yeah. You probably didn't need that.

1:20:04 Restate v1.0 Release and Stability

1:20:31 Restart Oh, yeah. That fixes the yeah. Yeah. Nice. And I curious if that fixes. No. I can get to that type, but I can't I'm not quite getting the exact type information that I expect here, but I think that's possible because of this any type. Like, if I break everything, this definitely fails. Ah, okay. Yeah. So, yeah, there's something that I'm gonna have to work out. I'm not gonna be able to do it in nine minutes on this session. Alright. That's fine. That's totally fine. I think that's pretty cool. And it's something I need to get working

1:21:04 today anyway. This was after the session. This is what I was gonna be working on for the rest of the day. And I feel the point here, right, is that, like, you haven't really written, in this case, the service in any in any different way. And the main benefit we're getting from from Restate here is just for the communication between services that that becomes, like, exactly once guaranteed, essentially. And if you can just make that d b dot insert step idempotent, that's I think that's important as well. If you can do that, then the whole thing

1:21:28 Exploring Core Concepts: `context.run` & State

1:21:31 that that's basically the root of the distributed systems problem here. As long as the database right is unimportant, everything else should work with Restate. Yeah. I mean, I can start a transaction, and then it becomes an item. At least it's either gonna success or fail. It's gonna be atomic. And then that Yeah. Atomic is easy with databases, but the critical thing is that it's, it's like an update if if exists. It's like insert or update is that one. I'm I'm not a ShysQL expert, but I think that's the critical thing. It needs to not fail if the exists. If the

1:22:03 row exists. I want it to fail. Oh, you do? Okay. But what if, let's say, it gets past the DB right, but then TURCE fails or whatever, and then it wrote to the database, but it never returned 200. Then Restate will retry. Right? Then we'll write it again, and then it'll fail permanently. So, yeah, why why why would it need to fail if if the rows are already there? Isn't that 200 okay? Like, we're out and put it and we're okay. Yeah. I suppose we can do an answer on update. Like, that would Yeah. Would work.

1:22:35 Generally, I would say that's better. Does support that anyway on conflicting data update, and then we specify. Okay. Great. So that's really all you need for this to be, like, a well behaved Restate service. So the idea is that, like Like so. We we are kind of saying here that failures are all transient failures should be retried. The other type of non transient failure that we shouldn't retry is the one where it is valid as false. In that scenario, I think we should throw a terminal error and then just say, look. The request parameters are wrong.

1:23:05 Sweet. Alright. I am gonna be sat here for the next three hours at least working on tidying this up because I'm now in a position where I was like like, I could put this in a Restate folder, but, I mean, there's nothing yeah. This has got us on No. I'm not knowing what, you know, what other kind of services you need really than these kinds of services because yeah. I mean, it's but our hope is that we make every handler a bit better. Some handlers, make, like, a million times better so that every handler is, like, enough improved that

1:23:35 it's kind of worth using. Yeah. I'm very excited to play this now. This is this is a lot lot for my setup. So I will definitely be doing some videos once I get this all working on how this all fits together. But my my main objective was data model, read model, write model. And between graph based, Restate, and Drezel, I have those three components. And I I am very excited once I Mhmm. Deal with the TypeScript thing here, once I deal with that. I agree. And it's a it's a simple thing. I'm gonna go after this session. I'm

1:24:10 gonna pop this back to to Big Face mode. Once I go through the Dursle docs, I'm sure there's a page on getting that that type and using it as a TypeScript type, and that's the final the final straw. So Exactly. Yeah. I mean, you could actually have something, by the way, that sort of there's there's like a concept of high order services. So you could actually have a thing that just takes a table and turns it into a Restate service that does the crud that you've just shown. So it does validation, returning a terminal error if it doesn't validate,

1:24:39 and then does the update and or up sorry, the insert or update. You can have this that just converts any table into any Restate service and not even have to write this slide boilerplate each time. That could be kinda cool. Yeah. I I don't fully understand that. But The point being is, like, the service that you've just written, it would look exactly the same if the table was different. Right? It's just accept an input, check that the input matches the schema for this table that's previously Yeah. Yeah. Yeah. And then do the update. So there's no reason to write it for

1:25:11 each one. You can just have a single implementation of, like, for the a function that turns a table into a Restate service, then you could just bind all of those services for each table. You only have to write that handler once. Oh, yeah. I mean, I it could be a generic handler that takes a test. Exactly. Yeah. Yeah. Yeah. Okay. Oh, wow. That's pretty slick. Yeah. Yeah. It should work. I mean, for the simple use cases where it's If it's just crud. I mean, if you're doing some kind of other complex stuff, then, yeah, you would still write the handler. But

1:25:38 but, yeah, there's no point writing these crud handlers over and over again. You can just, you can just you're sort of writing like a you're kind of converting a database table into a Restate service. So you're handling all of the types of failure, like schema being invalid, the the write failing, and those become, Restate services as errors as RPC errors. And then once it's an RPC handler, you have all of Restate's properties, like the RPC will happen once and will always complete and things like that, and you they'll go through invocations. So this is a really nice thing, and something

1:26:07 we would really like in the future is maybe even Restate can even do SQL operations for you. So you wouldn't even need to write such a wrapper, but you can sort of do context dot SQL, and we would manage the transaction and guarantee that it only happens once and this kind of matches. So nice. Yeah. Watch this space, but we would love to do that. Alright. I have taken up enough of your time for the Rawkode Academy self help program. So Well, let's do a part two because this is actually really interesting. Yeah. I'll get a few steps forward. So

1:26:39 I so we get to that point where I think it's more interesting for the viewer to see a more complete system rather than what is right now very early prototype proof of concept. But I like that generic handler idea. I can share that in my BUN workspace, and the simple things can just pull that through, and that's gonna be amazing. And then I can get on to the more advanced use cases where we have real workflows with real challenges. And then we can look back and say, okay. Here's where we are. What's the next steps?

1:27:08 Yeah. Well, I I know it's gonna be worth it. Yeah. Alright. Something I'd like to finish on is just to talk about what's next. Now Restate is already a one point o release. You've got that seed funding. You've mentioned context or SQL, which doesn't exist but could exist. Maybe you could share a bit of light on what's on the road map. What are you working on as a company and a team? Yeah. So I would say 85% of the resources right now at Restate are are working on distributed runtime. So, like, you know, for context, we have come from

1:27:43 the world of, like, building distributed systems, and we have designed Restate from the ground up to be distributed. But right now, we've been running a bi a single binary today. Right? And that binary is a single process. And a lot of the things that we're talking about are way easier in a single process. We haven't taken those easy routes, but we need to prove that now. We are distributing Restate. We're making it work in high availability setups, so, like, active, passive, but also, like, fully distributed sharded setups, like, ten, twenty, 50 nodes. And, yeah, the designer always accounted for that

1:28:13 at the start, but to actually build, like, a distributed system with virtual consensus from scratch and Rust is obviously a huge amount of work. I would say we are now in the last stages of that, but it's kind of taking up all energy. And this is really exciting because we want it to be possible that you can kill half of the restate instances or lose a DC or whatever and have milliseconds of fail over time or even no fail over time. That's that's really important as part of our mission. So, yeah, that's that's what I'm excited about.

1:28:45 I'm really, really hopeful that we'll have something there, like, very soon. Nice. Awesome. Any final words before we depart for today? No. I think we covered everything. Had a really awesome time, so thanks very much. It's been brilliant. Really appreciate you taking the time to, you know, share your story, the Restate story, and even help me debugging and planning some future Academy features. So, again, thank you so much for your time. We'll definitely reach out and get a part two in the future, and I'm excited to see what you and the team at Restate too.

1:29:17 Awesome. Well, thanks very much. I'll see you next time. Yeah. For everyone watching, thank you very much. Feel free to say something in the comments as we bid you a do. Have a great day and the rest of the week. Bye.

Technologies featured

Meet the Cast

Weekly Cloud Native insights

Stay ahead in cloud native

Tutorials, deep dives, and curated events. No fluff.

Comments, transcript, and resources

Additional Resources

More from Rawkode Live

View all 173 episodes
Rust

More about Rust

View all 22 videos