About this video
What You'll Learn
- Model database schemas as declarative SchemaHero custom resources.
- Plan migrations, inspect generated SQL, and approve changes manually.
- Use operatorless mode, Flux, and seed data for GitOps deployments.
Mark Campbell from Replicated joins to demo SchemaHero, a Kubernetes operator that turns database schemas into declarative custom resources. We install it via Krew, run it against Postgres and MariaDB, review generated migrations, and wire it into a Flux GitOps workflow.
Jump to a chapter
- 0:00 Introduction
- 0:54 Introduction
- 1:10 Introducing SchemaHero & Database Migration Challenges
- 2:11 Guest Introduction: Mark Campbell (Replicated)
- 3:10 Why Database Migrations are Difficult
- 5:08 Declarative Approach vs. Imperative
- 8:37 Demo Setup (Kubernetes Cluster, Postgres, MariaDB)
- 8:43 Installing SchemaHero
- 9:41 Installing SchemaHero (Kubectl Plugin & Operator)
- 11:44 Installing Kubectl Plugin (Crew)
- 12:05 Running Homebrew
- 12:49 Installing SchemaHero Operator
- 13:58 Operatorless Mode (CLI Discussion)
- 14:05 Use Cases
- 15:26 Generating Schema from Existing DB (CLI Feature)
- 16:05 Tutorial
- 16:12 Defining Database Connection (Kubernetes CR)
- 18:55 Reference
- 20:05 PostgreSQL
- 20:54 Database Must Exist (Limitation Discussion)
- 21:45 Get Databases
- 22:58 Defining the First Table (Kubernetes CR)
- 23:15 Create New Object
- 24:22 Applying the Table Definition
- 24:55 Checking Pending Migrations
- 25:25 Filter Connect
- 26:30 Migrations
- 27:34 Understanding Planned Migrations & Approval Workflow
- 28:33 Describing the Migration (View Generated SQL)
- 29:28 Approving the First Migration
- 29:45 Beekeeper
- 30:38 Manually Adding Data (for testing)
- 31:15 Modify Table
- 31:52 Modifying the Table Definition (Adding Column, Rename Discussion)
- 33:12 Adding a New Column with Constraints
- 35:03 Applying Modified Table Definition
- 35:45 Migration Failure Due to Data Conflict
- 36:15 Adding Default Value to Column
- 36:55 Demo Continued
- 37:36 Applying Corrected Table Definition
- 38:11 Approving Corrected Migration
- 38:38 Verifying Column Addition (with Data)
- 38:41 Testing Manual Schema Change (Drift Detection)
- 39:32 Why Automatic Drift Detection Isn't Continuous
- 41:55 Marking Table for Active Deletion (`isDeleted`)
- 43:24 Enabling Immediate Deploy
- 44:28 Enabling & Discussing Kubectl SchemaHero Shell
- 46:20 Testing Immediate Deploy with New Column
- 46:43 Verifying Immediate Deploy
- 46:55 Transparency
- 47:40 Workflow
- 48:34 Transition to Operatorless Mode (Deleting Operator)
- 49:30 Operatorless Mode
- 49:35 Using SchemaHero CLI (`plan` & `apply`)
- 50:45 Plan
- 51:45 Troubleshooting
- 52:04 Manually Dropping Table (for CLI test)
- 53:30 Migration
- 53:44 SchemaHero CLI: Plan
- 55:02 SchemaHero CLI: Apply
- 55:05 DDL
- 56:06 Guest Demo Introduction
- 57:13 Code changes and migrations
- 58:16 Guest Demo Setup (GitOps, Flux, MySQL, Sealed Secrets)
- 59:06 New Feature: Seed Data (Alpha 1.3)
- 1:00:11 Demo YAMLs (Database & Table)
- 1:01:53 Demo: Fixing Customize Config
- 1:02:16 Pushing Changes & GitOps Workflow
- 1:03:59 Using Kubectl SchemaHero Shell (Demo)
- 1:04:19 Exploring DB via Shell
- 1:04:31 Defining Seed Data (YAML)
- 1:04:53 Pushing Seed Data Changes & Verification
- 1:05:21 Seed Data Opt-In Explanation
- 1:06:54 Describing Seed Data Migration (SQL)
- 1:08:15 Discussion: Generating Fake Data (Feature Suggestion)
- 1:09:42 Demo Wrap-up
- 1:10:02 Q&A: SQLite Support
- 1:11:08 Community Involvement & Conclusion
Full transcript
Generated from the English captions. Timestamps jump the player to that moment.
Read the full transcript
0:54 Introduction
0:54 Hello, and welcome back to the Rawkode Academy. I'm your host, David Flanagan, also known as Rawkode. This is Rawkode Live. I just like seeing Rawkode as many times as I can in the first ten seconds. And I also hate myself a little bit for it. Now moving on with today's session, we are taking a look at a CNCF sandbox project. This is SchemaHero. SchemaHero is going to make all of our database migration lives a whole of lot easier. And I am not gonna show you. Well, I am kind of, but we are joined by a wonderful guest, maintainer of the project,
1:10 Introducing SchemaHero & Database Migration Challenges
1:26 Mark Campbell, who is gonna keep us up to date on all things SchemaHero. How's it going, Mark? How are you? It's great. Thanks for having me. Excited to be here. Yeah. Yeah. It's awesome. I've been looking at this project for a while, and I'm I'm really keen to kinda play with it and just see how it can simplify my life a little bit. I mean, I I think I've had the worst luck with databases throughout my entire career. So I'm really looking for, like, the secret weapon to making my life a little bit easier. Is is that is that what
1:54 we're gonna get today? Hopefully. That was like our goal. I think, you know, I don't know. Maybe maybe you've had the second worst luck with databases. We had so such bad luck, we created a whole project around it to make it easier. Yeah. Okay. Before we get back into SchemaHero and and start showing people what this thing can do, can you tell us a little bit about yourself, please? Yeah. Sure. So, yeah, I'm Mark Campbell. I'm cofounder and CTO of Replicated. At Replicated, we're helping software vendors ship enterprise versions behind the firewall installable products, versions of
2:11 Guest Introduction: Mark Campbell (Replicated)
2:27 their product to their enterprise customers. As we focus on Kubernetes, so you take a Kubernetes app, package it up as a Kubernetes app, a Helm chart, whatever, in replicated, can help you deliver that to air gapped environments and complex environments. And as part of that, we ran into a lot of different problems that you don't run into when you deliver first party software when you're trying to either deliver third party software or consume third party software. Database migrations were part of it. We created SchemaHero and, you know, put it in the CNCF sandbox and, like, excited to dig into
2:59 it a little bit more today. Yeah. It's a bold name. Right? SchemaHero. Like, I'm expecting some really big things from this. Yeah. Let's it'll deliver. Could maybe we could talk about a little bit of the the history of the project. You know, you said you've I said I have the worst luck. You said, no. You probably have the worst luck as you build a project. Like, why are database migrations so difficult to get right? Yeah. So, you know, we started off really focused on the schema part of database migrations. You know, we we're using Goose
3:10 Why Database Migrations are Difficult
3:29 to run them. If you're familiar, it's like, you know, writing Go code, and so we would write SQL migrations often using Go. And it generally worked pretty well, but it was, like, imperative where we would say, I want you to go you know, you'd you'd write a SQL migration that said go add a column, drop a column, map, create a table. And the challenge that we really had was, like, kinda came around to you have to be, like, really, really confident in the current state of that database schema before you can apply changes to it,
4:00 which generally sounds, you know, like an okay thing. Right? Like, have my database. I can go look at the schema, the state of it, and then decide, okay. I wanna, like here's the commands that I need to do to alter it to what I want. The problem is, like, in a you know, when you're running a lot of migrations, sequencing becomes difficult. You might have three migrations pending, and you each of them work against the current state of the database. But when you apply one, like, the second and the third one are no longer valid, and it may
4:28 actually have, like, adverse side effects in kind of, like, like, result in, like, a like, dropping some data a a column or doing something that you didn't expect. And then for us, you know, again, since we're delivering software into third party, you know, enterprise customer controlled environments, we have, like, very little ability to guarantee the current state of the database. And so all we wanted to say is, you know, here's the state that I want the database to be at, kinda like Kubernetes. Right? It's very declarative. SchemaHero is, and we can just say, like, I
4:59 don't care what the state of the database is right now. Here's the state that I want the database at. Like, do whatever is necessary to move it from a to b. Yeah. I think that's one of the things that got me really excited about it when I was looking at the documentation. Like, you know, I'll talk my database history is starting off twenty five years ago, hand rolled SQL that was deployed whenever we shut the application with probably many more manual changes, which I think it's just the way things were back then. At least I don't think
5:08 Declarative Approach vs. Imperative
5:27 I was doing anything too bad. But then came along ORMs and they give us this ability to kind of model our code, our database as code, then it would generate the migrations for us. And then those would run on deploy. And that worked really well, but then you run into problems with rollbacks, whereas I I think rollbacks are still always a problem. Right? Like, probably just shouldn't rollback. I'll get your thoughts on that if you want to share. But it's it's nice to see that go back to what I was trying to make a point of. I was at the SchemaHero documentation,
5:56 there were no SQL statements. It was, like you said, a declarative, here's what my database has to look like. Here are the fields that I want. Here are the types that they have, and I don't need to worry about anything else. Like, that's a pretty cool feature. And Yeah. Like, I mean, if you're used to, you know, Rails or Django, you know, like you mentioned ORMs and, like like, that pattern was really, really good. Right? You don't have like, it came with challenges around rollbacks and a few different things, but but these challenges are gonna exist. It's a complex subject. You
6:26 know? The the challenge really, though, was Django or Rails, like, the migrations are pretty tightly coupled into the ecosystem and the language and the framework that you're actually using to, like, to build stuff. And not all languages, not all ecosystems have that. Go has Go ORM. It's generally more around, you know, not writing SQL statements at runtime, but, like, that that that that functionality just should exist, and it should be decoupled from like, we thought, that functionality should exist and be decoupled from the actual, like, language and framework that you're using and actually push down
6:58 to the platform layer to be responsible for it, kinda like what Kubernetes has done. Yeah. I I I kinda wanna highlight one distinction to make sure I have my own mental models that kind of makes sense for it. So for everyone who's not familiar how the ORM migration stuff worked, when I generated the SQL, every migration got an ID, that was usually stored in the database to see that migration had run. So typically, that migration would never run again. And if you make manual changes, any subsequent migrations that have to run wouldn't be aware of those changes and that
7:26 would break and then you just get paged and then you're in a whole lot of heart a few days while you work out why there was a manual change in the database. What I got there from looking at the documentation and and even just talking to you now, it sounds like SchemaHero doesn't have any stored state. It's it's looking at the database as it runs to work out if there have been manual changes, would it then fix that for me? Is that Yeah. Like, so, you know, like, we build that in as, like, drift detection, right, as the
7:57 feature is what we're calling it. And then exactly, like it doesn't if you've deployed and said, here's the state that I want this table to be at, and then somebody's gone and, you know, manually changed the column type. And then the next time you try to deploy that table, it's gonna detect that and it's gonna say, that table, like like, my job isn't to add this column to the table or my job isn't to add this foreign key. My job is to make the actual table in, you know, Postgres or Cassandra or Cockroach, whatever that is, match
8:25 the desired state that's defined in this YAML file. And so it it ends up doing, like, drift detection and bringing it back to the desired state as needed. Awesome. Well, it sounds exciting. I think we should just dive right in. What what do you think? Let's do it. Let's do it. All right. Let's start off with what I've done in advance. And the answer is standard, almost nothing. So we do have a Kubernetes cluster. This is running on Civo as KCS running Kubernetes 120. I have no SchemaHero installed, but I did go ahead and deploy Postgres
8:43 Installing SchemaHero
9:02 and MariaDB because I'm assuming we'll be working with both of those today. These are empty, they have no configuration, they have no data, nothing like that. So we're starting from almost zero and I have the SchemaHero website here and my expert. So I feel that we have everything that we need to be successful now. No pressure. No pressure. It's funny. It's like when I when people join me on this show, the thing they're not worried about is, like, their own confidence or their knowledge. It's always the, oh, I really hope our docs are up to date.
9:33 Oh, yeah. I hope they are. Documentation is always just one of those really challenging things. So yeah. Alright. Let's click on get started. So I'm assuming the first thing we need to do is just get this thing installed. Yeah. So I see we have two choices. Yeah. Go for it. I was just gonna tell you, there's there's two choices here. This is like the the documentation on how to install SchemaHero. On the top nav bar, there's also a tutorial. Like, whatever way you want to go with, like, we like, it's under Learn SchemaHero, one tab to the left.
9:41 Installing SchemaHero (Kubectl Plugin & Operator)
10:03 Whichever path you want to take, one just like this path still leaves you with SchemaHero installed. It's a little bit more like, you know, walk you through every single thing. Well, yeah, I probably need all the help I can get. So let's use the the start of this tutorial because it says new to SchemaHero and I I certainly am. Alright. So we're getting some introduction. It's even given us a little basic database that we can design, so reservations, schedules, and airports. And I'm sure someone has spent a lot of time putting this all together, but I'm
10:36 gonna skip onto the install step of the Yeah. Let's do it. Alright. So it says so do we need something local and something on the server? I see that this is installing a a plugin. Do you wanna kinda just break that down for us? Yeah. So SchemaHero is a it's packaged the way I think that we're going to go through it today, it's packaged as a Kubernetes operator. It doesn't actually have to run-in Kubernetes. It's just a CLI command, and you can run it completely outside of Kubernetes as long as it can talk to the
11:06 database. But, like, we're gonna go through it, and we think the primary and kind of the best use case is really, like, running it as a Kubernetes operator. So we have a Helm chart. We have other, like, you know, community supported ways to install it. The preferred way for us is to grab our kubectl plug in using Crew that we publish called SchemaHero, if you have Crew installed. If not, we can get that too. The ability is that will actually manage installing the operator for you and also kinda give you control SchemaHero CLI commands so that you can interact exactly
11:38 with, you know, like, the SchemaHero extension inside the cluster. I normally have Crew installed, but I did switch to my m one Mac recently. So we'll need to quickly grab this. Looks fun. Yeah. Am I supposed to trust us? Seven seven lines of of shell scripting here. Let's do it. I'm looking for the safe option. Yeah. Alright. Let's do that. Although Homebrew can't be notoriously slow, but, hopefully, it's not too bad. I do love that they store all the binaries and get help container registry now is OCA artifacts. It's really clever. Yeah. There we go.
12:05 Running Homebrew
12:32 Great. Alright. So installed. So that just installed the the client side portion, obviously. That hasn't, like, talked talked talked to the cluster at all. Yeah. It's now telling me to run the kubectl SchemaHero install. So I'm assuming this is what's gonna do. Go ahead and install the operator to our cluster? Exactly. I just I think you when you install brew, you have to, like, add some path to the shell in order to do it. I've never Am I making this more difficult for myself? Is that what you're saying? I mean, when you go so, like, if
12:49 Installing SchemaHero Operator
13:18 you maybe? Yeah. Exactly. Like, this this binary exists somewhere now. Right? Like, I don't know where it is on your path. Neither do I. That's that's great. Okay. Let's see. Do we have a crude directory? We do. And we got a Ben. So let's there we go. So we are gonna do export path. Accru bin. And then we're gonna run that again. That looks better. That's pretty fast. Does it get a sort of namespace? It does. Okay. It does. Yeah. So I wanna elaborate on something that you said there then. So let's close that. You
13:58 Operatorless Mode (CLI Discussion)
14:03 said that SchemaHero doesn't have to run-in cluster. So with the use case in fact, you said I could also just run cube control schema. It's got a whole bunch of things on it? Exactly. Yeah. Like, you can you know, we'll dive into it here. And so, like, in in kinda getting into how SchemaHero works, but, like, you can see there's a couple commands there, plan and apply. Those are probably the big important ones. And so you can say, here's a manifest, and I want you to plan it against this database and give it a URI and
14:05 Use Cases
14:35 run it. And we actually run it that way in Kubernetes at times when we're in controlled environments that, like, we can't deploy an operator because we can't get cluster RBAC level permissions, and so we can still run it. It's, like, it's less automated at the end of the day. You still get the benefits of SchemaHero, but, like, it doesn't, you know like, some of the ongoing like, the benefits of an operator is you have that reconcile loop that's running continuously in the cluster, and you lose that. Yeah. I'm assuming for people that aren't doing local development, I get Kubernetes. This
15:05 is the best way as well for them. Like, If they're just doing Docker Compose to run their their services, but they still wanna manage and use SchemaHero for the local dev environment, they could just do the SchemaHero plan, apply, you logic themselves. Yeah. For sure. Yeah. Exactly. And like you another use of it, it's not built into the operator, is we have a a generate command, which you can use, like, against an existing database. And you can just say, you know, like, one of the things that we've done is SchemaHero is we've said, oh, here's a
15:26 Generating Schema from Existing DB (CLI Feature)
15:36 declarative structure. There are Kubernetes manifests, and this defines the table. Well, if you have an existing database, that can be a little bit tricky to migrate over if you have, you know, hundreds of tables and it's complex. And so SchemaHero generate will actually, like, connect to the database, describe the schema, and then spit it all out for you so you have a starting point, you can actually migrate to SchemaHero. Oh, nice. We'll definitely have to try that then before we finish. Is that is that part of our tutorial? It's not, but, like, well, let's try it.
16:04 Alright. Do wanna do that first or do you wanna go stick to this tutorial first and and Let's go through this tutorial here. Alright. Okay. Wise wise decision. So we've done we have screen available locally. We have that available inside our cluster, and now we need to connect our database. So what databases does SchemaHero support? So we support Postgres, MySQL, Cassandra, CockroachDB, and SQLite. And I think we're working on adding support for Yugabyte right now. The goal is, like, you know, common databases, and we wanna continue to expand that. Cool. Awesome. So I don't know. If you already have
16:12 Defining Database Connection (Kubernetes CR)
16:51 Postgres running, David, like, we might wanna skip some of these parts here and Yeah. So I actually have a port forward to both Postgres and MariaDB running another terminal on my machine. So No. Great. Oh, this is the app I'm using. I didn't realize that this was in the documentation, this beekeeper app. So that's obviously quite a a good choice. I picked it for a completely silly reasons. I thought it looked nice, and it's the only reason when I search for Postgres clients, I picked beekeeper. But I'm glad that that this has validated something for
17:23 me now. So I'm gonna have That's actually, that's why we picked it and put it in the docs too because it just looks better than most of the other ones. Yeah. I I actually I was playing, like, those Versus Code plugins now where you can just, like, open empty text files and write SQL and then say execute and stuff, but I felt that was a bit a bit hardcore for this stream. I like these visual things. Okay. Sorry. Digress in there. I'm gonna copy our manifest here, which is going to create a database on our database.
17:54 I guess we should just have a code session too. Right? Yeah. We're gonna erase some YAML. Love me some YAML. I've lost my mouse. There we go. Alright. So database1.YAML. So pretty standard Kubernetes looking manifest here. It's okay. So the URI is coming from a secret of a key. I mean, can I hard put that? Yeah. You can. If we look at the the docs, we can see exactly how. So let's let's head over to the the documentation tab instead of the tutorial tab. I was hoping for a duplicate action there, but I'll just do that. There we go.
18:47 Docs. Then connecting data. Actually, that's another example. There you go. The reference for all fields. There we go. So Alright. Value. Okay. Yeah. Obviously, in production, like, don't mind. It's just hard to Yeah. It's just because I I have no idea about the config map. So secrets available on this cluster. This was a civil marketplace installs the database. I took no ownership of it whatsoever. So I'm just going to provide the service name, which is going to be old square sql. Default. Svc. Cluster. Local. And I'll assume we're gonna just put the port on on
18:55 Reference
19:36 that. I think we need I think it needs to be like a there's like a PostgresQL. Yeah. Exactly. And then you can throw, like, if don't need a username and a password, do you need the username and password to connect? How is it deployed? Yeah. We we will need a username and password. So I just oh, yeah. I'll just put them in the URI then. So Sorry. Sorry. It's not a publicly available Postgres No. One hopes. Yeah. Cluster IP. Safe. What could go wrong? Yeah. Perfect. Okay. So I need to describe my postgres config map, which has all of
20:05 PostgreSQL
20:18 my stuff. I wanted get DB password. User. This is terrible as secure. Some random Don't don't judge me too much. That that may be enough. I don't think we need to put the DB name in or anything because we're creating a database. So I think what we have here should be enough for it to work. The database already exists in in it. So one thing SchemaHero today doesn't do is create the database for you. It assumes that you've provisioned a date. We have a feature request on on that, but, like, you have to yeah. Alright. Okay. So
20:54 Database Must Exist (Limitation Discussion)
21:08 I think the database just goes on the end. Yeah. That is part of this blurb. Postgres DB. Alright. So it's impressive that you can write a Postgres connection URI off the top of your head. That's Well, it's not worked you had. It's not worked yet. Right? Fair enough. Alright. Are we feeling confident? Let's do it. Or Oh, we gotta change the name and the namespace up there. You probably wanna name it something not AirlineDB. I don't know. But you can. Whatever you want. Rawkode five. Why not? Alright. Let's try again. Okay. So I'm assuming I can just run
21:45 Get Databases
21:52 get databases. Yeah. Exactly. I'm assuming if I run describe, we make some events or We should, yeah. And no events there. But if we look at like if you want to do like a get pods, what this actually did is created one controller pod, like the Rawkode Live controller zero. Nice. And that's the one that's going be managing this database. And I know one of the things that's always tricky is debugging some of these systems. And so this is, like, where to go to watch logs. You know, it'll spit out, like, the the schema and, like, any kind of
22:30 challenges or problems and and and errors it runs into here. Okay. I mean, I'm assuming there's no error message, so the connection probably worked? I think so. Yeah. This is this looks good. This is a positive result. At least a better error tutorial. I'm assuming we wanna start creating some tables next. So That's good. So to do a validate, we try and get databases. Yeah. Cool. Alright. Let's create our first table. Again, SQL flashbacks there. Yeah. I haven't written any SQL in so long. I've gotta say. Good. You shouldn't have to. It gets gone. Alright. So we could just modify oh, no.
23:15 Create New Object
23:18 We're creating a new object. So we've got a table there. Got it. Let's just keep it in the same file. Keep it simple for now. So k. We'll call I can call this table, I'm assuming. Oh, this is the manifest name. Right? So Yeah. Let's call that project. I'll just use my own namespace. It's going to the database that I just created, which I called Rawkode Live. And the name like, if I admit this, does it use the manifest name or no. It doesn't. That's a good idea. Yeah. Like, we we have them as separate fields
23:58 just so you know. Like, we we like underscores and table names, and Kubernetes doesn't like underscores and manifest names. And so we ended up, like, having some weirdness, and that's why they're two fields. But, like, you know, I like the idea. Okay. So we're gonna have a primary key. I'll just keep the airport schema just now. I mean, we can modify it and play with it and see what happens. But for right now, I think this is enough to test that we're we're on something that may be working. Great. You happy with that? That looks good.
24:22 Applying the Table Definition
24:25 Yeah. And we can connect already to beekeeper and see the the schema and stuff like that. Right? Yeah. We should be able to. So I'm gonna apply that. Get databases and tables. Looks good. I don't know what I'm looking for, but I think it looks good. I'm gonna Looks good. Describe. We've got the schema. So one thing you can do here is actually use the kubectl plugin that we have there, and you can do kubectl SchemaHero get tables or get migrations. Well, I think we're gonna dig into, like, the migrations part of it here. So
24:55 Checking Pending Migrations
25:06 Okay. So Penta and Zeta, that sounds good? Yeah. Well, it didn't there's no migration, which is a little bit weird. I would have expected there to be a migration that it, like, needed to deploy this table. It may either still be running or some other problem. Like Oh, now we gotta fail to connect. So my confidence with my Postgres URI. Looks like fill the split host port. Oh, there's like a slash missing, I think, it looks like. I look at username, password, and then a URI just shows right up there. Oh, yeah. Silly, silly, silly.
25:25 Filter Connect
25:49 Be an at. Right? Yeah. Exactly. There you go. An at sign's better. Alright. Let's yeah. Let's just keep those logs coming, actually. We may we may have to delete that pod and let the let the controller I don't know if that pod's gonna automatically reconcile. That may be a bug. So if we run get migrations right. Okay. Let's delete the pod. Let's Yeah. This is what reconciled loop is supposed to do though for you. Right? Exactly. We'll let that go away. We'll let it come back and that'll be fixed. Alright. Let's get our logs again.
26:30 Migrations
26:47 No errors is good. No errors is good. So with tables migrations planned. Oh, we have a migration. Hey. Yeah. So yeah. If you actually run this with kubectl SchemaHero get migrations, we we spit the output a little bit differently. So we'll kind of like, they're both valid. You see the same names. Like, I think that we need to change the way that we're doing this a little bit in the project. No. I can't do that. Okay. No. I don't think we did that. So Okay. So we've got one pending migration. We've got a migration that
27:29 has been planned for fifty seconds. So what's happening here? Can you fill me in? Yeah. So you do it it fifty seconds ago, it saw that there was, you know, a table YAML that was deployed, and it realized that there's something like, it doesn't match the actual running schema in your database, which I think is your current schema has nothing in it. And this is a plan to create a table. So it generated the necessary DDL, like the SQL statements that were going to be either create table or alter table or whatever. And it stores them in this migration CRD.
27:34 Understanding Planned Migrations & Approval Workflow
28:03 And then, like, by default, SchemaHero expects that there's some workflow here where you can actually decide, I want to look at this so you could control SchemaHero describe migration and pass that ID in. And it gives you the ability to, like, check it and say, yeah, that's actually the SQL that I want. You can turn that off and let it auto deploy. You can see the generated DDL statement there is create table projects. Alright. So it's it's already given me the command. So I can do approve migration and drop the ID in. And that's gonna execute
28:33 Describing the Migration (View Generated SQL)
28:41 the migration and look at the table. Okay. What what is the recalculate there? Does that just tell to go build that again? Yeah. So we in that migration, we we store that create table statement. That way, we're, like we calculated it one time, and now we store it. So recalculate will basically erase it's supposed to erase that and tell it, like, go go look at the database again and see if that's still the same command or what you need to do. Okay. So if I was modifying it manually at that point in time, I might
29:12 wanna ask it to go back and recalculate. Exactly. What happens if I reject it? Nothing other than it puts a timestamp that says you rejected it at this time. Like, that's all it does right now. Alright. Okay. Let's not reject it. Let's approve and run. No. We wanted to to table. Right? Yeah. Either get migrations or get tables. So there's zero pending now. Okay. Executed, approved, no rejection. Is this where we open beekeeper and Yeah. Let's do it. Let's make let's make sure the table actually like like, SchemaHero did what it's supposed to do here.
29:45 Beekeeper
29:59 Alright. We got our projects. Cool. No data in that button. Yeah. How to how to add data? I've not gone that far with BitKeeper yet. View data. Oh, views view yeah. Or view structure maybe too. You can actually yeah. I I don't know how to I should start playing with the the oh, there we go. Alright. Code. I'm in Glasgow. There we go. We have our first airport. Cool. Great. I just wanted to put in, like, one row so that we can maybe migrate the data and we'll actually see that that we don't lose that data. I think
30:38 Manually Adding Data (for testing)
30:46 that would that's important. I think you I'm not a beekeeper expert here either, but I'm I I I suspect you gotta click that apply button at the bottom there. Yeah. That should be a bright color. Come on, beekeeper. Yeah. There we go. And select star from that will complete too. Sweet. Okay. There we go. So we're confident that SchemaHero did this job. That's pretty cool. Back to this tutorial, are we I want I think let's like, the next step in the tutorial was just, like, go modify the table. So, like, if whatever let's let's let's go modify stuff.
31:15 Modify Table
31:26 Oh, yeah. So let's just wait. Okay. So Yeah. Everything happens, and our database CRD and our table most of our table's c d but we have this database for the custom resource. So let's because this is declarative, I can just do whatever I want. Right? Within reason. Yeah. With Yeah. Alright. So I've decided this is going multilingual. We'll do English name. Now what I'm curious about this specific scenario is there's nothing really to tie this to being formally named. So the SchemaHero look at the previous schema and understand that this is a rename because the type is
31:52 Modifying the Table Definition (Adding Column, Rename Discussion)
32:07 the same? Does it think it's a new column and drop the old column? Or do I have to add anything to tie this to what was the column formerly known as name? Like Yeah. Yeah. You you jump to the the hard the hard declarative scheme migration problems here. It's it's unfortunately the second right now where it's going to see this as the table no longer has a column named name, and now it has a column named English name. So the I I I think that SchemaHero is gonna calculate this as drop column and add column,
32:38 you're gonna lose that one piece of data. There's a issue in the repo to say, like, you know, prior names are for like, exactly where you went with that last option, like like, be able to list for for former names and then be able to rename columns. But today, it's not in the in the product yet. Okay. So there it doesn't rename. So we would just always leave the column names as is really and work with with new columns. So Yeah. We we can add columns. We can modify column types, constraints, keys, you know, things like that for sure, but, like or
33:06 indexes, but not not renaming columns yet. Okay. Cool. Let's add one more column then. Got a preference? Let's go. What what what what else does airports have? Like, number of runways or something like that. Yeah. There we go. Yeah. Runways up. That'll do. Type. And is that number n is there any abstractions? Am I working directly with the Postgres types here? Just Postgres types here. So, like, int, you know, big int or yeah. Whatever, like, whatever Postgres types are. Okay. And can I set, like, a default value or anything like this? Yeah. You can. We may have to look
33:12 Adding a New Column with Constraints
33:56 at the docs off the top of my head on that one, but, like, you you can set a default value. Yeah. We've got the reference open here. Right? So if we click on table okay. So we could do foreign keys. We've got our indices, columns. We can set a not null. I I don't even know if Postgres does have default values, to be fair. I was just thinking. No. We do that in we use that in my SQL quite a bit. I'm curious if we we question. Okay. So we got let's just do a not null.
34:44 Alright. So that's constraints not null on the column. That we got it here. This will be a good one because it's gonna show you that it like, you already have a piece of data, you already have one row in there and, you know, adding a new column with a not null and not a default, like, post credits is not gonna, you know, it's not gonna be happy with that that alter table statement. Right? Yeah. Good point. I hadn't thought of that, but that's cool. So let's break it. Let's break it. Let's do an apply. And then if I run SchemaHero, get migrations.
35:03 Applying Modified Table Definition
35:19 We've got a new planned one and I can describe the ID. So this was to do author table projects, add column runways, integer, not null. Kinda what I expect. So do we approve this? Let's approve it. I'm assuming that's never gonna be executed because of the constraint. Is that what's happening? I it right. It never successfully executes. And so I think if you look at the logs of that controller pod again, you'll see the actual error here. Yeah. Okay. Well, it's it's got no values because this doesn't exist. So I think we do so I was gonna say, think we I
36:15 Adding Default Value to Column
36:15 think we do support the default field here as a as a string. So, like, if you if you wanna take this one step farther, let's actually go and and add default at the same YAML level as type. So the reference doesn't have it, but my autocomplete, which is creating against the cluster, does pick it up. Interesting. I think you have to I think it's a string type, and then we'll cast it in, like, Zins. Yeah. What else we got? Attributes. What's that? Oh, okay. We thought it well, that would be silly, but, yeah, we could
36:54 cool. Let's try that again. So we approved that migration, and now we're gonna generate a new migration. Is that going like, what's the workflow here? Do I have to now reject the old one in order for that to work? When it's the same table, what SchemaHero is going to do is it it's going to it doesn't allow you to have these, like, stair stepped migrations that are in flight. So it's going to say, like, you have a migration for this table. Now when you deploy this new YAML, it's gonna calculate a new migration for this table, so
36:55 Demo Continued
37:25 it'll replace the one that's sitting there failing right now. Will it take away the approved state and then it's reapproved? It will. Yeah. It'll be just back to, like, two migrations, one's unapproved. Alright. That sounds good to me. There we go. Planned, not approved. Oh, no. That's the third one. It may be a recent change that we keep it there, so you for for for for audit purposes. But yeah. Alright. So alter table projects at columnar is not l d four zero. As you can see that this one calculated that against the current state of the database,
37:36 Applying Corrected Table Definition
38:01 not against the current state plus the pending migration here. Right? Yeah. It wasn't altering the column. It was adding it's still adding a new column. Okay. Let's approve this one then and then get executed and approved. Sweet. So if we go to beekeeper and we run this query, yeah, we get runway zero. Cool. That worked. Hey. It's good when things work. Right? Yes. Let's refresh. Okay. I'm gonna just keep trying to break stuff. Are are you comfortable with that? Let's let's do it. Alright. I'm gonna modify the schema myself. So if I can work out how to
38:41 Testing Manual Schema Change (Drift Detection)
38:49 do that. In fact, I'm I'm gonna cheat. Where was the alter statement? So I don't know how to add columns with beekeeper, but we do have this query window, so I'm just gonna use it. So let's add a column called never, like so. Now in my head, what I expect to happen here is for SchemaHero to detect this and per generate a new migration to remove it. Is that what you think is gonna happen? Very, very, very close. That's what should happen. The only thing you have to do today is you like, SchemaHero has that table as in sync, we don't
39:32 Why Automatic Drift Detection Isn't Continuous
39:39 continuously reconcile and put a load on the database querying it. Like, there's a a thing that we we wanna figure out how to, like, do that drift detection in large databases without, like, overloading the database, hitting the schema up constantly. So today, there's you could either deploy any other change to the table, or if you delete, like, delete table dash all then just redeploy that table, exactly what you said will happen. It just there's no trigger in the reconcile loop to tell it to go recalculate that right now. Okay. So our best way forward would be just
40:14 to modify as soon as all airports have at least one runway. Sure. It's not an airport otherwise. Or yeah. Apply. Okay. So, yeah, I I wanna make sure I understand. So SchemaHero does the reconciliation, but if it believes your database doesn't sync, it doesn't wanna add any the burden of extra load querying on that. So Yeah. One thing that we do, and we can dig into this a little bit later, is we use GitOps inside our production environment that's constantly sinking YAML into the cluster. So when we run into this problem, we'll just delete a table YAML. SchemaHero, by the way, just
41:00 like it won't it's designed to be relatively safe. If you delete this table YAML, it's not going to go drop the table from the database. It's going to leave it there and think that's now an unmanaged table. There's a is deleted field that you can set in the YAML, which will actually tell SchemaHero, I want you to actively go delete this table. So one trick that we do to kind of poke SchemaHero when we need to is just delete the table YAML, let Flux redeploy it, and then it recalculates it. Okay. Cool. Let's approve this one
41:38 and refresh here impatiently. Let's do it. Don't have don't have to be very impatient. Okay. Perfect. Yeah. That that it's working. It's doing everything that I kinda want it to do. I'm pretty happy with that. Now I'm gonna do the thing that you said, which is is it here? Yeah. And that's just a billion. So as this manifest is deleted, it will delete the table? Exactly. Yeah. So you can probably get rid I mean, you can obviously, you can leave the rest of the data there too, but it's not gonna be that important. Alright. Okay. I understand you. So when I
41:55 Marking Table for Active Deletion (`isDeleted`)
42:24 apply this, it will delete the table. It's not an indicator that if the manifest disappears to delete. Okay. That makes sense. Yeah. Exactly. Which we we we errored on a, like, really, like, you know, side of caution that, like, you know, you you need to you need to say very declaratively, I want this table deleted. Not like, the absence of a YAML means we should delete this table. That felt a little too risky for what we wanted to do with her an early project. Yeah. Working with databases, you've got you've I guess you've always got the error in the
42:53 state of caution. Right? The last thing you want is people complaining that things disappeared. Let's try. I'm going have to approve the migration route. Yeah. So there is an immediate deploy flag on the database that you can set. And then when you when that when you have that set, SchemaHero doesn't wait for you to immediately for you to approve them. It just automatically moves them into the the approved phase. Okay. Let's approve this one first, and then let's try that. Right. So I've lost the number. There we go. Beekeeper. Alright. Definitely gone. Okay. I've now realized that that was a terrible,
43:24 Enabling Immediate Deploy
43:46 terrible mistake. Okay. So where do I set the we don't need approval thing? Is that on the database or on the table? It's on the database. I believe it's right under your autocomplete will help you. I believe it's right underneath Postgres. Should be immediate deploy. No. Maybe up under connection. Or we may need to look at docs. Oh, database. Right? So let's pop over here. Oh, yeah. Media us on the spec. Okay. Yeah. This is what's the enable shell command? So if, you know, you have a database SchemaHero can talk to your database, obviously. Interestingly,
44:28 Enabling & Discussing Kubectl SchemaHero Shell
44:37 to your point, you have this port forward setup in order to connect through to the cluster IP service. The shell command basically will allow you to kubectl SchemaHero shell database name, and that'll create a it'll create a pod running PostgresQL and exec you into that to connect to the local database so that you can actually get into a database that you don't have a direct route to. Oh, nice. I'm gonna try that too. Alright. Alright. On spec. I don't know why we're not getting the autocomplete. I don't know. I think we were we never went high enough
45:15 on it. I I'm I'm sure I did. I'm gonna rewatch this later and be like I'm I'm determined to prove Versus code wrong. Okay. I'm sorry Versus code. It was me. Okay. So we're this is gonna get us our shell command under immediate deploy so we can start adding columns, and we should just see that instance. So I'm gonna reapply. Do I need to restart the controller for that change? It's a great question. I'm I'm wondering that myself right now. So let's see if it deletes if it does it by itself. It should do it by itself. The right
45:54 the right behavior of SchemaHero would be to do it by itself. Yeah. There we go. Oh, it did. Great. Yes. We have the thing. Okay. So When you look at migrations, you'll still see, by the way, that, like, it still goes through the same phase. You know? It's just that it auto approves it for you because we we you still have an audit log of exactly when it was approved and and and executed. Alright. Well, let's add funny joke. I don't know. Too much pressure, Varcher. Okay. So play beekeeper. Funny joke. Okay. That's pretty cool to see that instantaneous
46:43 Verifying Immediate Deploy
46:47 kind of reconciliation when I when I make the changes here. Yeah. I guess that's that's confidence mode. Right? Exactly. We you know you know, completely transparently, we we have that enabled for pre prod environments, for QA test environments, and dev environments, but we actually still don't do that for our our own production environment. We still have a thing where we're like, some you know, a human should not write the SQL statement, but, like, look at it and decide it's okay to run. Is this something where there's, you know or do you see tooling or if that's eventually
46:55 Transparency
47:21 coming where when the migration gets created, you know, it's posted to Slack, if someone can just approve it. I don't know if I'd want Slack hiccups quite there, but at least a notification to say this this pending migration, this is what it looks like. And is that tooling that you're using yourselves, or is it very much just keep control all the time? Yeah. Right now, we're still doing keep control all the time. Like that, like, finding that right workflow. So it's like, is it is it chat ops? Is it, like, in Slack? Is it, like, you know, how do you do it?
47:40 Workflow
47:51 It's it's a it's a great question, actually. Like, we're like, I don't think, like, running kubectl approved migration is the right path, honestly. Like, it's like it gives you the workflow and the check that you need in place, but, like, they're honestly, like like, it's like you know, working with large enterprises, they have, like, change management systems and, like, you should have, like, an audited event of, like, who approved that thing into large migrations be you know, because, like, it can have negative effects, and you just wanna know who's who's looking you wanna make sure some human is looking at them really
48:20 at the end of the day. Okay. Let's do one more thing, and then we'll move over to your your demo. Does that sound good? Sounds great. I'm gonna do a delete on this because I wanna do it without the operator. Alright. So oh, it's in a namespace. Right? So SchemaHero. Yeah. I'm assuming this is the deployment. No. No. It was s s t s. Right? So it was named that way. S t s SchemaHero. So we now have no more operator or controller running for that. And you said Right. So I I I wanna think about the
48:34 Transition to Operatorless Mode (Deleting Operator)
49:02 local dev thing. When I meet teams and people, they're not always deploying to Kubernetes. They are for production, but when it comes to dev, they're still using local services or Docker Compose. It's one of tooling that. So I'm curious if I have my application. I've got this op directory with the SchemaHero. I've got the the exact same YAML here that I've shipped to prod. What's my process to get that working with just the SchemaHero command? Yeah. So there's, like, one page in the docs that show it. Like, we'll we'll let's go through it and we'll figure it out here. I don't like,
49:35 Using SchemaHero CLI (`plan` & `apply`)
49:41 the the the CLI command syntax exactly off the top of my head. Under documentation, under advanced, there's a page called operatorless mode is what we call it right now. And this starts to show you how you can like, this shows it running as a job in Kubernetes, which is probably, you know, like, looking at this now thinking that's probably not the right syntax to give you, like, the operator list mode in because we we designed this really being like, I wanna run this in Kubernetes, but I don't have cluster admin permissions. But you can actually see that it's running,
50:14 you know, the job. You know, we may have to, like, reverse engineer this a little bit to say, like, okay. It's running SchemaHero plan, and then it's passing in a couple of args to it. You know? Yeah. I'm sure we could work with that. So Yeah. K. Email. Let's try that plan first. This will take the YAML as yeah, spec file and spec type. Right? So let's try spec file. And you don't have to deploy a database type here. Like because what you can do is just pass in driver driver and URI right on the CLI. So you can pass
50:45 Plan
50:58 in driver equals Postgres and then URI equals that connection string. And then just the plan to the YAML, and it'll it should it should plan it for you. Oh, so it's not parsing the custom resources on the YAML to pull out the database information and the table. It's Yeah. It parses the custom resource in the table, but you you basically you you deconstruct the database part and give it as CLI flags at this point. Alright. At URA. Let's just copy that. Back fail. Yeah. So a couple things I think are are causing the problem. One is, like, your machine,
51:45 Troubleshooting
51:52 don't think can access like, local host five four three two is gonna be a problem here. Oh, no. Do you still have the port forward open so it actually could probably do that? Yep. They're still running because that's what Beekeeper's using right now. Let's drop the table, like, because it might just be that there's no differences detected or something like that. Oh, yeah. I had to delete the alright. Okay. Drop table. Done. There's like a out pram. I maybe I should've written docs, shouldn't I? Yeah. That's just Alright. So alright. It takes environment variables. Okay.
52:04 Manually Dropping Table (for CLI test)
52:47 So this seems to be potentially a directory, and there's a schema here which will create so I was trying I was gonna create migration. It was okay. So I think we're close. Yeah. What am I missing? So I think there's maybe looking for a directory. So I'm gonna do spec. I'll equals dot maybe out migrations. No. Not quite. I still got my database file. Yeah. Okay. We're close. Oh, you know, let's let's like, I'm kind of I'm kinda guessing a little bit here where we're like but let's can we comment out or do like, let's make
53:30 Migration
53:40 that database one YAML only have table specs and not a database spec in it. Alright. Okay. Yeah. Okay. So migration. That seems promising. Yeah. Alright. So it's trying to unmarshal something and failing. So that's I think It did work. Oh, because dot contained other files. Yeah. Yeah. I think I'm confusing it. So let's do r equals migration. You should also I think spec file can take a directory or a file. Like, you can I think that we can support both there? There's just a dot at the end of it. Oh. Alright. Oh, that's a fail.
53:44 SchemaHero CLI: Plan
54:48 Oh, but we did get the migration. Okay. So based on that, I'm assuming I could plan all the way back here. Apply. Yeah. And this is gonna take Slash d d or dash dash d d l is the, I guess, the parameter name here. D l equals migration. Oh, hey. There. We did it. Well, and you never wrote SQL. Okay. That's cool. So, yeah, we can without Kubernetes local development environments, we can point it to a file or directory containing our table definitions. So we'd probably keep our databases separately. But, yeah, my my my local repository for my application code is never
55:05 DDL
55:43 gonna contain that database CID. I mean, that's gonna be always come from platform and and ops team. And so the table one's just living there. Actually, it probably makes a better sense. And then we can do the plan and apply. We get the migrations, and it worked. Cool. I'm I'm I'm happy with that. Great. We're gonna move to your demo now. We have a question from Rio in the chat, which I'll throw over to you. So Rio asks, is there a way to set the name of the migration beforehand? This could enable CI systems to block deployments
56:06 Guest Demo Introduction
56:18 of specific software until SchemaHero shows a migration has been executed. It's a it's a good question. So today, like, those those IDs that were, like, the name of the migrations, they're actually deterministically calculated by the generated SQL statement that is being created. And one of the benefits of that is if you have to recalculate them or whatever, the ID won't change. You end up with a history. Every time I deploy the same migration from the same state, I get this ID. So today, like, no. There's that's unfortunately, that's not possible. And it's an interesting thing to to to
56:56 think about, but today, we we they're all they're all calculated deterministically. I guess you could have it in a container, though, that looked up all migrations and made sure that we're all up to date before deploying software. I I mean, if that was part of your workflow. But, yeah, maybe something Yeah. Yeah. It's actually we have a monorepo that has the database migrations and our code in it. And so when we deploy a change that involves code changes and a schema change, it becomes a little bit tricky. And you have to be careful if we don't have immediate deploy to true
57:13 Code changes and migrations
57:32 because if that gets deployed all the way, you might have code that expects a column to exist and you haven't approved that migration yet. We, like, have decided to handle this by saying, you know, don't write code that is hard dependent on that column to be there. Like, you can handle that by writing code a little bit differently. That's not like, we don't want to tell you to change the way you write code. So that's that's a overall problem if you're handling, you know, monorepo that handles code and migrations, though. Like, there's a it's a challenge. Yeah. Definitely. Alright.
58:03 Do you wanna share your screen, and we'll we'll take a look at what you got, Ashul? Sure. I'll I'll share here. Alright. So great. So what I have is alright. So I have a Kubernetes cluster just running on DigitalOcean and a managed separate out of the cluster MySQL server right here. And then MySQL server is only available inside the cluster. It's set up similar to the cluster IP, obviously, but it's like a VPC firewall rule that's making it so it's not publicly accessible because that's how people run production systems, hopefully. And also in case I accidentally show a
58:16 Guest Demo Setup (GitOps, Flux, MySQL, Sealed Secrets)
58:43 password here on the screen, I'm not super worried. And, like, I did a little bit more prep on this one, and I have Flux running. So I have Flux running in a GitOps deployed pipeline. So I can just write code, commit it, GitHub actions runs, and then it gets automatically deployed. And the thing that I wanted to show was a couple of things here. We've been playing around a lot lately with a new version of SchemaHero. Oops. Sorry about that. And so this is like the alpha version of 13, which is currently in progress. We hope to
59:06 New Feature: Seed Data (Alpha 1.3)
59:19 ship this in mid February. And there's a really cool feature in it, and my goal was to show that. So I'm I'm gonna basically go through the same same demo. We'll go pretty quickly. We're gonna install the 13 operator to this cluster, and that's going to spin up the operator. Here I have a semi, potentially typical repo that has a CI pipeline, more matches like how you actually would run stuff in production. This pipeline is lit I mean, in order to make the CI process really quick for this, I don't have code in it. It's
59:57 literally just customize running a bunch of, like, manifests. And I have sealed secrets installed so that I can actually deploy secrets through a GitOps pipeline. Here, I have a database YAML. And the database YAML, I have the same commands that we just looked at, immediate deploy and enable shell command, and this deploy seed data set to true, which is one of the cool new features of 13 that I wanted to show off. In my production overlay for customize and sorry. I'm throwing a lot here, like, around GitOps and Flux and customize and everything. Like, I think
1:00:11 Demo YAMLs (Database & Table)
1:00:32 that, you know, just trying to, like, demo a more, like, you know, realistic end production environment. So I have a sealed secret that contains my connection string for MySQL. Yep. The sealed secret's Helm repository. And so what I'm gonna do is deploy all of this. I have a schemas, and the schema has the same, you know, basic table. I'm actually gonna comment that part out for now. And so this is the same basic table here. It's called airport this time, but it has the code and the airport name. And the customization here only has the airport
1:01:10 YAML. So really quickly I need to make the window smaller so I can see the terminal prompt. So if I look at, like, customize build, you know, customize overlays production, and we look at see what that actually is creating, you know, we just have a sealed secret here. We have our database object, which we're familiar with. We just went through it. And then I don't have a table because I probably forgot to save a file somewhere or something like that. Let's try that one more time. I have a table in here. So base database customization.
1:01:53 Demo: Fixing Customize Config
1:01:56 Oh, yeah. Here. I need to include the schemas. Sorry. Alright. So here. Now I have this airline DB table or the airport table being deployed. So So we'll just push that to GitHub. This is the repo that I pushed it to. And a GitHub action is gonna run, which is going to basically go, you know, run unit tests, run all the tests that normally run-in CI. For this one, it's it's really quick. It takes about thirty to forty seconds to run all the way through CI. And then we'll render out that customization YAML and put it into a a a separate
1:02:16 Pushing Changes & GitOps Workflow
1:02:43 repo that Flux is watching. And wait for all that works, so I can, like, you know, kubectl get tables, and I don't have anything. There's nothing deployed. So we'll have flux do, like, a a reconciliation while this continues to run. It's red, but I don't think that it's actually an error. This is just what happens. This should run pretty quickly, customize build, and then upload all the rendered out YAML. Where does it upload the render other another repository? Yeah. So we it's right now, it's going to this other repository that I just created called GitOps demo,
1:03:28 and it's gonna deploy it all right now. You see it here. It just deployed this, and so 10 ago. So Flux is gonna watch this and pick that up. So we'll try to push Flux along and not have it set idle. If I look at tables, it's there. I can run the same commands that we were just looking at, SchemaHero, get migrations, and this is approved. It's there. So this is because we have immediately immediate deploy set to true. The thing that I wanna show here is a couple of things. So kubectl schema hero get databases.
1:03:59 Using Kubectl SchemaHero Shell (Demo)
1:04:05 Our database is named AirlineDB. So I can run kubectl schema hero shell that and that's gonna run on my SQL pod on that cluster and give me a connection and do it. That's cool. Now yeah. So I can show tables. I have the airport table. Select star from airport. And sorry. I don't have beekeeper setup with port forward here, so this is a little bit harder. The table's empty. So coming back here to the part that I wanted to show, which is the part that we're excited about, is we added the seed data where I added a couple airport codes. Right?
1:04:31 Defining Seed Data (YAML)
1:04:40 Like, I have two two rows in the database, JFK and LAX. I'm gonna go ahead and add that. We'll push that up to the repo. We wait another minute while while this action runs. And what it should do is the end result will be that the the manifest with the seed data will end up over here in our GitOps repo. One thing to note while the CI runs is we set this, like, that you have to opt into it for seed data because one of the things we ran into, we'd like, we we deployed it. We were using it
1:05:21 Seed Data Opt-In Explanation
1:05:28 in our dev environments to do things like, you know, deploy features and entitlements or feature toggles entitlements and a little bit of, like, you know, fixture data that you need to run tests. And it was, like, super useful. And then right when we started to think, okay. Let's merge this in and push it out to production, we were like, wait. We don't want all this data in production either. So we ended up having to, like, make that something that you can opt into and out of at the database level for now. So this should finish this
1:05:57 CI here in a second. So why if you opted for this kind of two step GitOps deployment where you push to another repository rather than having Flux just apply the the YAML directly? Yeah. It's a great question. It it match like, I think it's let's call it muscle memory of the way that we've been using Flux. We we adopted Flux with Flux one point o when it didn't have support for customize built into it. And so we ended up having to do these, like, kind of CI processes to do it. And like this like, was easy for me to set up
1:06:31 in order to do it. Like, no other reason than that. Good. That's a good enough reason. Sorry. So the seed data is now here. And if we come back here and we select star from airport, there's two two rows of data in the database now. So it's not just Schema. We're starting to get into into the actual data. But can you describe that migration? Is that just gonna be in the search statement? It it it well, almost. Almost. Let me get migrations. It is in search statements, but I think it's important to point out here that here. Let me
1:06:54 Describing Seed Data Migration (SQL)
1:07:12 I've been talking, so I don't know how you do it. Subscribe migration. So it is just insert statements. But for MySQL, we do the on duplicate key update. And for Postgres, it'll say, like, insert ignore statements because, like, we know they're gonna be regenerated all the time. And the idea is, you know, primary key in this table is the code column. So as long as the one of the primary keys is included in the seed data, which, like, likely is going to be anyway or else it's not gonna be a very valid piece of data,
1:07:46 this works. That allows me to delete that row. And then the next time it's, like, drift detection, it'll put that seed data back, but it also makes sure that it's not duplicated each time, if that makes sense. Yeah. Definitely. I think that's a really cool feature for, again, that development workflow as well. I've been able to do the plan and apply and get data that is, you know, crafted alongside the the schema itself. Makes a lot of sense. Yeah. Yeah. I think that's a really cool feature. I'm gonna stop sharing here. But, like, yeah, the
1:08:15 Discussion: Generating Fake Data (Feature Suggestion)
1:08:18 no. But, like, the that's exactly what we use it for. You know? Our our dev environments, we ended up with you know, like most folks, we have, you know, a table full of features. And, like, you know, when we wanna roll them out, like, everybody has to run migrations to insert those features into their database for feature toggles or, like, there's, you know, generic fixture data that we use for, you know, different types of tests. And to make sure that the tests are portable between environments, we can actually just throw the the data into the seed data and not put it into
1:08:47 production. Yeah. I've got a a feature request for you that you can bring for 14 now. Is that instead you know, when I have the seed data, I've been able to use the faker library and say, like, this is the first name. This is an address. This is the second name. And Oh, that's good. Generate 10 or 20 or a hundred or a thousand of them. That would be a really cool feature as well. That's cool. So seed data would be less deterministic but more just like like, let's just fill everybody's environment up with it and, like, we're gonna find
1:09:11 some problems and Yeah. You know, if you wanna do some property based testing or you wanna do some load based testing and really fill and hammer thing up a little bit, it would be cool to be able to have a seat there that is deterministic and then populate it with maybe a whole bunch of undeterministic stuff. So and determine I like it. I like it. We may have to go add that in. I like that that idea. Yeah. I don't know if you're familiar with the Faker project, but it's just this really cool library for generating fake data with, like, types and
1:09:37 understands what it is. It's pretty cool. Yeah. No. It's great. Love it. Alright. Well, that was awesome. Thank you so much for kind of working with us today and showing us that that demo was cool to see the other things great. That shell command was great. That would have saved me a a whole bunch of setup with beekeeper stuff, but well. You know? Yeah. That's this this is why we do these things. To show these features to people so they see how they work. We've got one question that popped on right at the end there that'll pop up for
1:10:02 Q&A: SQLite Support
1:10:03 real. So everything until now has been connection based. How does it work with SQLite? So I can it basically works the same way. Like, you you can write a SQLite URI that just points to a file name, and then that'll point to a SQLite file. It's identical otherwise. David pointed out earlier, you know, like, he asked about the type, whether that's like an abstraction or just raw Postgres types. And you'll notice, like, in the database or in the table schema, there was a Postgres key. And in the one that I did, there was a MySQL
1:10:37 key. So, like, the schema can change. We don't try to, like, normalize and make sure the schema is identical between them all. We try to make them as similar as possible. But, like, SQLite works, you just pass in a, you know, a SQLite connection string. Yeah. SQLite is something that I was interested in. In fact, I I actually plan first to do a bit of a demo there, but I got carried away with the the postcard stuff. So I think I'm I'm definitely gonna be playing with SQLite sidecars with SchemaHeroes and having some fun there. So I'll be
1:11:03 sure to get some tutorials or something onto the channel. But yeah. Awesome. Alright. Any last words, Mark, before I let you go back to your day? No. I think, you know, you know, it's a sandbox project. Like, if I if I can, like, you know, like, we we been in the sandbox for a year. I'd love what we're really looking for is just, you know, folks who, you know, find use cases and, like, hit us up on the Kubernetes Slack. There's a SchemaHero channel. Like, things where it's working or not working, you know, we wanna you know, we have a goal of
1:11:08 Community Involvement & Conclusion
1:11:31 trying to get out of the sandbox and move it to an incubating project sometime this year. In order to do that, we need to have demonstrated, you know, use cases, and we we know it's out there. Use People are using it. You know, love to, anything that we can do to help unblock or, you know, feature requests or anything like this, just we'd love to talk. Awesome. Alright, everyone. Get involved if you like what you've seen. Sure Mark will be happy to answer questions on GitHub issues, Twitter, anywhere else that you're participating. Is there a
1:11:58 Slack for for this? Yeah. It's in the Kubernetes Slack. Yep. Alright. Okay. So check out the channel on the Kubernetes Slack. Alright. Well, thank you again, Mark. It was really good fun. I enjoyed that. I'm looking forward to playing with SchemaHero a little bit more. So thank you for your time today, and I'll speak to you again soon. Thank you.
Technologies featured
Meet the Cast
Stay ahead in cloud native
Tutorials, deep dives, and curated events. No fluff.
Comments