Overview

About this video

What You'll Learn

  1. Explain how VTGate routes queries using shard metadata from topology services
  2. Deploy a sharded MySQL cluster on Kubernetes with the Vitess operator
  3. Connect WordPress to Vitess then scale and reshard the cluster live

Vitess maintainers Deepthi Sigireddi and Alkin Tezuysal from PlanetScale walk through the architecture (VTGate, VTTablet, topology), then deploy a sharded MySQL cluster on Kubernetes with the operator and connect WordPress.

Chapters

Jump to a chapter

  1. 0:00 Holding screen
  2. 0:45 Introductions
  3. 1:03 Introduction
  4. 1:37 Guest Introductions
  5. 3:53 What is Vitess? (Project Overview)
  6. 4:00 What is Vitess?
  7. 5:44 Vitess Architecture Basics
  8. 11:02 Control Plane & Operational Features
  9. 12:43 Architecture Summary
  10. 13:37 Supported Databases & Sharding
  11. 15:14 Discussing Application Compatibility & Tools
  12. 18:00 Installing Vitess
  13. 18:19 Kubernetes Quick Start (Demo Setup)
  14. 25:00 Creating MySQL Cluster
  15. 25:11 Deploying the Cluster & Initial Look
  16. 31:31 Connecting WordPress & Troubleshooting
  17. 39:00 Deploying WordPress
  18. 45:38 Successful WordPress Connection
  19. 46:03 Application Compatibility Discussion Continued
  20. 48:38 Exploring the Vitess UI
  21. 49:00 Vitess UI
  22. 53:20 Scaling Our MySQL Cluster
  23. 53:28 Scaling Replicas & Testing Failover
  24. 1:01:32 Failover Behavior Discussion
  25. 1:04:41 Vitess-Specific SQL & Features
  26. 1:06:08 Discussion: Multi-Keyspace / Logical Databases
  27. 1:09:50 Chatting about Sharding, Backup/Restore, & Misc.
  28. 1:10:00 Resharding (Advanced Feature Discussion)
  29. 1:13:55 Discussion: Query Consolidation & Hot Rows
  30. 1:15:32 Discussion: Backup, Recovery & Online DDL
  31. 1:21:20 Community Resources
  32. 1:22:27 Conclusion
Transcript

Full transcript

Generated from the English captions. Timestamps jump the player to that moment.

Read the full transcript

1:03 Introduction

1:03 Hello, and welcome to today's episode of Rawkode live. I am your host Rawkode. Today, we're gonna be taking a look at Vitess, a graduated cloud native foundation project that helps to aim, that aims to help with clustering and scaling MySQL. To join me today, I have Adepti and Alkin. They are maintainers from the Vitess project and employees of PlanetScale Data. Hi there. How are you both? Hello. Yeah. Alright. I think what we should do is start with introductions. I'd love to get to know you both a little bit. So who would like to go first?

1:37 Guest Introductions

1:43 I'll let Diti go first, please. Okay. Yeah. Please tell us a little bit about yourself, and then then we'll move on. My name is Vitessigreedi. I'm a software engineer at PlanetScale, and I have been with the company for over two years, getting close to three years now. PlanetScale itself was founded in early two thousand eighteen, so it's just about three years old. I started working on Vitess at PlanetScale in late twenty eighteen, and I became a maintainer in 2019. We have a maintainer team with 16 members, and, I've been a maintainer since. And in my maintainer role, I

2:25 communicate with the community, answer questions on Slack, attend conferences, mostly cube cons, but also other conferences where I give talks about Vitess. I am currently the tech lead, so I also manage the road map for Vitess and just I'm involved in high level technical and architectural decisions on the project. Alkan? Alright. Thanks, Okay. My name is Alkan, and I come from an enterprise world. And I'm an open source database evangelist for the for the last almost ten years, a decade or so. And, I've been working in in the MySQL specific projects in areas for, like,

3:11 services companies. And I have joined PlanetScale about six months ago and, been engaged in, conferences and and webinars before. So, what I am doing at PlanetScale is part of the maintainer team, developer relations, and, also a little bit of, multiple hats over here with customer success and solution engineering and and so on. So I love sailing, so some people know. And that's about it. Nice. Well, it sounds like you both keep yourself extremely busy. Yeah. Awesome. Well, why don't we kind of just take a a few moments? I know we've got some slides that's gonna explain

3:53 What is Vitess? (Project Overview)

3:59 what Vitess is, why people need to, etcetera. So, why don't you get your screen shared up? We'll move that over. Let me share my screen real quick. Alright. Chrome tab. Hope you can see it. Yep. It is just coming. There we go. Perfect. Alright. So okay. We're done with the, inter introduction. So PlanetScale, we talked about that was founded in February. The the significance of of PlanetScale is the co creators of Vitess are actually the the founders of of PlanetScale. So there is a mutual relation between the Vitess project and the and the founders. So,

4:00 What is Vitess?

4:42 basically, our creators are the co owners, technically speaking. So a clustering system, we don't call it a clustering system, technically. It's actually a framework because it comes in bells and whistles. CNCF graduated project, if you have ever looked into CNCF, there are multiple projects. This is one of the earliest database projects. There are many projects that the the list is is long and very successful applications over there. And, it's open source, Apache two point zero license. That's important. And we have contributors around the community around the world also, so it's not actually US centric at all. The we also decided to

5:24 highlight this. It's written in Go link, and and contributors should be aware of it's it's a Go language that's used. And Go comes with a lot of updates to existing breed of languages, so we are actually Go shop. So so today, we wanna talk about the agenda on on the architecture, use cases, and how do we actually install, possibly show a demo with you. So let's let's get into the architecture basics. We have let's go actually talk about the glossary. So we have some terms, which is database terms but adopted to Vitess. Vitess is not a database. I need to

5:44 Vitess Architecture Basics

6:10 highlight that. Vitess is a framework, drives the database. So the MySQL in the back end MySQL is still good old, well, not good, but MySQL is great. So, it's a it's a great database in the back end. So we don't actually make any changes to the database. Database still remains to be, in one piece, but we actually drive the database with Vitess, edition. So we call, a key space a database, which is which one when it's sharded. So it actually had to be something other than a database. So when it's sharded, it has a key

6:45 space ID. It has a primary v index and a v index that actually knows where the shards are. There's a v t gate. The proxy server accepts the connections, and the v t tablet is the back end server which pairs with the MySQL database. And there's a topology that knows what's what's what's going on within the cluster. So, let's talk about a a common replicated environment. So we have a a database, a sidecar that actually runs the the MySQL daemon process, and we call this a VT tablet. So each VT tablet is paired with with

7:24 same host as the as the MySQL d runs. And you will have multiple VT tablets. And so it's it's a it's a VT tablet which which paired with it has a responsibility of of running the database, but also driven by the VTgate. So VTgate is actually our our proxy where the application connects to. So when you actually make a connection to the to the Vitess cluster, you actually go through the Vitess gate. Vitess Vitess is the is the query serving property of of Vitess, and and that actually points to the Vitess. So in a in a large deployment, you

8:10 would have multiple VT gates. You don't actually have one. Technically speaking, one VT gate cannot or be possibly be responsible for all the queries. We're talking about hundreds of shards in in in some cases, thousands of servers in in behind or or or nodes. VTgate would also have to scale, and that's that's this is how you would actually scale. And, and then well, your application, connects to the VTgate. And then at that point, you need to know where is my data. So in this example, we have a commerce database which is sharded, and, it has two shards.

8:50 And, how do we know where that query needs to point to? Vitessgate knows where your, actually, data sits. So if you are actually sharded by the customer ID and then your in n number of customers sits on one shard, and then VTgate would know when you send a query to that shard. And you can also have an uncharted cluster within Vitess. This is also something we need to clarify. You do not have to shard when you're using Vitess. Vitess also support supports uncharted just like the good old, you know, database applications, but you would also have the benefits of

9:29 the the proxy server, the topology server, and additional products that actually bundled in the Vitess framework. So we actually query your customer ID to that shard. And in this example, we can know where the shards are. So the next thing is is the is the VT gate that actually works with the topology server. So in this example, we have supported topology servers that knows your cluster. So you need to actually store all this information into where where they sit and and keep them up to date. The this allows the the Vitess framework to know the state of the of the the

10:19 cluster at all times. So it actually keep updating if things change or if things move. And in this example, we also say it's CD or Zookeeper are are are supported topology managers. So we are a VTCTLD is a a control daemon. That's another component of the Vitess, and that runs the the ad hoc operations and bunch of other things. So since the it's a control plane, you need to actually have a have a way to change things around and and and apply changes. So VTCTLD actually is the daemon that actually controls all that. So we we said the keeping the state,

11:02 Control Plane & Operational Features

11:08 Vitess knows the schemas, shards, clusters, several roles, and all that all that stuff as as a state in the TOPL. So the the when we look at the the control plane, we have we talked about the VT gates, the proxy server. We also have a built in backup and recovery, and and it pretty much automated. When when a when a node fails, it takes a backup and then restores it and then connects the dots, and then replication resumes where it left off. We have integrated failover, which is an a very popular known open source utility called Orchestrator as a VTORC within the

11:47 cluster. The sharding schemes are are are are optional. You can actually customize how do you want to shard. You don't actually have to shard by customer ID in that example. There are other ways to do that. Advanced replication options with the vreplication and vstream, these are, like, components of Vitess that you actually control how you want to replicate from and to. This is one step beyond the built in replication that that MySQL provides. Of course, there's an online DDL option that when you have to actually do a schema migration or schema changes or, you know, in

12:24 short, alter table, you would actually still drive GitHub online schema transfer or or Percona Percona Toolkit online schema change utility. And this is also driven by Vitess itself or do a direct up supply. In summary, the Vitess architecture looks like this. We have the application server that actually connects the to your your preferred load balancer, which actually points to VT gate. You're gonna say, okay. I have a proxy. Why do I need load balancer? Because you might have multiple VT gates and and scale out. So that's that's why. And then you have BTCTLD daemon that control the

12:43 Architecture Summary

13:06 the architecture, and the topos server knows where where things are. And then you have the the tablets and the clusters that's driven by by by tablet with the your your primary, your replicas, your your read only replicas, and then you can actually scale out based on your application. This is not like this is an example. This is not just one standard application. You actually may actually have to customize it based on your workload, we call it. Supported back end databases are MySQL 5.7 and version eight, and and we we also have MariaDB support. It does not work on

13:37 Supported Databases & Sharding

13:47 PostgresQL, and Postgres is not in the road map at this time, or there is not much of a development that actually. So we can call Vitess is is pretty much MySQL centric framework at this point. You can you can like in that example, you can have a a sharding done on the entire application or part of that application. So there is a mix and match from the community that we know uses not entire. Like, we cannot say, okay. This entire shop is on Vitess, but, you know, where it needed to scale or or fit for Vitess is is migrated

14:28 in that. And then the Vitess allows sharding and resharding. So when you shard in the in the old school, it's it's you're pretty much stuck. But with Vitess, if you shard, you can reshard. You can change your your your your methods. And then within the video application and the tools provided, you can migrate it into new new method. And so this the idea is to minimize backup and recovery. Hopefully, I'm not out of time, and and we have more resources at these things. And we'll share we'll share these slides also. Perfect. Thank you. Okay. Alright. There was

15:14 Discussing Application Compatibility & Tools

15:15 many more components there than I was expecting. So I just wanted to make sure I can I understood enough of that where we're going? Like, it says that so Vitess augments MySQL and MariaDB. Right? It's it's not replacing them. We're still using those standard databases and our infrastructure that that has not changed. So but what it does do is provide a proxy that's gonna handle all the requests that we wish to send our database and that provides a some sort of hashing key mechanism to shard that across multiple MySQL, MariaDB's and I don't even need to worry about it. I think

15:51 that alone is particularly cool. But then you kept coming with all this other stuff, like online backups and the restore processes. Just all these little things that are really cumbersome when it comes to working with my scale specifically. I think we've all worked with it at some point over the years. Like, trying to scale it beyond a certain point is always very, very challenging. I wish Vitess had existed in 02/2010 just from what you've showed me there in those slides. So I'm really excited to play with it today and and see some of this in action.

16:22 So, like, I'm curious about third party tooling. So I'll just throw a few questions out now before we go into the hands on bit. But does if I wanna use, like, a a MySQL browser on my local machine and it goes through the test, would that cause any sort of problems, or would that just work? It depends on the tool that you're using. I'm trying to remember what it is that we tested with. There there are popular MySQL inspection tools with GUIs that we have been able to get working with Vitess. Yeah. So I'm assuming that the Vitess proxy just

17:09 speaks the protocol and it kind of intercepts, tries to understand the query. I mean, does it even do a query plan into point if I do a select across multiple shards so that it knows to direct them to the appropriate ones? Is that something that's part of the Right. So that's all part of the proxy. It does speak to MySQL protocol and parses the query and uses the past query to figure out how to route it. And for certain complex queries, you actually have to do a scatter gather, send it to multiple shards and process the results. For instance, if you're,

17:44 doing some aggregate function in the query that or an order by, then that might require additional processing in memory in the proxy layer to send back the proper results to the client. Okay. That's that's that's pretty cool. Like, I'm a so even if I'm only running against one MySQL, I'm still seeing value in using something like Vitess even early on in my project. Like, just bringing it in, having it there, working quite happily, I still get all of that extra functionality even if I don't particularly need to drop into the sharding stuff quite yet. Okay.

18:19 Kubernetes Quick Start (Demo Setup)

18:19 Then I think we should just play with it and then see how easy it is to get working. So let's jump over to my screen share. I have here the Vitess documentation, which we'll use as our kind of guides today. I have Docker for Mac running with nothing, I hope. Yeah. Nothing running at the moment. So I see that there's a Kubernetes quick start. There's the local install. Is that fair to say that there's nothing that binds me to Kubernetes to use the test? I could use that in more traditional bare metal like environments just the same without Kubernetes,

18:57 or is Kubernetes is some of the functionality limited to the Kubernetes runtime? Kubernetes is not required. There are large installs that are running outside of Kubernetes and large installs that are running within Kubernetes. Mhmm. What you do get with Kubernetes is sort of being able to use Kubernetes primitives to manage your cluster versus writing your own tooling. Everyone who is doing it on their own kind of had to write some tooling to schedule backups and things like that. Whereas in Kubernetes, you can do that much more easily, especially because there is, an operator for Vitess,

19:45 which is also open source. Wow. So my take on on the Kubernetes, thank you, for for that insight. My take on on the Kubernetes, Kubernetes is cloud native. So when you do that in Kubernetes, you're already in the cloud, which comes with also a lot of tools and utilities, including load balancer, storage, your, you know, your uploads, your, you know, your gateway, your security, your keys, all of that is you're already in like, let's say you're using Google and GKE and and launch a cluster in Kubernetes, you already get everything at your, you know, customization.

20:26 You can actually set things up and and utilize those. If you're on local, then you need to figure out, okay. What's my storage? How do I, you know, get a load balancer? Where where do I run, you know, another node? Let's say, you need to run an application. Are you gonna run it on Kubernetes, or are you gonna actually, launch a, you know, virtual machine? But you still need a cloud. In that case, if you're in the in the Kubernetes, you get all that. The other thing is, like like, Diti said, the operator if you have an operator,

21:01 which means many things are are are automated. So within the operator, there's a logic that runs. Okay. If this node dies, bring another node, attach it to your your source or or primary, and then connect and then continue as nothing happens because the nodes can die in Kubernetes. Right? That's normal. But in in your local, if a node dies, then you need to figure out, okay. What what happened to that? Right? And then there is not much of an automation in as in that in that sense. So that's that's the those are the difference. And the pros and cons of the learning

21:40 curve versus the old school and how you how you did things in the past. Right? Excellent. Thank you. Yeah. You're right. I'm a big fan of operators in general. I think they are taking, you know, all that experience from people like yourselves that are running these things or have done in the past and trying to encapsulate that into a binary that I run on my Kubernetes cluster to make my life easier. I am all for that. I will take all the operators that exist out there. Glad to hear this is Vitess one. So I guess we're just gonna click on

22:11 the Kubernetes quick start for today and get this deployed to my local cluster. Yeah. I can probably are we gone do you want to go with the operator or are we just gonna go with the the manifest? Is there a preference there? We we should do the the operator. Yes. Mhmm. Okay. So I'll just clone down this directory. So this is actually the main the test directory. Is this just the mono repo then? This is the the Vitess mono repo. And within this, we have an example on how to run it with the operator. The operator

22:55 lives in its own repo. Alright. Okay. Gotcha. And I noticed there that also oh, what did I get wrong? Oh, I'm not in the right directory. Example. Alright. Okay. It's always my problem when things go wrong. It's always me failing to type basic things. Yeah. I noticed here that this is the Helm chart is deprecated. Is that in favor of another chart or just something that, you know, the Vitess community has decided that Helm isn't the right way to work with it? The helm chart predates the operator, and it was useful before an operator was available. Once

23:36 PlanetScale started working on this operator that they intended to open source, the community decided that they didn't want to support the Helm chart anymore. So, at that point, we deprecated it. Yeah. I think Helm is great for for starting things up, but operators are much better for ongoing management of Kubernetes clusters. Yeah. I I think if you don't have to maintain a health chart, that's a fantastic thing. I maintain some popular charts, and they've actually become a burden now. Trying to, like, satisfy every use case in the world from every user of that help chart is

24:14 no easy feat. Okay. So I am going to deploy what was it called? Operator.YAML? Yep. There we go. So the operator is on Docker Hub, so it's it's getting downloaded from Docker. Yeah. I noticed you didn't need to install any operator source. Yeah. And it looks like it's just installed some custom resource definitions. We've got some service accounts and probably roll bindings. Oh, I do see actually a priority class for scheduling. That's that's interesting. I don't often see that when I install operators. Is that just to ensure that if a test operator isn't infected? I guess I guess it becomes quite an

25:00 Creating MySQL Cluster

25:00 important part of my, you know, if I'm running the stateful workloads on Kubernetes, like I probably don't want to have a test operator to disappear. I think Oh, got it. I understand why that's there. Okay. Let's move on. So we wanna bring up our initial cluster. I'm just gonna pop this one open and we'll take a look at what's actually going on here. Okay. So we are requesting a Vitess cluster and then we have the ability to overwrite any of the images. Is that something that a regular end user would have to do or is that really

25:11 Deploying the Cluster & Initial Look

25:44 just there as a convenient function for the developers of the project? So what we do with these versions is that on release branches, when we do a Vitess release, we do a compatible example. So that tag will be a release tag versus being latest. And if people are, choosing a specific Vitess release to go live with, then they will pin these versions to those releases. Yeah. Okay. And I see some vocabulary that I'm not familiar with in this context. So could we maybe break down what a a cell is here? Right. So in Vitess terminology,

26:26 cell is any failure domain. So it can be a group of servers, a data center, an availability zone, a region. Typically, it's an availability zone when you're deploying in the cloud. So here, we are basically saying we've got one cell, and we call it zone one because it represents an a z. Right. Okay. Gotcha. And then we have how many replicas we want. We have some rules resource constraints and then the authentication model. So should I apply this now before we go through it? Will this take a little bit of time to get up and running, running or is it

26:59 pretty quick? It should be pretty quick. Are you on me, Cube right now? It's just talker for Mac. Okay. Yeah. It's it's pretty yeah. It should it should go fast. We'll let it go. I'm sure. Okay. So then we have something called a Vitess dashboard, which seems to reference the cells that we specified. We've got some extra flags, replicas. So do we when we use the Vitess operator to deploy our MySQL clusters with the proxies and stuff also get, like, a UI that we can navigate and explore? Yes. This UI gives you a view of

27:38 the Vitess cluster. It it will you can also see the schema, but you don't actually have the ability to navigate down to, like, a table level with this UI. And it's very primitive, very old, and we have a couple of volunteers who are replacing that, and hopefully, that'll be available in the near future. Alright. Okay. Alright. Now we have something Coming soon. VT admin. Something called a keys keyspace. That seems to tie into the slides that you kinda presented at the start there, Alkin. Do you wanna just quickly run us through what we're defining here? The key space is

28:20 is the database, basically. So we have launching commerce database with, you know, initial DB dot SQL that it runs the tables. So it's gonna create three three tables in this example, and it's gonna build a database within that zone one cluster. And then it will build one primary to, I believe, two replicas around it. Or it's one yeah. It's two replicas. Yes. Okay. How much you know, if I'm coming at this with a a new project, like, completely greenfield, I haven't got anything yet. I I wanna go with MariaDB, and I think, well, I'll start off early.

29:05 I'm gonna bring in Vitess to help prepare for that eventual skill. Like how set in stone is this configuration of shards and partitions? Like is this something that's that I can tweak as I grow as a company, or do I really have to have an idea early on of how to set this up? Did you wanna answer that? Or Yeah. Sure. So it's not set in stone at all. You can start with uncharted, and then you can update your configuration and roll out the changes. And the operator can provision the new pods for you. And

29:48 there are commands that you can issue to copy over the data from let's say you started uncharted. You have one shard and you decide my data has grown large enough. I have a couple of terabytes. I wanna break it up into two shards. Yep. Then, all the commands are available for you to initiate a copy from the unsharded space to the sharded space. And once the copy is complete, to be able to cut over to switch over the traffic from the old to the new. And that's a core part of Vitess. That's, most of the large Vitess installs

30:27 would not be able to work without that ability. So I just wanna make sure I understood this correctly. So I've got my application. I'm running an uncharted. The data is it's getting too big. It's too crazy. I know I wanna horizontally scale less than some fashion. Does my system go offline during this migration? Can I no longer accept rates or does it do everything like actively while that's happening in the background? What what what's the burden on me as an operator at that point? All of the actual data migration happens in the background. And even the v t gate endpoint that

31:06 your application is connecting to won't change. But when you initiate the cutover, we'll have to stop accepting rights for a few seconds Mhmm. So that we can cut over to the to the new primary and then start accepting rights again. So there may be a few seconds when any rights sent by the application to the Vitess cluster may block and then start getting through again. Okay. You know what? So in this example that you're actually running, initial cluster example, is something similar that it mimics. So there's there's a story behind this. Right? So there is there

31:31 Connecting WordPress & Troubleshooting

31:43 is a the story is you have an ecommerce application that decide started growing. So you already have accepting reads and writes to that cluster. Okay? So this is what you're building. Once you build this, you have the the old version. And then we what we do is we we keep adding on and then migrate that data within the Vitess realm. But if you had already had a MySQL that's running and then you decided to go with TestFace, so first, you need to come to Vitess uncharted way within the Vitess. So you come from outside and migrate into that. And there's there's a

32:25 way to do that. Also, migrate that, like, with a replication. You set up a replication, and you cut over, and then you fail over to that Vitess after that. So that once you're in Vitess, then you do one more migration to the uncharted to a charted. That's the story. Okay. I mean, I So in fact Sorry, Andrew. Go ahead. In fact, in this very example, we'll go through that resharding process. Oh, nice. Okay. So we'll be able to see that. Well, I mean, if it's only a few seconds, I'm a very happy person. So let

32:56 let me just like, I think all the things we're talking about just they sound like great. I can't imagine ever being a situation where I've got MySQL and MariaDB backing my application where I would say I don't want this. Is it fair to say that anybody who's MySQL and MariaDB should just use Vitess by default? Is that the mission? That is the mission. Alright. That that is the ultimate goal. There we go. Perfect. Alright. So that was just some secret stuff. We could see we're doing a little bit of import of data. So in theory,

33:28 we've given that a little bit of time now. I can run, get the test cluster. We have our example cluster. I always like to run a describe and see what I get back to. So let's do that. And I'm not sure if those are old or new messages, but we got some potentially error stuff there. And let's see if we have any pods. Okay. Yeah. So those were just, I guess as it spends up a new Vitess cluster, it's likely that a few things are gonna fail until it becomes healthy. Right. But we do have

34:08 we have an ETCD. We have the V tablets. We have a test control D and then the VT gate. So this is my new cluster. Right? These are all the components. We kind of got a rough idea of what those all now. Mhmm. That was quite easy. That was it. Yep. So it's it's CD itself is redundant over there. You can see. And the tablets are are also redundant, and you have the other components. Those are those are stateless, so they're basically single, and the operator is its own driver. Why does Vitess use HCT? HCD is not the only option. You can

34:56 also use Zookeeper, but I'm guessing your question is more of why do you even need a distributed data store. Right? Yeah. Why don't MySQL and MariaDB? Like So we we would like to, get to using, Vitess cluster itself as the data store for, the discovery. Yeah. Okay. It historically, when Vitess was being built at YouTube, they already had a, quote, unquote, log server available from Google. So it was easy for them to say, we don't have to worry about building this part ourselves. We'll just use what is available. But it does definitely make sense to

35:42 store the same data in a MySQL or MariaDB. Yeah. I think that would be quite a cool story, you know, bootstrapping for test with Vitess. Like, wait. Like anyway My answer would be my answer would be nontechnical. So it's it's more the community adopted and best practice kind of thing. People know about it. It works, and it it has no, like, hiccups. And I think this is what drives people to, okay, NCD, fine. It works, and let's keep it. You know? Like, this this is more like a nontechnical in the sense. But there's like, the Helm

36:20 charts and and maybe console, there are some technical difficulties the community adopters had, so they actually didn't actually pursue using them. Okay. Well, yeah, that's this is all really useful. I guess I was about to deviate from the documentation. I'm trying to get better at not doing that. So let's let's stick with it. I was just gonna deploy WordPress to go, let's connect the dots, but I'll I'll behave myself. So we can get pods which we've done. We've verified everything is running, everything is happy. The documentation even shows here that, you know, you will see a couple of restarts. I think

37:05 that's really good. And then what can we do? So there's a client I can use. Is that what this is telling me now? Yeah. In that in that step is one of the things, is to setting up port forward. So since you're in the operator and the Kubernetes, you need to port forward, how you wanna access the the Vitess client and the MySQL, which which actually is behind the VT gate. Right? So you would set up a port forward with that script and set aliases for the VTCTL client and the MySQL on your on your

37:52 shell. Yeah. So it it wants me to do that because it wants to create this kind of schema. So how about we deviate then and we deploy? Like, we don't actually need to deploy the do we? We could we just deploy WordPress and point it to it and walk through that, or is there something unique about the schema stuff that we should go over? We don't need to deploy the schema, but we do need to do the port forward because that host port is what you will then give to WordPress. Unless you're deploying WordPress in cluster, in which

38:25 case, you don't need to do the port forward. Yes. I will deploy it in cluster. Okay. Yeah. Then then you can just connect to Vitigate with the partner. I'm assuming we have a service address. Right? Yeah. We do. So would I connect WordPress to the v tablet or the v gate? I'm sorry. To to the to the v gate. V gate. V Okay. So I do have a well, I did have something in my buffer. Let me grab that again. On the official Kubernetes examples, reposits where they do ship a WordPress deployment. So let's just

39:00 Deploying WordPress

39:06 download that. I need a workload folder, and we should be able to see this here. Let's try again. That's better. So we have our workload folder here and we have a WordPress YAML fail. I don't think I need to Oh yeah, we'll need to set the host and stuff like that. So if we're going with the Kubernetes service address, then we want That's interesting with the service. So I can see that we're actually getting like a generate name happening here. Why is it not just example VT gate? Is it I guess, I'm potentially gonna have

39:59 more than one VT gate at some point? Okay. Yeah. So one thing I'm not sure about here, and maybe I'll can notice, is there's an example, we t gate, and there's an example zone one, t gate. There are two of them. I'll can does it matter which one we use? Let's just try one and see how it goes. Yeah. I I don't think it matters. Yeah. You can just use Okay. Okay. So I'm gonna drop this in. That's gonna be the default effect. We we work with that one too. Okay. And we can figure it off through

40:36 the secret, which was an error initial cluster. Can I just grab the username and password from Yeah? Okay. So it's just user with no password. Is that right? Is that what that means? Yeah. That's what that means. I'm just wondering if it if an empty string is, like, the operator are gonna in some way randomly generate a password for me, but we'll try. No. No. Oh, no. That's actually the one to see. Oh, no. I know better than you, Kubernetes. There we go. Password blank. And we'll just set this to now we have to play the game as

41:21 a DB user or DB username, so we'll find out. And then the rest of that should just work as is. I don't think there's anything. Yeah. Let's see. So in theory, I can now do a deploy of WordPress. We'll see if that comes up happy. Hopefully it builds that image pretty quick. This is normally where I would say control C and describe it and then it would say it's healthy and I was just being impatient, but now I'm gonna describe it. One part has unbounded media. Okay. Why does that need persistent storage? Storage? And why does it not work?

42:36 Storage classes. Okay. We've only got a host pass from Docker for Mac. Today, I learned. So let's do host pass. I'm not even sure if I can just make this up, we're gonna find out. If that doesn't work, I'll remove the PVC and we'll just stick it with ephemeral. I don't really think it's gonna matter for what we're doing today. So we'll see what happens. There, it's happy. Okay. So I should now be able to port forward to WordPress. And it should have enough information that the wizard we're gonna get probably gonna fill itself in.

43:24 I'm saying hopefully. Connection refused. I spoke to soon. Okay. So it's failing to write the database now. And our VT gate. So it looks like it did reach our VT gate, but the VT gate is saying that creating a database is not allowed. Right. So what port are you connecting to VT gate? No. No. No. That's not the issue. So I did not realize that WordPress does issues a create database. In Vitess, a database is a key space, which is backed by all of the infrastructure. So if you do create database, then what you are expecting is that you bring up

44:12 all of these processes. So I think we have to do the create database manually and then connect it up. Well, we did actually. Right? We we did generate a key space. So if I've understood this correctly, the database name is commerce? Yes. Okay. So we can I am going to assume very naively that I can just do name WordPress DB name value commerce? Assuming I know how to spell commerce. Yeah. Good. Okay. And we don't need that. Yeah. Now it's not gonna have any problem. Will creating the tables with the schema cause any problems or should that just work out?

45:01 No. Create table should work. Okay. So let's try reapplying our workload, and we should see a restart. Yep. We'll just keep an eye on that for a moment. And if I guess it's past the create database, then the port forward should hopefully give me that nice kind of wizard screen. There's a lot of hopefullys on this stream. Well, that's better. There. Okay. So Vitess, Rawkode. I wonder if it's gonna let me away with that. Yeah. There we go. And my email address. Definitely don't index as Google. Okay. So that seems Nice. So trivial. But at the same time, I wanna just emphasize

46:03 Application Compatibility Discussion Continued

46:05 something here. Emphasize something here. Like, WordPress does not know it's speaking to Vitess. It thinks it's just speaking to MariaDB or MySQL. And I never had to change anything. Is is that standard then when I adopt Vitess? Like, you know, if my application speaks MySQL and MariaDB, I should just expect it to work with very little changes. Is is that correct? It depends on how assortative MySQL features are that you are using. So certain things, like, say, show binary logs will not make sense to send to a proxy. And some complex cross shard queries still don't work.

46:45 Sometimes because they are harder to plan, and we haven't done it yet. And sometimes because they would be so inefficient that maybe we should not support them. Maybe the application should be rewritten in a more efficient way. Okay. But most things most things should work. Anything that doesn't work is somewhat obscure. And if there is a common thing that doesn't work, then we are fixing it. Or it's not it's a new feature. Like, there's there's a like, my c like like I mentioned, MySQL itself excels in every version, every release. Right? So they keep adding

47:28 enhancements and new features. And those are not always caught up with the framework. Yeah. We are definitely not caught up with MySQL eight dot o. There are features there that we that Vitess doesn't understand right now. Yeah. I I mean, I'm that's just running my SQL right now. It's like, I wanted to run MariaDB, would that have just being a flag in my my cluster dot YAML. Yes. We will need a different build. Right. Okay. A different image. Yeah. I mean, my personal preference is MariaDB, but that's just mine. I try to avoid anything that's got Oracle branded and stamped on

48:09 it. But other than that, that's people are welcome to use what they want. Okay. That is really cool. I'm I'm really happy with that. So I think we're kind of covered on what's this, not what's the same, but you know, we've hooked up an application that doesn't require any changes to work other than that little caveat about, you know, you have to create your key spaces as part of the operator and the cluster definition. I think that makes perfect sense. Should we dive into the What's different under the hood? Can we take a look at the tools and the schemas

48:38 Exploring the Vitess UI

48:42 and how we interact with this cluster? Is there is there documentation for that? And I'm just making things up. Is there something you wanna point me to here? So do you have in mind? Like, what would you like to see? So yeah. I'm just curious. Like, how do I, you know, take a look at you said there was a UI. Should we take a look at that first? Yeah. We can do that. Okay. So I gotta run get pods. This is where everything is. I mean, I don't see something there that says UI. Is that something

49:00 Vitess UI

49:17 we have to deploy separately, is it just a port on one of these other components? It's support on that VTCTLD component. This one, okay. So if I describe this, Clarence is well described first. Let's take a look at the ports. So 15,000 or fifteen nine nine nine? Fifteen thousand is the HTTP port. Okay. Fifteen nine nine nine is the gRPC port. Is all communication in Vitess over gRPC? Pretty much. Yeah. There are there is a JDBC driver which you can use to talk to the Vitess cluster. So from outside, we can speak gRPC, MySQL protocol, and Java.

50:11 But, within Vitess, it's all gRPC as of now. I did that. One less zero. Yeah. Okay. So Let's go to app two. Let's go to app two. Just like a new and improved slash app. Seems to keep redirecting me. Okay. That's fine then. Yeah. So this is the, sort of cluster view. You can see that there is a key space called commerce, and it has one shard, and you can click through to that. Click into that. There's a master which we will shortly rename to primary and a replica. So status oh, I can't click it. Looks like it's

51:13 clickable. Yeah. That that that link doesn't work within Kubernetes. Alright. Okay. So I can run health checks. I can delete the tablet. That sounds scary. I hope there's a confirmation on that. And I'm kinda There is there is confirmation. Okay. Alright. So if I We can look at Sorry. Please go ahead. Sorry. I was gonna say we can look at the schema. So schema is viewable from this Though it says there is no schema. That's interesting. That is interesting. Yeah. I'm pretty sure we have a schema. I'm I'm assuming WordPress has deployed something into this. Right. Right.

52:03 Is there a command line tool? Probably, there is a command line tool that you can use to do get schema. But my guess is that because the schema was applied by WordPress, it did not make it into the topology server. Whereas if you use the Vitess command apply schema, it stores it in the topology and propagates it to the MySQL. Okay. That would be my guess. So this is our topology manager here. Mhmm. I'm just gonna keep clicking on stuff. Which is what I did the first time I saw this. Yeah. So we can kinda see some stuff

52:49 about the partitions and the replicas and all that. Okay. Right. Yeah. So you said there was an app v two or at least an app two. Is that something that's currently being worked on by the team? Is that just an improvement on what we see here? Like, what what what was the app two? Well, app two is this, but there is a new UI that is being worked on, which will probably have its own starting page. Okay. Alright. Let's look at a couple more things then, and we'll we'll leave the UI alone for now. I think I've clicked everything there

53:20 Scaling Our MySQL Cluster

53:25 is there now. So I'm curious, now that I have my Vitess cluster here, if I want to change the properties of this, I'm assuming I just work like with my standard Kubernetes tooling and just change the spec. Right. Now where was So If I just change the replicas to five, is that okay? Sure. Yeah. What else can I break? Let's see. So there's two sets of replicas then. We have Yes. Can you scroll up a little, and let's see what that first replicas is? This where you are right now, I know what it is. It's the MySQL

53:28 Scaling Replicas & Testing Failover

54:17 level replicas. But higher up, that's the number of Vitigates. So gateway is Vitigate. So replicas five will give you five Vitigate pods. Okay. So that's not replicas of my data. That's just like okay. We don't actually need five of them then. That's not the interest in one. So I guess we wanna modify the key space here, and we'd wanna modify the replicas of the key space to be does the does the number matter? Do I have to have a certain quorum? Does it have to be odd or does does it not No. It doesn't matter. It doesn't matter.

54:53 What about two, but at least two, but beyond that, it doesn't matter. Okay. So one of the things I'd like to do then is why don't we scale this up to four? Is there a way for us, like, will one of them be considered the leader then, the primary? So there is already a primary. The operator should just add two more and make them point to the existing primary and not switch to a new primary. Then can I kill the primary and things will just handle nicely? Yes. You you yeah. You can kill the primary, and one of the others

55:30 should become a primary. Let's see. Alright. Yeah. Let's just see. Let's see. What's the worst that could happen? Actually, well, I I know what will happen, but we'll see it happen, and then we'll talk about it. Okay. So we can see that your tab the Vitess tablets are now spinning up, they're going through the edit container phase, probably already finished. Too many clicks. Oh, we got one error. I wonder if that's just transient. Yeah, there we go. Okay. Although we did get two restarts on these ones. Wonder if that's going up. No. Okay. It seems to have stabilized.

56:07 So now we have four replicas. How would I work out which one is the leader? Would that be come back to the UI or is there another way of handling that? You can do command line utility with the port forward if you had the alias. You can connect you can connect to VT gates and say show Vitess tablets and it'll give you which ones are what they are doing. Okay. Now do I do that with standard MySQL CLI? Or Yeah. MySQL CLI and but you need to do a port forward and I don't think I have it, but we

56:44 can get it. It's in the The other yeah. The other way to do it is to connect to the VTCTLD that well, we only forwarded the HTTP port. But if we forwarded the gRPC port, then we would be able to do list all tablets, and we'll get a list of all the tablets. A couple of options. But, you know, for if you already are, like, MySQL user CLI, for me, like, remembering the flags are are harder because it's new terms. Right? But just connect to database and say say show databases and stuff like that so maybe you can explore a

57:25 little bit. Yeah. Of course. I'm confused. VT gate. Did we not have two VT gates out there? No. No. We just had one. That was in the service. When we looked at the service, there were two services which Okay. Yeah. Right. Do you know the port number that I wanna port forward? Yeah. It's 15306. 15 3 0 6. 15 3 0 6. Okay. We have our port forward running. We have access to my I just installed you. You need to set an alias to your Well, let's let's Let's go to client. Oh, okay. You need

58:16 a client. Right? I'll just export this. I think it just got a little confused. Yeah. There we go. Okay. So now we have a a client. Can I just do my SQL local host dash p? Local host 15 o three. Yes. Minus u user. Yeah. Oh, I think it's capital h, isn't it? No. I think it's capital p. Capital p p. For the host? P is like looking for a password. For the port number. Port number. Capital p. Yeah. Yeah. That's correct. Yeah. It's not trying to use the host. I'm wondering if that h is correct. Oh,

59:07 let's try +1 27001 instead of local host. Yeah. Oh, that was closer. Last connection. It might put forward. Oh, we got refused there. I'm gonna describe that pods just in case. Scrape pod. And it was the gate. Right? Yeah. And the ports oh, three three zero six then. 3 3 0 6. Yes. You need to do I was looking at that +1 5306 to 3306. So is this you know, you should be glad when it's always those little silly errors. Those are the ones I don't mind. Yeah. Now you want now you can take out the port

1:00:04 specification because it's the default port. I'm in. There you go. Show Vitess underscore tablets. Underscore. Okay. Yeah. Alright. So we can see we've got our zone one. We have our commerce database. This is uncharted. We've got three replicas and one primary. Awesome, very, very cool. Now I'm assuming these aliases maybe match up with those IDs on the Let's check. Yeah, I don't like getting this many splits, but I'll do Yeah. Three three. Yeah. Okay. So these are pod names. So those are still a random thing on the end, but Right. We want to kill two four six.

1:00:57 So Right. The rep. Delete pod. So what should happen here when I do this? There was not gonna an application that's consistently hitting it. So I'm doing it. We're probably not gonna be able to like visually see fail over, but I'm assuming when I delete this pod, if I run our show tablets again, have will we see some sort of status or degradation here or will it just immediately switch to offline, and will something else take over? Like Well, we will see that that so that tablet may not even show up and show Vitess tablets depending on when we

1:01:32 Failover Behavior Discussion

1:01:40 run it, depending on what state it is in at that point. And what the operator does, and this is different from how, most people run Vitess in their live installs, is that it's actually so quick to restart the tablet and reattach it to its volume and reconnect it to the MySQL that it doesn't do a failover. It just restarts it. So so the pod will simply restart. So we'll see whether I'm right about that. Guess we'll just wait for this to to at least finish Cause I guess until cube control tells me the pause delete, it's

1:02:28 probably not actually gone yet. I wonder if we've already got a new one spinning up. I've got too many tabs. We just use a new window. Yeah. There we go. So we got one initializing. We got three replicas now. So we've lost our leader. We've lost the master. Yeah. The primary. So when that new pod comes online, will it just take over? It will come back as primary. Okay. And what if it's away for an hour? You know? An hour? Uh-huh. Yeah. What what if I what if it goes away for three days? What if I run it on

1:03:09 bare metal and the hard drive dies and I shut down the machine there? So what we haven't shown in this example is that you can integrate Vitess Orchestrator, which will actually do a failover. Don't know if that's stuck. It looks okay here. Well, it just take a a bit of time before it shows up on the oh, no. There we go. Back. Yeah. And so what happens with write requests that come in during that time when there's there's no leader available? So Vitigate, I'm not sure if we have set this up in this example, but Vitigate can buffer

1:03:54 the request. And let's say, on the client side, you have some sort of a time out, fifteen seconds as you right? Vitigate will buffer the request for a specified duration. By default, it's, like, twenty seconds, but you can set it up to thirty seconds. And, Vitigate will buffer the request. If the client times out, the client can retry. If the client doesn't time out, once the primary is back up, Vitigate will, send the request again, and they will succeed. Okay. There will just be a delay. Something that you expect to take one second might come back after twenty seconds or twenty

1:04:38 five seconds. Okay. Is there anything else that's interesting to the way that the test had? I think Google was thinking I was talking to it. Is there anything interesting that we could show from, you know, now that we're in this kind of MySQL prompt thing, is there any other Vitess specialized objects that we can show, describe the schemas, and for that that would be useful to the people viewing at home? I think we can also do show with the key spaces. And is would that be describable? I can't even remember if this is real MySQL syntax. I'm kind of just making that

1:04:41 Vitess-Specific SQL & Features

1:05:26 up. No. No. It's oh, nope. Not yet. No. You can say use commerce and and say show show tables. Yeah. In fact, you can do show tables even without Yeah. Doing use commerce. Yeah. One of the But we didn't actually select the default database. That's why when we connected, it didn't actually have yeah. So these are the these are the tables that initial setup created in the commerce database. Right. WordPress itself Yeah. That's WordPress metadata. Yeah. Yeah. Okay. So one of the things Vitess can do is you can actually have multiple key spaces and have a VT gate that

1:06:08 Discussion: Multi-Keyspace / Logical Databases

1:06:12 spans those and have the illusion of a single database as long as the table names don't overlap. So sometimes what people will do is instead of sharding let's say instead of having many large tables or a few large tables, they have many, many, many small tables, and the overall database size is big. Then you can split those into multiple MySQL databases, give each of them a different name, and at the Vitess level, you'll get the illusion that they're all still in a single database as long as the table names don't collide. Ah, okay. That's actually super cool. I mean,

1:06:55 there are, like, few blog posts about about this, like, million tables in one database. Like, million database. The million tables. Like, the the customer create a table per customer. So every customer gets one table. Table per user. Table per user. Table I've I've seen that also. Table for user is also another practice that just to skip try to scale out. You know? I think But, eventually, they they have a problem managing that many objects. I think that's a really interesting use case. You know, I I'm I'm in Europe where GDPR is a really big thing, and companies

1:07:32 are looking for ways to segment and isolate certain countries data or certain, you know, studies data or even not, you know, maybe per user to a certain degree, they really wanna just be able to make it easy to wipe that away after the GDPR timeframes, etcetera. So I could see a lot of really interesting use cases for adopting that VT gate across multiple case based thing. That would be very interesting to me definitely. That's even lower level than the zone and the cells. So you actually have the zone. Let's just in GDPR case, you can actually zone out

1:08:07 all the countries. Right? And then you can actually set set them on a cell per data center, like, availability zone, and then drill down even further by region, by, you know, state, county, or or that. And then you can all do that within this design. So the extremely interesting use case. Yeah. So I just wanna make sure I understood that correctly. Like, I'm designing an application where users can register and I want to store all of the, you know, all of the people that are in The US and a US zone, I wanna store all the European people, maybe I'm

1:08:42 like London zone and then I wanna use I wanna be able to break them down by user type and say I break them down into a key space each. So you're saying that t I could run a v t k in each geographic location which would handle the request for that specific continent? Yep. Yes. Okay. Cool. So Slack actually does this. They have the data locality. When you create a Slack work you can choose which region you want your data to reside in, and they use Vitess, and they are doing it. And are these VT gates aware of

1:09:16 one another if I make a request from a let's say I'm traveling, right, and all my data is hosted in the Scottish data center, assuming we have any. And I'm in, you know, Amsterdam or California for a business trip, and I log in to the VT gate there, does that know to move me over there, or would it fail? Like, what would happen? That's a good question. You can set up the VT gates to be able to access data in other zones, but there'll be a latency. Yeah. Okay. Let's I I don't know how long this will

1:09:50 Chatting about Sharding, Backup/Restore, & Misc.

1:09:50 take, but I'll suggest you on one more thing. If we think it's gonna take too long, let's just not do it. We can go back to the documentation and maybe actually stick to the what we were gonna do for a while. But when we created this key space, it was uncharted. Right? So if I run that command again, the show key spaces. No. Show tablets. Yeah. This dash here means uncharted. Is that right? Right. Can we chart this? Yes. It's documented. We can look at the documentation to see how we should do it. Okay. That's fine.

1:10:00 Resharding (Advanced Feature Discussion)

1:10:27 Let's see. Can I just search? We should go go back we should go back to that guide that we were in, and there's a link from there to go to the next steps. So get started with this operator. Next steps. So is there move tables? Let's click on move tables. We don't need to do move tables because we are not trying to split the existing key space into multiple key spaces. We just want to shard. So we can go all the way to the end and resharding. Yeah. That's what we want to do. So my resharding ideas in the advanced track.

1:11:15 Okay. So So the the the part that takes time is to define those indexes all of the tables. And with WordPress, there are quite a few tables. So for each of those tables, you'll you ideally need to inspect the schema, pick a column, pick a index type. Hash is the most common, but there are multiple index types. So it's probably be very beneficial if I even knew what the WordPress schema looked like. It's kinda Yeah. So in in WordPress example, I looked, earlier was you have a blog site. Right? So you would keep one uncharted

1:12:03 key space for your metadata. But for the blog, you would actually have a charted key space by the, you know, blog ID or or or comments, users that we would need to scale. You don't need to scale your metadata because it only, like, keeps track of users, how many they they like, how many they commented, and stuff like that. But but you need what you need to scale in that example is the number of comments from the users. Right? You actually have to split them up so that they don't actually stack into one area. So that's that's why

1:12:41 there is some assembly required before you decide a shard on what type of sharding you want to do on those. And and and and Deepthi's comment on hash is very common because that way you can say, okay. I want n number of shards and shard them by hash, which is kind of easy way of doing it. And it usually gives you a pretty good distribution if you're hashing a numeric field because that's important when you're sharding. You don't want some shards to be overloaded and some shards to be very sparse. But I can make a great point. I

1:13:19 skipped over the move tables thinking that, okay. If you want to reshard, then you'll reshard all tables. But no no. Not necessarily. Some tables are always gonna be small, so you will move them out into a key space that you don't intend to shard, and then you'll take your main key space and shard that. Okay. That makes sense. Well, I'm not confident in my ability to work through the WordPress schema and and try try and identify how we should share that. So if we were to take a look at one more thing, is there anything that

1:13:48 either of you would like me to click on here and lead me towards? I don't I think the the one thing the one thing I want to talk about is hortro protection. Okay. That's good. Vitess Vitess does query consolidation. So let's say that many clients are looking for the same thing. I don't know. Top, top voted comment or something. Everybody wants to click on the top post. Right? Vitess can detect that there are many queries which are identical, including the bind variables. And there's no need to send each and every one of those queries to the underlying

1:13:55 Discussion: Query Consolidation & Hot Rows

1:14:31 MySQL. If one of those queries is in flight, then you can hold everything else, wait for the result from the first query, and send that same result back to your thousand or 10,000 callers. So these are, some of this is one of those features that makes Vitess useful regardless of sharding. Even if you are never gonna shard, you're still going to get some protections from thundering herds and those kinds of problems. Yeah. I think what what I've seen in the past when people are trying to do something similar to this is that they're usually bringing on another

1:15:06 database tool like Redis, and they're trying to, like, analyze the query and remove any of the variants and then cache them in there, and it's very complicated. But is that something that by adopting the test I get? Is it something I have to enable? Do I have to tweak it? Like, is that running already in our setup just now? Query consolidation is on by default. You would have to turn it off if you didn't want it for some reason. Okay. What I would like to highlight is we talked about backup recovery. It actually did like, in order to if you were, like,

1:15:32 Discussion: Backup, Recovery & Online DDL

1:15:42 old school DBA, in order to build a replica, if your node died as a DBA, you would actually had very hard time trying to figure out the coordinates of your replica, your backup. You would actually have to restore it and find it, set those, you know, change master to commands and and all that stuff. Right? You know, need to pinpoint. And if you actually made a mistake, you know, fat fingered it, you would actually maybe screw up the entire cluster, roll forward to wrong position, things like that. The other thing that is in Vitess right now,

1:16:16 we are very proud of is the integration to online DDL, which is also another operational not headache. Nightmares. Nightmares. As an operational DBA myself for the last last last, you know, 25, it's it's it's a pain. Right? And you do just just all you want to make a a a schema change, which is very common. And and with with online DDL options, you actually can utilize one of those, utilities that are very common without having to lock and and impose a metadata lock and and actually stop the application doing that. So you can actually test it out. It's built

1:16:59 in right now. It's experimental, and and we are actually enhancing it as the maintainers. When we say we are, we there's a large group of engineers working on it. And, and and you can set the DDL strategy to, ghost or or p t o s c, and then do alter table, one of those tables, and see how that goes. And that would not actually, you know, cause, any lock on the database while the WordPress application is running, and it would actually apply those changes and it will be managed. And you'll know the progress of it and things

1:17:34 like that. So it's a it's a new development in Vitess world. Nice. Lots of really interesting and cool features there. And there was something that popped in my head just as you you both were kind of chatting away there. And, like, if I were just running MySQL or MariaDB, like it's, you know, I fat fingered enough databases in my career. And I made up I've deleted a lot of tables by mistake or even filled databases like, and, you know, you come to rely on like the binary log and point in time backups. Is that something that I still have

1:18:05 access to when I go through the test? You absolutely have. You can connect to the back end databases. You can mark around, download the bin logs, look look troubleshoot, and and audit your database. The the database, like like I said, database remains as database. It's the back end database. Still the good old MySQL. You can access the ports and and all the binaries and the and the logs, and the binary logs is is still there. We just don't do that anymore. We let let the Vitess handle all that stuff. Alright. Well Vitess itself sorry. Just one more thing. We actually have a

1:18:48 point in time recovery feature as well, which is documented on the website. And it's okay. Let's pull up the website again. Let's take a look at that. So It's probably an advanced feature somewhere. Oh, we should just search for it. Yeah. Maybe. I wonder if I can just yeah. Point and time recover. I like it as well. It's even the docs have a decent search functionality and I don't have to look too hard. So it's just taking advantage of the fact that, you know, all of the requests are going through the VT gates and and then

1:19:28 you're you're doing, like, a multi rate to some metadata store to understand what happened on each individual DB? It it relies on the bin logs. Alright. Okay. So it does require a bin log server so that we can replay the bin logs. So it so we we have periodic backups, so we will find the backup that is closest to the desired time and then replay the bin logs starting from that position until you get to the time you want it. Is it a bad thing? Like, you know, if I'm running the tests and I skip over it and go directly to my

1:20:06 MySQL DB or my MariaDB, does that cause problems? Like, should everything go through Vitess? Or Queries don't have to go through Vitess. You can write update data on the back end. That'll be fine. But configuration changes may conflict with how Vitess wants to run things. So it's not gonna be a good idea to do config changes on the back end. Yeah. That makes sense. You know, I think I'm already sold anyway. Like, I think the I think the amount of features and and, you know, operational help that Vitess is bringing to people that are working with these

1:20:42 two databases, like, it just seems like a no brainer. Like, you probably do want to adopt this technology and and use it. And you've already mentioned a few very large company names, YouTube and there was one other just a few moments ago, the fact that they're using Slack. Slack. Exactly. Right. You know, they're using less technology as well. I think that should give people a lot of confidence and hopefully gives them that little peak of interest to go and and check it out for themselves. Is there anything else either of you would like to finish with before I let you

1:21:12 go and enjoy the rest of your day? Just just quick quick note on on the Vitess project. We have a a Slack channel, Vitess. If there are questions on on Vitess, that is the most active place to ask about Vitess. Even if it's a newbie question, that's like a general channel is is great for that. The code is open source. GitHub is is a great resource for for research and development. And and as you already gone through the docs, you we already have it. And and and if there are any questions to either of us or the rest of

1:21:20 Community Resources

1:21:56 the maintainers, again, with the Slack channel or or Twitter is is a good communication channel. Alright. Perfect. We are both on Twitter, and there's a Vitess.io Twitter account as well. Yeah. Well, all of the Twitter links are in the show notes. I will make sure that I drop a link to your Slack channel in there as well. So people feel free to jump into the description, join, and have fun with the test. And I'm sure that you'd both be happy to help if anyone does have any challenges or problems. Alright. Well, I'll let you both go, please. Thank

1:22:27 Conclusion

1:22:29 you for joining me today. Have a great Thank you very much. Meet you both soon. Mhmm. Thank you for having us, David. Thank you. Bye. Yep. Thank you.

Technologies featured

Meet the Cast

Weekly Cloud Native insights

Stay ahead in cloud native

Tutorials, deep dives, and curated events. No fluff.

Comments, transcript, and resources

More from Rawkode Live

View all 173 episodes
Kubernetes

More about Kubernetes

View all 172 videos