Overview

About this video

What You'll Learn

  1. Load-test a Laravel app with Siege to expose CPU and latency pressure.
  2. Use Linkerd, Grafana, and Prometheus to inspect request-level service metrics.
  3. Configure custom Prometheus metrics for HPA scaling beyond raw CPU.

Continuing the Laravel-on-Kubernetes series, David is joined by Alex Bowers and Ciaran McNulty to load-test with Siege, wire up a CPU-based HPA via metrics-server, then layer Linkerd sidecars, Grafana dashboards, and a Prometheus ServiceMonitor for custom application metrics.

Chapters

Jump to a chapter

  1. 0:00 Holding screen
  2. 1:00 Introductions
  3. 1:16 Introduction
  4. 3:50 What did we do last time?
  5. 6:30 Building and Deploying Initial Application
  6. 12:40 Discussion on Auto Scaling Goals
  7. 14:18 Preparing for Load Testing & Installing Siege
  8. 16:30 Installing siege
  9. 18:30 Initial Load Test & Observing Performance
  10. 19:50 Creating the Horizontal Pod AutoScaler (HPA)
  11. 19:55 Introducing Kubernetes Horizontal Pod Autoscaler (HPA)
  12. 21:11 Creating CPU-Based HPA
  13. 22:30 Attempting CPU Scaling & Debugging Metrics
  14. 24:40 Deploying Metric Server
  15. 24:50 Deploying & Debugging Metrics Server
  16. 30:10 Fixing Metrics Server Issue
  17. 31:17 Successful CPU-Based Auto Scaling Demo
  18. 32:00 Triggering an AutoScale Event with siege
  19. 38:17 Discussing Limitations of Resource Scaling
  20. 41:30 Introducing Service Mesh (Linkerd) for Metrics
  21. 43:30 Installing & Setting up Linkerd
  22. 46:00 Adding Linkerd Sidecar for Request Metric Collection
  23. 50:00 Exploring Linkerd UI and Grafana Metrics
  24. 55:30 Exploring Raw Prometheus Metrics
  25. 1:01:30 Transition to Custom Application Metrics
  26. 1:02:00 Attempting to add Prometheus Middleware to Laravel
  27. 1:03:46 Attempting Laravel Prometheus Package Integration
  28. 1:27:00 Dependency & Compatibility Issues with Packages
  29. 1:35:35 Adding Manual Prometheus Metrics Endpoint
  30. 1:39:55 Building and Deploying with Manual Endpoint
  31. 1:41:05 Confirming Manual Metrics Endpoint Works
  32. 1:41:30 Attempting Prometheus Scraping via ServiceMonitor
  33. 1:45:25 Debugging ServiceMonitor Configuration
  34. 1:49:58 Conclusion & Wrap-up
Transcript

Full transcript

Generated from the English captions. Timestamps jump the player to that moment.

Read the full transcript

1:16 Introduction

1:16 Hello and welcome to today's episode of Rawkode live. I'm your host Rawkode. Today we are continuing our series of running Laravel on Kubernetes. Trying to make sure that we tackle all the different edge cases and real scenarios that people need to solve to be able to scale properly. Today specifically, we're gonna be taking a look at how we can get metrics out of our Laravel application and use that for some auto scaling plus one of my guests Alex has got some random ideas to do that with key workers too. So it's gonna be a little bit

1:46 of fun and a little bit of well what the hell are we doing but we're working out. So let's introduce my guests for today. I am joined by Alex Bowers and Kieran McNulty. Hello. Hi. Thank you for joining me again today. We'll, as always, we'll just quickly do introductions and feel free to use as little or as much time as you want starting with Alex. Yeah. I'm just a Laravel developer, I've known David for a few couple of years now, but very we see each of a once a year and I keep asking about Kubernetes

2:21 and eventually we might start doing some of these videos and here we are. K. I'll turn in. You hear me? Yeah. Yeah. Yeah. Good. We're all good. Hi. I'm Kieran. I'm a consultant. My background's in development, but I do a lot of coaching around agile and testing. And the actual deployment bit is nice to be involved in because it's a little bit outside my comfort zone. Yeah. I think today What else? Yeah, today is definitely an episode of fun discovery as we try and tackle the scaling aspects of running any, I guess any application on Kubernetes. This is obviously specific to PHP

3:03 and Laravel today but the same stuff that we're gonna do the same concepts will work across everything and already now we can see their first message and then discuss channel, discord have a bug in their stream overlay which means all the names of people are object objects. So that's awesome. Was that you Alex or was that someone else? Somebody else, somebody called John I believe. I'll keep the chat open. Hey It would be good if I could see your name but I'll reach out to Discord and see if we get that fixed. Regardless, we're testing out the discuss live chat thingy on

3:39 our stream, we'll see how we get on, I may move it at some point and of course if you're watching on YouTube, please feel free to drop any questions that you have into the comment section and we will tackle them as we go. Now, why don't we try and recap where we got to last time and then maybe best to do that through the power of code because we I think we got quite far, didn't we Alex? Yeah, we got very like a good amount was done. We had at the end of it, we had migrations

3:50 What did we do last time?

4:12 working. We had queues set up and running through. Those were the two main things. There was a couple of the things that we managed to cover as well, but can't remember what they are now. But, yeah, we made some good progress on a production level application last last time. Well, let's see what's still working because who knows? Let's see what did where did I put everything? All of the all of the roots are all of the features that we built last time are in the roots folder. So we can see that to see what we covered.

4:49 And so where where did we put the manifests? Oh, it was inside of slash resources ops. Oh, yeah. Hey, I'm looking for my my OPT directory, but there we go. So we've got our Docker files. We've got a config map. We have a deployment. We actually have our cron job which was to run. What was this for? So Rawkode has a job scheduler built in. So for every minute it can you can the scheduler gets called, and you from there, can dispatch different jobs in PHP for whatever you need to do. Alright. So you can

5:24 see the example for that in app console kernel, I think it is. Does Laravel have a way of instead scheduling them directly, scheduling the jobs directly from Chrome? Yes. So every single command that you create in Laravel, you can just hit using, like, PHP artisan command name. Right. Right. And you could do that directly from the cron. But if you want to, for example, add conditional logic, so you do a database query to see Yeah. Do do you need to care about running this? Yeah. You you do that in the the kernel the console kernel.

6:06 Yeah. Yeah. And it also has some nice syntax structure where you can do like, you can use the words like every evening rather than trying to figure out what random stars and commas and numbers mean. Yeah. Okay. Cool. I think I'm all caught up which is useful. I also see that I need to make file which has two commands for both in these containers. So why don't we get them built? Let's apply our Kubernetes manifest directory and see if anything blows up. Hopefully we just get our application working and then we can start getting some metrics

6:30 Building and Deploying Initial Application

6:37 out of it. It was left in a working state so I'd be I'd be upset if it breaks in the two weeks Yeah. Build NGINX and build FBM. Smarter person would have done this prior, but still. I'm sure it won't take too long though. Although, whenever I see an NPM command, I do get slightly worried. NPM CI, so it's not it shouldn't be doing as much as a normal install would be because it uses lock file to have specific version. So you shouldn't have to do any dependency resolution I don't think. Alright. Well, we'll get these Well, I'm saying

7:21 that we're we are thirty seconds in so, know, take what you want from that I guess. Well, my Mac has an airplane mode where it's trying to take off. So we'll see. I can hear the fans over the headphones which is never a good sign. That's the sign that using node, isn't it? Well, I wondered with your shiny M1 Mac, Keenan, do your fans ever go? I heard as much better. I've had it. Oh god, when did I buy it? I've had a couple of months and I've heard the fans once. Wow. They sell one without fans only now as

7:58 well so. Yeah. The edge is That's brave when I can do in any node stuff. That's brave. Alright. Here we go. It looks like it's now finishing at least one of our images. Left I left some CAD software running overnight, and that's when the fans ran. That's the only time. Hopefully, they bring out another one. I'll buy the second version, but I'm not buying the first. Yeah. They'll bring out an amazing one at the of the year and I'll be jealous. Yeah. Well, I'm I'm getting a new one. Oh god. Next month in March. Sometime in March or

8:46 April. Alright. So without doubt, it'll be about May or June when the next one comes out, just as mine's been delivered. So, yeah. We'll see. If you need to use it to do docker stuff day to day, it's still not quite there. We don't use docker in the company at the moment. We're moving towards that. So hopefully, by the time we get that new environment set up and stuff, it's a bit more stable. Our compelling extensions. I may go get a coffee. Do you clear up your Docker cache quite often or something, David? Yeah. Every thirteen seconds probably.

9:31 I'm always well because I always go into my Docker settings and just click reset everything and just kill it. I do that far too often. I do it my phone too. Like when I was on Android I would clear all when I'm on the task manager. Now I'm on iOS there's no clear all button but I just keep swiping apps away anyway and people keep telling me that's actually bad but it's just I guess habit more than anything. Yeah. It's meant to use more power to start the app back up again, isn't it? I know it's just something I've always done.

10:00 Not using it, close it. Not using it, close it. And I do it with Docker too. I'm always killing containers and images. I think I'm the opposite. I I usually at some point noticed my machine's running really slowly and has no disk. And do Docker years. I'm rocking about 20 gig of disk space left. I'm slowly just like working my way through various things clear and stuff out. I don't even think I use 20 gig. It's painful. I can't have any icons on my desktop either. I can't have any downloads in the downloads folder. I just clean everything up really

10:40 quickly. It's probably better that way then the alternative which is I have just my downloads folders everything I've ever downloaded. Useful or not. Alright. There we go. And I'm sure everyone's really disappointed because they were loving that conversation where I was talking about files on my desktop. But at least I've got my command history and history so that's full. So we're gonna do a kip control apply or don't wanna do everything. I'm assuming we built it in a way that that should be alright. Looks good. Wow, this is gonna work. Alright. And that containers are It's not like

11:28 you had no thing. Yeah. That that and it's gonna fix itself. I think that's just my SQL and the Ruby's not happy yet. I'm hoping. Is this local Kubernetes? It is local Kubernetes. Yeah. Alright. Let's take a look at one of these. Well, the last one says initializing. Let's try it. I'll look one more time. There we go. Perfect. Nice. Okay. So now we've got I'm assuming a service. Let's port forward SVC. Oh, we didn't create a Laravel service. Okay. So we'll just port forward to a pod. Eighty eighty on port 80. We should just have our application.

12:22 Maybe I'll hide that. Yep. That was exactly what I built. Yep. Alright. This should be read. Not gonna make us a million dollars by any means but functional for today. So what we wanna do now is well, we don't really want to you know, we talk about all scale, normally we talk about scaling up but I wanna equally emphasize that we wanna be able to scale down when required too. So right now we are running five replicas of our Laravel application and I'm the only user. So maybe this is slightly overkill. So what we wanna do

12:40 Discussion on Auto Scaling Goals

12:59 is scale this up and down accordingly. That makes sense? Mhmm. Mhmm. Okay. How do we do it? Okay. Yeah. Good good first question, Alan. Okay. So why don't we start with a naive approach? Right. No. No. Let's let's go back to that question. Like, what what do we think we want from auto scaling? When when do we scale up? What's the best way to scale up? When there aren't enough resources at the current scale level for what we're trying to do. Yeah. I like the like the usage of resources there. Definitely something that we can use

13:38 to determine whether we we scale up. Yes. Or when performance isn't good enough at the current scale level? Yes. So you know we're getting into the conversation that like like when we talk about what how is how is the application performing for our end users? So when I say naive auto scaling, resources is normally the first approach people say well, you know if I'm running out of CPU or memory then we probably want to be able to scale accordingly because we think that we think the performance will be degraded in some way, but I like to work with end users.

14:10 So we're gonna try and tackle both I think because maybe. Okay, let's scale it down. Let's see what happens. Does anybody know any HTTP load testing tools? I probably should have asked that. I use one called siege. Siege. Is that the patchy one, isn't it? Maybe. No. It's that's just a patchy bench. One I use is let me find it. Otherwise, I'm just gonna go to Google and search for Rust. This is what I use to just it's by something called Joe dog siege. Okay. I'll I'll post it in the discord. Why my alt tab not working to my

14:18 Preparing for Load Testing & Installing Siege

14:54 terminal? Did it goes to Versus Code? I'll restart that in a minute. Alright. Let's reapply this over the top and that should scale down. Good. So now we're just gonna run one. I guess I'm not even sure, this is where my Laravel knowledge will probably let me down with the first kind of scaling scenario but you know, if I throw siege at our application which is running on a single pod inside of a local Kubernetes cluster and had it with 10,000 requests, am I likely to see CPU and memory usage increase? Probably not massively. It depends on how many

15:32 concurrent requests, I guess, and what the v plus allocation is. But obviously, our home page there is literally just returning the string hello world. Maybe we change the code to make it, I don't know, like calculate an m d five ten thousand times. It's just something so it actually does some work. It might be worth doing. Yeah. Do we do we want it to create database load as well? Maybe. I don't believe there's there's no models or anything set up on this. Oh, actually, no. There was. There were models. Yay. Yeah. Because we created some post blog

16:03 posts, didn't we? Yeah. Yeah. We can look at the data upload. If you open up the I can't. The blog post forward. I can't. If you open up the roots file, the roots web PHP file, that's got all of the endpoints that I created in there. Yeah. There we go. We do have stuff in the database. Alright. You sent me this geodog slash siege. Is there a brew package for it? Yeah. Believe so. Yeah. Okay. Cool. Let's see if we can run that and let's just model. You know, we're just gonna in a very simple basis,

16:30 Installing siege

16:44 look at the CPU and memory utilization and then we'll see if we can cause a very naive scaling situation. Again, more stuff I should have done up front, but yeah. If you if you have it loading stuff from the database, PHP will perform badly because it will block waiting for the responses, which ties up processes. So that that will kill the sort of concurrency. It's alright. It's better than it used to be. Yeah. Can you remember what we put this on? Was it 7.4 or eight point zero PHP? I don't know. Actually, I can check the Docker file, can't

17:29 I? Yeah. Yeah. Resources ops, Docker, FBM, 7.4. Yeah. Quite a lot of improvements were made in eight point zero. But hopefully 7.4 means we see better examples for scaling because it'll be worth performance, I guess. Saw it the other day. One of the features that's landing in 8.1 has a 20% speed improvement real world for some of them. I think that's sort of linked linked caching of, like, inheritance files and classes or something, I think. But I don't think it's like your page will be 20% better. I think it's what was what was like a very small portion

18:14 of your, like, load time is 20% better than it was on that small section. Yeah. So real world Pages are the one we're at. Pages are the one we're looking at that's just doing CPU. Yeah. Yeah. Alright. Our application oh, we got a comment from a guy who mentioned, hey, for load testing. I've actually heard people mention that before. I'll check it out another time, but for now I'll run siege just because I've got it installed. We're gonna hit local host on 8080. And if I just do this, I'm assuming it's gonna magically just start firing requests at

18:30 Initial Load Test & Observing Performance

18:46 it. Mean, is that is that it working? I think so. If you cancel it, tells you then how many requests it did I think. I'm not I'm not really sure. I normally pass through dash c and dash t. Well, our availability four dash c is how many concurrency is. Well, it looks like we potentially got failed requests even with just the basic settings there. So Nice. Yeah. I think it was working. Concurrently 13. On our availability dropped to 44% that time. Wow. So it's actually that's okay. Our response time is eleven seconds. Is that is that can't be No. That might be

19:32 milliseconds. Can't be eleven seconds shortly. I would hope not. Right? It's done nothing. Because that homepage is just a bit the page which says like, this is my this should be red or whatever. Okay. Well, yeah. Okay. Let's try let's try and scale this then. So we can pull up the documentation for so Kubernetes has a primitive called an HPA. This is the horizontal pod auto scaler which out of the box doesn't do an awful lot but we can do CPU based scaling as you know the the cubelet that's just running containers does have some form of metrics coming out of it

19:55 Introducing Kubernetes Horizontal Pod Autoscaler (HPA)

20:24 and it knows what they're being used. Hopefully we've got an example of Yamal here that we can kind of steal. Computer said no. There we go. Example. Yes since last hit that. Computer said no. Yeah. Well they actually use PHP I have a square root function to try and trigger CPU intensive computation. Well. As our deployment, as our service, let's get down to the good stuff. Where is it? They're using the auto skill command to create one. Alright. Let's go with that. Deployment, Laravel example. Why did we give it such a verbose name? Or didn't we? I don't remember.

21:11 Creating CPU-Based HPA

21:25 Yeah. Project. Project. Thank you. And you can see now we have a horizontal pod auto scaler. So what we can do is get our HPA, which has got the same name. Take a look at it in YAML. And this is our spec. So it has max replicas of 10, minimum replicas of one, of course we always want one of these and the naive implementation as as the target or the CPU utilization goes above 50%, we're gonna want more of these. So in theory we have one pod now. I wonder if we just run siege again and watch that

22:20 and we'll see an auto scaling event kickoff. Are we feeling confident with that? No, me neither. Okay. Wrong split. Let's go this way. I'll just run the pod watch here and I'll run siege again here. I won't get any jokes just now. I have but not appropriate for stream so. It's always the worst question to ask somebody. You got joke? Too much pressure. I've got a question about the config. It does each of these pods do the Laravel pods have an NGINX and an FPM in them? No. It's just PHP FPM in there. And then the NGINX one

22:30 Attempting CPU Scaling & Debugging Metrics

23:13 loads it across from the FPM image. Look. It's scaled. That's nice. Okay. So I think I know where Kieran's question was going. So I'm gonna try and magically discuss what you were thinking there as if we do it. Maybe you can't see it from the screen. Alright. Okay. Do it the long way. Oh, no. That doesn't scale. Did it? It said it was creating a new container though. Oh, that's our job. Our cron job. That's that's the cron job one. Yeah. Okay. Alright. So let's work it why it doesn't scale yet. So Kieran, I think your question was

24:01 based on the HPA configuration that we've seen up here is that which containers are using to determine if the CPU went too high. Is that correct? Yeah. Yeah. Cool. It's a good question. So let's see if we can get some metrics from this and this may not even work but let's see if we do top node and top pod. Yeah. No metrics API. Okay. So that's probably why our autoscaler isn't kicking in either as that we need to actually provide a way to get metrics out of this Kubernetes hasn't shipped with a default metrics implementation

24:33 since I'll throw one out there 1.9 something like that. Everything that was later than that, but everything's been extracted. So let's get the metrics server running on our cluster first. Please, if I don't explain anything, just prod me. So the metrics API, that's what the horizontal scale is looking for. Mean, I don't I actually don't remember if it uses the metrics server or not. I believe it probably will. It may do something cruder when it's not available, but I'm gonna just I was just wondering why that didn't like error when you tried to spin it up if it didn't

24:50 Deploying & Debugging Metrics Server

25:13 have the server available. Well, also, we don't even have any resource constraints on our pods, do we? So what 50% utilization of what the full system? Yeah that could take a while to get to yeah. So there's loads of things here that I'm obviously just winging it. So let's go back to our deployment, so we don't have any resource constraints do we? We do. Okay. Yeah and they're not particularly great right? So 50 meg of memory and half a core which I'm assuming we're probably hammered in that pretty well. So I'm gonna assume it's just metrics server,

25:50 which we've already deployed to our cluster now. That should mean it's just not ready yet, so we'll give it a little bit of time. Let's see. Metrics server. So the metrics server is gonna run-in the cluster. It's gonna start scraping for stuff and this should be exposed soon. He says hopefully. Hopefully. The service isn't available yet so if we just do if we actually do a cube system describe service we should see the endpoints be added to it once it gets past the liveness probes that consider it healthy which could be anywhere from ten seconds to a minute probably so

26:34 metrics my service is called yeah metric service. Alright So we need to wait for an IP address to show up here. Lots of waiting in Kubernetes, right? At least it's all stuff it sorts itself out with. Yeah. I'm far too impatient and just keep running commands against my cluster. One of the things I could do is check out the probes on that metric server pod. Yeah. Unhealthy at the moment. How are the probes configured? Ten second intervals waiting for one success. Really should be healthy by now. I wonder if this is where I found out that the metrics server doesn't work with

27:35 Docker for Mac. Let's try our service. Yeah. Where's my endpoints? Come on. Let's get the logs from it. I don't really wanna debug this too much but you can see we've got four restarts on this. So I may switch to mini cube if that's gonna save us some pain. But let's see what we're dealing with. That's fun. So it's failing because the certificate coming from the API server doesn't contain the IP address. Alright. Let's switch. 1.10. Let's not do that. Alright. We're spending up many cubes. That means I will have to rebuild the images unfortunately.

29:00 But they should get as metrics ever running. How come the images have to be rebuilt? Aren't they stored on your local machine? They are, but many cube runs inside of a small virtual machine which won't have access to my host. Okay. I mean, I can Some way of packaging them up for export, isn't it? I I could save them to a tarball and import them into the mini cube. Would that save me time? I'm not really sure. I was kinda hoping Docker for Mac would would just work but I don't really wanna debug why a

29:39 metric server can't speak to it. I guess while we wait for that we can just quickly do Docker for Mac metric server. Someone's already written about this. So we're just they're just adding the and secure. Oh, wait. Do not enable this. That's just when it'll be accessed externally. Alright. Let's do it. Let's modify the metrics server to accept that flag. Makes that I'm pretty sure many cribs just changed my context. Yep. Proper desktop. Get deploy, edit, deploy metrics, server, args, next. Done. You feeling confident? So the next time you do a cube cube CTL apply,

30:10 Fixing Metrics Server Issue

30:53 would that not override that config that's there? Probably, I'm just not going to apply the metrics server again. Right. Okay. But if if you were to apply the metrics server again, would like nuke whatever you changed. Oh, yeah. Definitely. It would totally do But I do think we have an endpoint now. So I'm glad we googled that instead of rebuilding on mini cube. Thank you code of Dan. Okay. So which means we have a metric server working, which means we do have access to top pods and now we can actually see how much CPU and memory is being consumed by our

31:17 Successful CPU-Based Auto Scaling Demo

31:27 Laravel application and how much memory. So we're actually only using one m cores and 18 mega memory. So oh, seat isn't running anymore. And my port forward's gone. Too many moving parts. Alright. Port forward to Laravel. Laravel example. Eighty eighty. Alright. That stays there. Run siege. Run top. So we should just hopefully see this claim as a request come in. So that's gonna be a little bit late and say in the metric server probably around I can't remember what the default scrape interval is gonna be, but let's assume thirty seconds. Hopefully we see the CPU.

32:00 Triggering an AutoScale Event with siege

32:20 I mean it doesn't seem to get to 250 m previously or we would have seen an auto scaling event. So we may just tweak the resources on it and force it to scale earlier. Is it worth is worth making it so it hits the one which hits the database as well perhaps? Yeah. So if you send it to slash send your requests to slash posts instead of just to the home page maybe. Is there a watch on this? No. That would have been too handy. Can't you just wrap it in a watch dash n one?

32:51 I could. I could. But then what would I type? Oh, look. Yeah. Nice. So that should auto scale then now. Yes. Yes. Because it's made over 50% of the allocated resources. I hope so. Although not seeing another one spin up there. Yeah. So the HP is gonna wait for a few different data points there. So let's double check. These are limits. Yeah. Okay. So we we should see an auto scale. I'm pretty confident we'll see an auto scale. I hope. We'll let it do its thing and then I'll maybe change it to the database end

33:38 points to see if we can port it a little bit quicker And then we'll actually do something a bit more fun anyway. Like we wanna we wanna start instrumenting this. Oh, come on. Give me another pod. Let's double check our HPA. Why is that not coming up? Are those old? Yeah. I think so. It says unable to get metrics for resource CPU. The age is only three minutes forty one, though. Yeah. I don't think we've been running metrics Yeah. Four minutes away. It's quite a while. So I think it is working. We're just not seeing a scale yet.

34:44 Okay. Let's try and force it before we move on to the next thing. Let's put this down to 100 which we know will cross pretty quickly. But we are hitting the 500. Alright. I'm gonna blame the HPA. Question, Kieran, or just sign at my Yeah. Is the when you when you set a CPU limit I was trying to just ask the stupid question. When when you set a CPU limit, that doesn't actually limit the amount of CPU that that pod can use. Right? Because we're talking about it going past the limit. So is the limit just metadata?

35:32 No. We've got actually an enforced limit of 500 m. The CPU utilization may not be on the limit actually, maybe on the full host and we are blocking it before it ever gets there. So I guess in theory, why don't we just do 1%? Well, I guess my question is if it's got a limit of 500 and we're saying auto scale, how would it ever get past 500 to trigger auto scaling? Yeah. Good point. Trigger some person. We're saying never use more than 500 and then scale up if you go over 500. So instead of using limits then, do we

36:08 need to use the requests? Flag. To be honest, I'm making this up. I never used resource based auto scaling because it's well it's not ideal. So let's see if we can get it working before we move on though. So let's change us to just requests which means we want to make sure we allocate half a core of a 50 memory. How many cores have I given to Docker for Mac? Let's let's check that first and then we'll just give access to all of them. Come on Docker. Well, popped open over there. Let's drag my resources for Docker for Mac are eight

36:48 cores. So I'm gonna give this the full thing. The request of fail. I'll limit it at that. So let's reapply our PHP application. I'm in the wrong directory. Did you change it to be 1% was when it scaled as well? Yeah. I just wanna force it to scale. That's all. So I'm gonna run siege again. We're gonna get pods. Oh, looks like there was some stuff because there's stuff terminating now. Alright. Now it's scaling up. Okay. So now we have triggered an artificial scale event. Now the way that we interact and deal with us in Kubernetes is just sort of this

37:40 describe command. So if we describe our horizontal pod auto scaler, we should see threshold crossed scale event initiated and so forth. So yeah, here we go. So our new size was set to four. The reason was that the CPU resource utilization went above the target. It's actually said that that's happened twice over the last minute. So there we go. That was easy. Once we got it over a few little humps there. Let's kill siege. That should all scale back down because the CPU realization will no longer being used. Alright. Not ideal, right? Like CPU based scaling. I mean, I always think

38:17 Discussing Limitations of Resource Scaling

38:22 the CPU is indicative of, you know, we we can saturate it. So it's an important thing to kind of monitor and keep an eye on but not something, you know, if we talk about scaling being a proactive rather than reactive thing CPU seems like something that is worst case. Oh no, like we don't have enough CPU. We need more. And of course if we're running out of CPUs, we've got other challenges. So what we actually want to scale on is well, if we know the average response time for our application, then if it gets below a certain

38:56 service level agreement or objective, then we want to scale up as well to try and bring that back under the under the value that we're happy with. Does that make sense? Okay. Yeah. Yeah. Totally. And CPU isn't it's quite possible to have loads of CPU left, but you've run out of network sockets or you've run out of processes or Yeah. That's that's actually a fantastic point. Thanks. CPU is CPU saturation is is one symptom but it's not necessarily what is like, yeah. If we look at the response time of a request that can actually be

39:32 the multiple symptoms could be driving that and then that's that's better to monitor on. Yeah. I was just thinking about a system I worked on that had large binary downloads and basically the network card on the serve well, it's not a real network card. The network could be capacity of the server could be saturated and it's not really doing anything. It's just serving a big file. Yep. So what are you driving towards? You can measure something better. Yeah. I mean I always think scaling and monitoring is best done by our users but we don't want them to report their

40:10 problem. We want it to be automated. So for me I I wanna know if I expect my PHP application to respond in eleven milliseconds or whatever siege was telling us hopefully then we want to monitor that going above what we expect and scale it. Okay. Okay. We got a question from a Reza. So run out of network socket, how does Kubernetes gonna handle that? So there's two aspects to auto scaling on Kubernetes. The one we are focusing on today is the scaling of the workload which means we're gonna monitor for metrics on the workload and our users to

40:51 scale those pods up. If you want to monitor the platform to scale the platform up then you have to look at traditional monitoring approaches like the CPU, the network, the desk on your Linux system and then horizontally scale the nodes in your Kubernetes cluster to make sure you have the capacity to scale the pods that are running on the cluster. Two very very different things and maybe that's a good idea for another episode but definitely not something we're gonna be covering today but I do love talking about monitoring in general so. Thanks for the question.

41:27 Okay. Let me see if I can find any sensible conscious human thought today. There are two ways for us to get some metrics out of our Laravel application. One that involves us writing code and modifying our application and one that doesn't. Keenan, you got a preference. Alex, you got a preference. If you can change no code and get the same result, that's preferable. It's also the dubious one, but I'm gonna give it my best shot. So because inevitably, when you get code required for monitoring, new sections of code might not monitor it appropriately. Whereas if it's automated

41:30 Introducing Service Mesh (Linkerd) for Metrics

42:11 or abstracted away, then it should be just handled, and you're not relying on somebody doing that work. Yeah. Yeah. A lot of developers aren't thinking about those concerns, are they, when they're writing it? No. And you don't really want them to think about that either. I think we're gonna It should be it should be a level of abstraction between why it works in production efficiently and why like you're coding it in a certain way, I guess. Okay. Let's try and do both. Right? I think we've got enough time to do this and I think they're both valuable.

42:44 So when we run-in Kubernetes we actually have access to a part and called a sidecar, a sidecar means that we run an extra container and the pod allows that shares the same networking namespace, pet namespace and a few others bits and pieces together. One of the things that one of the really useful use cases for that is generally called service mesh even though I don't want to use service mesh today but I can still use that proxy effect to capture all of the requests that come in and out of our application. In theory, what we should be able to

43:12 do is deploy linker d to this Kubernetes cluster, have it inject the proxy into our Laravel application and then have it expose response time metrics on each request that comes out of our container without writing any single code. I think that would be pretty neat. So we'll give it a go. Okay. Pretty sure we don't need to use the linker d CLI but we may as well. Why not? I'm pretty sure I can just apply a manifest rather than do this. I just don't know if it's gonna be documented here. Yeah. So we'll just move it to the

43:30 Installing & Setting up Linkerd

43:53 CLI. Hopefully it's nice and quick. I really need to start taking notes of stuff to install before the streams like you know all of sudden watching brew run. How's brew on the m one Kieran? Is it better? Is it fixed? Does it work? How do you mean? I think when the m ones first came out brew didn't work. Oh yeah yeah. Well, day one, I was able to get home brew working. But at the time, everything had to build from source. Okay. And a lot of things believe that they've released version three of Homebrew now, which

44:29 I think has resolved all of those problems. Or like a lot of them anyway. Yeah. Pretty much. So you could get Homebrew two working really quickly and almost I I got my laptop a few weeks after they were available and a few a few of the packages had bottles already. They call them a bottle. Right? The pre compiled thing. Few of them had already got it, but most of them are building from source and it's just slowly been filling out. And the ones that didn't build from source got patched pretty quick. So with Homebrew three,

45:00 you don't have to I had a couple of manual steps to install it, but you don't need that anymore. It just works. And anything that doesn't work, you can run under as an Intel package anyway. Yeah. It's got that translation there, doesn't it? Yeah. Which works really well. You don't you don't you're not really aware of it. You can just run an intel binary and it transpiles it somewhere and stores the transpiled version. Some magic somewhere in I mean, I don't want to Transpiles the binary to an ARM binary and stashes that away somewhere for future use.

45:35 Yeah. I I don't I don't wanna segue or deviate too much from what we're talking about but like I read a really impressive article where they were saying that the Rosetta compilation translation and then running of the binary was actually faster than some Intel chips on the m one, which to me just seems ridiculous. That doesn't surprise me. Yeah. Yeah. Let's go back Because the designing because the chips designed for this use, there's there's some instructions and a whole memory access mode that's only there to make Rosetta work. But if you're building an ARM chip from

46:00 Adding Linkerd Sidecar for Request Metric Collection

46:10 scratch, you wouldn't have. Okay. Cool. Alright. I've taken the resources off of this just so that we can have a better flexibility as we deploy this. I'm gonna reapply this. What we should see is it's really nothing yet on directory other than those resources being changed and we can confirm that with a get pods and you can see our application still has two containers, which is NGINX and FBM. What we want to leverage with linker Dino running in our cluster, I'm hoping that it's healthy. Maybe I should just check that anyway. Yeah. Nice. Is that we can now use something called

46:51 automatic sidecar injection to add the third container to our pod, which gives us all the proxy support. I've never done this before, but I'm confident. So yeah, automatic proxy injection. I mean, I've used like a d before. I haven't done it quite in this scenario, so I'm sure it'll be fine. And all we need to do is add an annotation to our manifest. In fact, here's one here and we will change it to enabled obviously. Metadata annotations enabled. This should give us that third container. Well, that wasn't very nice, was it? The documentation says something about rollout restart or something.

47:51 Yeah. The annotations, I don't think will trigger the the so the way that this works is that's a linker to the machine controller in the cluster, which monitors for the deployment being created. I think if I just did this. No. So that's is this basically the Kubernetes version turn off and on again, delete the pod and recreate it? I deleted the deployment. Yeah. Alright. Maybe I need to enable linker Did you do that production often? Would I? Yeah. Of course I would. Alright. So why does it not work? I can always do it the manual way

48:38 if I really need to, but I wanted to use the nice shiny way. Yeah. So let's just do it the manual way. But this does it at a manifest level rather than doing it through the admission controller. That's what I get for trying to do something that I hadn't done before. Resources, ops, deployment, Kubernetes deployment. And like a d inject, we'll add that container and then we'll redeploy it. And now we should have one. So sucks that I had to do it manually, but as long as we get it working, I'm not too fast. So now we

49:13 have a third container. And we don't need to describe it, but you know, this is just a proxy and what we really should see here is if we color port forward and start that again is that our application still functions as normal. I love it that my voice goes high pitched when I see that as if it's a question but. Yeah, that's normal. Yeah, as normal however every request and as going through linker d which means if we pull and expose the metrics from linker d we should be able to see a little bit of information about how

49:53 long those requests take. And I'm sure there is a command I can steal to do that too. And metrics. Yeah. There we go. So see. Getting data from the proxies, put forward. Oh, there is the UI as well actually. I wonder if we could just browse to the UI. We all like UIs, don't we? What's the namespace? Nope. Has it got its own? There we go. Port forward. Link to the web. I really should have looked to see what port it runs on. It was like four two nine one, I think. We're about to find out. No.

50:00 Exploring Linkerd UI and Grafana Metrics

50:53 I was close. If you go to documentation, was definitely a four involved. Four one nine one. Okay. No. That's not the UI port though. Oh, is it? Okay. Right now. Yeah. So Show me the UI please. Please. Might just gonna have to Google that linker the UI. In fact, might as well show people the Kubernetes we're doing this right. You don't always need to go to the dock. So we can see that we have a pod here. We can just describe it and see what ports are available. So where we got ports 4191 may actually be the port Either 4143.

51:41 So 4191. On that on that documentation, there's a something here which is exposing dashboard, which is port eighty eighty four. 80 80 four. Eight zero eight four. But I'm not really sure what it's looking at. It's just called. We'll get there. Okay. This I'm not familiar with but I do. I am. This is cool. Let's see. Ritz. Nope. Is this just gonna be service meshy stuff? Deployments. Here's our Laravel application. I can see our embed successes. Oh, things are working well. I guess we can break that with running siege. Wait. Which UI are we looking at here?

52:36 Is this the Kubernetes UI or is this Linkerd one? This is Linkerd's UI. Linkerd. Oh, it's just because it shows all that stuff like the cron jobs and daemon sets and deployments and stuff. Okay. Yeah. Because Linkerd can inject its proxy into any of these resources. You can kind of break it down and take a look at it. So we can get let's go to our route metrics. We can see some stuff here. Okay. Is it good? Does it drill down anymore? No. Not really. What's this Grafana link? Does it really deploy Grafana too? That's not like it's just made your day.

53:21 That is awesome. Yeah. I should be more familiar with like a d but I I'm very impressed with that deploy setting all of this up. That is awesome. Alright. Let's run siege. Let's let's break this, right? So and then we'll hook up the auto scaler to it based on those metrics. So we just wanna port forward to our Laravel application again. I don't think I have that running anywhere. Pack a different port. 7,000. I don't have it running. Okay. It doesn't really matter though. I got paired directory history as well. So I need to go into the right directory

54:07 to get my command now on 7,000. Let's turn on refresh last five minutes, refresh every five seconds. Let's see if we can see some charts go crazy here. Can't believe it. Not only does it like, so by just running linker DNS cluster, adding the proxy container. We've got linker DUI showing which services are communicating with each other. We've got Grafana provisioned with pre canned dashboards to show us request information and we can already see the latency here spiking. That's really cool. That wasn't supposed to turn into a plug for linker d but I feel like I'm

54:48 just gonna have to keep saying nice things about it now. So we can see all these metrics going a little wild now so. So it looks like latency is actually in hitting like ten to twenty seconds. Was that not two seconds? Oh is it god I can't the resolution is not great, is that? Yeah there's the one downside to Grafana and zooming is not great. Just because it's so visual of course. But yeah, that's two seconds. Okay. Which isn't great but it's okay. It's better than twenty It's great. Age. Well, yeah. Yeah. A page which returns to words like

55:28 hello world. That's it. Okay. So one of the things we should be able to do is if we go to just explore, this is gonna give us raw access to Prometheus and we should be able to query the metrics that Linkerd is creating for us for our service. I kinda wish I had access to the Prometheus UI, I'm not that we'll do it from a dashboard right? Cause we know the metrics that we want to monitor on which is the, we'll use latency for today. So let's click edit on this and see what that query looks like and then we

55:30 Exploring Raw Prometheus Metrics

56:01 can use that as a basis for our auto scaling. So we can see here, So the histogram ninety fifth percentile is doing a sum across rate interval looking at response latency millisecond bucket and it's doing the rate over thirty seconds. That kind of helps. So we know that we have this available. So I'm just gonna copy that metric name and then we're gonna go back to explorer and just drop this in. Go. If there is a Prometheus install, is it not going to be a Prometheus UI also installed or Don't get smart. I hope so. Maybe.

56:58 Why is that not running? Oh, there we go. So now we can see we should be able to see woah. Sorry. It's just been a bit slow. Yeah. There we go. There we go. Alright. Cool. We got what we need. We scale on that. I'm pretty confident. And you were saying because we have let's shut down siege just now. We probably you're right. I think we do have a link. We do have a permetheus. So let's try and get pods. Prometheus. So let's try that then. Linker d port forward. Oh, what's the iere fuse port?

57:49 Nineteen ninety. Promise that's the last question I'll ask myself in that torn. Nineteen ninety. Yeah. I copied that thing but we should see I know we got this fancy thing, no not fancy but you know we got a list of all the metrics that we can actually execute against. The response time is the one we're curious about so yeah response time latency millisecond bucket and we can use label selectors to filter this but let's just get them all first and then we can open this and we want, so we just want a Laravel application, see what we have.

58:32 Namespace default. Maybe I don't remember how to do. Is it equal since yeah. But we got nothing. Execute. What's the name of spaces we've got here? Linker d, link d. So it's just doing the the proxy ones. Do we actually have anything like yeah. There we go. So let's copy this. So this is our application. Oh, it's a long way down. There we go. Cool. There we go. This is just the stats that hit our application. Where's the buckets? Histograms normally have a size. Oh, l e one, l e two, LE three. Okay, so this seems to be

59:45 the number of requests that took one second, two seconds, three seconds, four seconds, five seconds, ten, twenty, thirty, forty. Those must be milliseconds And we can see that nothing is taking above or no, maybe that's something else. Above infinite. Okay. This supposed me to explain how histogram buckets work. Right? So this less than is 50,000. I'm assuming that's milliseconds which means that we are looking at 50 response time and then everything else is in the plus infinity bucket. We seem to actually have really even distribution of the number, I don't think this is actually the number of requests

1:00:34 and so we can always tweak that but what's important is that we do have the ability to look at these metrics so yeah here's our actual application. Not sure why I'm not getting an idea now though. You've turned off your sage. I know but I haven't started. There's been no request last Oh, no request for last minute. Yeah. Okay. That's the refresh you just changed, not the time. I swear I've done this before. It's definitely been Damon the last thirty minutes. But you're the wrong namespace. Alright. Okay. Doesn't matter. We have metrics. We can scale. Now let's

1:01:28 go through the manual instrumentation of application using a Laravel package and then bring it back around to what we actually wanna and then join all the dots. Okay. So I think what I wanted to show here, I got really sidetracked by the shamely cool features was that we can inject linker d, enter a cluster, automatically inject a proxy and get a whole bunch of HTTP and request based metrics out of it that we can use for our HPAs. Now what we wanna do as well, okay, that's nice, but what if we wanna get custom metrics out of our application

1:02:00 Attempting to add Prometheus Middleware to Laravel

1:02:03 and the ways that we can do that. Okay. So the standard way to do this is just to expose, I mean we're still put forward in our way. Yeah. But it's on 7,000. Yeah. Normally what we do is just add a metrics endpoint. So right now if we go to slash metrics on our Laravel application, we're not getting anything back whatsoever and we wanna change that. Now typically you don't need to write the instrumentation for this yourself. There's generally middleware adapters for like I think every language and framework I've worked with in the past. Someone has already written this and you just

1:02:46 have to bring in the package. So this is where I'm gonna lean on both of you. My PHP is incredibly rusty. Good. But I did upfront Google Laravel Prometheus metrics. There were a couple of packages, you can see I've clicked on them. Are you familiar with either either of these developers and we have trust in them or am I just picking the first one? I don't know if either of them, so. Which one has nice stars? This one did. I think the other ones are like 11. See I got Sophie's Choice. It's like this one hasn't been updated

1:03:21 since March 2020 which is understandable given 2020 and then there's It means it's stable. Means it's stable. This one was updated more recently but had less stars. So like that that that Well we go with the recent one then like stars aren't it. Stars don't actually mean it's quality so. No true. Alright so it looks like I mean, I thought it was gonna only offer me the get approach but we do have a PHP way of doing this. I don't even know if I have composer. If you put it into a the composer. Json file and rebuild the image

1:03:46 Attempting Laravel Prometheus Package Integration

1:04:09 it will put it in there. Yeah, good call. I like that. Okay. So we could just drop it in here. I guess it would go here. We don't need this star. Can I just do latest? No, just just put a star if you want latest. Okay. Now I don't want to rebuild this image too many times and we don't have, we're not you know developing locally I don't have PHP or anything like that so what I'm gonna do is just instrument it, hope that it works and then we'll rebuild it there. Okay. So this then wants us to connect

1:04:53 this end and to bootstrap app. Sure I mean just keep me right here in the Laravel bits, I'm just gonna trust the documentation on this package for the time being so. I just drop it anywhere. I mean this is an unusual way of doing it but yeah. That would normally not go in the bootstrap file. Would normally go in wanted to config file. There's config slash up, which is where it would go. Sorry. Bootstrap slash up. It would go in there. That says bootstrap. Oh, wait. Oh, sorry. You are there. Okay. There's normally a massive array of them. One

1:05:39 minute. I don't see that. No. Sorry. If you go to config slash app, and then scroll down a bit, there's a section called aliases. You'd normally put it in here. That's like the standard way of doing it in Laravel. But I assume there's a reason they haven't done that because what they've done is more effort. Right. I'm just gonna trust them. Just just go with what they've got. I don't know why they've done it, but they must have done it for a reason. Okay. And then we have to register. I mean, I don't even know if that's

1:06:26 gonna just be lumped together here. If there's any ordering constraints, like, I put it to the bottom or We're gonna think that. Yeah. Okay. Yeah. Guess. So this package has a default configuration, which is just the environment variables. So we just need to do a little bit of tweaking and then or does this actually want us to do something else to to customize dot ENV. Do we have a dot ENV that goes into the container? Yep. In fact, I can just put these in the deployment like this yeah. That's one of the things we did last

1:07:04 time was we set it up with a config back, didn't we? Did we? Did we? We did. Nice. Cool. Oh, yeah. And there's those secret things. But that's all we shut down now so it's okay. So profifi is namespace that's gonna need tweets. That's not where we've not really deployed our own premiFUS yet but we can do that for sure. Just we'll piggyback on the linker d one. I'm assuming we want to enable route based metrics. Yes. Although I mean if these are the defaults I probably don't need to provide them right? Depends obviously it depends whether Dave

1:07:43 might. You mean you don't understand how this random package that you've never seen before that I've just presented to you works? Yeah basically. Right okay. Prometheus metalware now. Don't know what you are. Don't know what you are. Don't know what you are. Prometheus reddest. So this is yeah, this is where things get a little weird in PHP land. I never really thought about upfront and I really should have but Prometheus has counters that usually are stored in memory and then PHP with its CGI share nothing kind of approach probably wants us to have a Redis to

1:08:23 be able to cache those counters so they don't get reset to zero every time. So we'll probably just need to quickly throw a Redis us into this as well. Is the idea that it periodically would export so it keeps that state somewhere in the application then periodically exports it to Prometheus? Yes, so the way that Prometheus works is it's a pill based system that's gonna come to our application on the metrics endpoint every ten or thirty seconds. We're gonna have all these counters that we need to process across multiple requests. Redis is gonna have to make that work.

1:09:01 Okay. So we'll, we're gonna pay it back on Linkerd. Right? So let's do Linkerd. Doesn't seem to allow me to specify the name or where that Prometheus is. But that's not important because Prometheus is gonna come to us. Oh, I'm so confused. Because we can't we can't configure the target. I'm just gonna deploy Prometheus. That's gonna be easier. Yeah. Is pretty much getting started to get the docs deploy the manifest and most people would probably go down the route of configuring the operator. Maybe that's easier. Kubernetes. Prometheus operator. Let's just deploy that. Quick start bundle

1:10:01 bundle. There we go. Okay. Apply our bundle. Very trusting. I have no idea what's in this file. But the previous operator will give us a CRD approach to request new Prometheus servers and then configure them using something called a service monitor, which is just another CRD. In fairness, I do think this is the quickest way probably for us to get this working. We applied it to the default name space, so we should see our operator running here. Now we need the YAML to ask for one Prometheus. Let's jump back to the docs. Guess we'll just get it from the gate.

1:11:13 Nope. Get me a Prometheus. There's a service monitor. That's good. And there's a Prometheus. Okay. We'll add it to our ops Kubernetes. Now we have this dependency that we're not gonna automate in any way, but I'm just gonna add the CRD, which says give me one Prometheus. We'll call it Prometheus. Service monitor selector. We just, that just means we need to remember to add team front end to our service monitor. I think that should be all right. Let's find out. So we'll get to the right directory. There's a lot of just let's find out, isn't there? I'm sorry.

1:12:11 And the operator should detect the CRD and should spin this up one Prometheus, Assuming I have a go anything wrong very quickly there. I would have expected to see my Prometheus already. Did I get anything wrong? I don't think I need to add my service monitor first but I will just in case. Laravel, We've added the label that expects. Now we want to point it to our application. So if we take a look at our deployment, we have this label and we can just match that straight up and it's using a named port to work

1:13:16 out how to fetch those metrics. So we just need to make sure that our port is named, which is not, but we could just do that. Oh no, because I don't wanna reapply this because of the sidecar. So we'll just do port 80 like so. Yes? Okay. Alright. I was gonna say, is you've got the service monitor's name as Laravel. Does that need to match the Laravel example project or is that its own thing? Or line four. I know that that's okay. So the Prometheus is doing a service monitor based on the label team front end which I've kept the

1:14:08 same here. So Okay. That just means that our Prometheus should detect that service monitor which will add it as a scrape target to our Prometheus which hasn't magically shown up yet disappointed and pulling our metrics. So why do I not have a Prometheus? Let's grab the logs as quickly and see if there's anything obvious. Okay. What have I forgotten? Operator, deploy and manage previous server. Yep. That would be nice. Let's see. One of them's got operated, not operator. Does that matter? So that service tells me that it's tried to create our Prometheus. Maybe just failed.

1:15:36 K. Get service monitors. Laravel. Let's describe it. It's alright. We just don't have Prometheus. I'm just gonna deploy my own Prometheus. And then I'm just clicking on random links looking for YAML. Pretty much sums up my job as well to be fair. Kind deployment. Nope. Not there. Next. Deployment for me, I there's nothing particularly special about it. Nope. I want the YAML. Kubernetes. Nope. I'm not enjoying computers today. How old is this? 2018. That's probably not the wisest thing to go through on my cluster is it? What have I missed? What have I been silly with either of you?

1:17:08 I have no idea. So we installed the operator and it's happy. It's definitely happy. I found something which has the word deployment in it and the word Prometheus as well. You want more YAML to throw a wall that might, don't know. Hold on. Let's do get Prometheus. So we we had our CRD and we requested a Prometheus. We could see that we have no version and no replicas. Right? That's that's bad. So we're gonna describe our Prometheus and see if there's any events on it. I'm really regretting calling it Prometheus. So I I think it's failing because the service account

1:17:56 doesn't exist. So let's use the default service account and see if that keeps it happy. So now we can just redeploy our prometheus.yaml once I learn how to computer. And then we're on get prometheus. There we go. So it was Okay. Service account. I'm not a failure. Okay. So I don't know why the operator wasn't printing anything out to say, hey, the service account doesn't exist. You're not gonna get your Prometheus regardless. This is gonna configure us a Prometheus and our namespace with a service monitor that monitors our Laravel application. That monitoring is gonna fail,

1:18:46 we'll actually be able to see that through the Prometheus UI. So let's jump into that very quickly before we deploy our PHP changes. What you should see here is in targets. I was kind of hoping the surface monitor would have picked that up. Maybe I broke that too. Let's see. This is just not showing them. Okay. I think it's trying. Okay. But we don't we don't have anything there for it to work. So let's make the change that we need to make for our application to have that endpoint. So let's go back to those environment variables

1:19:47 on our deployment or config map. Apologies. Oh yeah. We know we need a redis. Right. Redis Kubernetes. Just give me a YAML, a Redis pod. Am I feeling that bold? Yeah. What's the worst that could happen? That's a really stupid idea. I'm just gonna do it. So that should give me one Redis. I feel I don't know. It's one available is we probably wanna make that available to service. Yeah. That is a really stupid idea. I don't even know why I entertained that. Okay. Next. Helm. I should just do the proper relation there. So Do we need to worry about it

1:20:56 back to saying it's deprecated? No. It's okay. Okay. Seems like such an overhead, doesn't it, to deploy Redis just for this? It is again, it's only in PHP because of the share nothing thing it is a bit of a pain in the ass but we're just gonna go with it and the deprecation warning here is because I know you can't really do URL and I can't zoom in on the URL bar but this is helm slash charts on GitHub which is deprecated but it's actually pointing us to the bitnami one which is not deprecated so. Okay. It's okay

1:21:33 And we'll call our helm. I will call our reddish reddish. So are we gonna have one reddish. We're gonna have one reddish per pot. No. We're just gonna run one reddish in our cluster and then our PHP application is gonna use that for cache and persistence of Prometheus counters. What this should mean is we have a Redis service here which I can now point our PHP application use. So Redis master or head in fact, probably wants the headless service, so let's do that. So now we need to compare to our config map and our Redis is available on Redis.

1:22:16 Let's point it to the master. I think that's what I was gonna expect. Our Prometheus now lives in our namespace. Let's call that default. This may work. Confident. I definitely lack any confidence today. So now that we've modified this config map, we can apply that first and nothing should actually happen yet because those keys won't be, aren't gonna be consumed by our PHP application. However, what we want to do now is rebuild those images with that code change that we made. Did we finish the code change? Were we confident there? I think did you make your way through

1:23:03 the entire file? So what's this suggesting? Do I need do I need this load component thing? It says to customize the configuration you either need to override the environment variables which I've done or you can copy the included permetheus dot PHP to this location edit it and then use it. Yeah, we don't need that. Yeah, just copy that and see what happens. No. You want you want the middle one as well. Okay. So now we have to add this also to our bootstrap at PHP and again I have no idea whether ordering is important so.

1:23:49 Just put it below, it came below in the documentation. That's kind of the the same logic I'm using right now. To observe guzzle. We don't we don't care about guzzle, we're not doing API calls. Alright. We do have database providers so let's get those metrics too. I mean if this application is doing all the things that suggest it's doing that would be really nice. So I wonder if we could just abuse the disk rather than Redis. In fact it does mention APC here and there's a memory adapter. So maybe we can maybe there was a

1:24:22 way to do this without Reddit. Yeah. Oh. Alright and then that's the Cool. You could probably use op cache as well. Yeah. I think I think that would make a lot of sense too. Alright. Let's rebuild our images and we're just gonna see what happens here. Alright. Fingers crossed. Hopefully the I was gonna say hopefully the NPM one won't need to do anything because there's been nothing that's changed in our assets. But I'm guessing I think something's changed. We're copying everything. Right. So we've been validated the cache, aren't we? What a crappy Docker fail. Who wrote that?

1:25:03 I don't know. Yeah. We actually Kieran and I did an episode on really nailing down how to have an optimized cache for bug and PHP applications. I'm not sure why we didn't copy that here. If I didn't write this one, that was you Alex, wasn't it? I wrote this one. It's a copy of the stream that me, you and came and did ages ago. Well, we did. Last October. There was a copy of that. Oh, yeah. So the problem is I think you've done the multiple copies but we've not got the steps. Oh no wait. Oh that's dev. So

1:25:43 oh yeah. You have to run the the composer install here. After the That's all. Yeah. No big deal. It won't take too long and then what we'll do is I won't bother reinjecting the linker deep proxy and we sit, we sit, we see not working so let's just leave it out, we'll apply our manifest over the top and then we will confirm that the metrics endpoint on our Laravel application works. Once we confirm that, that's that's pretty much it. Once those metrics are available, we can figure the HPA the same way that we did at the start

1:26:18 instead of using CPU, we're gonna point it to custom metrics and it should just work. TM. The shame a little discord chat thing is not working. I'll try it again. Oh well. We broke it. Oh no. So what's that complaining about there? Complaining about illuminate foundation application with facades. Yeah. Okay. So that that's before when I was saying this is a really weird way of doing it. That'll be why it's weird. Because because it's not a valid way of doing it looks like. You yeah. Is it worth maybe bringing this into a separate into a separate stream at some point, like,

1:27:00 Dependency & Compatibility Issues with Packages

1:27:20 use an application level metrics? Because I don't think that that package works. What? All of those methods, like, with facades and stuff, they don't seem to be like, that's not how I would ever register things. Okay. I'm gonna take it out. I'm gonna feel brave here. So hold on. I've had that as Alright. Let's use this one. Okay. We tried you one. Yeah. Cool. Yeah. Is the normal way of doing things. Seem to be the same. So let's let's trust it. And if it doesn't work, it doesn't work. We'll move on. No big deal. Mhmm. Okay.

1:28:11 So I need to modify the composer dot JSON. Yep. So, and then in our app dot PHP we add a provider. So this is config slash app, not the bootstrap app which you were in before. Okay. So I can just put it anywhere here? But the bottom, but yeah. No. That's the wrong one. Sorry. There's two arrays go to the top of it. The first array and put that one in. Yeah. Here. And then you got random. Yeah. Yeah. Cool. And then we add this one to the aliases. Actually, this seems very similar. It's very similar but that's the correct way

1:28:59 of doing it. I don't know. I assume this is maybe a fork where they've modernized it or something maybe. Hopefully that You've got a rogue closing bracket there as well by the way. So you might get rid of that. Why did they keep copying them? Or did I paste the wrong thing? I think I think that one there was when the first one you copied had you highlighted two lines. I think that's what it was. Alright. Are these the same? Prometheus namespace yeah. I think they are. Let's check the config map. The reddish is different. They're

1:29:31 just using this. Which is standard Laravel. So previous namespace metrics were enabled and metrics route path. Yes. Yes. It does want this. I'm just gonna throw that in very quickly too. Even though I killed in the last example and then read this host red is port, red is host and red at port. Yep. You've need to put the prefix one back in I think and the storage adapter. Oh, the storage adapter. So that will make it not that would make it not use Redis, would it not? If he was Well, they seem to use both. So I'm gonna that this

1:30:18 person knows what they're doing. Because I have no idea what I'm doing. So Redis master. And why am I getting red squiggles everywhere? It's specified twice. Thirty three and thirty six and thirty four. Okay. Now it's happy. We have a comment that I don't understand. Hopefully one of you too, but Jake Harris says 2021 and no auto discovery dance game. I don't know. Yes. So in like Laravel 05/2006, like about four years ago, service providers were auto discovered. So you don't have to do that stuff where you added to the aliases just by having a composer file. Having it in your

1:31:04 composer file would make it work. I'm guessing, but he's just on about the fact that it's not been, you know, updated to have that stuff done. Okay. So FPM failed again. And that looks like it didn't pull in the composer. Put in the package. Do I need to delete the lock file? No. Is it doing a compression install or compression update? I don't know. In the j, in the Docker file. Install preferred desk, no lens, no dev. So delete the lock file. Yeah. Yeah. I don't know how it handles copying the file that doesn't exist.

1:32:00 Alright. Fingers crossed. It doesn't. Just I thought it might not. Oh, Chris is back. Alright. Okay. Yeah. There is a copy. Yeah. We'll get there. Yeah. I know we're we're over time for today. So if either of you need to drop, feel free. I'm gonna go whenever. Alright. Alright. I'm gonna We have hit a lot. I do actually have to go because I've got a call in five minutes. Yeah. No worries, mate. Thanks for tuning in and joining us. Yeah. I added a lot. Thanks for having me. No worries. See soon. Okay. So what happened here then?

1:32:51 So this is saying that the package it can it basically can't resolve our dependencies because we're using a newer version of Guzzle than what they basically, this is a package that isn't up to date with dependencies and stuff. Is that maybe why someone forked it? Well, yeah. But then the one that's forked is worse. I'm just gonna try and configure the fork then the way the other one is configured because it's so similar that I can't can't see why it wouldn't work. Right? Okay. Yeah. And if it doesn't work, you know what? We'll come back to it another

1:33:34 day. But I'm gonna give this my best shot. If anyone watching has added Prometheus to their application, feel free to throw some advice our way. Alright. So we wanna add something to contact app aliases, which we did previously. Wrong code. To me, this just looks like it's screaming out for somebody to write a new and better package. Yeah. It's kind of the vibe I'm getting. That's the wrong one. Could be a fun could be a fun project to do, I guess. But, yeah, most of this was modified two years ago. Bits of it were changed five

1:34:21 months ago, but Okay. So I've added the alias. We need to register this, which seems to be done in its own way, but we copied, you know, we saw that up here. And I think I already modified composer dot JSON but I'll double check. Yes. Just one second. What version of Laravel are we using? Can you check the composer dot JSON file? Right. Yeah. This won't work. The package hasn't been updated. It's only it only seems to support version seven, and there's an there's an open issue on there for supporting seven. So So does that mean there is

1:35:13 there is no working Prometheus Laravel package? It certainly looks that way from a cursory look at Google. Yeah. Laravel. I find that hard to believe. Would that be an alternative instead of Prometheus, like that is done by somebody or maybe? So I mean I can definitely walk through. Why don't we add a manual metrics endpoint for now? Can show people I mean it's so easy right, these packages are supposed to make it automated and that you know the metal where we'll hook out to the request pipeline see the request coming and take a timestamp see

1:35:35 Adding Manual Prometheus Metrics Endpoint

1:35:59 the request going out take a time stamp, hold them in memory and then drop them into a page right. But Prometheus metrics are not difficult so let's remove this, let's go back into our config, let's remove all that stuff as well. Really disappointed there wasn't just a make it go package. One thing we could Yeah. Just looked like there was one that's an opening in the market of packages where it'd be a nice nice thing to fill. We could instrument through open telemetry which is another really really good approach for doing distribute iteration in your application. It's not

1:36:36 what I'm trying to show today so I won't do that. We'll add a custom metrics endpoint right now and then we'll wrap up for today and I think we'll come back and we'll revisit you know maybe even you and I can sit down and write a Laravel middleware to do this like I said it is not difficult. Yeah, no I'm down for that because writing Laravel packages is quite fun so. Okay. Let's save all of this. I'm gonna close all my tabs. How do I add a new route to my Laravel application? Okay. So go to routes/web.PHP.

1:37:11 Okay. Next. This is kind of a caveman approach but I'm okay with caveman approach because we are over time but I wanna try and get something working. So how to You want it to be a get request then get the path and then Is there no shorthand functions and text in PHP? There is. I just don't use it because it's PHP eight and some of our stuff is still using PHP seven point stuff. Okay. Can I just use print here? So like what do you want to do? Just get like a string on the page? I just

1:37:48 wanna return a plain text string. Okay. Return double quote string in there and double quotes. That's it. Does PHP have string literals, multi lane strings? Yeah. So it's Oh, Christ. How do you do that? Three left Chevron's Chevron's. Really? I have to do it here doc? I believe that's how you do it. Yeah. Well, you can do double quotes and pull out in multiple lines if you want to. That's Sorry. Yeah. Yeah. Yeah. Okay. So let's try and get that response time here. These are just Prometheus metrics, there's nothing fancy here and then I can do one,

1:38:29 I can add metadata to this, I could say app is Laravel. And this was gonna be a histogram. I would add my buckets, so we'll see less than one, two second responses, three second responses. We'll assume that, you know, a 20 requests of 28 requests of command and we responded under one second. We've had four or two requests that took two seconds and one request that took three seconds. That's it. That is a met that is a metrics endpoint. Obviously we'd want real data. Yeah. Why is my end of here doc not working? So you need to put a semicolon at

1:39:10 the end of it. I tried that. And also semicolon after your Monday. Got it. Thank you. So obviously we've actually went to middleware to track how many requests are fitting into each bucket. We can also add any arbitrary metric here that we want and we could do HTTP version at Laravel PHP 7.4. And another thing we could do is users Name name equals David. And we had two Davids. Right? Completely random. And but this is all our metrics endpoint is is something that fits this form and then we can deploy it. So I'm gonna hopefully

1:39:55 Building and Deploying with Manual Endpoint

1:39:58 build this image deploy it, hit metrics endpoint. Prometheus hopefully if I'm not messing up enough, we'll scrape that based on the service monitor that we created and we could use that for an HPA. That's kind of where I want us, hopefully be finished in the next five minutes confidence. Okay. And maybe I'll get some lunch today. It's quarter four if you're eating anything yet. It's been a busy day. So Jake Harris, I think he's picking up the gauntlet and is gonna attempt to magic us a Prometheus middleware from Laravel that would be amazing and Alex and I would both be

1:40:37 happy to help, please feel free to link us to any repository or any efforts. Okay. Now that we have a new image we just wanna redeploy all of our manifests, hopefully I've not broken anything in there during this kind of hectic component and then we're gonna do a get pods. Let's see, four seconds, that's good. So we have got a deployed application. Gonna port forward to Laravel example. Let's see if eighty eighty is available. Yep. We should have our application and we should have, We don't get new lines or is that just my browser? If you've used our. Okay. Cool. Yeah. Usually

1:41:05 Confirming Manual Metrics Endpoint Works

1:41:25 if I must be my Chrome has a primipious extension that formats metrics but oh well. Okay. That's I mean, wish I could say it's there's more magic than this but that's all the metrics endpoint is. Obviously you want the numbers to be real, you want the dimensions, the metadata to be vast and understand your application, you can expose as much context as you want here that you want to be able to query slice and dice with from a monitoring perspective. You'd also want it to be behind some form of authentication I guess wouldn't you? Some

1:41:30 Attempting Prometheus Scraping via ServiceMonitor

1:41:57 sort of like, you wouldn't I mean you may only want to respond on a local internet address like the you know, the Kubernetes and pod or service ciders, sure. 10. Something or whatever. You might not want to, yeah, you probably don't want it to be exposed to your public end using customers. Yes. But I wouldn't necessarily put it behind off, I would probably just Firewall or something. Okay, something like that. That's a whole, I mean there's so many different approaches to that but let's cover that later. Okay. Let's see if our service monitor works. So

1:42:33 let's see. Describe service monitor Laravel. Hopefully let's get our Prometheus up and running again. Did I still have that port forwarding? I can't even remember what my port forwards are anymore. It was like ninety one ninety one I think for that one. That's linker d. Okay. Let's close them. Let's port forward. I know. You need some you need to get on the screens or t mux. I can't I can't stand t mux. Okay. How come? I just break scrolling. Scrolling becomes really cumbersome on it. Oh yes. It's like can be left square bracket and yeah.

1:43:23 I wish my service monitor was working. So what is the service monitor trying to do? Okay. So Prometheus is a pool based system, which means that Prometheus has to know which pods within our cluster should be, they should fetch metrics from. It does that should the Prometheus operator monitoring for our service monitors here and those service monitors tell how to find things that expose previous metrics and then updates the targets and then it should just work. But where have you told it to go to slash metrics? Okay. Let's let's double check my configuration. So we added something to our resources

1:44:03 here in Prometheus. We told it to find Laravel example application use port 80 and that's gonna go to the pods and this application on port 80 and fetch slash metrics. Oh it doesn't that slash metrics by default? Yes. Okay. Right. So labels team front end exist should be going to this Prometheus. So we can double check that we have this label on our deployment. I'm pretty sure we do. Yeah. Think you copied that one. Yep. And our pod gets it too, which is perfect. And the port is 80. The port is 80. Yep. Oh is it getting is it getting that

1:44:50 from don't matter if it goes via the NGINX to FPM does it? That doesn't matter. No because we hit that in the browser so it should be okay. Yeah. So let's port forward to Prometheus again. Which let's see. Service discovery. That seem to be picking up some Laravel configuration. Oh, that's the job. Why is it going to the job? Wonder if it has the same labels on it. Oh, it probably does. Alright, let's try it. It was a generic label and not specific for web or anything. Okay let's add Promise. I don't know why but I'm gonna add

1:45:25 Debugging ServiceMonitor Configuration

1:45:49 it and see if that tweaks our configuration. Let's modify this here So instead of going here we'll do this and we'll do the same. Well, in fact that's okay. Yeah. That should hopefully work. Who knows at this point, right? So let's apply everything. Oh. Did not allow that. Yeah. It's okay. Let's just delete or deploy. Okay. We'll give that a second. Hopefully just fat. Yeah. It's almost healthy already. So we have a comment. No. I know. Says open telemetry for PHP is very new. So waiting on GRPC and the PHP package to help with direct

1:46:34 integration with Honeycomb. Yes. Honeycomb is a great front end for open telemetry and distributed traces and structured events. Really cool. Just really cool product. Definitely worth checking out in this space and open telemetry is new all around the board. They are starting to stabilize on their APIs, they have caught up with a lot of the feature gaps that they had during the migration from open census and distributed tracing open tracing. Yeah, it's worth checking out and I think instrumenting your application open telemetry now is definitely the safest way to move forward. Alright. Let's see what happens. Let's go back

1:47:10 to our targets. Prometheus, you're really annoying me today. I wonder if we can just do this old school style. Prometheus annotation. There used to be a way where you could just add an annotation to your pod and tell Prometheus to go find you. May still work at all the service monitors are the replacement for that so we'd begin to fix it. But I'm also worried about time here. We're way over where we wanna be. So let's see if we can get any more information from that service monitor. Laravel. Okay. No is working on an open telemetry package.

1:48:15 GitHub slash Sean hood slash. You know what? I actually came across that and opened it earlier. Your package. I haven't opened on my other browser. I was gonna show this off today but I hadn't, I didn't really want to go into the open telemetry stuff but we both definitely do more on open telemetry over the coming weeks so yeah, cool that you're working on that package. Okay, service monitor. Promo labels, yes port 80 should be fine. What we can do is take a look at the logs which I know failed as measurably earlier on the operator.

1:48:57 Yeah, that logging is crap. I'm gonna restart it just for the sake. Let's see if it reconfigures the Prometheus thing. I mean it should be dynamic. Start everything. Right? So Yeah. It went last time. Okay. So I'll go check our targets. So where it says zero out of 11 active targets that mean that it's not finding how to access those 11. Yeah. It's the service monitor is not picking up our application whatsoever. So which is frustrating. Anyway, think we'll just need to leave it there. Not as successful as I was hoping. I think there is a lot. Let me

1:49:58 Conclusion & Wrap-up

1:50:06 drop off our empty slot there. You know, we're bigger. Right. That should have been a lot easier. I'm not happy with with the challenges that we came across. Know, Jake, you know, you said you're gonna give that a go. Think maybe this is just Laravel. I think Symphony possibly has better support with Prometheus exporters. I know when I did a little bit of Googling just to see if these existed in the PHP ecosystem, there weren't a lot of options but this is gonna save you from instrument in your code yourself, know you don't really want

1:50:43 to add your own slash metrics endpoint and be and be writing those out. You really want the middleware to do it all for you. OpenTelemetry is a good way of doing it. We did have success with BlinkerD and automated proxy metrics which was sweet and so I think based on what we've seen today that the best way forward until Jake writes his new million dollar library and start selling it to people would be to go down the automated proxy approach because the most valuable metric you can scale on is how long does it take to respond

1:51:14 to 95% of my customers and am I happy with that number? And when the answer is no, scale up. So Alex, how are you feeling with our auto scaling now? So we managed to the only actual auto scale successful auto scaling we had today was the one that you said started off by saying that you didn't like doing, which was the the one based on requests and limits. Yes. The horizontal autoscaler in Kubernetes. Yes. So there's definitely more that we can do on this. Because you mentioned a few times about doing custom metrics and things to scale based on

1:51:58 responses from Prometheus or whatever. Uh-huh. So let me just show how that would have worked. You know, if we had a Laravel package that was exposing Prometheus metrics and our service monitor was being picked up and that was available and I could query those metrics in Prometheus. What I would have wanted, what I would have done next is deploy the Prometheus adapter to our Kubernetes cluster. The Prometheus adapter is what exposes new or fat any adapter for the metric server is what exposes new custom metrics that can be used in the HPA rules. In fact, I have it already in the

1:52:34 demo directory I think and I want that. Nice to smart down. Okay. Hopefully there's some YAML down here but this would have allowed me to use custom metrics to do the HPA. Let me see if I can zoom in on this where we would have been able to instead of using the CPU utilization on the HPA object is actually used rules which identify a series which would have been our HTTP response time and we would have told it to you know perform a query against that and try and get what is that percentile very much like the query that

1:53:04 we copied from the Grafana dashboard that linker deep provided to us, know we want to understand percentiles you know we can't give everybody thirty second and forty millisecond response time but we wanna try and ensure we get as many people into that bucket as possible. So hopefully once we find a Laravel package that works and we can retry this again, we'll be able to get to the stage where we can deploy the Prometheus adapter configure a new HPA and take it from there. All I can hope for the people that have watched us throughout this is that, you

1:53:36 know, hopefully the components and primitive required to do horizontal scaling on Kubernetes are now more familiar to you. You know the right words to Google and you know which package you need to exist in order to get those metrics out of your application and even using the could be as well where wherever possible. So well we were unsuccessful, I hope there was enough there for people to get interested. Yeah. So in terms of losing linker d, would you use that alongside or instead of something like click d stats d or are those not really relevant at all

1:54:08 when you're dealing with Kubernetes compared to a standard VM? Yeah. I'm I'm unlikely to use collect your stats D. Those are, you know, platform or host based monitoring. They're not really application level stuff although you can't use them in that regard. Prometheus being a CNCF graduated project, a lot of people are just running it on Kubernetes because of the time away the Prometheus operator does work when you get the service account right. The service monitoring I'm sure is my editor and I'm sure when I rewatch this to type up the, you know, the timeline for

1:54:38 the description, I'll see the stupid mistake that I've made but obviously right now I'm not seeing that. No, don't see it either. So yeah. It might be might be worth when we think about what the mistake is to put a a snippet like in description or something maybe. Well yeah all of the resources that we have created even the broken ones for now will be pushed to your repository which is in the show notes. It's also github.com/go for Alex. Alex Bowers slash Laravel example project probably. So it's in the show notes. Alex Bowers will be the one that's on there with the

1:55:12 most recent push on it So Yeah. So all this stuff will be there. Hopefully the Laravel situation improves and we'll do a second part of this and thank you all for watching and thank you Alex and Kieran has left, but thank you for joining me and having some fun. Thank you. Alright. Have a great day. I'll speak to you soon. Bye. Yep. And you. Bye.

Technologies featured

Meet the Cast

Weekly Cloud Native insights

Stay ahead in cloud native

Tutorials, deep dives, and curated events. No fluff.

Comments, transcript, and resources

More from Rawkode Live

View all 173 episodes
Kubernetes

More about Kubernetes

View all 172 videos

More about Laravel

View all 5 videos
PHP

More about PHP

View all 7 videos
Linkerd

More about Linkerd

View technology
Prometheus

More about Prometheus

View all 26 videos

More about Grafana

View all 20 videos