About this video
What You'll Learn
- Discover how CoreDNS uses compile-time plugins to add DNS features such as caching, TLS, and logging.
- Configure Corefile blocks for multiple zones, UDP plus DNS over HTTPS, and plugin ordering behavior.
- Explore debugging and hardening in CoreDNS, including metrics, dnstap traces, and DNSSEC automation in production.
CoreDNS creator Miek Gieben walks through the plugin-driven Go DNS server: Corefile syntax, plugin ordering at compile time, DNS-over-HTTPS, multi-zone servers, debugging with dnstap, writing a plugin, and how coredns.io is hosted with automated DNSSEC signing.
Jump to a chapter
- 0:00 Holding screen
- 1:20 Introductions
- 2:18 Guest Introduction and CoreDNS Overview
- 3:20 Why a New DNS Server? (CoreDNS Origins)
- 3:22 Why do we need new DNS software?
- 5:06 Evolution of the DNS Protocol
- 5:10 Is DNS still evolving?
- 7:07 The Origins of CoreDNS (Inspired by Caddy)
- 8:40 Introduction to CoreDNS Concepts (Slides Begin)
- 8:45 What is CoreDNS?
- 9:31 CoreDNS Plugin System
- 10:59 Plugin Ordering Explained
- 11:57 Corefile Configuration Examples
- 16:00 Demo: Supporting multiple protocols
- 16:05 Debugging CoreDNS
- 17:35 Demo: Basic Queries (DNS and DoH)
- 20:50 UDP or DNS over HTTPS?
- 22:38 Multiple Servers in Corefile
- 22:40 Demo: Multiple Zones
- 23:40 Demo: Logging with Multiple Servers
- 26:00 Debugging DNS
- 26:07 Debugging Plugins (pprof, trace, dnstap)
- 28:47 Introduction to Dnstap
- 29:53 Demo: Using Dnstap for Debugging
- 30:00 Demo: dnstap
- 37:45 Building a plugin
- 37:50 Understanding CoreDNS Plugins (Code Example)
- 42:00 CoreDNS's DNS configuration
- 42:02 Real World Use Case: CoreDNS.io Hosting
- 44:00 Automated DNSSEC and Zone Transfer Explained (Using Logs)
- 47:00 Future roadmap
- 48:25 Questions and Discussion
- 48:30 Questions
- 58:49 Conclusion and Wrap-up
Full transcript
Generated from the English captions. Timestamps jump the player to that moment.
Read the full transcript
1:20 Introductions
1:20 Hello and welcome to today's episode of Rawkode live. I am your host Rawkode and today we are gonna be taking a look at a CNCF project, CoreDNS. Before we move on to that, I just wanna take a moment to say thank you to my employer. They provide me the time to invest and help produce these episodes that we can take a look at the modern cloud native landscape and purchase learning materials for us all to learn together. So thank you Equinix metal. There is a code, Rawkode Live. You can use that when signing up or
1:51 even if you're already a customer, you can still use this coupon. It will get you 50 US dollars in credits, which you can use in a couple of hours with some beefy machines or spread out over longer term with some smaller machines. Either way, the choice is yours. If you want to have a conversation afterwards, there is a Discord server. You can come and join and ask questions or just ask for news show suggestions. I am open to anything and if you're not watching live, this is the best way to interact. Now, we're gonna take a look at CoreDNS
2:18 Guest Introduction and CoreDNS Overview
2:19 and for that I have the founder, Mick Haven. How are you Mick? Hello. I'm fine. Thanks. Thanks for asking me. No. This is awesome. I'm really looking forward to learning more about CoreDNS and who better to do it than you. So this is gonna be fantastic. Do you wanna just take a moment to introduce yourself and tell us a little bit about you and the and the project? Yeah. We've been in London for a couple of years. Moved back recently to The Netherlands where I live now. And I have been doing DNS for a long time.
2:55 Involved in DNSSEC in in February. At some point, I got involved in SkyDNS, which was the thing that got used in Kubernetes. Via via some ideas, this led to me actually writing a new DNS server, which is CoreDNS, which we're gonna talk about tonight. Okay. Great. So let me start with a couple of questions that I kind of prepared in advance then. So when I think of DNS, you know, I've been doing operational work for nearly twenty years now, and Bind was just always that one thing that I had installed on my server. Why did we need to change the status
3:22 Why do we need new DNS software?
3:37 quo there? Why did we have to bring in something new? Yeah. Note that there is Bind, there is NSD, there is PowerDNS, there's a couple of DNS servers and and precursors. Mostly, you don't need a new DNS server off the bat. The interesting thing here is that I wanted a new DNS server, and I was capable enough to actually write one. And it's one of the few that's not written in a c language. So off the bat, there are a couple of things that are better. It just crashes. There are no buffer overflow and that kind
4:11 of stuff. So that that's a big difference from the current crop of of DNS servers. Being written in Go, you have GCs, so there might be more overhead, so speed might be an issue. But, yeah, in general, I would recommend against writing a DNS server, But to just break the molds and have something new in the ecosystem, I think CoreDNS is a very good good idea. Also, if you look at DNS, the protocol is is fairly complex nowadays, and I don't know from the top of my head any new DNS servers that have grown, like, CoreDNS did in the past year.
4:47 Like, it's it's bind is from the eighties. LSD is almost twenty year old now. PowerDNS has also been around for a long time. I don't know exactly how long, but there hasn't been any new DNS software in the last decade, I think, except for CoreDNS. So just something kinda popped in my head there as you were talking. Like, I guess, you know, DNS is as old as the Internet. Right? At least as as what we think of. Is it is that protocol still evolving? Is it still changing, or is it relatively static? So the core of the protocol has been
5:10 Is DNS still evolving?
5:22 static since nineteen eighty four ish, but a lot of stuff has happened. I mean, it's called plain text. It's easy to spoof. So there was this effort in a sec that actually made it more secure. That made a protocol way more complex, and not everybody is actually using it. So it's it's questionable if that was a good idea. And and because it's all like, it runs on UDP and having to do your own networking with UDP is is pretty hard. So you see new things popping up, which actually are in the presentation. The protocol called
5:55 Do is DNS over HTTP, which is interesting. And, QUIC is a new protocol being finalized, I think, by the ITF. So DOC is another thing that's coming along, which means we're gonna do DNS over QUIC. So there's a lot of stuff happening in that that area as well. And corner cases are still being found, and they need an RC to to detail them and how that actually properly works. So even though it's an old protocol, a lot of new things are happening in in that space. The core protocol has been static as I said for for decades now.
6:30 That's pretty cool. Okay. So, again, I mean, DNS sec, you you've mentioned that twice now, and it's one of those things that I know I'm supposed to enable on my domains, and then I just never actually do. So I'm sure I should. I probably will try it one day, but I'm I'll skip it for now. It it's it's complex. Although I wrote thing for that in CoreDNS because we got annoyed with it as well. And and why are you just getting better and even as the EmpowerDNS have a good tuning for that as well. But, no, you do it
6:59 on your server and nobody validates those signatures you're adding, it don't add any value. Ah, okay. It sounds like you're kinda scratching a lot of your own edges or pain points with CoreDNS. It's like you're getting frustrated with things and you're going, right. I can do this better. Is is that a fair guess or assumption? Yeah. Well, that's how it started because I wanna make a few metrics from bind, and bind doesn't do this out of the box. So you have intermediate binary which compiles some stuff from XML, but bind output it at a time. So
7:07 The Origins of CoreDNS (Inspired by Caddy)
7:29 that led to me thinking I have this Go DNS library, which is pretty okay. How hard can it be to actually write a DNS server? And at the time, I was looking at a new web server for the whole lesson group stuff. And, Gedi, a Go server written by Matt Holt, did all that stuff, and that looked pretty amazing. And that got me thinking, like, maybe I can use the same model that Kelly is using and just run the NSR. And that's actually I forked Kelly and basically replaced HTTP with DNS and the source code
8:02 after two weeks of failure. Compile it, and it was resembling a DNS server. So that was pretty much my own itch to and it did, of course, for metrics. So that was my own itch that I scratched and that wasn't into into CoreDNS. Amazing. And that's great that you had the CADI project there or something you could just kinda use as, I guess, the server boilerplate and then work in all the DNS logic and stuff like that. Yep. Yep. Yeah. The whole web API here is is is copied from. Okay? Very cool. So I believe we're gonna start today's session
8:40 Introduction to CoreDNS Concepts (Slides Begin)
8:40 with a little bit of a introduction and some slides. So I'll get your Yep. There we go. Your slides are now live. Thank you. Yeah. So as we just discussed, like, CoreDNS, very much emphasis on on a flexible server. The the the the plug in idea that we briefly touched upon makes it easy to add stuff in CoreDNS, so you don't need to be a full expert to add stuff working in your server. And also discuss DNS looks like a very simple protocol, but there's many corner cases which makes it really, really tricky to to deal
8:45 What is CoreDNS?
9:21 with. And also people expect speed, so it also needs to be blazingly fast. Otherwise, you get latencies in in all your application. So let's talk about the plug ins. Without plug ins, CoreDNS basically doesn't do anything, and a a plugin is similar to a web server spec middleware. The original word in CoreDNS was even middleware, which at the time used middleware and plugins. We settled on plugins at some point. We have quite a bit, I think 30 ish, and they are all configured compile time. And we have a file called plug in dot config, which tells you what plug ins
9:31 CoreDNS Plugin System
10:06 are available in in CoreDNS. Here you see a couple of them. Metadata, that's a plug in that adds some a little bit of metadata like MAC address. You can add that if you want. Cancel is to cancel a context on the on the query. TLS is to set up TLS, which is useful for DNS of HTTP, for instance, or or gRPC. Reload will reload the CoreDNS process if it sees changes in its config file. So there's a whole range of what plugins can do, but it's all implemented as as a plugin. NSID stands for name server ID. It's a
10:44 little tag you could add to a DNS request, that tells you when you hit an Anycast server, which server actually responded to your request. So that could be useful for debugging. So there's a whole range of stuff that we have in in in CoreDNS. So one of the things that people struggle with is in which order are the plugins processed when the CoreDNS sees a query. And the order is done compile time, and the order is what the order is in plugin dot config. So it doesn't matter how your config looks. The order is already predetermined
10:59 Plugin Ordering Explained
11:22 by compiling this file into CoreDNS, so to speak. So looking at config special specifies with weapons you have and then which order they will be executed. And the core file, which is the config file for CoreDNS, specifies the plug in usage for your specific server. It would be nice if this was all dynamic. Go doesn't do this yet. Maybe in some future rewrite, we will revisit this whole idea and it becomes more natural. For now, this is how we do things. So to look specifically, if I have a config file, a core file with, the following plugins,
11:57 Corefile Configuration Examples
12:05 cache, where am I, log, Then if I get a query from from the outside to CoreDNS, I need to look into the plug in of config and see where these plug ins are actually configured. So cache is on line 44, where am I is on line 64, and log is on line 37. Meaning log will be seen first, then cash will see the query, and then who am I? We'll see the query, and then you can reply. So the ordering here in the core file doesn't matter on how this query is being processed. So can I ask a question about that
12:44 if you're happy to take them at all? Yeah. I just I I guess That's a good thing, though. Right? I'm assuming that the the plug in developers and yourself and the other contributors, you you understand the order and semantics that sense for each of these plugins rather than leaving it into the hands of the person right in their own core file where they could have cash at the very end and then never actually being used at all. So I guess this this is a good feature and I guess something that people just need to be
13:09 aware of. Yes or no. So we get we'll get bug reports by people saying, like, my plug in is not executed, and that's, like, where it is, where do do you put the plug in in your plug in config, which is, like, I put it last, which is, yeah, then it won't get seen probably. But sometimes you have two plug ins that need to be on precision five, for instance. And then you need to make a decision which one goes first, which if you could just revert the order in the core file, you wouldn't have that problem.
13:40 Okay. So yeah. It's yeah. It is what it is. Could could be better. I quite like this because it's really too too realistic, and you can just see what what's happening. On those plug ins, we have a whole range of them. We are discussing, do we need to make this an official type of plugin? But there are so many, and it's so liberal what you can do in the plugin. It's it's pretty hard to do this properly. But we have plugins that generate some response. Who am I forward, which goes to another server to fetch response and return it.
14:16 Kubernetes plugin, which gets its data from the Kubernetes API. Other things are queries that are inspected. A lock is such a thing. It doesn't really inspect it, but it it looks at the query and then partnership unlocks what what it sees. Cache, obviously, will catch the query, so it will see the query, put it in its cache if the response isn't there yet. You can inspect the query and change it, rewrite template, both sides, all do this. MSID is one of those as well. You have plug ins that affect the whole process. Health used in in Kubernetes to report
14:56 health endpoint. Ready readiness, another thing used in Kubernetes. Reload, we saw this. All you can do if you reload a config file, sorry, you can actually do the a a binary on this and metrics that actually exports Prometheus metrics. It's one of the reasons that I actually wanted coordinates for purpose. And for Core file hamper helpers, you can import another file or a snippet to make your Core file readable. Each of these has a readme. They're all published on CoreDNS.io. I made a W package where our package is all as manual pages as well. So
15:42 that that's a good set of documentation. It mostly tells you how to use a plugin. Becomes interesting if there's interaction between different plugins and and how you debug stuff, which is something we will see, later in this presentation on on plugins that might help you there. Okay. Let's delve into not to work. Into a a config and see how this works. So we have two config files. This one is plain DNS, and what does it say? I am authoritative for the root zone, which is basically the entirety of of DNS. And I will run on this port number,
16:05 Debugging CoreDNS
16:31 and this server will have a whoever plug in and a log plug in. Meaning, it will respond to it, and it will log this. The new protocol, though, DNS over HTTP, similar thing except you just prefix the domain with HTTPS colon slash slash. I'm not sure what Quick will do, but it probably also gonna be HTTPS. I'm not sure how we're gonna make that different in in a core file. It needs to run on a different port in this case, so we do this one. Also, this plugin, log plugin, and I need TLS configured for though it doesn't work if you
17:15 don't have TLS. This this is interesting, especially if you look at, for instance, and let's encrypt. It would be cool if we could automate this in a similar way, but being a DNS server has some interesting things that we need to look at. Let's let's see how this work. Let's shift to my terminal. So I have a couple of things here. I found a parking utility called Homer, which actually speaks DNS over HTTP because because my go to utility deck doesn't do it yet. So I have a core file here, which is basically the thing
17:35 Demo: Basic Queries (DNS and DoH)
18:09 I just shown you. It's one core file. So this core file will start two internal servers, one those server and one a normal one. And there's not so much to it. I can just according as my score is core file. Port numbers are already specified in the core file, so this should just start. It does we have the server starting for the root zone and the those server also for the root zone on this port. Skip to this. So big, like, local host, any record, who am I? This will give me a reply. And you see
18:58 where am I replied to me. I got this data back, which is a bit nonsensical. It tells who am I, which is my IP address and port number I was asking the question on and the protocol. It's a bit silly, but it's a nice tool for debug things and demos like this. Nice. You also see it it's locked, which is what the lock plug in do does. So I prepared little shell script. This goes to the other port fourteen forty three, and it asks the same question over HPS. It's a bit silly. The actual ID is always zero, which is
19:48 not how it should be. The buffer size, because we are asking of TCP defaults to this value, that's all details. But you see here that we actually query CoreDNS on this URL because it's it's PS, and we got a similar answer. You see here, it's TCP because of HTTPS. Different ports, but we're still coming from local host. So this works. You can just use CoreDNS as a server that does go. A quick will be added once that's fully specified. One thing we're not doing yet is we can't forward queries to DNS over HTTP server. We are missing a plug in that does
20:33 that. Cool. Do you see DNS over HTTPS? Just is that the new standard now? Would you say always default to that if possible? I mean, are there any considerations that people should be making with, like you know, start with the UDP approach and then bring it and phase it in? Like, what what's your thoughts on that? So with dough, browser's good is now native. Meaning the whole DNS ecosystem was always you need to upgrade every box on Earth to actually have your code being deployed in the resolve of all the machines. And now Google just pushes out a new
20:50 UDP or DNS over HTTPS?
21:17 version of Chrome, and everybody is using an upgraded version of of DNS or HPS. And there's this whole thing, what I said, like, DNS was plain text. DNSSEC didn't change that. It just added signatures. So with DOE, for the first time, you actually can't inspect DNS traffic, which is interesting in certain parts of the world or maybe all of them, but that's a whole whole different discussion. So that's that's one of the main drivers for for DNS of HTTPS. I I I don't wanna divert too much, but I'll ask one more question. Does the metrics, the CoreDNS exposed, does that give me
21:58 a breakdown over the quest the protocol that was handled? Will it tell me if it was DNS or HTTPS or whether it was plain DNS? No. I think we only do UDP TCP, which which might be an interesting thing to to add. Okay. Cool. I'll play around with that. Yeah. We see? Yeah. We can definitely see how it comes in. So it might actually be a nice thing to to have. Running both UDP service and HPS service is kind of strange because then you don't have nobody can see my queries because you're still doing the the plain old QDP.
22:33 But a fair point. Oh, scrolling the wrong wrong window. On those multiple servers so this is a a core file that defines two servers, basically, two internal coordinate servers. One serves the example of our domain on port 53, and the other one serves the example.net domain on 53. So these are the same listener, but internally, this means there are two servers in CoreDNS responding to queries for these two domains. And you can see the difference, which I will show. One of the servers will lock because it has the lock plug in, and the other one will stay
22:40 Demo: Multiple Zones
23:22 silent and won't do anything. Now that if you enable the log plug in, if you have a busy server and you are logging a lot, that will take away some performance. I'm not going back to my terminal. I'll stop this. So this is the core file from the slides. Same here. I populated all these directories with the CoreDNS part of me, so to make it nice and easy. We started, it prints out, all this the mini servers that's running. So those are those those two. We can do the same thing as as just now. I have a deck,
23:40 Demo: Logging with Multiple Servers
24:17 for example, .net. So this shouldn't show anything on the CoreDNS site, which it didn't because it didn't have the log, plugin. But if I'll ask one for org, it does give a logline because it has a login that says it does must do so. All all what you see here is it's pretty basic. This is the remote address, so the reason for asking. I can't actually remember what this was, but this is what the question was. Protocol size of the incoming request. Do I want DNSSEC or not? And what my UDP buffer is? And then this stuff is the actual answer.
25:05 So we didn't see an error. The answer was a 12 bytes, and it took this amount of time to actually generate it. And some of this stuff is also in the in the metrics if you have metrics enabled. K. So is it a standard practice for like looking at that conflict, I mean, and now knowing that it came from the caddy kind of background. So those are essentially like virtual hosts and then you're just gonna using that some Yep. For that the Genesys response. Would you run them on different ports, or would that just always be the same port?
25:49 No. You can also run them on different port. Then you actually get multiple listeners in the CoreDNS process to pick those up. But yes. Okay. Cool. Yeah. I can change the port, and then I need to change the ports and then they come on as well. There's no difference. Alright. Just curious. So one of the big things in in DNS is that it's super hard to debug, and I will not claim that these following slides will make you a debug expert because it's still hard. Couple of reasons why this is hard, mostly because the volume of DNS queries are probably
26:07 Debugging Plugins (pprof, trace, dnstap)
26:23 large. So if one or in a thousand queries fails, how do you see this one query failing even though somebody actually had a problem with the application because that one query timed out or whatever? So that's that's a thing. And, but I said logging on on scale is hard to do. So it's it's really hard to just find, like, one in a million queries that timed out for some reason, which is why and because CoreDNS is lower in the stack, anything that that fails to, properly work here has repercussions, higher up, which means things don't work or have a humongous amount of
27:05 latency. So couple of plug ins. We have the debug plug in. That's a fairly new thing. You can lock debug from within a plug in. And if you enable the debug plugin, that actually shows up in your outputs in standard output. It also stops CoreDNS recovering from panics. Panics are both thing. If CoreDNS has out of bounds read, it will crash. By default, it will then just restart. I will not actually crash. It will intercept the panic and will just continue running. Furthermore, there's pprof. So there is for go pprofing. You could just enable a plug in, connect
27:50 to the remote port, and get a pprof of the binary. Prometheus already talked about adding metrics, which you always should, and trace those actual open tracing of a request within CoreDNS. Those are all useful things to to have, especially tracing can be can be handy to just see what's happening. Although it's not too interesting because usually according as future query forwards, it's too extreme and then gives back an answer, which it may or may not get. Another thing that we have is DNS step, which is a gRPC protocol which forwards all generated queries to a DNS
28:32 step endpoint and will allow you to see what queries are received by CoreDNS and what queries CoreDNS itself also does. And as with always, it's just enabling a a plugin. So this stands out of a core file. We run a server for the root domain, so we capture all incoming queries. We will forward them to Google. We will lock the queries, but we'll also DNS step the queries to, the socket. And on the other side, that socket needs to be running, DNS step binary, which will show you the queries, being received and the queries being forward to to Google.
28:47 Introduction to Dnstap
29:15 So this can be really, really powerful in actually seeing and and capturing what is happening with with this process. Note that a lot of stuff, especially in Kubernetes, we've seen issues where IP tables are filling up. So if the query doesn't reach CoreDNS, this will be useless. Then you will need to do a step on your note to actually see where is the traffic going. And, usually, it's it's dropped. We've seen it quite a lot of times in Kubernetes, for instance. Okay. I prepared a demo for this as well, so I will switch to my terminal.
30:00 Demo: dnstap
30:01 Quit. The core file oh, I've had a debug for good measure here, which is which is fine. So this is the same core file I've just shown. Let me start my DNS test binary. So this open the sockets unit sockets on temp unit tap dot sock. It will, I think, barf out YAML. So we should see some YAML if I start CoreDNS. We all love YAML. Yeah. Yeah. It's also JSON, so maybe I should have done it. Let's query this and allow us for the MX records of example. Oh, dot net. I'm seeing a bunch of things.
31:02 Of course, this is the answer that I got from CoreDNS. Don't look at how this looks, but this is a valid answer that basically says we don't have any MXs. We have a logline because we enable that as well, and we all have our DNS step data. And this is interesting because, what you didn't see in the logline is that we are also forwarding this to the server, and we are asking the same questions, so that's good. The MX records for example .net. And then responded. So we got a forward response, and you see it here,
31:49 which is right now, we see the original query coming in, so it's a bit of ordering. It it makes sense if you know what's happened, but this is the query going into CoreDNS, which is indeed asking for this. And this is the actual response that we sent back, which is the forward message we received. So these are four data points, two queries sent and two queries received, which you don't get from the logs and you also don't see from the date, obviously, because that's just the the client. K. And note that the protocol itself is
32:20 not UDP based. It's gRPC based. So it it it's not as as efficient. So it might block or it might build up if you have a large number of queries getting coordinates. Although, I don't know what the specific number would be, but it's especially, you can also do DNS step over TSP. So there might be some slowness there. By default, if you can't DNS step a message, we will drop it because we can't wait for DNISTEP to be ready for us. But this is a very useful debugging tool that I don't think a lot of people
33:01 know about. Yeah. I mean, there's this obviously, there's this meme. Right? Every time something goes wrong, it's always DNS. And I think the visibility here that I can see DNS tap given people would probably help them debug those things so much quicker than the long drawn out way that I I can say from experience I've spent looking at DNS problems. So I wish I had known this existed a long time ago. Yeah. But you have to have the bandwidth to actually get all this data. You have to store it maybe for a couple of days.
33:32 In this format, it's it's pretty voluminous, so it's it's it might be a lot of data that you need to catch. But, yes, it totally doable with with But it has that output format. Right? You chose YAML, but I could have chose JSON. So I'm assuming I could, you know, hook in a little bit of tooling to pull out the fields I need and store them in, you know, a TSDB or some other DB and then Yep. Yep. You know, delete them after seventy two hours. That's not there when I have a problem and I wanna fix it. So
33:55 I Yep. Yeah. Exactly. And you can actually look back time stamps for there. So it's a it's yep. We can't do this in our logging for instance because that just wouldn't scale. So this is a good good in between. Oh, yeah. Definitely. See the the forwarding? So right now, this CoreDNS is saying, have I so I'm a is that say forward the dot means anything and then if we can't answer it, to a to a. If I have another domain in there, like, can you help me understand how that works? So let me phrase that question in a
34:28 way that makes sense to you. If I add another domain, like example.net to this core file and it doesn't have a forward, that dot is a wild card. Is that is my understanding? Oh, yeah. So basically, what you can do, let's say you have this, and you can say so now you are authoritative, for example, .org, but anything on the org, wanna forward. The rest you wanna handle locally in this CoreDNS query. Okay. And maybe that's just something that you're gonna cover later, and I'm just jumping the gun a little bit. About can I also have any number of
35:13 entries for the if I want each sub domain on my domain to be its own resolver, is that possible too just by adding extra sub domains? Does that make sense? Yeah. So you want, like, you what do you mean? Like, reload If I wanted david.example.org to have its own name server, that's just a case of doing that. Right? Or Yes. Although doing this locally in the core file just makes CoreDNS know about it. If you wanna do the proper delegation, you need somebody else actually pointing your well, you need to buy domain and go through all those things. But, yes,
35:46 once you've done that, you can just add the name here. Okay. I'm just gonna keep these questions coming until you tell me to show up now. But it says the forward use the same protocol that I query. So if I query over DNS over HTTPS, does it forward in the same protocol, or does it forward over UDP to my default? By default, this is UDP. There are options. I think force TCP, which will force a TCP connection to the upstream. The one thing that we did implement, because it was fairly easy, is we leave this
36:19 syntax, which should actually work nowadays, means use DNS over TCP. I don't know. Use DNS over TLS, which is a different protocol than DNS over HTTPS. Basically, this is a raw TCP connection and you do TLS handshake on it. So this is supported. We have a gRPC plugin, which is similar to forward, but uses gRPC in a protocol we invented to talk to an upstream. What we don't have is an HTTPS forwarder that actually talks DOE to an upstream that doesn't exist. Okay. So we can do though on the, like, the the the entrance side into CoreDNS,
37:09 but we can't speak it to an upstream yet. Okay. Cool. Okay. Let me switch back now. Yeah. What I said, If this doesn't give you the data you want, you definitely need to teach speed ups. That's also useful for book reporting on the CoreDNS repo. If you have a really strange case, I will probably ask teach speed up or account debug. Now a slight detour on from operational to some go code. I will show you an actual plugin, which is fairly fairly simple. As said, we are using the go language. So a plugin is a thing that implements the plugin plugin interface,
37:50 Understanding CoreDNS Plugins (Code Example)
38:06 which means in in our case that you need to implement the serve DNS, which is the meat of the plug in and the name method. And we have an example plug in, which is on CoreDNS slash example repo, which prints the words example on every query it sees, so it doesn't do a lot. But the code is is basically here. We have our structure that we defined. Example, in this case, we have a serve DNS method, catch a context, a DNS response writer. If you've ever programmed with the a net HP package, this looks simple similar except
38:47 it's DNS instead of HP, and the request the incoming request. The thing does something, it returns a status code and error. And in this case, we make a new response rider, which we will see in the next slide, and we call into our plug in chain to further handle the request. Or if nothing is to be done, we are done and we will return on. So the new response printer thing that is here, and, the meat of this is this bit. Once we are done, we will call write message, which will write the message back to
39:36 the client asking, for a response. We have a response, which is, either forwarded or whatever. We don't know at this point. We have just a response that we need to write back. So this is printing out example, which is what this plug in does, and then it just writes calls the underlying response writer, calls its right message method, and they'll write the method. We will write the response back to the client. And these can be wrapped, so this can be, calling into caching, which you can call into logging, and it just unwraps the whole thing until the client
40:14 actually sees the bytes, being returned. And that's it. And in this case, you don't need to know anything about DNS. You just write a couple of lines of go, and you have a plug in. It's not always that easy. That's pretty simple, actually. To be honest, I was just I figured the plug ins would have a little bit I guess there's more to it than that, but the the base skeleton actually is not that daunting at all. Yeah. Yeah. And we're trying actively to to make, like, the core of CoreDNS to do most of the the DNS stuff. So you
40:49 can actually focus on just writing Go code, and you just hook it in, and it works. Sadly, there are many core cases in the DNS which you need to think about. But, yeah, in in essence, we wanna make open up DNS for for people to just use this. So one of the things we're seeing is that the crypto people, there's like some naming scheme they have on the in some bit Bitcoin coin, and they wrote a plug in in CoreDNS that works on the on the blockchain and does some naming. Yeah. I was going through the plug in
41:24 list earlier, and I noticed the the Ethereum DNS server plug in, which is kind of sitting there, and I found that, you know, quite interesting that quite it's not just the Kubernetes community and the cloud native community, but, you know, the brighter technology community is is now coming to CoreDNS and adding their their own plug ins to it, I thought was really cool. Yep. Yep. No. I think so too. I have no idea what a thing actually does, but it's just not you know, that pro prototype and just make the thing easily work in in CoreDNS.
41:52 Much harder than doing it in mind, for instance, or or MSB. I don't know about PowerDNS. So so that's it. So yeah. We, of course, use CoreDNS ourselves to host CoreDNS.io, which is one of the things we wanna move to another fly, but they don't do IPv6 and hosting our own DNS there is is more interesting. So for now, we just host this on a on a Linux machine. And the core file that we have for CoreDNS, the buyer, looks like this. So we have a snippet as it's called, because I'm using this to serve more
42:02 Real World Use Case: CoreDNS.io Hosting
42:33 domains. So this this defines that we have Prometheus errors debug and any. Any is another small plugin that replies to any queries with the h info. Any replies can be really, really big, so they are used for amplification attacks. With any plugin, you just sidestep the whole problem. And then for CoreDNS.io, it's a domain we serve, so we spell it out in the core file. And we have a couple of things here. We use the sign plug in to DNSSEC sign CoreDNS.io because I've done DNSSEC. I thought it should be signed with should be using DNSSEC.
43:21 We have some keys on system that just signs this stuff. Then we use the file plug in to pick up this sign zone, which is created by the sign plug in and serve the contents. Then we have the transfer plug in that tells you if, the zone has changed, ping those other servers because we need them to pick up the new contents that we have. And then we import base, which means we also have the plug ins that we have on the on the left side here. These things make DNS access zones hassle free. It will just resign
44:00 Automated DNSSEC and Zone Transfer Explained (Using Logs)
44:08 and will do the right thing. And you can see this from logging. So the logs from according the s and o, I grabbed this, like, yeah, a couple of days ago. The sign plug in is telling me that it's just resigned CoreDNS IO because the signatures are too old, and they have a lifetime because a bit deeper what DNSSEC is. But they have a lifetime, so you have to resign those things every so often. Otherwise, your zone will go bad. So, thankfully, this worked. It will save some more bits, like when I'm gonna sign again.
44:45 A super small zone, so the signing took no time at all. But then you see that the file plugin sees that the CoreDNS IO zone had been reassigned or changed in that file, and it picks up the new file and serves the new data. When file that does this, it will then ping the transfer plugin, which will send notifies as they are called to the other servers like you guys. I have a new a new zone. You need to pick it up, and you need to get the new data from me. So here you see all the plugins, like,
45:21 in the actual plugin chain, just ticking those boxes one by one and just, have the whole thing automated without me doing anything, which is why I wanted to sign plug in because before this, had a Chrome job, which was a bit crummy. So I'm curious about that transfer plug in whereas notifying another IP address. Like, as our is this, like is CoreDNS horizontally scalable? Is that my misunderstanding? This this is a yeah. That's an old feature of DNS servers. But, yeah, the the proper term now would be horizontal to each cable. Yes. So that is that just two core fails
46:05 that are the same being run by the binary but on different machines? Is that is that all I do? Yep. Yep. Well, preferably, you will run that somebody else runs a different binary in a different org. So these are a couple of old friends of mine who actually run a different DNS server, but it it serves the same day. Okay. Cool. I'm contemplating making a thing that uses bit current to do this, but it turned out to be somewhat complex. That's all. I mean, as long as we control both ends, so you're both ends with the coordinates, we are super free
46:41 in what we can do. But this the notifies and the transfer is the standard DNS protocol. Alright. Okay. Cool. Which is a different protocol than normal query and also protocol. Cool. This is my last slide. So what are we gonna do in in CoreDNS? Well, the usual bug fixing and and keeping up to date with with Kubernetes, which is moving fast, adding dual stack support, which is super nice and some other API stuff that's that's been done there. Furthermore, a couple of couple of new plug ins that are core to DNS. There is one to minimize
47:00 Future roadmap
47:24 responses, which is saves bandwidth and application attacks. DNS cookies is a thing to actually make sure you're talking to the right server, which you might implement as a plug in. One thing that I've been pushing back is new storage plug ins because, a, they are quite complex and they vendor the entire world. So I think we we we have a lot of stuff that we need to pull down to compile according to us mostly because of we're using actually cloud DNS route 53 Azure, a whole bunch of these, and all those plugins look similar. So we need some kind of generic plugin
48:05 where you can hook in all these to to make that scale again, I think. So for now, I I I've had enough of adding identical code to go to a slightly better one. So I'm putting my foot down unless there's a good reason to actually include something super new and novel. Okay. So we got it from my side. Alright. So if anyone watching has any questions, feel free to drop them in the comment section now. I have a a few that I prepared that I'll I'll run over just now then. And I think my first question actually very
48:30 Questions
48:39 luckily ended on this a similar note to what you're talking about there like, what when I was looking at the website earlier and I realized there's plugins and external plugins, it kinda reminded me of a problem we had on the telegraph project. The telegraph project is a collection agent for metrics. But they started off as a plugin based thing, but all the plugins were compelled in. The the telegraph project today has like over 220 plugins. The project size, the binary is huge, the vendor, wow, the vendors like and I think that's a very goal problem.
49:09 Right? Like goal problem there's not a good plugin mechanism that works with goal. And I'm curious about external plugins and how that works in CoreDNS. It's a gRPC, http, something else. Maybe you could help me understand that a little bit. So so what is your exact question? Like, how do you get them in, or how do we deal with the like, value move from external to internal? So as the external plugin is just not hosted in the CoreDNS repository, but it's still compiled in, or are you doing some magic at runtime to pull an external plugins as
49:38 No. No. So if you wanna have an external plugin, you need to exchange plugin dot config to actually add the lines to point to the external plugin, then I go generate and then I go build. Ah, okay. So that's the understanding problem then for Go projects is how how do we have external plug ins that could be dynamically linked or something like that? Yeah. There's this plug in interface, but I haven't dealt looked into that. So if if we're gonna go that route, that would be a a massive rewrite, which means we can also look in
50:09 in the plugin ordering because we lose the plugin with config file probably. But, yeah, I I'm I'm happy enough with CoreDNS as is that I don't consider this personally a big big thing that I need to fix in in CoreDNS. Okay. Cool. So I'm curious also about, you know, we we touched on I kind of misinterpreted the transfer plug in for something to do with, you know, scaling a DNS server. But I'm also curious about the the bad actor side of that. Like, you know, DNS attacking DNS servers can be a popular vector, I guess, for people that want
50:43 to bring down or take over DNS zones and stuff like that. Does CoreDNS have plugins or anything that allows me to do to to try and protect myself? There is an external plug in called response rate limiting, which means that if we see a lot of queries from same IP, we will will do some blocking. That's not a default plug in. Usually, if you there's several ways. Like, if you wanna interact with with the registration and the current stuff, that's not something CoreDNS itself can help with. I mean, it just serves whatever is in the core file. So if if somebody gets
51:21 into that trajectory, then so be it. The transfer itself, there is some very limited things you can do to securities, which also shows the age of of DNS protocol. There's a a symmetric key you can configure, IP white listing, that kind of stuff, which is all, in my book, pretty old school things to actually make sure you receive the correct data. The best way to make a DNS server not responsive is just DDoSs and then fill the pipe, and there's not much you can do at that point. Okay. I'm I'm assuming your personal domains are all hosted by CoreDNS.
52:07 Is is that a correct assumption? Yeah. Yeah. Yeah. Yeah. Yep. Yeah. The snippet that I showed you on the recording that showed, that's why all the domains there as well. Of course. Alright. I'll give you two proper questions though instead of that. So, you know, obviously, I think a lot of people are familiar with CoreDNS because of the adoption of of Kubernetes and CoreDNS into that project. Like, with the whole advent of cloud native architectures and micro services becoming a new norm for teams and organizations, does that allow or afford DNS to play a newer role in those architectures or take
52:42 on new responsibilities? So, yeah, there was a massive influx of popularity once we got into the CNCF. Although, a humongous amount of bugs that we fixed because people were were not actually using it. So one of the things we try to do with gRPC is we can probably move away from UDP if everything is courting us, which means we can use new protocols and can do interesting things. One of the things we have with gRPC was that you could watch names to be changed. In the DNS right now, you have to wait until the TTL expires and that kind
53:21 of stuff. But the gRPC, we could just kinda watch back like this name has changed. You need to update it, which would mean that the updates in Kubernetes, for instance, would be instantaneously instead of just waiting for it to deal to time out. So that would be interesting, but then this all falls flat because you also need to change the clients to actually use your new Fangled protocol. Meaning, at the time, local DNS node node local what is it? Node DNS local, whatever the hell they have right now. Every node in the DNS in the Kubernetes cluster
53:57 runs CoreDNS. So we can all potentially actually do this. We could probably switch to gRPC and have some other protocol than DNS doing the naming in those clusters. Alright. Cool. That actually kinda leads on. We we had a a viewer question, and the the question was, are there any best practices for using DNS in Kubernetes? And I'd also like to kind of just throw something else into there like, you mentioned that you previously worked on SkyDNS, which I think was the original cube DNS like Yep. Yep. What why was that deprecated in favor of CoreDNS?
54:37 Like, what were some of the differences changes? What lessons did we learn from that kind of older approach? So to start with that, so SkyDNS was the SkyDNS binary combined with DNS mask, which was another c DNS binary. And there was at some moment, couple of years back, where they had eight CFEs in DNS mask. So that was a a thing they needed to upgrade. Also, the interaction between the two because you have two boundaries getting more complex. So so CoreDNS basically unified at all, gave problem metrics, all that kind of stuff. So it's it's easier
55:18 to deploy one CoreDNS than it is to have SkyDNS and DNS Mask. So that was a bit of the impetus on on why moving to, CoreDNS. Doing DNS in Kubernetes clusters is tricky for a couple of reasons. There's a thing called the search path and the kind of options you can have, which means if you have a domain like ...google.com, the entire search path will be searched before it will actually try wwwgoogle.com. Meaning, every query that is done in the Kubernetes cluster leads to a five x amplification of the number of queries we see in CoreDNS.
56:01 So that alone is horrendous and should be fixed, although it can't be fixed anymore. So the number of queries that we see is is way more than need to be done, which means you need to scale CoreDNS aggressively or your DNS and and process aggressively to actually be able to respond to all those queries. We have a couple of text that's sidestep this. There is auto path, which makes this this it shortcuts your search path because we know what you're looking for, so we will tell you what the actual answer is. But that's quite a
56:36 hack. So if you wanna know if things are performing well, use metrics. The health plug in has an internal loop that tells you how busy the core and that's binary is. So you can use that for for latency. If that goes through the roof, then something is clearly wrong. Latency on on DNS request is also good to track. Sysdig had an article yesterday on exactly that. So you need to be on top of things to see see what's happening. And if you look at the metrics, you see that non existing domains is 9% of all periods.
57:09 Yeah. The many you said that search path thing, like, got innocent flashbacks. Like, when Kubes DNS was the default at our clusters, we we pretty much dead denial of service attacker entire infrastructure with them and a request that we're going through. And the the configuration is that's it was the end of its problem. Right? I think by default end of its set to like five or something like that. Yeah. Cousin five x number of lookups that we actually want to do. I mean, you have a high number of lookups, The whole thing was found out. Yeah. Yeah.
57:38 The default on your Linux box is to for handles to give a sense of how how far along they they were. Another thing that's you should probably do in your Kubernetes cluster is normally you have two DNS servers that are useful for clients. In Kubernetes clusters, there's one service IP and all clients go to this. So if that's overloaded, a client will just need a degree. So once you have a cascading failure, it's there's no way we can come out of this. So what you should have is have two sets of core DNS serving two different
58:14 service IPs and then updating all your notes to have those two IPs in the result of the comp. And then at least client can move to the other one, which is maybe less loaded than the previous one. Yeah. That's a great tip, actually. That's nice. I wish I was involved in these decisions when they were made in Kubernetes, but yeah. There there's tough time. It's not it's you can, you know, fail the cat. We can get that changed. We can make it better. I mean, think it's the first part that's that will never change. Alright. I don't think we have any more
58:49 Conclusion and Wrap-up
58:52 questions. So with that, I'll just say, you know, thank you for joining me today. That was really insightful and you know, getting your experience and your understanding of the project and the demos and just getting a good look into how CoreDNS functions and how the plug ins work like all of that was great. So thank you for taking the time to join me today. Cool. Thanks for having me. Alright. Well, you have a a great evening. I know it's getting late where you are, so I won't keep you much longer. But again, a pleasure and I'll speak to
59:18 you soon. Cool. Great. Thanks. Bye bye.
Technologies featured
Meet the Cast
Stay ahead in cloud native
Tutorials, deep dives, and curated events. No fluff.
Comments