About this video
What You'll Learn
- Trace an application update failure in cluster 17 by isolating a container runtime error during pod startup.
- Find and stop a suspicious node-level debugger pod while validating other deployment and networking signals.
- Resolve cluster 18 by detecting CoreDNS NXDOMAIN behavior and removing a broken mutating webhook.
Marcos Nils joins to debug two Kubernetes clusters: cluster 17 from Sascha Grunert (a rogue node debugger pod and a containerd 'honk' error pointing at BPF) and cluster 18 from Billie Cleek (CoreDNS NXDOMAIN rule and a malicious mutating webhook).
Jump to a chapter
- 0:00 Viewers Comments
- 1:23 Introductions
- 1:24 Introduction & Show Overview
- 2:56 Introducing Co-Host Marcos
- 3:47 Starting Cluster 17 Troubleshooting
- 3:50 Kluster 17 - Broken by Sascha Grunert
- 4:43 Initial Cluster 17 Checks
- 7:37 Cluster 17 Application Upgrade Failure (v2)
- 8:33 Debugging OCI Runtime Error ('Honk')
- 10:03 Investigating Cluster 17 Configurations
- 19:08 Investigating Node-Specific Issues
- 21:58 Discovering Rogue 'Node Debugger' Pod
- 24:04 Examining Rogue Pod Manifest
- 25:33 Debugging Inside Rogue Pod
- 26:40 Finding Suspicious Host File
- 27:50 Analyzing Rogue Killing Script
- 28:29 Stopping Rogue Service & Pods
- 32:16 Retesting Application Upgrade (Cluster 17)
- 33:15 Cluster 17 App Works, 'Honk' Remains Mystery
- 46:40 Switching to Cluster 18
- 46:55 Kluster 18 - Broken by Billie Cleek
- 47:16 Cluster 18 Initial Diagnosis (API Server Down)
- 49:08 Debugging Control Plane Components (Cluster 18)
- 54:39 Cluster 18 Application Networking Issue
- 55:48 Debugging Networking from App Pod
- 56:47 Identifying DNS Issue (Cluster 18)
- 57:35 Checking CoreDNS Configuration
- 58:30 Found: CoreDNS NXDOMAIN Rule
- 58:47 Fixing CoreDNS
- 1:00:59 Discovering Mutating Webhook Issue
- 1:05:40 Deleting Problematic Mutating Webhook
- 1:06:03 Verifying Cluster 18 Control Plane Health
- 1:08:17 Cluster 18 App Connectivity & Upgrade Test
- 1:08:40 Cluster 18 Resolved
- 1:10:00 Kluster 17 Revisited
- 1:10:13 Revisiting Cluster 17 ('Honk' Mystery)
- 1:12:46 Searching for 'Honk' Binary/Config
- 1:15:58 Debugging Containerd on Worker Node
- 1:21:45 Inspecting Containerd Configuration
- 1:24:22 Cluster 17 Conclusion (BPF Suspected)
- 1:27:18 Wrap-up & Thank You
Technologies featured
Meet the Cast
Weekly Cloud Native insights
Stay ahead in cloud native
Tutorials, deep dives, and curated events. No fluff.
Comments