How Tech - Systems Programming

How Tech - Systems Programming

Profiling User-Space Applications with perf and DWARF Debug Info

Feb 19, 2026
∙ Paid

You’re staring at a latency spike at 2 AM. perf report shows 99% of samples labeled “[unknown]”. Your production binaries are stripped, built with -fomit-frame-pointer for that mythical 2% performance gain. Now you’re blind, guessing where the CPU time went. This is why understanding perf’s sampling mechanisms and DWARF debug information isn’t academic—it’s the difference between finding your bottleneck in 10 minutes versus guessing for 3 hours.

How perf Actually Samples Your Code

When you run perf record -g, your CPU’s Performance Monitoring Unit (PMU) counts events—cycles, instructions, cache misses. After N events (default: 4000 per second), the PMU triggers an interrupt. The kernel’s perf subsystem captures the instruction pointer, registers, and crucially, walks the stack to build a call chain. This sample lands in a lockless ring buffer that perf record reads from user space.

User's avatar

Continue reading this post for free, courtesy of Systems.

Or purchase a paid subscription.
© 2026 Sumedh S · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture