QuestDB switches to clock_gettime for per-thread CPU time, cutting procfs I/O with a 40-line fix and validating gains via a 55-line JMH benchmark.
Tech News Team

Continue your reading
QuestDB Linux clock_gettime 40-Line Fix Improves Per-Thread CPU Time
QuestDB replaced a heavy per-thread CPU time read from the proc filesystem with clock_gettime for a faster, kernel-backed time source. The 40-line tweak closes a large Linux performance gap and is documented in a commit noted as 858d2e434dd 8372584; the diffstat shows +96 insertions and -54 deletions. In production code, that small tweak also reduces the codebase, not just the runtime cost. The changeset includes a 55-line JMH benchmark, demonstrating real gains rather than theory.
The old path lived in os_linux.cpp, in a function that opened a per-thread stat file, read it, and parsed the data to compute user_thread_cpu_time. The code used /proc/self/task/%d/stat to pull CPU timing data, then closed the file. That approach incurs I/O and parsing overhead on every tick, which adds up in a busy, multi-threaded DB engine where per-thread accounting happens often.
Replacing proc based reads with clock_gettime is the main improvement. On Linux, clock_gettime with CLOCK_THREAD_CPUTIME_ID provides the thread’s CPU time directly from the kernel, avoiding the per-thread stat file path. That path reduces I/O and parsing overhead and relies on a kernel-backed time source. The 40-line fix trims the performance gap by removing an expensive, platform-specific bottleneck.
The effects show up beyond a single hot path. QuestDB’s blog notes the changeset includes a 55-line JMH benchmark, showing a disciplined approach to validating micro-optimizations. The production codebase shrinks while the benchmark shows the real-world impact, proving that small, targeted changes can unlock substantial throughput gains. The +96 insertions and -54 deletions in the diffstat reflect a cleaner, leaner implementation that still covers the same functionality.
For engineers watching Linux performance, this is a helpful case study. It shows why leaning on /proc for per-thread timing can drag latency and throughput in high-concurrency systems. It also demonstrates a practical migration pattern: when kernel-provided time sources exist and are reliable, favor clock_gettime over file-system probes. This approach sticks to standard Linux APIs and matches how many performance-sensitive projects measure time, with benchmarks that confirm gains.