The Intricacies of Rust’s Zero-Cost Abstractions
Rust, renowned for its safety and performance promises, carries the torch for zero-cost abstractions. This tantalizing promise suggests a utopia where developers craft high-level, elegant code devoid of runtime drawbacks. But what unfurls when these layers of abstraction appear to deter the very optimizations they champion? Join me on this journey, step by step.
From Humble Beginnings: The No-Frills Code
Picture this: you’re knee-deep in a project, crafting uncomplicated Rust code. Nothing fancy, just the essentials:
let mut output = String::new();
for i in 0..100000000 {
write!(output, "hello {} {}!", "world", i).unwrap();
}
A Twist in the Tale: Enter the Wrapper
As the days go by, your project evolves, presenting you with unique requirements. The default Display
trait just won’t cut it. Tailored needs beckon. And so, you architect a workaround: a wrapper that elegantly delegates to the Display
trait, at least for the time being:
struct Wrapper<T>(T);
impl<T: Display> Display for Wrapper<T> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
self.0.fmt(f)
}
}
// And now the main logic with our wrapper
let mut output = String::new();
for i in 0..100000000 {
write!(output, "hello {} {}!", Wrapper("world"), i).unwrap();
}
The Unfolding Drama
Upon running some rudimentary benchmarks on my machine, an eye-opening revelation surfaced: the code encapsulated by the wrapper consumed roughly 50% more time than its unwrapped counterpart.
Decoding the Enigma
Leveraging direct, idiomatic patterns grants the Rust compiler free rein to unleash its optimization prowess. However, introduce an abstraction layer, and it seems the compiler’s enthusiasm wanes a touch. This innocuous wrapper, it appears, threw an unsuspected spanner in the works.
The Epilogue
Programming stands as a delicate dance between devising solutions and deeply understanding our toolkit. As showcased, the nuances of Rust’s zero-cost abstractions aren’t without their peculiarities. So, tread with curiosity, and may your abstractions always be both efficient and insightful!
EDITS
What actually caused this
After posting on reddit this blog post, it has been brought to my attention that the specific PR that causes this is this one. This results in the compiler inlining string literals, integer literals and nested format_args!() into format_args!(), which explains why string literals are considerably faster than their wrapped counterparts.
Some benchmark numbers
I did not initially post benchmark results, as I’ve only run them on my machine and it did not feel proper. However, I’ve been asked to post them, so here they are:
no wrapper millis: 2720, no wrapper length: 2088888890
with wrapper millis: 4205, with wrapper length: 2088888890
The benchmark code is available here