The Intricacies of Rust's Zero-Cost Abstractions

Diving into Rust's revered zero-cost abstractions, we uncover a curious interplay between elegant code and optimization quirks. Through simple examples and intuitive explanations, this post unravels how even a minor layer of abstraction might sway the compiler's optimizations. Join us on this explorative journey, where we decode the nuances of Rust and the delicate dance of performance.

The Intricacies of Rust’s Zero-Cost Abstractions

Rust, renowned for its safety and performance promises, carries the torch for zero-cost abstractions. This tantalizing promise suggests a utopia where developers craft high-level, elegant code devoid of runtime drawbacks. But what unfurls when these layers of abstraction appear to deter the very optimizations they champion? Join me on this journey, step by step.

From Humble Beginnings: The No-Frills Code

Picture this: you’re knee-deep in a project, crafting uncomplicated Rust code. Nothing fancy, just the essentials:

let mut output = String::new();
for i in 0..100000000 {
    write!(output, "hello {} {}!", "world", i).unwrap();
}

A Twist in the Tale: Enter the Wrapper

As the days go by, your project evolves, presenting you with unique requirements. The default Display trait just won’t cut it. Tailored needs beckon. And so, you architect a workaround: a wrapper that elegantly delegates to the Display trait, at least for the time being:

struct Wrapper<T>(T);

impl<T: Display> Display for Wrapper<T> {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        self.0.fmt(f)
    }
}

// And now the main logic with our wrapper
let mut output = String::new();
for i in 0..100000000 {
    write!(output, "hello {} {}!", Wrapper("world"), i).unwrap();
}

The Unfolding Drama

Upon running some rudimentary benchmarks on my machine, an eye-opening revelation surfaced: the code encapsulated by the wrapper consumed roughly 50% more time than its unwrapped counterpart.

Decoding the Enigma

Leveraging direct, idiomatic patterns grants the Rust compiler free rein to unleash its optimization prowess. However, introduce an abstraction layer, and it seems the compiler’s enthusiasm wanes a touch. This innocuous wrapper, it appears, threw an unsuspected spanner in the works.

The Epilogue

Programming stands as a delicate dance between devising solutions and deeply understanding our toolkit. As showcased, the nuances of Rust’s zero-cost abstractions aren’t without their peculiarities. So, tread with curiosity, and may your abstractions always be both efficient and insightful!

EDITS

What actually caused this

After posting on reddit this blog post, it has been brought to my attention that the specific PR that causes this is this one. This results in the compiler inlining string literals, integer literals and nested format_args!() into format_args!(), which explains why string literals are considerably faster than their wrapped counterparts.

Some benchmark numbers

I did not initially post benchmark results, as I’ve only run them on my machine and it did not feel proper. However, I’ve been asked to post them, so here they are:

no wrapper millis: 2720, no wrapper length: 2088888890
with wrapper millis: 4205, with wrapper length: 2088888890

The benchmark code is available here