I ran a benchmark comparing a recursive function vs a recursive lambda function using the std::function<>
capture method. With full optimizations enabled on clang version 4.1, the lambda version ran significantly slower.
#include <iostream>
#include <functional>
#include <chrono>
uint64_t sum1(int n) {
return (n <= 1) ? 1 : n + sum1(n - 1);
}
std::function<uint64_t(int)> sum2 = [&] (int n) {
return (n <= 1) ? 1 : n + sum2(n - 1);
};
auto const ITERATIONS = 10000;
auto const DEPTH = 100000;
template <class Func, class Input>
void benchmark(Func&& func, Input&& input) {
auto t1 = std::chrono::high_resolution_clock::now();
for (auto i = 0; i != ITERATIONS; ++i) {
func(input);
}
auto t2 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count();
std::cout << "Duration: " << duration << std::endl;
}
int main() {
benchmark(sum1, DEPTH);
benchmark(sum2, DEPTH);
}
Produces results:
Duration: 0 // regular function
Duration: 4027 // lambda function
(Note: I also confirmed with a version that took the inputs from cin, so as to eliminate compile time evaluation)
Clang also produces a compiler warning:
main.cc:10:29: warning: variable 'sum2' is uninitialized when used within its own initialization [-Wuninitialized]
Which is expected, and safe, but should be noted.
It's great to have a solution in our toolbelts, but I think the language will need a better way to handle this case if performance is to be comparable to current methods.
Note:
As a commenter pointed out, it seems latest version of VC++ has found a way to optimize this to the point of equal performance. Maybe we don't need a better way to handle this, after all (except for syntactic sugar).
Also, as some other SO posts have outlined in recent weeks, the performance of std::function<>
itself may be the cause of slowdown vs calling function directly, at least when the lambda capture is too large to fit into some library-optimized space std::function
uses for small-functors (I guess kinda like the various short string optimizations?).