How to provide the compiler with branch prediction information.
If you’re a Linux kernel developer or you’re working on a Linux based system, probably you’ve been through code with calls to likely and unlikely functions:
These two functions are nothing but that MACRO both defined in terms of a __builtin_expect() function, which takes two integral values as arguments, respectively the value to be tested and the expected result:
Is quite rare to use these functions, but if you’re curious to know what they are for, they are for improving performance. As first thing, let’s see what the gcc documentation says about:
— Built-in Function: long __builtin_expect (long exp, long c) You may use __builtin_expect to provide the compiler with branch prediction information. In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform. However, there are applications in which this data is hard to collect.
The return value is the value of exp, which should be an integral expression. The semantics of the built-in are that it is expected that exp == c. For example:
if (__builtin_expect (x, 0)) foo ();
indicates that we do not expect to call foo, since we expect x to be zero. Since you are limited to integral expressions for exp, you should use constructions such as
if (__builtin_expect (ptr != NULL, 1)) foo (*ptr);
when testing pointer or floating-point values.
In a nutshell, __builtin_expect() is nothing but that an hint used by compiler to correctly optimize the branch and exploit the usage of the processor pipeline. Use likely(x) means that x is expected to be true, while unlikely(x) that x is expected to be false. The code is arranged so that the likeliest branch is executed without performing any jmp instruction, so none collateral effecta as flush the pipeline are present. This kind of speedup can have a significant impact for critical code sections. It’s also pretty easy to guess where the names likely and unlikely come from: both are referring to likelihood in Statistics.
However, as wrote into gcc documentation, the use of both these functions is not easy at all because it can lead to counterintuitive results. As software developer, everyone should prefer to use profiling feedback for optimization. I wrote a short program test_bexp (a short name for test branch_expected()) just to test the performance and verify what is the better. The program just check if the first generated number corresponds to max number 100.
We can compile four different versions of the program for results comparison (below values averaged over 20 iterations):
How you can see from results, the optimised version beat all the other ones. It worth to note that the profiling feedback is used to made several kind of optimizations, i.e. loops, not only for branch predictions.
It is interesting the comparison of the assembly code generated, especially for the likely and unlikely versions (there are few comments to help in identifying the important parts):
I would encourage all of you in using profiling optimizations that can significantly improve performance. For a software, the knowledge of both memory and execution patterns are extremely useful for code optimization and for exploiting the hardware at the maximum level. However, not always these patterns are available, i.e. for the kernel you cannot predict the user actions. In these situations, the use of __builtin_expect function can be considered for a branch optimization, but keep in mind that the final result can be counterintuitive, so use it really carefully.