Original post is here: eklausmeier.goip.de
What effect can the optimizer have for gcc
?
On Intel/AMD I ran my intpoly
program (with -n0
) once with and once without optimizer. It showed a speed-up of about 3.
- no optimizer: 7.84s
-O3
: 2.25s
gcc
for Intel/AMD is version 4.8.2.
On Power8 I again ran intpoly
(with -n0
). The factor is more than 8 (eight).
[more_WP_Tag]
- no optimizer: 28.58s
-O3
: 3.31s
gcc
for Power8 is also 4.8.2.
The effect is less pronounced for floating point, it just showed a factor of 3 on Power8, and a factor of 2 for Intel/AMD. So, the effect of the optimizer depends on integer/floating-point, and CPU architecture.
For my Power8 tests I used the free test drive on RunAbove, which I learned on RunAbove: A POWER8 Compute Cloud With Offerings Up To 176 Threads in Phoronix.
Interestingly enough, intpoly
on Power8 showed the same effect regarding multiple cores as described in CPU Usage Time Is Dependant on Load.
Update 19-Jun-2016: RunAbove no longer offers PowerP8 servers, their offer is now closed.