A particular friend, who is a math professor, always has interesting computational problems for testing source code optimization methods. Shown below are the compile times for the current problem, a single large C++ file, if the compiler was successful, for various machines and compilers. Once the application finally compiled, it required 5 to 10 minutes to execute, approximately an order of magnitude faster than the unoptimized version.
On a Core i7-920 machine with gcc 4.4.2:
g++ -fopenmp -o build build.cpp: required 53sec.
g++ -fopenmp -O -o build build.cpp: required 3min. 40sec.
g++ -fopenmp -O2 -o build build.cpp: crashed quickly
g++ -fopenmp -O -g -o build build.cpp: crashed after about 2hours
On a dual Xeon E5520 machine with gcc 4.1.2:
g++ -fopenmp -o build build.cpp: required 1min. 3sec.
g++ -fopenmp -O -o build build.cpp: required 3hrs 7min.
Regardless of optimization level or debug options, the Intel C++ compiler v. 11.1 20091130 always crashed.
The program builds a fairly large matrix using explicitly stated polynomials for each matrix element. The source code is generated by another program. While each equation is unique, many common sub-expressions exist. Instead of complicating the generating program, we rely on the compiler to pull out the common sub-expressions.