Matrix Product Toolkit | Installing / ConfigureCustomization

(redirected from Main.ConfigureCustomization)

Custom compiler flags

The default compiler options are designed for a high level of optimization with the gcc compiler. For other compilers, you might get better results with some customization. To set the compiler flags, either set the environment variable CXXFLAGS or add CXXFLAGS=... when invoking configure.

The default is equivalent to

export CXXFLAGS="-O2 -march=native -flto"

Specifying a different architecture

By default, the tools will be compiled for the same architecture of the machine that the compiler is running on. This means that if you compile the toolkit on a modern machine that has some recent CPU extensions (for example Advanced Vector eXtensions), and then attempt to run the tools on an older machine that doesn't have these extensions, it will fail with an "Illegal Instruction" error. To work around this, you need to specify the target architecture by hand, with the -march=xxxx compiler flag (or whatever is appropriate for your compiler). For GCC, the possible architectures are listed at https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html. A reasonably safe architecture would be core2, eg

export CXXFLAGS="-O2 -march=core2 -mtune=native"

Installing to a different location

make install will copy the executable files to the PREFIX/bin directory specified in the configure script. By default, the PREFIX directory is $HOME. This is what you want if you are installing a personal copy of the toolkit and you only have write access to your home directory.

If you want to change the default, for example to install the executable files into /usr/local/bin, use

../mptoolkit/configure --prefix=/usr/local

Debugging

By default, the toolkit will be compiled with no debug information, so stack traces etc will not be useful. To enable debugging, use the configure option --enable-debug. Note that this also disables optimizations and also enables a lot of debugging checks in the toolkit. It will run much slower, and produce a lot of extra output too.

You can also use --enable-debug=info to get just basic debugging. This is equivalent to adding -g to the compiler flags.

You can also use --enable-debug=profile to set compiler options appropriate for profiling the toolkit with gprof.

Optimized BLAS libaries

The configure script attempts to auto-detect the BLAS and LAPACK libraries, but it will often fail to autodetect an optimized BLAS library, especially if it is installed in a non-standard location. The difference in speed between the reference BLAS library and an optimized BLAS library such as MKL is typically around a factor 4 or more (much more if you also use multi-threading).

To use a specific BLAS libary, use the option --with-blas=... when invoking configure. The ... will typically be something like -L/path/to/optimized/blas -lname_of_library.

To use MKL, use the option --with=blas="...", where the information in ... is taken from the Intel link line advisor, https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor

For example, a typical command for single-threaded MKL (which strangely still requires libpthread), use

../mptoolkit/configure --with-blas="-Wl,--no-as-needed -L/opt/intel/composerxe/mkl/lib/intel64 \
-lmkl_gf_lp64 -lmkl_core -lmkl_sequential -lpthread -lm"

note 1: I used -lmkl_gf_lp64 here, not -lmkl_intel_lp64. This is very important, as the two versions of MKL use a different way of passing complex values from functions. The toolkit will detect automatically which convention to use, but if ARPACK is compiled with gfortran then you must use the gf version of the library, or ARPACK will not work! The gfortran version will report

checking convention for returning complex values from Fortran functions... return_in_register

whereas using the intel version of MKL gives

checking convention for returning complex values from Fortran functions... pass_as_first_argument

If you get problems such as a segmentation fault or a program hanging inside zdotc (or some similar BLAS function) then it is most likely a problem of wrong MKL version.

note 2: verify in the output of configure that it really is using the BLAS library that you specified -- if the configure script is unable to run a program using the specified BLAS library it will keep searching, and possibly use the wrong BLAS library!

note 3: Versions MKL prior to around 2019.5 have serious bugs in the SVD and eigensolvers in the LAPACK functions. Symptoms are the toolkit hanging at the start of iDMRG or at the end (while orthogonalizing the MPS), or extreme inaccuracies with TEBD time evolution. A possible workaround is to use a different LAPACK library with MKL BLAS. To do this, specify --with-lapack=LIBRARY when configuring.

Which BLAS library to choose?

Most Linux machines will come with either the 'reference' versions of BLAS/LAPACK or the openblas libaries. Avoid the 'reference' versions at all costs - it will work, but it will be much slower than an optimized library. Openblas has quite good single-thread performance, but the multi-thread performance is generally very bad - for the toolkit, it will often be slower to run with >1 thread than for single thread! So we recommend setting the environment variable OPENBLAS_NUM_THREADS=1 when using openblas.
On recent Intel CPU's, the AVX-512 instruction set makes a big improvement to floating-point performance, typically a factor 2 or more compared with older CPUs. MKL and Openblas take advantage of this instruction set. AMD doesn't implement the AVX-512 instructions, although in most other respects the new Ryzen architecture is faster than the current Intel offerings (as of late 2017).
For Intel machines, use either openblas or MKL. Depending on the workload, MKL can give quite good multi-thread performance.
For AMD machines, we recommend the BLIS library (replaces BLAS) and FLAME (replaces LAPACK) as higher performance libraries (about 60% faster than openblas, in some benchmark tests). These libraries need to be installed from the github source; as far as I know there are no pre-built packages available for linux distributions.