Recent Changes - Search:

HomePage

PmWiki

pmwiki.org

MpDmrgInitResume

The mp-dmrg-init program is used for larger-scale DMRG calculations. It supports batch operations, and is interruptable and restartable. mp-dmrg-init does not do any calculations itself, it merely initializes the data files for use by mp-dmrg-resume.

Matrix Product Toolkit version HEAD-0.7.3.0 (subversion tree rev 143:146M) (DEBUG)
Compiled on Apr 11 2006 at 14:23:49
usage: mp-dmrg-init [options]
Allowed options:
  --help                    show this help message
  -H [ --Hamiltonian ] arg  operator to use for the Hamiltonian (wavefunction attribute "Hamiltonian")
  -w [ --wavefunction ] arg initial wavefunction (required)
  -c [ --config ] arg       configuration file (required)
  --orthogonal arg          force the wavefunction to be orthogonal to this state
  -o [ --out ] arg          initial part of filename to use for output files (required)

Required options are --wavefunction, --config and --out. In addition, you need to specify the Hamiltonian, either via the --Hamiltonian option, or you can set the Hamiltonian attribute of the initial wavefunction (see mp-attr).

Example: mp-dmrg-init -H hubbard-20site-lattice:H -c dmrg.conf -w initial.psi -o groundstate

There is no lattice parameter that was used by older versions of this program, this is now part of the expression defining the Hamiltonian. Eventually, a rather general operator expression will be allowed here, but for the time being the only allowed syntax is "lattice : operator". Typically, operator will be H.

There are some example configuration files in the conf/ directory, specifically conf/dmrg-example.conf documents all of the possible options.

You can force the obtained wavefunction to be orthogonal to another wavefunction using the --orthogonal option.

Example: mp-dmrg-init -H hubbard-20-1-1.lattice:H -c dmrg.conf -w initial.psi --orthogonal groundstate.psi -o excitedstate

Assuming groundstate.psi was a previously obtained groundstate wavefunction, the DMRG will find the first excited state. This scheme has the advantage over the traditional DMRG method of targetting multiple states in the density matrix, in that there is still only a single state targetted so the full m-dimensional basis is devoted to the excited state. The scaling of the energy to the large m limit can be performed independently for each excited state, which should give an improved estimate for the energy differences.

mp-dmrg-resume

Only one parameter is required for mp-dmrg-resume, the filename that was used as the -o option to mp-dmrg-init. The mp-dmrg-resume program does the actual calculation. For a batch queue, typically the calculations are initialized (with mp-dmrg-init) and then a job script is submitted to the queue that runs mp-dmrg-resume.

The program handles various signals. SIGUSR1, SIGUSR2, SIGTERM and SIGINT. Any of these signals cause the program to checkpoint as soon as possible (the time lag is typically only a few seconds). After the program has checkpointed, it can be resumed simply by running mp-dmrg-resume again. The different signals give a different return code, which can be used by the job script to determine what caused the program to checkpoint. The return codes are defined in common/proccontrol.h.

SIGTERM is raised by the kill command, and also the queuing system typically sends this signal if the job exceeds its CPU time (however the queuing system typically also sends a SIGKILL shortly afterwards, which might not be enough time to checkpoint properly - don't depend on it!).

SIGINT is raised by pressing control-c on the keyboard. Pressing control-c a second time will terminate the program immediately (and corrupt the checkpoint file).

SIGUSR1 and SIGUSR2 are normally not used but are available for whatever the user wants. They can be used by external scripts to control the behavior of the job in some way.

There are some example scripts for job control in the scripts/ directory. wrapper.lxtccl2 is a sample script that calls mp-dmrg-resume and handles restarting the job, and copying the checkpoint files from the NFS directories to the local node, and back. This script handles SIGUSR1 by making a backup of the checkpoint files and resuming the calculation (useful if you are paranoid that the computer is about to crash and the program has been running for a long time). SIGUSR2 stops the current run but resubmits the job to the batch queue with an increased memory limit. This isn't really relevant for the lxtccl2 cluster though, as the memory limits are not enforced. Note that the wrapper.lxtccl2 will not work without some customization.

Also, the MaxCPUTime and MaxWallTime options in the configuration file can also be used to force a checkpoint after the specified amount of CPU time or elapsed time respectively.

Tips

Tip 1: For quick calculations, MpDmrg can be used instead. But this program is limited by the simpler interface, there is no way to modify the number of states per sweep, no checkpointing, and only a few of the configuration file options of mp-dmrg-init can be set.

Tip 2: If you want to modify something in the configuration file quickly, you can use environment variables. For example, if dmrg.conf contains

...
NumStates = ${NUMSTATES}
...

Then, using bash (this probably works in most other shells too), use
NUMSTATES="20 40 60 80" mp-dmrg-init -c dmrg.conf .....

and the configuration file will load the value of the NUMSTATES environment string.

Tip 3: The NUMSTATES line is the only part of the configuration file that cannot be modified in between checkpoint runs. For example, if a calculation seems to be taking an unreasonable amount of time to converge, you can interrupt it, modify the filename.conf file (the file with the filename prefix from the -o option not the original file that was supplied with -c!) to adjust the convergence criteria, then resume the calculation. The environment variable trick can also be used in this case, for example, with MixFactor = ${MIXFACTOR} in the configuration file,

MIXFACTOR=0.005 mp-dmrg-resume mycalculation

will continue a stopped calculation with the new MixFactor.

Edit - History - Print - Recent Changes - Search
Page last modified on December 16, 2010, at 05:55 AM