Skip to content

Running CRYSTAL in Parallel

Pcrystal, MPPcrystal, MPI, OpenMP, GPUs

6 Topics 33 Posts
  • Discrepency between MPP and Pcrytsal

    4
    0 Votes
    4 Posts
    167 Views

    Hi Rams,

    The need for relatively high values of FMIXING is not unusual, so what you are observing is expected. That said, you are correct that if FMIXING is set extremely high (close to 100%), there is a risk that the SCF procedure may reach converge before reaching the "true" ground state. This is not unique to your system, it is a general trade-off with strong damping.

    To improve robustness while still guiding the calculation toward the correct solution, you can try:

    Increasing TOLDEE: this tightens the SCF energy convergence criterion and will allow the SCF to continue reaching the minimum even with high FMIXING. Keep in mind, however, that this will typically require many more SCF cycles. Combining mixing with other stabilization strategies, such as level shifting (LEVSHIFT) or electronic smearing (SMEAR) (if applicable to your system), which often reduce the need for such extreme damping.
  • OpenMP problem

    9
    0 Votes
    9 Posts
    384 Views

    Hi Fabio,
    A few clarifications:

    i) I'm aware that setting KMP_DETERMINISTIC_REDUCTION=true can help, but in practice, it doesn't guarantee reproducibility on its own. Even building crystalOMP with stricter floating-point flags like -fp-model precise (or equivalent) doesn't always lead to fully consistent results, at least not in my experience with CRYSTAL.

    ii) Yes, MPI reductions are typically deterministic, as most implementations use pairwise summation or other stable schemes. OpenMP, on the other hand, can still introduce variability due to threading and compiler optimizations.

    iii) If you're seeing discrepancies around 1e-5, it's plausible that small numerical differences from OpenMP reductions are enough to drive the SCF, especially in extremely delicate cases like metallic systems treated at the HF level, toward slightly different solutions.

    At this point, I’ve investigated what I could from our end. If maximum reproducibility is essential, I strongly recommend sticking with MPI-only parallelism.

    Have a great day,
    Giacomo

  • No space left on device

    4
    0 Votes
    4 Posts
    183 Views

    job314 said in No space left on device:

    those are huge HPC nodes… they can’t be possibly out of disk…

    On our cluster, although the total disk space is huge, there are limited disk quotas for each user. Maybe it is the same there for you?

    job314 said in No space left on device:

    will it affect my convergence or calculation speed?

    Possibly, but not much I think.

  • PCrystal job stuck when run between several nodes

    7
    0 Votes
    7 Posts
    270 Views

    OK, here we go. It just is stuck, always the same position in the output

    (ceres20-compute-46:0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95)
    (ceres24-compute-18:96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191)

    export TMPDIR=/local/bgfs/jonas.baltrusaitis/15383115
    export TMOUT=5400
    export SINGULARITY_TMPDIR=/local/bgfs/jonas.baltrusaitis/15383115

    MAX NUMBER OF SCF CYCLES 200 CONVERGENCE ON DELTAP 10**-20
    WEIGHT OF F(I) IN F(I+1) 30% CONVERGENCE ON ENERGY 10**-10
    SHRINK. FACT.(MONKH.) 6 6 6 NUMBER OF K POINTS IN THE IBZ 64
    SHRINKING FACTOR(GILAT NET) 6 NUMBER OF K POINTS(GILAT NET) 64

    *** K POINTS COORDINATES (OBLIQUE COORDINATES IN UNITS OF IS = 6)
    1-R( 0 0 0) 2-C( 1 0 0) 3-C( 2 0 0) 4-R( 3 0 0)
    5-C( 0 1 0) 6-C( 1 1 0) 7-C( 2 1 0) 8-C( 3 1 0)
    9-C( 0 2 0) 10-C( 1 2 0) 11-C( 2 2 0) 12-C( 3 2 0)
    13-R( 0 3 0) 14-C( 1 3 0) 15-C( 2 3 0) 16-R( 3 3 0)
    17-C( 0 0 1) 18-C( 1 0 1) 19-C( 2 0 1) 20-C( 3 0 1)
    21-C( 0 1 1) 22-C( 1 1 1) 23-C( 2 1 1) 24-C( 3 1 1)
    25-C( 0 2 1) 26-C( 1 2 1) 27-C( 2 2 1) 28-C( 3 2 1)
    29-C( 0 3 1) 30-C( 1 3 1) 31-C( 2 3 1) 32-C( 3 3 1)
    33-C( 0 0 2) 34-C( 1 0 2) 35-C( 2 0 2) 36-C( 3 0 2)
    37-C( 0 1 2) 38-C( 1 1 2) 39-C( 2 1 2) 40-C( 3 1 2)
    41-C( 0 2 2) 42-C( 1 2 2) 43-C( 2 2 2) 44-C( 3 2 2)
    45-C( 0 3 2) 46-C( 1 3 2) 47-C( 2 3 2) 48-C( 3 3 2)
    49-R( 0 0 3) 50-C( 1 0 3) 51-C( 2 0 3) 52-R( 3 0 3)
    53-C( 0 1 3) 54-C( 1 1 3) 55-C( 2 1 3) 56-C( 3 1 3)
    57-C( 0 2 3) 58-C( 1 2 3) 59-C( 2 2 3) 60-C( 3 2 3)
    61-R( 0 3 3) 62-C( 1 3 3) 63-C( 2 3 3) 64-R( 3 3 3)

    DIRECT LATTICE VECTORS COMPON. (A.U.) RECIP. LATTICE VECTORS COMPON. (A.U.)
    X Y Z X Y Z
    13.1430453 0.0000000 0.0000000 0.4780616 0.0000000 0.0000000
    0.0000000 11.6066979 0.0000000 0.0000000 0.5413413 0.0000000
    0.0000000 0.0000000 21.1989478 0.0000000 0.0000000 0.2963914

    DISK SPACE FOR EIGENVECTORS (FTN 10) 53868000 REALS

    SYMMETRY ADAPTION OF THE BLOCH FUNCTIONS ENABLED
    TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT gordsh1 TELAPSE 186.18 TCPU 45.44

  • Setting the output file name

    3
    0 Votes
    3 Posts
    170 Views

    Dear Giacomo, it worked fine. Thank you very much!

  • Help to install CRYSTAL17 parallel execution

    6
    0 Votes
    6 Posts
    393 Views

    Thank you very much for your kind response.