Skip to content
  • Home
  • Recent
Collapse
Brand Logo
CRYSTAL23
Latest v1.0.1
Tutorials Try the Demo Get a License
Tutorials Try the Demo Get a License Instagram
GiacomoAmbrogioundefined

Giacomo Ambrogio

@GiacomoAmbrogio
Developer
About
Posts
41
Topics
2
Groups
1
Followers
8
Following
3

Posts

Recent Best Controversial

  • Unexpectedly cannot get the geometry after optimization: KEYWORD EXTPRT NOT ALLOWED
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi esmuigors,

    When you run a geometry optimization (within the OPTGEOM block), CRYSTAL will always produce a fort.34 file (that contains the optimized structure and can be used in a subsequent calculation with the EXTERNAL keyword).
    If you don’t see it after the run, a common cause is that it wasn’t copied back by the script used to manage scratch directories for the execution as the runCRY script provided with the code.

    As for the error that you get, the issue is that EXTPRT should be placed outside the OPTGEOM block, as in this input.d12 example.

    If you prefer, you can also extract the optimized geometry manually from the .out file. However, note that you must take both the final cell parameters (and angles, if present) and the final fractional atomic coordinates from the “FINAL OPTIMIZED GEOMETRY” section (those in the primitive cell), using only updated cell parameters without matching coordinates will lead to inconsistencies and errors like the ones you saw.

    Let me know if you manage to get the optimized structure.


  • SCF fails spinlock with POB-DZVP-REV2
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi jquertin,
    As Aleks correctly pointed out, it’s better not to use DIIS (which is active by default) when using SPINLOCK.
    A good way to combine the two is like this:

    SPINLOCK  
    1  
    -2  
    THREDIIS  
    0.005
    

    This way, DIIS is only activated when SPINLOCK is turned off.

    However, I tried running your input, and it’s likely that this is actually a bug. Unfortunately, I don’t have a better solution at the moment other than changing the functional.


  • CRYSTAL23 Installation Linux one node
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Emmanuel,

    For an updated guide on the installation of CRYSTAL23, please follow the instructions provided here.
    Specifically, to install the executable, refer to Section 1. If you need to recompile the code from object files, see Section 3.

    To test the parallel executable, you can refer either to Section 2, or to this tutorial.

    Let me know if you need further help.


  • OpenMP problem
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Fabio,
    A few clarifications:

    i) I'm aware that setting KMP_DETERMINISTIC_REDUCTION=true can help, but in practice, it doesn't guarantee reproducibility on its own. Even building crystalOMP with stricter floating-point flags like -fp-model precise (or equivalent) doesn't always lead to fully consistent results, at least not in my experience with CRYSTAL.

    ii) Yes, MPI reductions are typically deterministic, as most implementations use pairwise summation or other stable schemes. OpenMP, on the other hand, can still introduce variability due to threading and compiler optimizations.

    iii) If you're seeing discrepancies around 1e-5, it's plausible that small numerical differences from OpenMP reductions are enough to drive the SCF, especially in extremely delicate cases like metallic systems treated at the HF level, toward slightly different solutions.

    At this point, I’ve investigated what I could from our end. If maximum reproducibility is essential, I strongly recommend sticking with MPI-only parallelism.

    Have a great day,
    Giacomo


  • OpenMP problem
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Fabio,

    After an in-depth investigation, I can confirm that the behavior you're observing is definitely not due to uninitialized variables. The most likely explanation is the use of non-deterministic summation in reduction operations, which is a known behavior when running floating-point operations with OpenMP parallelization.

    In OpenMP, when large arrays or matrices are reduced (e.g., summing elements of the density or Fock matrices), the order of operations can change depending on how the threads are scheduled at runtime. Since floating-point addition is not associative, this can lead to slightly different results depending on the number of threads, their distribution, or the memory layout at runtime, even if the input is exactly the same.

    To confirm this, I ran your script using the public OpenMP build, with both 16 threads and a single thread. Results are reported below.

    • In the parallel run, there are slight differences in the final energy, but the magnitude of the difference is negligible, well within the SCF convergence tolerance.
    • In the single-thread run (i.e., serial execution with OpenMP), the results are fully reproducible, with no differences at all.

    As expected, the more threads you use, the larger these differences may become, although they typically remain very small.

    Increasing the number of threads likely increases the occurence for these small numerical differences, although they typically remain insignificant in practice. For this reason, we recommend limiting the number of OpenMP threads to 4, and primarily use MPI for more extended parallelism, which generally offers a better balance between performance and numerical stability.

    Additionally, it's good practice in our workflow to run each job in a clean, isolated SCRATCH folder, to avoid any unintended interference from leftover files or cached data. Your current script doesn’t do this, so we recommend updating it to clean or define a separate working directory for each run.

     Energies at each SCF Cycle     -     Copper bulk
     -----------------------------------------------------------------------------
                             OpenMP 16 th
     -----------------------------------------------------------------------------
                     run 1                 run 2                 diff
     CYC   0  -1.959786174741E+02   -1.959786174741E+02    0.0
     CYC   1  -1.959830545289E+02   -1.959830545289E+02    0.0
     CYC   2  -1.959842257372E+02   -1.959842257372E+02    0.0
     CYC   3  -1.959843727892E+02   -1.959843727892E+02    0.0
     CYC   4  -1.959844091796E+02   -1.959844091796E+02    0.0
     CYC   5  -1.959844090374E+02   -1.959844090374E+02    0.0
     CYC   6  -1.959844090603E+02   -1.959844077969E+02   -1.2634000086109154e-06
     CYC   7  -1.959844090705E+02   -1.959844090714E+02    9.000018508231733e-10
     CYC   8                        -1.959844090772E+02    6.700020094285719e-09
     CYC   9                        -1.959844090706E+02    1.000159954855917e-10
    
     -----------------------------------------------------------------------------
                             Open MP 1 th
     -----------------------------------------------------------------------------
                     run 1                 run 2                 diff
     CYC   0  -1.959786174741E+02   -1.959786174741E+02    0.0
     CYC   1  -1.959830545289E+02   -1.959830545289E+02    0.0
     CYC   2  -1.959842257372E+02   -1.959842257372E+02    0.0
     CYC   3  -1.959843727892E+02   -1.959843727892E+02    0.0
     CYC   4  -1.959844091796E+02   -1.959844091796E+02    0.0
     CYC   5  -1.959844090374E+02   -1.959844090374E+02    0.0
     CYC   6  -1.959844090603E+02   -1.959844090603E+02    0.0
     CYC   7  -1.959844090705E+02   -1.959844090705E+02    0.0
    
    
    

  • OpenMP problem
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Fabio,
    Regarding your initial question, I ran the public OpenMP version of the code (using 4 threads) multiple times with your input, and I consistently obtain the same results across runs. The SCF procedure converges smoothly in 6 cycles, as shown below:

    $ grep DETOT input_4th.out
     CYC   0 ETOT(AU) -1.959786174741E+02 DETOT -1.96E+02 tst  0.00E+00 PX  1.00E+00
     CYC   1 ETOT(AU) -1.959830545289E+02 DETOT -4.44E-03 tst  0.00E+00 PX  1.00E+00
     CYC   2 ETOT(AU) -1.959842257372E+02 DETOT -1.17E-03 tst  5.19E-04 PX  2.32E-02
     CYC   3 ETOT(AU) -1.959843829641E+02 DETOT -1.57E-04 tst  6.44E-04 PX  5.14E-02
     CYC   4 ETOT(AU) -1.959844090345E+02 DETOT -2.61E-05 tst  4.59E-06 PX  4.09E-03
     CYC   5 ETOT(AU) -1.959844090249E+02 DETOT  9.60E-09 tst  1.92E-07 PX  1.01E-03
     CYC   6 ETOT(AU) -1.959844090704E+02 DETOT -4.56E-08 tst  8.72E-09 PX  1.01E-03
    $ grep DETOT input_4th_2.out
     CYC   0 ETOT(AU) -1.959786174741E+02 DETOT -1.96E+02 tst  0.00E+00 PX  1.00E+00
     CYC   1 ETOT(AU) -1.959830545289E+02 DETOT -4.44E-03 tst  0.00E+00 PX  1.00E+00
     CYC   2 ETOT(AU) -1.959842257372E+02 DETOT -1.17E-03 tst  5.19E-04 PX  2.32E-02
     CYC   3 ETOT(AU) -1.959843829641E+02 DETOT -1.57E-04 tst  6.44E-04 PX  5.14E-02
     CYC   4 ETOT(AU) -1.959844090345E+02 DETOT -2.61E-05 tst  4.59E-06 PX  4.09E-03
     CYC   5 ETOT(AU) -1.959844090249E+02 DETOT  9.60E-09 tst  1.92E-07 PX  1.01E-03
     CYC   6 ETOT(AU) -1.959844090704E+02 DETOT -4.56E-08 tst  8.72E-09 PX  1.01E-03
    $ grep DETOT input_4th_3.out
     CYC   0 ETOT(AU) -1.959786174741E+02 DETOT -1.96E+02 tst  0.00E+00 PX  1.00E+00
     CYC   1 ETOT(AU) -1.959830545289E+02 DETOT -4.44E-03 tst  0.00E+00 PX  1.00E+00
     CYC   2 ETOT(AU) -1.959842257372E+02 DETOT -1.17E-03 tst  5.19E-04 PX  2.32E-02
     CYC   3 ETOT(AU) -1.959843829641E+02 DETOT -1.57E-04 tst  6.44E-04 PX  5.14E-02
     CYC   4 ETOT(AU) -1.959844090345E+02 DETOT -2.61E-05 tst  4.59E-06 PX  4.09E-03
     CYC   5 ETOT(AU) -1.959844090249E+02 DETOT  9.60E-09 tst  1.92E-07 PX  1.01E-03
     CYC   6 ETOT(AU) -1.959844090704E+02 DETOT -4.56E-08 tst  8.72E-09 PX  1.01E-03
    $ grep DETOT input_4th_4.out
     CYC   0 ETOT(AU) -1.959786174741E+02 DETOT -1.96E+02 tst  0.00E+00 PX  1.00E+00
     CYC   1 ETOT(AU) -1.959830545289E+02 DETOT -4.44E-03 tst  0.00E+00 PX  1.00E+00
     CYC   2 ETOT(AU) -1.959842257372E+02 DETOT -1.17E-03 tst  5.19E-04 PX  2.32E-02
     CYC   3 ETOT(AU) -1.959843829641E+02 DETOT -1.57E-04 tst  6.44E-04 PX  5.14E-02
     CYC   4 ETOT(AU) -1.959844090345E+02 DETOT -2.61E-05 tst  4.59E-06 PX  4.09E-03
     CYC   5 ETOT(AU) -1.959844090249E+02 DETOT  9.60E-09 tst  1.92E-07 PX  1.01E-03
     CYC   6 ETOT(AU) -1.959844090704E+02 DETOT -4.56E-08 tst  8.72E-09 PX  1.01E-03
    

    I suspect that the behavior you're observing might be related to differences in the OpenMP and/or MKL library versions used in your environment. In my experience, such differences can influence numerical results to some extent (especially in delicate systems like metals treated with HF, where convergence can be sensitive to small perturbations or numerical noise).


  • OpenMP problem
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Fabio,
    Could you please also send both outputs (from the OMP and non-OMP calculations)?
    This is a strange behaviour. In the meantime, I will run some tests.


  • Conversion of fort.25 file to .cube (Vesta file)
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Chhatra,
    This is because your system is no longer in 3D after you do a SLABCUT. In order to run a ECH3 properties calculation on a slab, you shoud add a RANGE for the non-periodic direction.

    ECH3
    100
    RANGE
    -2
    +2
    END
    

    The two numbers after RANGE are the min and max distance from the 0 of the cell along the non periodic direction in atomic units.
    The same spacing from the priodic dicrections is used to generate the points.

    You can find more referneces on this at page 322 of the user manual.

    Hope this helps!


  • Negative value in the SCF of an anion vacancy
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Mo,

    I believe the correct way to remove an atom while keeping the basis set is to use the GHOSTS keyword (see page 82 of the User manual) instead of ATOMSUBS, without modifying the original basis set.

    Could you try that and check if the energy makes sense afterward?


  • problems with the calculation of hexagonal structures
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi heimurinn,

    The error you're seeing is not actually related to the symmetry adaptation of the Bloch functions. What’s happening is that the error is triggered by another processor before the one responsible for writing the output reaches the actual failure point.

    If you run the same calculation on one single process (serial mode), you should see the real error, which (for 81_from1.d12, I didn’t try 81_to1403845.d12, but I suspect it will be the same) is:
    ERROR **** GROTA1 **** ERROR IN SYMMETRY EQUIVALENCE - CHECK INPUT COORDINATES

    This issue is likely related to symmetry. One thing you can try is to follow the structure standardization procedure described in this thread.

    If you have any questions or need further help, feel free to ask!


  • Dispersion scheme D4 documentation
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Georg,

    As far as I know, the D4 dispersion scheme is not yet implemented in the public version of CRYSTAL. It’s likely planned for a future release, but not currently available.

    Where did you come across information that D4 has been implemented?


  • PCrystal job stuck when run between several nodes
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Job,
    It seems there might be a bit of confusion regarding how to report MPI bindings, maybe we will update the tutorial to make it more clear.

    To display binding information correctly, you’ll need to include the appropriate flag in your usual mpirun command. For example:

    • If you're using OpenMPI:
    mpirun --report-bindings -np 192 Pcrystal
    
    • If you're using Intel MPI:
    mpirun -print-rank-map -np 192 Pcrystal
    

    To check which MPI implementation you're using, you can inspect the loaded modules in your environment, or simply run:

    mpirun --version
    

    If you're using a different MPI implementation, feel free to let me know. I'd be happy to help you find the right way to print the bindings.


  • PCrystal job stuck when run between several nodes
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi job314,

    A good reference is the how to run tutorial.

    In your calculation, you're using 30 k-points. Depending on system symmetry, this generates a number of symmetry-adapted Bloch functions that are then distributed across processes. This is the main parallelization bottleneck in Pcrystal: ideally, you want to use a number of MPI processes equal or lower than the number of SABFs.

    However, the behavior you describe seems unusual, even in cases where the number of MPI processes exceeds the number of SABFs. It might be an inter-node communication issue.

    Have you tried running other MPI programs across multiple nodes to see if the problem persists?

    Also, check how MPI processes are bound to physical cores. You can control this by adding a printing flag in your mpirun command (this is also explained in the tutorial above).

    Let me know if you manage to get the bindings.


  • Phonon density of states plotting
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Danny,

    From a first look at your input file, I noticed that you are performing a combined PDOS and BANDS calculation. This has not been fully tested as we prefer to run two separate jobs, one for PDOS and one for BANDS. Once the harmonic frequencies are computed, they can both be run through restarts and are both very fast.

    Indeed, the PDOS calculation produces a single set of data if run without the BANDS keyword.

    I therefore suggest to split your calculation in two different ones: one for PDOS and one for BANDS. You can use the RESTART keyword in the FREQCALC block to avoid recomputing the Hessian matrix (see page 219 of the User manual).

    I leave here a link to a tutorial webpage that we have recently updated about computing phonon-related properties with CRYSTAL.

    Let me know if this helps!


  • How can I find the total number of irreducible representations corresponding to the k-points?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Yachao_su,

    The relation you mention between MPI processes and efficiency is correct, but a bit of clarification might be necessary:limitations on efficiency apply only to the linear algebra steps and the density matrix construction (specifically, the FDIK and PDIG steps) during the SCF procedure.

    In the FDIK section we are essentially diagoalizing a set of Fock matrices (of size n atomic orbitals x n atomic orbitals), one for each k point. When using symmetry-adapted Bloch functions, these matrices become block-diagonal, and each block corresponds to an irreducible representation (irrep). These blocks can then be diagonalized independently (by different MPI processes).

    Diagonalization scales cubically with matrix size. However, for small systems, this time is usually negligible compared to other SCF steps. So in such cases, there's no need to worry too much about how many MPI processes you're using. 😊

    The other most time-consuming part is the computation of two-electron integrals (SHELLXN step), which scales almost linearly with system size and is not affected by the MPI-per-irrep limitation.

    The limitation becomes relevant when dealing with large systems, where the irreps (ie, the block sizes) are large. In that case, the time spent on FDIK and PDIG becomes significant. Since Pcrystal cannot assign more than one MPI process per irrep, any additional processes will stay idle during diagonalization, but they are still used in other parts of the calculation.

    68716be2-c92c-4dbf-a2db-f8f98e7bc3e6-image.png

    Here, I've reported an example. You can see the blue line (FDIK, i.e., diagonalization) is negligible on the left (small systems) but becomes dominant on the right (larger systems). So the MPI limitation becomes noticeable only in that region.

    If you're on the right side of the plot, where the time spent in FDIK exceeds that of other steps like SHELLXN or NUMDFT (DFT integration), you should limit your number of MPI processes using the rule:

    $$ \text{max MPI processes} = \sum_\mathbf{k} n_{irreps, \mathbf{k}} \times n_{spin}$$

    That is, the total number of processes should not exceed the sum of all irreps across all k points, multiplied by the number of spins (2 if you're doing a spin-polarized calculation, 1 for close shell).

    By default, the number of irreps is not printed in the output file. However, you can activate some advanced printing options to display this information. To do so, insert the following into the third block of your input file:

    KSYMMPRT
    SETPRINT
    1
    47 nk
    

    Here, nk is the maximum number of k points for which you want to print symmetry information. Likely this will be the number of k points in your calculation.

    You can find some reference about this option in the CRYSTAL User Manual, at pages 117-118.

    The output will look like this:

     +++ SYMMETRY ADAPTION OF THE BLOCH FUNCTIONS +++
    
     SYMMETRY INFORMATION:
     K-LITTLE GROUP: CLASS TABLE, CHARACTER TABLE.
     IRREP-(DIMENSION, NO. IRREDUCIBLE SETS)
     (P, D, RP, RD, STAND FOR PAIRING, DOUBLING, REAL PAIRING AND REAL DOUBLING
     OF THE IRREPS (SEE MANUAL))
    
     K[   1] (  0  0  0)                                                    <------ 1st k point
    
     CLASS | GROUP OPERATORS (COMPLEX PHASE IN DEGREES)
     ------------------------------------------------------------------------
       2   |   2(   0.0);
       3   |   3(   0.0);   4(   0.0);
       4   |   5(   0.0);   6(   0.0);
       5   |   7(   0.0);   8(   0.0);   9(   0.0);
       6   |  10(   0.0);  12(   0.0);  11(   0.0);
    
     IRREP/CLA     1     2     3     4     5     6
     ---------------------------------------------
      MULTIP |     1     1     2     2     3     3
     ---------------------------------------------
          1  |  1.00  1.00  1.00  1.00  1.00  1.00
          2  |  1.00 -1.00  1.00 -1.00 -1.00  1.00
          3  |  2.00  2.00 -1.00 -1.00  0.00  0.00
          4  |  1.00  1.00  1.00  1.00 -1.00 -1.00
          5  |  2.00 -2.00 -1.00  1.00  0.00  0.00
          6  |  1.00 -1.00  1.00 -1.00  1.00 -1.00
    
       1-(1,  27);   2-(1,  25);   3-(2,  37);   4-(1,  23);   5-(2,  37);  <------ This is the information about
       6-(1,  25);                                                          <------ the irreps
    
     K[   2] (  1  0  0)                                                    <------ 2nd k point.
    
     CLASS | GROUP OPERATORS (COMPLEX PHASE IN DEGREES)
     ------------------------------------------------------------------------
       2   |  11(   0.0);
    
     IRREP/CLA     1     2
     ---------------------
      MULTIP |     1     1
     ---------------------
          1  |  1.00  1.00
          2  |  1.00 -1.00
    
       1-(1, 126);   2-(1, 122);                                             <------ irreps of 2nd k point
    ....
    

    For example, here the first k point is adapted in 6 different irreps, while the second in 2

    Please note: this feature should be run using one single MPI process.


  • Is it necessary to add the keyword NOSYMADA when using a user-defined anisotropic k-point grid?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    The default DETOT value for convergence in SCF is set to 1.0E-6 (at least in single point calculations, you can see all other settings at page 130 in the CRYSTAL user manual). This means that as soon as DETOT drops below 1.0E-6, the code considers the system to have reached convergence and stops. For technical reasons, however, the code performs one additional SCF cycle (actually, half-cycle) after this point.

    This explains the behavior you're seeing. If you look at the 8th cycle, the DETOT is 8.46E-07, which triggers the SCF to stop. But clearly, the system hasn't fully converged. I'm not entirely sure why this happens, but it's likely due to parameters that are too loose... such as the thresholds for integral evaluation, DETOT, SHRINK, etc., combined with the DIIS convergence accelerator.

    To tighten the SCF convergence threshold, you can use the TOLDEE keyword in the third input section. For example you can make the DETOT treshold 1.0E-8 like this:

    [...]
    SHRINK
    0 5
    1 5 5
    TOLINTEG
    7 7 7 7 14
    TOLDEE
    8
    END
    

    Another thing you can try is restarting the calculation using the previously obtained wavefunction as the initial guess. This will allow the code to continue the SCF after the ninth cycle and possibly move toward the correct solution. You can do this as follows:

    CZ_DBF
    EXTERNAL
    BASISSET
    SOLDEF2MSVP
    DFT
    HSESOL3C
    ENDDFT
    GUESSP
    SHRINK
    0 5
    1 5 5
    TOLINTEG
    7 7 7 7 14
    END
    

    The important thing is to copy the wavefunction obtained from the first calculation (fort.9) into the scratch folder of the new one, renaming it to fort.20.

    As a general tip, it's always a good idea to check the convergence of the calculation using: grep DETOT output_file
    This can be very helpful for spotting issues and monitoring how the code is progressing during the calculation.


  • Is it necessary to add the keyword NOSYMADA when using a user-defined anisotropic k-point grid?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    We ran some tests regarding the inclusion of the keyword NOSYMADA in relation to anisotropic shrinking factors, and I can confirm that we did not find any significant difference. Therefore, anisotropic shrinking is fully compatible with symmetry-adapted block functions.

    Taking a closer look to the data and input/output files you submitted, the convergence with respect to the shrinking factors appears to be very similar. Comparisons of calculations using the same SHRINK value consistently show differences well within the default tolerance (10^-6 Ha/cell), except for one value (highlighted in red in the screenshot). In this case, the energy seems to be significantly off the trend. Could you please double-check that specific calculation?

    Screenshot 2025-04-16 112322.png


  • Basis Sets for Spin-Orbit
    GiacomoAmbrogioundefined GiacomoAmbrogio

    There shouldn't be any problem in using POB basis sets alongside a basis set with ECP for SOC.
    If you run into any issues, send the output file and I’ll take a closer look.


  • Basis Sets for Spin-Orbit
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Ehsangowdini,
    Could you please share an example of your input file?


  • Is it necessary to add the keyword NOSYMADA when using a user-defined anisotropic k-point grid?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Can you please share the input file?

  • Login

  • Don't have an account? Register

  • Login or register to search.
  • First post
    Last post
0
  • Home
  • Recent