Skip to content
  • Home
  • Recent
Collapse
Brand Logo
CRYSTAL23
Latest v1.0.1
Tutorials Try the Demo Get a License
Tutorials Try the Demo Get a License Instagram
GiacomoAmbrogioundefined

Giacomo Ambrogio

@GiacomoAmbrogio
Developer
About
Posts
30
Topics
2
Groups
1
Followers
6
Following
2

Posts

Recent Best Controversial

  • PCrystal job stuck when run between several nodes
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Job,
    It seems there might be a bit of confusion regarding how to report MPI bindings, maybe we will update the tutorial to make it more clear.

    To display binding information correctly, you’ll need to include the appropriate flag in your usual mpirun command. For example:

    • If you're using OpenMPI:
    mpirun --report-bindings -np 192 Pcrystal
    
    • If you're using Intel MPI:
    mpirun -print-rank-map -np 192 Pcrystal
    

    To check which MPI implementation you're using, you can inspect the loaded modules in your environment, or simply run:

    mpirun --version
    

    If you're using a different MPI implementation, feel free to let me know. I'd be happy to help you find the right way to print the bindings.


  • PCrystal job stuck when run between several nodes
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi job314,

    A good reference is the how to run tutorial.

    In your calculation, you're using 30 k-points. Depending on system symmetry, this generates a number of symmetry-adapted Bloch functions that are then distributed across processes. This is the main parallelization bottleneck in Pcrystal: ideally, you want to use a number of MPI processes equal or lower than the number of SABFs.

    However, the behavior you describe seems unusual, even in cases where the number of MPI processes exceeds the number of SABFs. It might be an inter-node communication issue.

    Have you tried running other MPI programs across multiple nodes to see if the problem persists?

    Also, check how MPI processes are bound to physical cores. You can control this by adding a printing flag in your mpirun command (this is also explained in the tutorial above).

    Let me know if you manage to get the bindings.


  • Phonon density of states plotting
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Danny,

    From a first look at your input file, I noticed that you are performing a combined PDOS and BANDS calculation. This has not been fully tested as we prefer to run two separate jobs, one for PDOS and one for BANDS. Once the harmonic frequencies are computed, they can both be run through restarts and are both very fast.

    Indeed, the PDOS calculation produces a single set of data if run without the BANDS keyword.

    I therefore suggest to split your calculation in two different ones: one for PDOS and one for BANDS. You can use the RESTART keyword in the FREQCALC block to avoid recomputing the Hessian matrix (see page 219 of the User manual).

    I leave here a link to a tutorial webpage that we have recently updated about computing phonon-related properties with CRYSTAL.

    Let me know if this helps!


  • How can I find the total number of irreducible representations corresponding to the k-points?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Yachao_su,

    The relation you mention between MPI processes and efficiency is correct, but a bit of clarification might be necessary:limitations on efficiency apply only to the linear algebra steps and the density matrix construction (specifically, the FDIK and PDIG steps) during the SCF procedure.

    In the FDIK section we are essentially diagoalizing a set of Fock matrices (of size n atomic orbitals x n atomic orbitals), one for each k point. When using symmetry-adapted Bloch functions, these matrices become block-diagonal, and each block corresponds to an irreducible representation (irrep). These blocks can then be diagonalized independently (by different MPI processes).

    Diagonalization scales cubically with matrix size. However, for small systems, this time is usually negligible compared to other SCF steps. So in such cases, there's no need to worry too much about how many MPI processes you're using. 😊

    The other most time-consuming part is the computation of two-electron integrals (SHELLXN step), which scales almost linearly with system size and is not affected by the MPI-per-irrep limitation.

    The limitation becomes relevant when dealing with large systems, where the irreps (ie, the block sizes) are large. In that case, the time spent on FDIK and PDIG becomes significant. Since Pcrystal cannot assign more than one MPI process per irrep, any additional processes will stay idle during diagonalization, but they are still used in other parts of the calculation.

    68716be2-c92c-4dbf-a2db-f8f98e7bc3e6-image.png

    Here, I've reported an example. You can see the blue line (FDIK, i.e., diagonalization) is negligible on the left (small systems) but becomes dominant on the right (larger systems). So the MPI limitation becomes noticeable only in that region.

    If you're on the right side of the plot, where the time spent in FDIK exceeds that of other steps like SHELLXN or NUMDFT (DFT integration), you should limit your number of MPI processes using the rule:

    $$ \text{max MPI processes} = \sum_\mathbf{k} n_{irreps, \mathbf{k}} \times n_{spin}$$

    That is, the total number of processes should not exceed the sum of all irreps across all k points, multiplied by the number of spins (2 if you're doing a spin-polarized calculation, 1 for close shell).

    By default, the number of irreps is not printed in the output file. However, you can activate some advanced printing options to display this information. To do so, insert the following into the third block of your input file:

    KSYMMPRT
    SETPRINT
    1
    47 nk
    

    Here, nk is the maximum number of k points for which you want to print symmetry information. Likely this will be the number of k points in your calculation.

    You can find some reference about this option in the CRYSTAL User Manual, at pages 117-118.

    The output will look like this:

     +++ SYMMETRY ADAPTION OF THE BLOCH FUNCTIONS +++
    
     SYMMETRY INFORMATION:
     K-LITTLE GROUP: CLASS TABLE, CHARACTER TABLE.
     IRREP-(DIMENSION, NO. IRREDUCIBLE SETS)
     (P, D, RP, RD, STAND FOR PAIRING, DOUBLING, REAL PAIRING AND REAL DOUBLING
     OF THE IRREPS (SEE MANUAL))
    
     K[   1] (  0  0  0)                                                    <------ 1st k point
    
     CLASS | GROUP OPERATORS (COMPLEX PHASE IN DEGREES)
     ------------------------------------------------------------------------
       2   |   2(   0.0);
       3   |   3(   0.0);   4(   0.0);
       4   |   5(   0.0);   6(   0.0);
       5   |   7(   0.0);   8(   0.0);   9(   0.0);
       6   |  10(   0.0);  12(   0.0);  11(   0.0);
    
     IRREP/CLA     1     2     3     4     5     6
     ---------------------------------------------
      MULTIP |     1     1     2     2     3     3
     ---------------------------------------------
          1  |  1.00  1.00  1.00  1.00  1.00  1.00
          2  |  1.00 -1.00  1.00 -1.00 -1.00  1.00
          3  |  2.00  2.00 -1.00 -1.00  0.00  0.00
          4  |  1.00  1.00  1.00  1.00 -1.00 -1.00
          5  |  2.00 -2.00 -1.00  1.00  0.00  0.00
          6  |  1.00 -1.00  1.00 -1.00  1.00 -1.00
    
       1-(1,  27);   2-(1,  25);   3-(2,  37);   4-(1,  23);   5-(2,  37);  <------ This is the information about
       6-(1,  25);                                                          <------ the irreps
    
     K[   2] (  1  0  0)                                                    <------ 2nd k point.
    
     CLASS | GROUP OPERATORS (COMPLEX PHASE IN DEGREES)
     ------------------------------------------------------------------------
       2   |  11(   0.0);
    
     IRREP/CLA     1     2
     ---------------------
      MULTIP |     1     1
     ---------------------
          1  |  1.00  1.00
          2  |  1.00 -1.00
    
       1-(1, 126);   2-(1, 122);                                             <------ irreps of 2nd k point
    ....
    

    For example, here the first k point is adapted in 6 different irreps, while the second in 2

    Please note: this feature should be run using one single MPI process.


  • Is it necessary to add the keyword NOSYMADA when using a user-defined anisotropic k-point grid?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    The default DETOT value for convergence in SCF is set to 1.0E-6 (at least in single point calculations, you can see all other settings at page 130 in the CRYSTAL user manual). This means that as soon as DETOT drops below 1.0E-6, the code considers the system to have reached convergence and stops. For technical reasons, however, the code performs one additional SCF cycle (actually, half-cycle) after this point.

    This explains the behavior you're seeing. If you look at the 8th cycle, the DETOT is 8.46E-07, which triggers the SCF to stop. But clearly, the system hasn't fully converged. I'm not entirely sure why this happens, but it's likely due to parameters that are too loose... such as the thresholds for integral evaluation, DETOT, SHRINK, etc., combined with the DIIS convergence accelerator.

    To tighten the SCF convergence threshold, you can use the TOLDEE keyword in the third input section. For example you can make the DETOT treshold 1.0E-8 like this:

    [...]
    SHRINK
    0 5
    1 5 5
    TOLINTEG
    7 7 7 7 14
    TOLDEE
    8
    END
    

    Another thing you can try is restarting the calculation using the previously obtained wavefunction as the initial guess. This will allow the code to continue the SCF after the ninth cycle and possibly move toward the correct solution. You can do this as follows:

    CZ_DBF
    EXTERNAL
    BASISSET
    SOLDEF2MSVP
    DFT
    HSESOL3C
    ENDDFT
    GUESSP
    SHRINK
    0 5
    1 5 5
    TOLINTEG
    7 7 7 7 14
    END
    

    The important thing is to copy the wavefunction obtained from the first calculation (fort.9) into the scratch folder of the new one, renaming it to fort.20.

    As a general tip, it's always a good idea to check the convergence of the calculation using: grep DETOT output_file
    This can be very helpful for spotting issues and monitoring how the code is progressing during the calculation.


  • Is it necessary to add the keyword NOSYMADA when using a user-defined anisotropic k-point grid?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    We ran some tests regarding the inclusion of the keyword NOSYMADA in relation to anisotropic shrinking factors, and I can confirm that we did not find any significant difference. Therefore, anisotropic shrinking is fully compatible with symmetry-adapted block functions.

    Taking a closer look to the data and input/output files you submitted, the convergence with respect to the shrinking factors appears to be very similar. Comparisons of calculations using the same SHRINK value consistently show differences well within the default tolerance (10^-6 Ha/cell), except for one value (highlighted in red in the screenshot). In this case, the energy seems to be significantly off the trend. Could you please double-check that specific calculation?

    Screenshot 2025-04-16 112322.png


  • Basis Sets for Spin-Orbit
    GiacomoAmbrogioundefined GiacomoAmbrogio

    There shouldn't be any problem in using POB basis sets alongside a basis set with ECP for SOC.
    If you run into any issues, send the output file and I’ll take a closer look.


  • Basis Sets for Spin-Orbit
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Ehsangowdini,
    Could you please share an example of your input file?


  • Is it necessary to add the keyword NOSYMADA when using a user-defined anisotropic k-point grid?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Can you please share the input file?


  • Is it necessary to add the keyword NOSYMADA when using a user-defined anisotropic k-point grid?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Yachao_su,

    From my experience (at least with CRYSTAL23), using anisotropic shrinking factors does not automatically enforce the use of the NOSYMADA keyword. Thus, symmetry is still retained unless you explicitly specify it.

    So, I believe the statement in the manual might be outdated.


  • Error in Projected DOS Atom Indices
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Gryffindor,
    In the properties input you submitted, the first value of the DOSS keyword is 4. However, you are requesting 5 projections. Could you try run the same calculation using 5?

    DOSS
    5 400 500 900 1 12 0
    

  • Out of Memory Error during CPKS Calculation for Large System (400+ atoms)
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi Gryffindor,

    The only way to improve memory management without compromising the calculation parameters, as Alessandro correctly pointed out, would be to run with a lower number of MPI processes.

    If you don’t want to "waste" CPU cores in the process, the openMP version of the code shoud be compatible with CPHF/CPKS. You just need to export OMP_NUM_THREADS according to the number of MPI processes used, ensuring that:

    CPU cores = MPI processes × OMP_NUM_THREADS

    By doing so, you should be able to fully exploit all resources while optimizing memory usage more effectively. Some references can be found in the tutorial and on the CRYSTAL23 paper.

    Hope this helps!


  • R2SCAN not implemented with UHF or incompatible with EOS?
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi job314,

    I tried to run your input (without the RESTART keyword), and it seems to work fine.
    Can you double-check the file used for the restart? Or eventually, can you try run the code without RESTART?

    Anyway, the proper way to perform a spin-polarized calculation in DFT is to use the SPIN keyword in the DFT block instead of UHF (but technically, both should work):

    [...]
    DFT
    SPIN
    R2SCAN
    XLGRID
    ENDdft
    [...]
    

  • Input MOF geometry problem
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi ywang,

    We managed to resolve the problem with your input file. The issue was that the system was not standardized according to the space group.

    It should be possible to tell the program that the structure is not standard through a specific keyword. However, an alternative approach is to manually modify the CIF file. This can be done using VESTA, please refer to the image.

    image.png

    The new input geometry look like this:

    Fe_complex
    CRYSTAL
    0 0 0
    225
    26.7506
    6
    8  0.05353    0.18260    0.24189
    26 0.50000    0.21143    0.21143
    6  0.20297    0.20297    0.07018
    6  0.17805    0.17805    0.11438
    6  0.13562    0.13562    0.19929
    1  0.12115    0.12115    0.22832
    ENDgeom
    

    Fe.d12

    This has also reduced the number of symmetry-irreducible atoms, so the calculation should be a bit faster. 😉


  • Input MOF geometry problem
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi ywang,
    I managed to print the error for your input, which is:

    ERROR **** GROTA1 **** ERROR IN SYMMETRY EQUIVALENCE - CHECK INPUT COORDINATES
    
    

    Unfortunately, it is not very descriptive.

    It seems there may be an issue with the initial geometry. Could you share more details about your system? Do you have a .cif file from which you generated the CRYSTAL input?


  • Input MOF geometry problem
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi ywang,
    If you are running the parallel version of CRYSTAL, could you try run it on one single processor?
    Sometimes, when running in parallel, an error can terminate the job before the output is printed, especially during the input reading section.


  • GPU-Accelerated Linear Algebra for Large-Scale DFT with CRYSTAL
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Dear CRYSTAL community,

    We’re excited to share our recent work on accelerating linear algebra operations in the CRYSTAL code using GPUs. Our implementation boosts the performance of self-consistent field (SCF) calculations by offloading key matrix operations like multiplication, diagonalization, inversion, and Cholesky decomposition to GPUs.

    In the manuscript, we first analyze the performance and limitations of the standard parallel version of the code (Pcrystal) and then we evaluate the scalability of the new GPU-accelerated approach with 1 to 8 GPUs, observing remarkable scaling. To highlight these improvements, we present benchmark results on different systems, such as the example below.

    post_forum_1.png

    We expected significant speedups for large systems due to the limited number of k points, each requiring substantial computational effort. To ensure a fair comparison, we ran calculations using the massively parallel version of CRYSTAL (MPPcrystal) on a large MOF structure with over 30000 basis functions. Surprisingly, a single GPU on one node performed comparably to 512–1024 CPU cores running across 4–8 nodes.

    To find out more, read the full paper here.

    We aim to make this GPU-accelerated version of CRYSTAL available in the upcoming release, allowing all users to benefit from its enhanced performance for large-scale simulations. We look forward to reading your thoughts and discussing potential applications or further improvements.

    A big thanks to Lorenzo Donà, Chiara Ribaldone, and Filippo Spiga for their contributions to the development of this code!


  • Confusion while creating 2D surface slab from 3D bulk using SLABCUT keyword
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi R.Zosiamliana,

    When using SLABCUT from a 3D system, or directly a SLAB input, CRYSTAL treats the resulting system as a lower-dimensional 2D structure. Specifically, the system retains periodicity only along two directions (conventionally in the x-y plane), while periodicity is inherently absent along the orthogonal direction (z).

    Just to preserve the same output format as for a 3D calculation, an arbitrary value of 500 Å is printed in the output for the c lattice vector, even if it does not exist for such calculations. We understand this may be confusing and we are considering to remove it in future versions.

    To summarise, in CRYSTAL when running a 2D (or 1D, or 0D) calculation, there is no need to repeat the structure along the non-periodic direction and thus there is no need to define a vacuum.


  • corrupted size vs. prev_size while consolidating
    GiacomoAmbrogioundefined GiacomoAmbrogio

    Hi job314,
    I moved the topic in bug reports category.

    Could you kindly provide the input and output files for an example of these errors?

    I assume you are using the parallel version of CRYSTAL23. Could you also specify the environment in which you are running the code? Specifically, which version of MPI you are using and whether you are using libraries such as intel MKL. If so, please include their versions as well.


  • cam-B3LYP with pobTZVP SCF convergence
    GiacomoAmbrogioundefined GiacomoAmbrogio

    One thing worth trying before changing the basis set is increasing the TOLINTEG keyword values. This can be particularly important when dealing with hybrid functionals, especially CAM-B3LYP, as it has a high fraction of EXX at long range. Suggested values to start with could be 8 8 8 15 30. You can further increase these values if needed to improve SCF convergence, but make sure that the fifth threshold is at least double the fourth one.

    If you want, you could also share the output files so I can take a closer look.

  • Login

  • Don't have an account? Register

  • Login or register to search.
  • First post
    Last post
0
  • Home
  • Recent