Skip to content
  • Home
  • Recent
Collapse
Brand Logo
CRYSTAL23
Latest v1.0.1
Tutorials Try the Demo Get a License
Tutorials Try the Demo Get a License Instagram
  1. Home
  2. CRYSTAL
  3. Single-Point Calculations
  4. Out of Memory Error during CPKS Calculation for Large System (400+ atoms)

Out of Memory Error during CPKS Calculation for Large System (400+ atoms)

Scheduled Pinned Locked Moved Single-Point Calculations
4 Posts 3 Posters 72 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Gryffindorundefined Offline
    Gryffindorundefined Offline
    Gryffindor
    wrote on last edited by
    #1

    Hi,

    I'm trying to run a CPKS calculation on a system with 400+ atoms, but the job keeps getting aborted due to out of memory (OOM) issues. I'm using 200 GB as the maximum memory limit per job (which is also the maximum allowed on our cluster), but the calculation still fails with OOM errors.

    Is there anything I can do to reduce the memory load during the CPKS step (e.g., approximations, splitting into smaller jobs, etc.)?
    I tried using LOWMEM, but it did not help.
    INPUT.dat

    OUTPUT.dat

    1 Reply Last reply
    0
    • aerbaundefined Offline
      aerbaundefined Offline
      aerba Developer
      wrote on last edited by aerba
      #2

      Hi,

      The CPKS implementation in PCRYSTAL is parallelized according to a replicated-data strategy, which means that each process has a copy of the main arrays. Assume that each process (usually coinciding with each CPU core) allocates X GB of memory. If you run the job over n CPU cores in the same node, the total amount of required memory on the node will be nX.

      So, one way to reduce the memory requirement of a replicated-data calculation is to lower the number of used CPU cores per node.

      As an example, if you have a node with 128 CPU cores, try running the calculation on just 64 or 32 cores, making sure no other processes are executed on the node at the same time. This would effectively increase the available memory.

      Hope this helps.

      Alessandro Erba
      Professor of Physical Chemistry
      Department of Chemistry, University of Torino
      [email protected]

      1 Reply Last reply
      1
      • GiacomoAmbrogioundefined Offline
        GiacomoAmbrogioundefined Offline
        GiacomoAmbrogio Developer
        wrote on last edited by
        #3

        Hi Gryffindor,

        The only way to improve memory management without compromising the calculation parameters, as Alessandro correctly pointed out, would be to run with a lower number of MPI processes.

        If you don’t want to "waste" CPU cores in the process, the openMP version of the code shoud be compatible with CPHF/CPKS. You just need to export OMP_NUM_THREADS according to the number of MPI processes used, ensuring that:

        CPU cores = MPI processes × OMP_NUM_THREADS

        By doing so, you should be able to fully exploit all resources while optimizing memory usage more effectively. Some references can be found in the tutorial and on the CRYSTAL23 paper.

        Hope this helps!

        Giacomo Ambrogio, PhD Student
        Department of Chemistry - University of Torino
        V. Giuria 5, 10125 Torino (Italy)

        1 Reply Last reply
        0
        • Gryffindorundefined Offline
          Gryffindorundefined Offline
          Gryffindor
          wrote on last edited by
          #4

          Hi Alessandro and Giacomo,

          Thank you for the clear and helpful explanations!

          I tried running the calculation with 32 cores, and it indeed helped with memory management. I’ll continue experimenting to optimize performance. Really appreciate your guidance and the references!

          1 Reply Last reply
          ♥
          0

          Powered by Crystal Solutions
          • Login

          • Don't have an account? Register

          • Login or register to search.
          • First post
            Last post
          0
          • Home
          • Recent