Hi,
The CPKS implementation in PCRYSTAL is parallelized according to a replicated-data strategy, which means that each process has a copy of the main arrays. Assume that each process (usually coinciding with each CPU core) allocates X GB of memory. If you run the job over n CPU cores in the same node, the total amount of required memory on the node will be nX.
So, one way to reduce the memory requirement of a replicated-data calculation is to lower the number of used CPU cores per node.
As an example, if you have a node with 128 CPU cores, try running the calculation on just 64 or 32 cores, making sure no other processes are executed on the node at the same time. This would effectively increase the available memory.
Hope this helps.