Kokkos unable to allocate enough memory
Hallo,
I'm having an issue where I cannot run simulations with high resolution because I get an error due to memory allocation:
terminate called after throwing an instance of 'std::runtime_error'
what(): Kokkos failed to allocate memory for label "grid_temp_3D_7". Allocation using MemorySpace named "Cuda" failed with the following error: Allocation of size 1.465 G failed, likely due to insufficient memory. (The allocation mechanism was cudaMalloc(). The Cuda allocation returned the error code ""cudaErrorMemoryAllocation".)
The cluster in question is such that a GPU has 40 GB of memory available, so a Allocation of size 1.465 G failed
should not be due to actual limits in memory.
I've asked the helpdesk of my cluster and they've replied: "Kokkos is unable to allocate enough memory. How much memory do you allocate on GPU? If you are unsure, you can track Kokkos memory allocations with Kokkos Tools https://github.com/kokkos/kokkos-tools/tree/master/profiling".
I'm not sure if I need to somehow write it into the fargOCA code or if it's enough to run this tracker after the simulation has run (probably the first)... But maybe you already have something to take care of this?