Kill run when limit time is being reached on cluster
Hallo, I wonder if it would be possible for the code to keep track of the total integration time, so that it can be preventively stopped when the limit slurm time is about to be reached. This would avoid the process to be killed by slurm for example in the middle of writing large diskN.h5 files or outputs. It has happened to me that the run is terminated due to time limit, which leads to repeated lines when the job is restarted and the same step is repeated. This makes it extremely hard to post process the data. One could use the maximum time spent on an integration step or the average time spent on the last X number of steps as a proxy for the typical integration time. But better approaches probably exist...