Dear NGSolve Developers;
I am having issues with the direct solver when dealing with problems when ndof ~ 2-10 million. My problem involves an inhomogeneous dirichlet bc and i use the technique given in Sec 1.3 of the documentation, namely:
Code:
u, v = fes.TnT()
a = BilinearForm(fes, symmetric=True)
a += grad(u)*grad(v)*dx
f = LinearForm (fes)
gfu = GridFunction (fes)
gfu.Set (ubar, definedon=BND)
#
with TaskManager():
a.Assemble()
f.Assemble()
res = gfu.vec.CreateVector()
res.data = f.vec - a.mat * gfu.vec
gfu.vec.data += a.mat.Inverse(freedofs=fes.FreeDofs(), inverse="sparsecholesky") * res
The slurm script i use to launch ngsolve:
Code:
#!/bin/bash
#SBATCH --job-name=ngs
#SBATCH -N 4
#SBATCH --ntasks 96
#SBATCH --ntasks-per-node=24
#SBATCH --ntasks-per-core=1
#SBATCH --mem=24gb
#Load ngsolve_mpi module
module load apps/ngsolve_mpi
mpirun ngspy script.py
However the slurm script returns the following message for one of the nodes:
Code:
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
A system call failed during shared memory initialization that should
not have. It is likely that your MPI job will now either abort or
experience performance degradation.
Local host: node16
System call: unlink(2) /tmp/openmpi-sessions-1608000011@node16_0/19664/1/2/vader_segment.node16.2
Error: No such file or directory (errno 2)
--------------------------------------------------------------------------
The exact same piece of code of course runs fine when the system is smaller. It's only when i use a refined mesh for the same geometry, i get in to the errors. I have tried refinements both outside (using gmsh and then reading the refined mesh in ngsolve) and inside (i.e. reading a corase mesh and then refining it with ngsolve's refine) but both end up in errors. Could you comment on likely cause of the problem? Thank you in advance for your help.