- Thank you received: 6
MPI related questions
- Guosheng Fu
- Topic Author
- Offline
- Elite Member
Less
More
5 years 7 months ago #1526
by Guosheng Fu
MPI related questions was created by Guosheng Fu
Hello,
I am now starting to play with MPI in ngsolve and got a couple of questions to ask:
(1) For my installation, I am asking ngsolve to download and build for me metis, mumps, and hypre, while it succeed in buliding all the packages, I can not get mumps working. In the final linking phase, my libngla.so has some undefined references...
../linalg/libngla.so: undefined reference to `blacs_gridinfo_'
../linalg/libngla.so: undefined reference to `pzpotrf_'
So, I my working installation, I have to turn off mumps. This is always the most painful part due to my limited c++ experience... Do you see an immediate fix for this? I can provide more details about my build if you want...
(2) In ngsolve.org/docu/latest/how_to/howto_parallel.html , it says taskmanager allows for hybrid parallelization. I am assuming it is MPI+OPENMP, so how to do this hybrid parallization? I tried to run
mpirun -n 4 ngspy mpi_poisson.py
with SetNumThreads( in the code, but it didn't work...
(3) Again in ngsolve.org/docu/latest/how_to/howto_parallel.html , it says mpi does not support periodic boundary yet :<
But in the git repository, there is a quite recent commit on mpi+periodic... are you recently working on solving this issue?
Best always,
Guosheng
I am now starting to play with MPI in ngsolve and got a couple of questions to ask:
(1) For my installation, I am asking ngsolve to download and build for me metis, mumps, and hypre, while it succeed in buliding all the packages, I can not get mumps working. In the final linking phase, my libngla.so has some undefined references...
../linalg/libngla.so: undefined reference to `blacs_gridinfo_'
../linalg/libngla.so: undefined reference to `pzpotrf_'
So, I my working installation, I have to turn off mumps. This is always the most painful part due to my limited c++ experience... Do you see an immediate fix for this? I can provide more details about my build if you want...
(2) In ngsolve.org/docu/latest/how_to/howto_parallel.html , it says taskmanager allows for hybrid parallelization. I am assuming it is MPI+OPENMP, so how to do this hybrid parallization? I tried to run
mpirun -n 4 ngspy mpi_poisson.py
with SetNumThreads( in the code, but it didn't work...
(3) Again in ngsolve.org/docu/latest/how_to/howto_parallel.html , it says mpi does not support periodic boundary yet :<
But in the git repository, there is a quite recent commit on mpi+periodic... are you recently working on solving this issue?
Best always,
Guosheng
5 years 7 months ago - 5 years 7 months ago #1527
by lkogler
Replied by lkogler on topic MPI related questions
1) I think the problem here is the loading of the BLACS/SCALapack libraries.
You can:
2) MPI & C++11 threads. It should work like this:
Assembling/Applying of BLFs will be hybrid parallel, but most of the solvers/preconditioners will still be MPI-only.
3) It should work. Keep in mind that your mesh must contain Surface- and BBND- Elements. (This is only an issue with manually generated meshes). Please contact me if you run into any problems with this.
You can:
- Use "ngspy" instead of python3
- Set these libraries in the LD_PRELOAD_PATH (have a look at the "ngspy"-file in the NGSolve bin-directory)
- In your python scripts, before importing ngsolve:
Code:from ctypes import CDLL, RTLD_GLOBAL for lib in THE_LIBRARIES_YOU_NEED: CDLL(lib, RTLD_GLOBAL)
2) MPI & C++11 threads. It should work like this:
Code:
ngsglobals.numthreads=X
3) It should work. Keep in mind that your mesh must contain Surface- and BBND- Elements. (This is only an issue with manually generated meshes). Please contact me if you run into any problems with this.
Last edit: 5 years 7 months ago by lkogler.
- Guosheng Fu
- Topic Author
- Offline
- Elite Member
Less
More
- Thank you received: 6
5 years 7 months ago #1528
by Guosheng Fu
Replied by Guosheng Fu on topic MPI related questions
I still can't get mumps working...
Here is the details of my build, maybe you can help me find the bug:
(1) I have a local gcc-8.1, python3, and mpich installed
(2) I am using intel mkl library to locate Lapack/Blast
do-configure.txt contains my cmake details
c.txt is the output of running "./do-configure",
m.txt is the output of running "make VERBOSE=1", which produce the error message at final linking stage (line 4744--4756)
Thanks!
Here is the details of my build, maybe you can help me find the bug:
(1) I have a local gcc-8.1, python3, and mpich installed
(2) I am using intel mkl library to locate Lapack/Blast
do-configure.txt contains my cmake details
c.txt is the output of running "./do-configure",
m.txt is the output of running "make VERBOSE=1", which produce the error message at final linking stage (line 4744--4756)
Thanks!
Attachments:
- Guosheng Fu
- Topic Author
- Offline
- Elite Member
Less
More
- Thank you received: 6
5 years 7 months ago #1529
by Guosheng Fu
Replied by Guosheng Fu on topic MPI related questions
Wait, this is exactly the same error I encountered two years ago (when I am installing in another machine)
ngsolve.org/forum/ngspy-forum/11-install...root-access?start=24
It was the MKL library issue, and I got the issue fixed by using a static library. -DMKL_STATIC=ON lol
But then I encountered an issue with MUMPS, in the demo code
, I refined twice the mesh with
to make the problem bigger, then mumps solver failed to factorize the matrix, exit with a segmentation fault..... hyper and masterinverse are working fine...
Is it a bug, or is there still something wrong with my installation?
ngsolve.org/forum/ngspy-forum/11-install...root-access?start=24
It was the MKL library issue, and I got the issue fixed by using a static library. -DMKL_STATIC=ON lol
But then I encountered an issue with MUMPS, in the demo code
Code:
mpi_poisson.py
Code:
ngmesh.Refine()
Is it a bug, or is there still something wrong with my installation?
5 years 7 months ago - 5 years 7 months ago #1533
by lkogler
Replied by lkogler on topic MPI related questions
Did it crash or did it terminate with an error message? I am looking into it.
Last edit: 5 years 7 months ago by lkogler.
- Guosheng Fu
- Topic Author
- Offline
- Elite Member
Less
More
- Thank you received: 6
5 years 7 months ago #1534
by Guosheng Fu
Replied by Guosheng Fu on topic MPI related questions
For the mpi_poisson.py file (with two mesh refinement), I run
the code is working fine if I take X to be 1 or 2, converged in 1 iteration.
But generate the following seg fault if I take X to be 3
And generate the following message if I take X to be 4
But mumps is working fine with smaller system when I only do one mesh refinement...
Code:
mpirun -n X ngspy mpi_possion.py
But generate the following seg fault if I take X to be 3
Update Direct Solver PreconditionerMumps Parallel inverse, symmetric = 0
analysis ... factor ... /afs/crc.nd.edu/user/g/gfu/NG/ngsolve-install-mpi/bin/ngspy: line 2: 32379 Segmentation fault LD_PRELOAD=$LD_PRELOAD:/afs/crc.nd.edu/user/g/gfu/NG/mpich-inst/lib/libmpi.so:/opt/crc/i/intel/19.0/mkl/lib/intel64/libmkl_core.so:/opt/crc/i/intel/19.0/mkl/lib/intel64/libmkl_gnu_thread.so:/opt/crc/i/intel/19.0/mkl/lib/intel64/libmkl_intel_lp64.so:/opt/crc/i/intel/19.0/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so:/usr/lib64/libgomp.so.1 /afs/crc.nd.edu/user/g/gfu/NG/PY3-mpi/bin/python3 $*
And generate the following message if I take X to be 4
Code:
Update Direct Solver PreconditionerMumps Parallel inverse, symmetric = 0
analysis ... factor ... 2 :INTERNAL Error: recvd root arrowhead
2 :not belonging to me. IARR,JARR= -41598 13
2 :IROW_GRID,JCOL_GRID= 0 0
2 :MYROW, MYCOL= 0 2
2 :IPOSROOT,JPOSROOT= 10982 0
application called MPI_Abort(MPI_COMM_WORLD, -99) - process 3
But mumps is working fine with smaller system when I only do one mesh refinement...
Time to create page: 0.127 seconds