Forum Message

 

 

We have moved the forum to https://forum.ngsolve.org . This is an archived version of the topics until 05/05/23. All the topics were moved to the new forum and conversations can be continued there. This forum is just kept as legacy to not invalidate old links. If you want to continue a conversation just look for the topic in the new forum.

Notice

The forum is in read only mode.

Open MPI question

More
4 years 11 months ago #1555 by ddrake
Open MPI question was created by ddrake
Hi,

I have installed libopenmpi2 2.1.1-8 and libopenmpi-dev 2.1.1-8 on Ubuntu 18.04. I then built NGSolve from source with USE_MPI=ON.

The py_tutorial examples are all working for me in that I can run the mpi examples like this
Code:
mpirun -np 5 ngspy mpi_poisson.py
I can also run the non-mpi examples like this
Code:
netgen poisson.py
-- But I can no longer run the non-mpi examples in this way:
Code:
python3 poisson.py
. If I try, I get an error like this:

importing NGSolve-6.2.1902-107-g5aa0a3e4
[dow-HP-Notebook:18549] mca_base_component_repository_open: unable to open mca_patcher_overwrite: /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi/mca_patcher_overwrite.so: undefined symbol: mca_patcher_base_patch_t_class (ignored)
[dow-HP-Notebook:18549] mca_base_component_repository_open: unable to open mca_shmem_mmap: /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi/mca_shmem_mmap.so: undefined symbol: opal_show_help (ignored)
[dow-HP-Notebook:18549] mca_base_component_repository_open: unable to open mca_shmem_posix: /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi/mca_shmem_posix.so: undefined symbol: opal_shmem_base_framework (ignored)
[dow-HP-Notebook:18549] mca_base_component_repository_open: unable to open mca_shmem_sysv: /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi/mca_shmem_sysv.so: undefined symbol: opal_show_help (ignored)


It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS


It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_init failed
--> Returned value Error (-1) instead of ORTE_SUCCESS


It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
--> Returned "Error" (-1) instead of "Success" (0)

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[dow-HP-Notebook:18549] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

This is a problem when I try to run code which uses a custom shared library based on NGSolve libraries but which is not designed for openmpi and doesn't need the netgen gui.

Is there a way to build my custom library so the mpi dependencies are not included so it can still be used by code that is run directly from python? -- or is it recommended to have two builds of NGSolve - one for parallel and one for serial?

Thanks!
Dow
More
4 years 11 months ago #1556 by Guosheng Fu
Replied by Guosheng Fu on topic Open MPI question
It might be the loading of libraries issue...
You can:
Use "ngspy" instead of python3
Set these libraries in the LD_PRELOAD_PATH (have a look at the "ngspy"-file in the NGSolve bin-directory)
The following user(s) said Thank You: ddrake
More
4 years 11 months ago #1558 by lkogler
Replied by lkogler on topic Open MPI question
Yes, that seems to be the issue. I am working on correcting the link order.
The following user(s) said Thank You: ddrake
More
4 years 11 months ago #1563 by lkogler
Replied by lkogler on topic Open MPI question
This also is something that does not seem to happen with newer versions of OpenMPI.
I does not happen with versions 3.1.2 or 4.0.
The following user(s) said Thank You: ddrake
Time to create page: 0.099 seconds