- Thank you received: 17
custer installation issue (again)
6 years 3 months ago #715
by lkogler
Replied by lkogler on topic custer installation issue (again)
Hi Guosheng,
Sorry to hear you are running into issues again . Let's see if we can resolve this.
Could you send me a backtrace of gdb for a simple .pde-file?
For example, execute "d1_square.pde" from the pde_tutorials with
".wrap_mpirun" should be something like this (with OpenMPI), and just pipes the output to seperate files
for all mpi-ranks:
Also, please send me the output of:
Also, could you try to import netgen in python and see if that crashes too?
Finally, your CmakeCache, the cmake-command you used and the cmake-output would be useful.
In principle, the tests are supposed to work with MPI! In this case, of course, they all fail because something is wrong with the ngsolve-libraries.
While this is probably not the issue you are running into in this case, on the cluster, they might still fail because you might not be allowed to run any MPI computations on the login node . If that is the issue, you have to go through the batch system (e.g. by switching to an interactive session and then running the tests as usual.)
Best,
Lukas
Sorry to hear you are running into issues again . Let's see if we can resolve this.
Could you send me a backtrace of gdb for a simple .pde-file?
For example, execute "d1_square.pde" from the pde_tutorials with
Code:
mpirun -np 5 bash .wrap_mpirun gdb -batch -ex "run" -ex bt --args ngs d1_square.pde
".wrap_mpirun" should be something like this (with OpenMPI), and just pipes the output to seperate files
for all mpi-ranks:
Code:
#!/bin/sh
ARGS=$@
$ARGS 1>out_p$OMPI_COMM_WORLD_RANK 2>out_p$OMPI_COMM_WORLD_RANK
Also, please send me the output of:
Code:
which ngspy | xargs cat
Code:
which ngs | xargs ldd
Also, could you try to import netgen in python and see if that crashes too?
Finally, your CmakeCache, the cmake-command you used and the cmake-output would be useful.
In principle, the tests are supposed to work with MPI! In this case, of course, they all fail because something is wrong with the ngsolve-libraries.
While this is probably not the issue you are running into in this case, on the cluster, they might still fail because you might not be allowed to run any MPI computations on the login node . If that is the issue, you have to go through the batch system (e.g. by switching to an interactive session and then running the tests as usual.)
Best,
Lukas
- Guosheng Fu
- Topic Author
- Offline
- Elite Member
Less
More
- Thank you received: 6
6 years 3 months ago #716
by Guosheng Fu
Replied by Guosheng Fu on topic custer installation issue (again)
Lukas,
Here is my cmake file:
I turned off MPI.
Here in the attachment is the CmakeCache in the build directory.
I am doing everything in a computing node via a interactive session.
which ngs | xargs ldd gives the following:
I do not have ngspy in my $NETGENDIR directory, where only
ngs, ngscxx, ngsld
are available.
In my laptop version of ngsolve, I also do not have ngspy....
Best,
Guosheng
Here is my cmake file:
Code:
cmake \
-DUSE_UMFPACK=OFF \
-DCMAKE_PREFIX_PATH=/users/gfu1/data/ngsolve-install-plain \
-DCMAKE_BUILD_TYPE=Release \
-DINSTALL_DIR=/users/gfu1/data/ngsolve-install-plain \
-DUSE_GUI=OFF \
-DUSE_MPI=OFF \
-DUSE_MUMPS=OFF \
-DUSE_HYPRE=OFF \
-DUSE_MKL=ON \
-DMKL_ROOT=/gpfs/runtime/opt/intel/2017.0/mkl \
-DZLIB_INCLUDE_DIR=/gpfs/runtime/opt/zlib/1.2.8/ \
-DZLIB_LIBRARY=/gpfs/runtime/opt/zlib/1.2.8/lib/libz.so \
-DMKL_SDL=OFF \
-DCMAKE_CXX_COMPILER=/gpfs/runtime/opt/gcc/5.2.0/bin/g++ \
-DCMAKE_C_COMPILER=/gpfs/runtime/opt/gcc/5.2.0/bin/gcc \
../ngsolve-src
I turned off MPI.
Here in the attachment is the CmakeCache in the build directory.
I am doing everything in a computing node via a interactive session.
which ngs | xargs ldd gives the following:
Code:
linux-vdso.so.1 => (0x00007fff643ff000)
/usr/local/lib/libslurm.so (0x00007f7bf293f000)
libsolve.so => /users/gfu1/data/ngsolve-install-plain/lib/libsolve.so (0x00007f7bf2644000)
libngcomp.so => /users/gfu1/data/ngsolve-install-plain/lib/libngcomp.so (0x00007f7bf1adc000)
libngfem.so => /users/gfu1/data/ngsolve-install-plain/lib/libngfem.so (0x00007f7bf0523000)
libngla.so => /users/gfu1/data/ngsolve-install-plain/lib/libngla.so (0x00007f7befdc6000)
libngbla.so => /users/gfu1/data/ngsolve-install-plain/lib/libngbla.so (0x00007f7befadb000)
libngstd.so => /users/gfu1/data/ngsolve-install-plain/lib/libngstd.so (0x00007f7bef76c000)
libnglib.so => /users/gfu1/data/ngsolve-install-plain/lib/libnglib.so (0x00007f7bef562000)
libinterface.so => /users/gfu1/data/ngsolve-install-plain/lib/libinterface.so (0x00007f7bef308000)
libstl.so => /users/gfu1/data/ngsolve-install-plain/lib/libstl.so (0x00007f7bef07b000)
libgeom2d.so => /users/gfu1/data/ngsolve-install-plain/lib/libgeom2d.so (0x00007f7beee39000)
libcsg.so => /users/gfu1/data/ngsolve-install-plain/lib/libcsg.so (0x00007f7beeb33000)
libmesh.so => /users/gfu1/data/ngsolve-install-plain/lib/libmesh.so (0x00007f7bee623000)
libz.so.1 => /gpfs/runtime/opt/zlib/1.2.8/lib/libz.so.1 (0x00007f7bee40d000)
libvisual.so => /users/gfu1/data/ngsolve-install-plain/lib/libvisual.so (0x00007f7bee20c000)
libpython3.6m.so.1.0 => /gpfs/runtime/opt/python/3.6.1/lib/libpython3.6m.so.1.0 (0x00007f7bedd04000)
/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64/libmkl_intel_lp64.so (0x00007f7bed1e6000)
/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64/libmkl_gnu_thread.so (0x00007f7bec01a000)
/gpfs/runtime/opt/intel/2017.0/mkl/lib/intel64/libmkl_core.so (0x00007f7bea52a000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f7bea31e000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f7bea101000)
libstdc++.so.6 => /gpfs/runtime/opt/gcc/5.2.0/lib64/libstdc++.so.6 (0x00007f7be9d73000)
libm.so.6 => /lib64/libm.so.6 (0x00007f7be9aef000)
libgomp.so.1 => /gpfs/runtime/opt/gcc/5.2.0/lib64/libgomp.so.1 (0x00007f7be98ce000)
libgcc_s.so.1 => /gpfs/runtime/opt/gcc/5.2.0/lib64/libgcc_s.so.1 (0x00007f7be96b7000)
libc.so.6 => /lib64/libc.so.6 (0x00007f7be9323000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7bf2cf6000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007f7be911f000)
librt.so.1 => /lib64/librt.so.1 (0x00007f7be8f16000)
I do not have ngspy in my $NETGENDIR directory, where only
ngs, ngscxx, ngsld
are available.
In my laptop version of ngsolve, I also do not have ngspy....
Best,
Guosheng
Attachments:
- Guosheng Fu
- Topic Author
- Offline
- Elite Member
Less
More
- Thank you received: 6
6 years 3 months ago #717
by Guosheng Fu
Replied by Guosheng Fu on topic custer installation issue (again)
Now I have a complete rebuild of ngsolve.
The segmentation fault is gone, surprise!
But I have a MKL issue (this bug is way more friendly:>):
In my application, I need a hybrid-Poisson solver, so I am using the static condensation approach for the implementation, and apply a sparsecholesky factorization for the resulting hybrid matrix.
It is this line that cause the code crash:
I recall that I have another build that turned off MKL, which cause the seg. fault before, but now I am not completely sure...........
Best,
Guosheng
The segmentation fault is gone, surprise!
But I have a MKL issue (this bug is way more friendly:>):
Code:
Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so.
In my application, I need a hybrid-Poisson solver, so I am using the static condensation approach for the implementation, and apply a sparsecholesky factorization for the resulting hybrid matrix.
It is this line that cause the code crash:
Code:
inva = av.mat.Inverse(fes.FreeDofs(coupling=True), inverse="sparsecholesky")
I recall that I have another build that turned off MKL, which cause the seg. fault before, but now I am not completely sure...........
Best,
Guosheng
6 years 3 months ago #718
by lkogler
Replied by lkogler on topic custer installation issue (again)
If you have not enabled MPI, that makes things less complicated. You don't need ngspy in that case
(that only exists because we had some issues with linking MKL libraries and MPI on certain systems).
You can simply run
or something similar.
(that only exists because we had some issues with linking MKL libraries and MPI on certain systems).
You can simply run
Code:
gdb -ex run --args python3 poisson.py
6 years 3 months ago #719
by lkogler
Replied by lkogler on topic custer installation issue (again)
Try turning MKL_SDL on,
-DMKL_SDL=ON
-DMKL_SDL=ON
- Guosheng Fu
- Topic Author
- Offline
- Elite Member
Less
More
- Thank you received: 6
6 years 3 months ago #720
by Guosheng Fu
Replied by Guosheng Fu on topic custer installation issue (again)
Ha, with
-DMKL_SDL=ON
the installation works!
This shall save a lot my computing time, thank you guys! (hopefully everything works...)
-DMKL_SDL=ON
the installation works!
This shall save a lot my computing time, thank you guys! (hopefully everything works...)
Time to create page: 0.110 seconds