Building NGSolve with HYPRE support

1 year 8 months ago #2347 by JanWesterdiep
Hey Lukas,

I dont *think* this Ubuntu machine has AVX512, and I *think* this machine is running gcc 7.4.0.

So you *are* able to run the attached file? That's super strange, haha.

I'm assuming you meant to type `precond_test.py` here; in any case, i managed to set a `catchpoint` to the `bad_cast`. The stack trace is
Setup Hypre preconditioner

Thread 1 "python3" hit Catchpoint 2 (exception thrown), 0x00007fffee3a4ced in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) bt
#0  0x00007fffee3a4ced in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007fffee3a3a52 in __cxa_bad_cast () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007fffe4b07e46 in ngcomp::HyprePreconditioner::Setup (this=this@entry=0x1cdce60, matrix=...) at /data/ngsolve/ngsolve-src/comp/hypre_precond.cpp:67
#3  0x00007fffe4b08b75 in ngcomp::HyprePreconditioner::FinalizeLevel (this=0x1cdce60, mat=0x1cddb08) at /data/ngsolve/ngsolve-src/comp/hypre_precond.cpp:56
#4  0x00007fffe482f70f in ngcomp::S_BilinearForm<double>::DoAssemble (this=0x1cdb5a0, clh=...) at /data/ngsolve/ngsolve-src/comp/bilinearform.cpp:2386
#5  0x00007fffe47df412 in ngcomp::BilinearForm::Assemble (this=0x1cdb5a0, lh=...) at /data/ngsolve/ngsolve-src/comp/bilinearform.cpp:688
#6  0x00007fffe47e06c4 in ngcomp::BilinearForm::ReAssemble (this=<optimized out>, lh=..., reallocate=<optimized out>, reallocate@entry=false) at /data/ngsolve/ngsolve-src/comp/bilinearform.cpp:840
#7  0x00007fffe4cdb4f7 in <lambda(BF&, bool)>::operator() (reallocate=false, self=..., __closure=<optimized out>) at /data/ngsolve/ngsolve-src/comp/python_comp.cpp:2090
#8  pybind11::detail::argument_loader<ngcomp::BilinearForm&, bool>::call_impl<void, ExportNgcomp(pybind11::module&)::<lambda(BF&, bool)>&, 0, 1, pybind11::gil_scoped_release> (f=..., this=0x7fffffffd5f0) at /data/ngsolve/ngsolve-install/include/pybind11/cast.h:1962
#9  pybind11::detail::argument_loader<ngcomp::BilinearForm&, bool>::call<void, pybind11::gil_scoped_release, ExportNgcomp(pybind11::module&)::<lambda(BF&, bool)>&> (f=..., this=0x7fffffffd5f0) at /data/ngsolve/ngsolve-install/include/pybind11/cast.h:1944
#10 pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::operator() (__closure=0x0, call=...) at /data/ngsolve/ngsolve-install/include/pybind11/pybind11.h:159
#11 pybind11::cpp_function::<lambda(pybind11::detail::function_call&)>::_FUN(pybind11::detail::function_call &) () at /data/ngsolve/ngsolve-install/include/pybind11/pybind11.h:137
#12 0x00007fffe4a87d47 in pybind11::cpp_function::dispatcher (self=<optimized out>, args_in=(<ngsolve.comp.BilinearForm at remote 0x7fffec328960>,), kwargs_in=0x0) at /data/ngsolve/ngsolve-install/include/pybind11/pybind11.h:624
#13 0x00000000005674fc in _PyCFunction_FastCallDict () at ../Objects/methodobject.c:231
#14 0x000000000050abb3 in call_function.lto_priv () at ../Python/ceval.c:4875
#15 0x000000000050c5b9 in _PyEval_EvalFrameDefault () at ../Python/ceval.c:3335
#16 0x0000000000508245 in PyEval_EvalFrameEx (throwflag=0,
    f=Frame 0x149e628, for file precond_test.py, line 42, in SolveProblem (h=<float at remote 0x7ffff7f703a8>, p=1, levels=5, condense=False, precond='hypre', mesh=<ngsolve.comp.Mesh at remote 0x7fffdad99bf8>, fes=<ngsolve.comp.H1 at remote 0x7fffe6222f10>, u=<ngsolve.comp.ProxyFunction at remote 0x7fffda21c678>, v=<ngsolve.comp.ProxyFunction at remote 0x7fffdadb7990>, a=<ngsolve.comp.BilinearForm at remote 0x7fffec328960>, f=<ngsolve.comp.LinearForm at remote 0x7ffff57dd848>, gfu=<ngsolve.comp.GridFunction at remote 0x7fffda21c7d8>, c=<ngsolve.comp.Preconditioner at remote 0x7ffff57dd9d0>, steps=[], l=0)) at ../Python/ceval.c:754
#17 _PyEval_EvalCodeWithName.lto_priv.1836 () at ../Python/ceval.c:4166
#18 0x000000000050a080 in fast_function.lto_priv () at ../Python/ceval.c:4992
#19 0x000000000050aa7d in call_function.lto_priv () at ../Python/ceval.c:4872
#20 0x000000000050d390 in _PyEval_EvalFrameDefault () at ../Python/ceval.c:3351
#21 0x0000000000508245 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0xaeeb08, for file precond_test.py, line 64, in <module> ()) at ../Python/ceval.c:754
#22 _PyEval_EvalCodeWithName.lto_priv.1836 () at ../Python/ceval.c:4166
#23 0x000000000050b403 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=0, args=0x0, locals=<optimized out>, globals=<optimized out>, _co=<optimized out>) at ../Python/ceval.c:4187
#24 PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at ../Python/ceval.c:731
#25 0x0000000000635222 in run_mod () at ../Python/pythonrun.c:1025
#26 0x00000000006352d7 in PyRun_FileExFlags () at ../Python/pythonrun.c:978
#27 0x0000000000638a8f in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:419
#28 0x0000000000638c65 in PyRun_AnyFileExFlags () at ../Python/pythonrun.c:81
#29 0x0000000000639631 in run_file (p_cf=0x7fffffffe0dc, filename=<optimized out>, fp=<optimized out>) at ../Modules/main.c:340
#30 Py_Main () at ../Modules/main.c:810
#31 0x00000000004b0f40 in main (argc=2, argv=0x7fffffffe2d8) at ../Programs/python.c:69

so it looks like the `BaseMatrix &matrix` is not actually a `ParallelMatrix`?
Attachments:

Please Log in or Create an account to join the conversation.

1 year 8 months ago - 1 year 8 months ago #2348 by lkogler
Ah, my bad, I thought your last error was still from navierstokes.py!

In this case, the problem seems to be that the hypre preconditioner ONLY works with MPI (or, more precicely, the interface NGSolve-hypre is written such that it does not work in the non-MPI case). It does not even throw an exception and exit gracefully but just crashes on the invalid cast ...

Also, refining distributed meshes only works uniformly, and only before spaces are defined on the mesh.

As an additional remark, hypre only really works well for order 1 discretizations. That is typical for AMG solvers. An option would be to combine this with an additive block-jacobi for the high order part.

Best, Lukas
Attachments:

Please Log in or Create an account to join the conversation.

1 year 8 months ago #2354 by JanWesterdiep
Heya,

OK, the fact that distributed meshes can only by refined uniformly does hamper my intended application. However maybe with only 1 process, the mesh is not really distributed? In any case, I'd like to try getting HYPRE to work, even if I can't do adaptive refinement.

I found how to run NGSolve with MPI (`mpiexec -np N ngspy precond_test.py`) but whatever `N` I choose, I get some error that looks like this. Am I not running MPI correctly? Do you have any idea?

Best,
Jan
$ mpiexec -np 1 ngspy precond_test.py
 Generate Mesh from spline geometry
 Boundary mesh done, np = 8
 CalcLocalH: 8 Points 0 Elements 0 Surface Elements
 Meshing domain 1 / 1
 load internal triangle rules
 Surface meshing done
 Edgeswapping, topological
 Smoothing
 Split improve
 Combine improve
 Smoothing
 Edgeswapping, metric
 Smoothing
 Split improve
 Combine improve
 Smoothing
 Edgeswapping, metric
 Smoothing
 Split improve
 Combine improve
 Smoothing
 Update mesh topology
 Update clusters
assemble VOL element 6/6
assemble VOL element 6/6
Setup Hypre preconditioner
Traceback (most recent call last):
  File "precond_test.py", line 65, in <module>
    print(SolveProblem(levels=5, precond="hypre"))
  File "precond_test.py", line 45, in SolveProblem
    a.Assemble()
netgen.libngpy._meshing.NgException: std::bad_cast
 in Assemble BilinearForm 'biform_from_py'

-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[29848,1],0]
  Exit code:    1
--------------------------------------------------------------------------

Please Log in or Create an account to join the conversation.

1 year 8 months ago #2355 by lkogler
The hypre interface only works with -np 2 or more. It is just programmed that way.

When you run with -np 2 or more, you need to generate the mesh on the master and then distribute it (see the file attached to my last comment).

Best,
Lukas

Please Log in or Create an account to join the conversation.

1 year 8 months ago #2356 by lkogler
As a workaround, you could try to use hypre through the PETSc interface, that should also work with only one rank. The overhead should not be substantially worse.

Best,
Lukas

Please Log in or Create an account to join the conversation.

1 year 8 months ago #2359 by JanWesterdiep
Hey Lukas, ah I didn't catch your earlier attached file. Thank you very much; I now have something that works more-or-less. I will play around with the PETSc interface.

For you all my questions have been answered!!

Please Log in or Create an account to join the conversation.

© 2019 Netgen/NGSolve