Complex Finite Element Spaces in Parallel

More
5 years 1 week ago #2114 by dfoiles
I've been experimenting with the mpi_cmagnet.py tutorial, and I've noticed a problem. When I change the finite element space to complex, I get the following error:
Code:
collect data[node6:16841] *** An error occurred in MPI_Send [node6:16841] *** reported by process [3593732097,7] [node6:16841] *** on communicator MPI_COMM_WORLD [node6:16841] *** MPI_ERR_TYPE: invalid datatype [node6:16841] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [node6:16841] *** and potentially your MPI job) [node7:12248] 5 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal [node7:12248] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

This leads me to believe that NGSolve is having difficulty handling complex finite element spaces in parallel. Is this a bug or a known issue? I have attached my copy mpi_cmagnet.py

Thank you for your time

File Attachment:

File Name: mpi_cmagnet.py
File Size:2 KB
Attachments:
More
5 years 1 week ago #2127 by joachim
we were using MPI_DOUBLE_COMPLEX as MPI type for a long time, and it seemed to work. Correct is MPI_CXX_DOUBLE_COMPLEX.

Change that in ngsolve/basiclinalg/bla.hpp, line 71.

There is a comment about missing MPI_SUM for the C-complex type, but I hope this comment is outdated.

Joachim
More
5 years 1 week ago #2128 by dfoiles
I did what you suggested and ran the same mpi_cmagnet program that I posted before and I no longer have an error, but I have divergence in the system solver.
Code:
assemble VOL element 0/0 0 0.0240793 1 0.0395717 2 0.133622 3 0.361622 . . . 124 8.50443e+71 125 3.40177e+72 126 1.36071e+73 127 5.44284e+73 128 2.17713e+74 129 8.70854e+74 130 nan

When I run this with only 1 processor everything works fine, so there still must be an MPI issue.

Thanks
More
5 years 1 week ago #2129 by joachim
is fixed and available on github,
Joachim
The following user(s) said Thank You: dfoiles
More
5 years 1 week ago #2130 by dfoiles
Everything is working now. Thanks for the help.
Time to create page: 0.106 seconds