Hi Tim,
we don't use sum-factorization for the matrix assembling, but we can use sum-factorization for matrix-free operator application with NGSolve. See tutorial 2-11 on matrix-free operator application, and 3-3 for matrix-free time-stepping methods.
We have timers for the major functions in NGSovle, which you can access via
Code:
for t in Timers():
print (t)
Look out for "Matrix assembling", "SymbolicBFI::CalcElementMatrix",
"static condensation", "SparseMatrixSymmetric::AddElementMatrix", "CGSolver"
They are meant for experts, some count wall-clock, some count single core CPU-time ....
You are welcome to present and discuss your findings.
For matrix-free methods the main difficulty is the preconditioner. We have some experimental stuff floating around, but it is not sufficiently robust for release. Contributions are very welcome !
The complexity of element-matrix assembly in NGSolve is the same as static condensation, O(p^3d).
As far as I remember some comparison with sum-factorization, the break even was between 4 and 6 (on tetrahedral meshes). Of course, it heavily depends on the linear algebra kernel you use, and this is pretty optimized in NGSolve.
Best,
Joachim