Oh, yes. I can create a python function that takes in a table, and a basematrix to directly create the smoother. This approach further speed up the process by about a factor of 2. Thanks again!
For the timing of constructing a vertex patch block in 3D, python looping is about 200 times slower than the C++ looping (the AFW block in edge space) in my machine, which I found quite interesting.