I want to migrate a piece of code that involves a number of vector and matrix calculations to C or C++, and the objective is to basically speed up the code as much as possible.
My question is that are linear algebra calculations with "for" loops inside C code as fast as using Lapack/Blas or there is some gain by using those libraries?
Or put in other words, can I write a C code doing linear algebra calculations with simple for loops which works as fast as a a code that utilizes Lapack/Blas ?
I've been on Google for less then 2 minutes and have no experiance with the LAPACK lib (and by extention no experiace with BLAS) but from what I can tell, it's specifically designed to be faster with liniar algebra equations.
What OS are you doing these operations on? I might have a way for you to analyse the process and see if the operations are hanging at any particular point.
I also have no experience, but teachers claim that it is faster than simply computing inner products of rows and columns and stuff like that. Supposedly, they use algorithms that have better asymptotic complexity (consequently scale better), and most importantly behave better in hierarchical memory (i.e. cached memory). As to whether they use platform specific optimizations, I dunno. But that should be written on their website.
Regards
EDIT: There are other issues to consider. The FP arithmetic in computers is non-associative, even for addition and multiplication. Consequently, you will have to think long and deep how to order your computations to produce numerically stable results. I guess, libraries like that should take care of such highly non-trivial aspects. On the other hand, they are probably slower if you are only multiplying 3x3 or 4x4 matrices or smth like that.
What OS are you doing these operations on? I might have a way for you to analyse the process and see if the operations are hanging at any particular point.