Have you actually looked at LgMatrices? There is no such run time penalty from calling a procedure such as value(M, i,j). You simply write e.g.
val := A^[i]^[k] * B^[k]^[j];
But that will lead to unnecessary cache misses because of multiple levels of indirection.
If you have such an _expression_ deeply nested in a loop, these cache misses will add up and significantly impact performance.
It is better to allocate all metadata (such as length of a vector) and payload data together in a single memory block. This is called data locality.