octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

speed of octave symbol table code


From: John W. Eaton
Subject: speed of octave symbol table code
Date: Mon, 22 Oct 2007 20:17:01 -0400

On 23-Oct-2007, David Bateman wrote:

| After having looked at the tic/toc issue for possible speed ups, its
| clear that there are speed issues in the symbol table code somewhere. A
| basic call to an an m-file or oct-file on my system costs 10ms
| regardless of how much work it does. So trying to look for simple
| solutions to reduce this and assuming it was the symbol table lookup
| code at fault I created a test function
| 
| function dummy ()
| endfunction
| 
| and ran
| 
| for i=1:5e5, dummy; endfor
| 
| on a version of Octave compiled for use with gprof.. Running gprof on
| the a flat profile gave the following as the top consuming functions
| 
|   %   cumulative   self              self     total
|  time   seconds   seconds    calls   s/call   s/call  name
|  16.73      4.34     4.34   500026     0.00     0.00 symbol_table::clear()
|  15.32      8.31     3.97   505802     0.00     0.00
| unwind_protect::run_frame(
| std::basic_string<char, std::char_traits<char>, std::allocator<char> >
| const&)
|   3.92      9.32     1.02  1508898     0.00     0.00
| symbol_table::lookup(std::
| basic_string<char, std::char_traits<char>, std::allocator<char> >
| const&, bool,
| bool)
|   3.57     10.25     0.93   500026     0.00     0.00
| octave_user_function::do_m
| ulti_index_op(int, octave_value_list const&)
|   3.38     11.12     0.88  7022416     0.00     0.00
| unwind_protect::add(void (
| *)(void*), void*)
| 
| So clearing the local symbol table in the dummy function even though its
| empty is taking the most time and the second most consuming function is
| the unwind_protect::run_frame in the
| octave_user_function::do_multi_index_op function. I'm not sure how to
| address the issue with the unwind_protect code. However, the issue with
| the symbol_table::clear function is fairly obvious... The symbol_table
| has a hash table with a default of 128 elements and each of these 128
| elements of the hash has to be cleared, even if there are no symbols
| stored in the hash. It seems to me that that is a trade off in the size
| of the hash between the speed of symbol lookup and symbol table clear.
| If the symbol table is smaller than the hash its likely to be the clear
| that is slower, and if the number of symbols is significantly larger
| than the hash, then its the lookup time that will dominant.
| 
| What is the likely average number of symbols in the local symbol tables
| of the functions? I suspect that 128 is a high estimate. Note that the
| global symbol tables are initialized separately with 2048 hash entries
| and who cares how long they take to clear as its when you are exiting
| Octave that it will be slowed up. Changing the default table_size to 32
| in symtab.h then gives
| 
|   %   cumulative   self              self     total
|  time   seconds   seconds    calls   s/call   s/call  name
|  13.00      3.34     3.34   505798     0.00     0.00
| unwind_protect::run_frame(
| std::basic_string<char, std::char_traits<char>, std::allocator<char> >
| const&)
|   4.51      4.50     1.16   500026     0.00     0.00  symbol_table::clear()
|   4.49      5.66     1.16  1508890     0.00     0.00
| symbol_table::lookup(std::
| basic_string<char, std::char_traits<char>, std::allocator<char> >
| const&, bool,
| bool)
|   4.07      6.70     1.05  6528822     0.00     0.00
| Array<std::basic_string<ch
| ar, std::char_traits<char>, std::allocator<char> > >::~Array()
|   3.89      7.70     1.00  7022404     0.00     0.00
| unwind_protect::add(void (
| *)(void*), void*)
|   3.81      8.68     0.98   500026     0.00     0.00
| octave_user_function::do_m
| ulti_index_op(int, octave_value_list const&)
| 
| Or about a 4 times speed up in symbol_table::clear.. Its hard to know
| what the best choice for symbol_table::table_size is and probably more
| tests are needed to determine the best size, However, a gut feeling
| makes me think 32 is better than 128 for most cases..

We can change this for 3.0 if you like.

Unless there is something really obvious we can do to speed it up, I
wouldn't worry too much about optimizing this code because the symbol
table in the object-branch is much different and currently uses a 
std::map object to map std::string variable names to symbol_record
objects that contain the information about variables (functions are
handled in separate tables).  With the new code, clearing variables
from a function is done by iterating over the elements of the std::map
object, so it doesn't take much work if the table is empty, but is
linear in the number of variables as the current implementation checks
each variable to know whether it is persistent, global, etc.

jwe


reply via email to

[Prev in Thread] Current Thread [Next in Thread]