lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] ~50% speedup for Input::read()


From: Vaclav Slavik
Subject: [lmi] ~50% speedup for Input::read()
Date: Fri, 19 Dec 2008 18:49:05 +0100

Hi,

I was looking at Callgrind profiling results of test_input to better
understand where does the slowness of the code using
xml_lmi::child_elements() come from. While doing so, I noticed that 50%
of Input::read()'s time is spent in std::vector<>::erase(), here:

        if(member_names.end() != current_member)
            {
            operator[](node_tag) = xml_lmi::get_content(*child);
            // TODO ?? Perhaps a std::list would perform better.
            member_names.erase(current_member);
            }

In other words, the performance hit of child_elements() approach would
be even worse, if it weren't dominated by erase() slowness.

So I did as the TODO comment suggests and tried to use std::list instead
(the [simple] patch is below and should be applied, I think). The time
spent in read() went from 28.5% to 14.7%, with the time spent in erase()
dropping from 57.5% to 1.24%. Total running time of test_input's "Read"
test went down to 63% (on Linux, with gcc 4.2; MinGW build with gcc 3.4
didn't impress that much at 83% [*]).

What initially surprised me was that using std:set<> yielded marginally
_worse_ results than std::list. This is because test's input data file
has XML nodes in the same order as they are in member_names() and so
find() in std::list always returns first element of the list (if you
fill the list with back_inserter). If this order is guaranteed or
typical in LMI's data files, then std::list is better than std::set.
This also explains the slowness of std::vector code: this kind of input
-- and always deleting the first element -- is the worst case for it.

Regards,
Vaclav

[*] I suspect this is because gcc3's default std allocator is much worse
    than gcc4's. I didn't verify this yet, but I noticed the same
    pattern when trying to use pool allocator for xmlwrapp's pimpls:
    Boost.Pool performs significantly better than mingw 3.4's operator
    new and noticeably worse than gcc4.2's.



Index: input_xml_io.cpp
===================================================================
RCS file: /sources/lmi/lmi/input_xml_io.cpp,v
retrieving revision 1.10
diff -u -u -r1.10 input_xml_io.cpp
--- input_xml_io.cpp    17 Oct 2008 22:55:34 -0000      1.10
+++ input_xml_io.cpp    19 Dec 2008 16:59:38 -0000
@@ -36,6 +36,7 @@
 #include "xml_lmi.hpp"
 
 #include <algorithm> // std::find()
+#include <list>
 
 // Entities that were present in older versions and then removed
 // are recognized and ignored. If they're resurrected in a later
@@ -127,10 +128,14 @@
 
     std::map<std::string, std::string> detritus_map;
 
-    std::vector<std::string> member_names
-        (Input::member_names()
+    std::list<std::string> member_names;
+    std::copy
+        (Input::member_names().begin()
+        ,Input::member_names().end()
+        ,std::back_inserter(member_names)
         );
-    std::vector<std::string>::iterator current_member;
+    std::list<std::string>::iterator current_member;
+
 // XMLWRAPP !! The unit test demonstrates that the suppressed code is
 // twenty-five percent slower. What would be really desirable is an
 // (efficient) element-iterator class.
@@ -159,7 +165,6 @@
         if(member_names.end() != current_member)
             {
             operator[](node_tag) = xml_lmi::get_content(*child);
-            // TODO ?? Perhaps a std::list would perform better.
             member_names.erase(current_member);
             }
         else if






reply via email to

[Prev in Thread] Current Thread [Next in Thread]