[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lmi] ~50% speedup for Input::read()
From: |
Vaclav Slavik |
Subject: |
[lmi] ~50% speedup for Input::read() |
Date: |
Fri, 19 Dec 2008 18:49:05 +0100 |
Hi,
I was looking at Callgrind profiling results of test_input to better
understand where does the slowness of the code using
xml_lmi::child_elements() come from. While doing so, I noticed that 50%
of Input::read()'s time is spent in std::vector<>::erase(), here:
if(member_names.end() != current_member)
{
operator[](node_tag) = xml_lmi::get_content(*child);
// TODO ?? Perhaps a std::list would perform better.
member_names.erase(current_member);
}
In other words, the performance hit of child_elements() approach would
be even worse, if it weren't dominated by erase() slowness.
So I did as the TODO comment suggests and tried to use std::list instead
(the [simple] patch is below and should be applied, I think). The time
spent in read() went from 28.5% to 14.7%, with the time spent in erase()
dropping from 57.5% to 1.24%. Total running time of test_input's "Read"
test went down to 63% (on Linux, with gcc 4.2; MinGW build with gcc 3.4
didn't impress that much at 83% [*]).
What initially surprised me was that using std:set<> yielded marginally
_worse_ results than std::list. This is because test's input data file
has XML nodes in the same order as they are in member_names() and so
find() in std::list always returns first element of the list (if you
fill the list with back_inserter). If this order is guaranteed or
typical in LMI's data files, then std::list is better than std::set.
This also explains the slowness of std::vector code: this kind of input
-- and always deleting the first element -- is the worst case for it.
Regards,
Vaclav
[*] I suspect this is because gcc3's default std allocator is much worse
than gcc4's. I didn't verify this yet, but I noticed the same
pattern when trying to use pool allocator for xmlwrapp's pimpls:
Boost.Pool performs significantly better than mingw 3.4's operator
new and noticeably worse than gcc4.2's.
Index: input_xml_io.cpp
===================================================================
RCS file: /sources/lmi/lmi/input_xml_io.cpp,v
retrieving revision 1.10
diff -u -u -r1.10 input_xml_io.cpp
--- input_xml_io.cpp 17 Oct 2008 22:55:34 -0000 1.10
+++ input_xml_io.cpp 19 Dec 2008 16:59:38 -0000
@@ -36,6 +36,7 @@
#include "xml_lmi.hpp"
#include <algorithm> // std::find()
+#include <list>
// Entities that were present in older versions and then removed
// are recognized and ignored. If they're resurrected in a later
@@ -127,10 +128,14 @@
std::map<std::string, std::string> detritus_map;
- std::vector<std::string> member_names
- (Input::member_names()
+ std::list<std::string> member_names;
+ std::copy
+ (Input::member_names().begin()
+ ,Input::member_names().end()
+ ,std::back_inserter(member_names)
);
- std::vector<std::string>::iterator current_member;
+ std::list<std::string>::iterator current_member;
+
// XMLWRAPP !! The unit test demonstrates that the suppressed code is
// twenty-five percent slower. What would be really desirable is an
// (efficient) element-iterator class.
@@ -159,7 +165,6 @@
if(member_names.end() != current_member)
{
operator[](node_tag) = xml_lmi::get_content(*child);
- // TODO ?? Perhaps a std::list would perform better.
member_names.erase(current_member);
}
else if
- [lmi] ~50% speedup for Input::read(),
Vaclav Slavik <=