Re: [lmi] [PATCH] Micro-optimization for ledger

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] [PATCH] Micro-optimization for ledger_format() taking vector

From:	Vadim Zeitlin
Subject:	Re: [lmi] [PATCH] Micro-optimization for ledger_format() taking vector
Date:	Fri, 30 Jun 2017 00:42:48 +0200

On Thu, 29 Jun 2017 22:25:58 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2017-06-29 16:16, Vadim Zeitlin wrote:
GC> > 
GC> >  While looking at this code, I couldn't help noticing that ledger_format()
GC> > makes a copy of all vectors passed to it, which seems completely
GC> > unnecessary -- even if these vectors are not exactly huge (limited to ~100
GC> > elements, I think), there is a number of them and just avoid a ~100
GC> > allocations is, I believe, worth it, especially when it can be avoided by
GC> > simply passing the vector by const reference instead of by value, so I 
went
GC> > ahead and made this change and also another small one inside the function
GC> > itself to avoid more memory (re)allocations there, please see
GC> > 
GC> >   https://github.com/vadz/lmi/pull/58/files
GC> 
GC> I had never previously thought of doing this:
GC> 
GC> $wget 
https://github.com/vadz/lmi/commit/24de308661afb879e7319918840648bddb96071b.patch
GC> $wget 
https://github.com/vadz/lmi/commit/ab079b23a41dfc7082e0b643d868f3ddb5438ef5.patch
GC> 
GC> but it works, and prevents mozilla from mangling patches.

 Oops, sorry, I thought I had already provided the instructions for
applying GitHub pull requests, but if I didn't, here they're:

 When you open a PR URL, you can see something like this:

        SOMEONE wants to merge N commits into master from BRANCH

where, in this case (https://github.com/vadz/lmi/pull/58), SOMEONE is
"vadz" (this risks to be often the case...), N is 2 but isn't important
anyhow and BRANCH is "micro-opt_ledger_format". With this information you
can do

        $ git fetch https://github.com/${SOMEONE}/lmi.git ${BRANCH}

(where SOMEONE and BRANCH should be replaced with their values, of course)
which will make this branch available under the symbolic name FETCH_HEAD in
your repository. From here, you can handle it as any other branch, e.g.
look at the diff between master and it:

        $ git diff FETCH_HEAD

or at the commits in this branch but not in master:

        $ git log ..FETCH_HEAD

or, of course, merge it:

        $ git merge FETCH_HEAD

or, if you don't wish to do true merges but rather linearize the history:

        $ git checkout -b tmp FETCH_HEAD
        $ git rebase master
        $ git checkout master
        $ git merge --ff-only tmp
        $ git branch -d tmp

(please let me know if you'd like more details about any of these commands,
I think you're knowledgeable enough about Git to understand what they do by
now, so I don't want to tire you with unnecessary explanations but, again,
would be happy to provide them if needed).

 In the simplest case when you apply a PR when there are no changes in
master since it was made yet, you can do just

        $ git merge --ff-only FETCH_HEAD

which will either fast forward master to include the commits in the PR or
complain about not being able to do so, in which case you need to use
rebase as above or, if you prefer, cherry-pick the commit(s) in FETCH_HEAD
on master.

 In any case, using Git itself is much more convenient and time-efficient
than downloading patches manually using wget and, of course, doing this
doesn't require having a GitHub account (for public repositories as this
one, anyhow).


GC> Is there any reason not to s/push_back/emplace_back/ here?

 I don't think emplace_back() is advantageous here. With the current code,
ledger_format() creates a temporary string, from which a new string is
efficiently constructed by the vector via std::string move ctor. With
emplace_back() a new string would still be constructed using move ctor too,
so we don't seem to gain anything.

GC> More generally, looking at even the much-improved code (pushed a
GC> moment ago):
GC> 
GC> std::vector<std::string> ledger_format
GC>     (std::vector<double> const&        dv
GC>     ,std::pair<int,oenum_format_style> f
GC>     )
GC> {
GC>     std::vector<std::string> sv;
GC>     sv.reserve(dv.size());
GC>     for(auto const& i : dv)
GC>         {
GC>         sv.push_back(ledger_format(i, f));
GC>         }
GC>     return sv;
GC> }
GC> 
GC> makes me pine for APL, where we'd just write something like
GC>   (i f)ledger_format¨dv
GC> to apply the scalar version of ledger_format() to each ("¨")
GC> element of vector 'dv', returning an array of strings.

 You'd love Perl 6 hyperoperators then:

        https://docs.perl6.org/language/operators#Hyper_Operators

 But, to be fair, you can also use std::transform in C++: with lambdas,
it's actually not a bad idea to do it. I still don't find it that much
better than a simple range loop above though.

 Regards,
VZ

[Prev in Thread]

Current Thread

[Next in Thread]

[lmi] [PATCH] Micro-optimization for ledger_format() taking vector, Vadim Zeitlin, 2017/06/29
- Re: [lmi] [PATCH] Micro-optimization for ledger_format() taking vector, Greg Chicares, 2017/06/29
  - Re: [lmi] [PATCH] Micro-optimization for ledger_format() taking vector, Vadim Zeitlin <=

Prev by Date: Re: [lmi] [PATCH] Micro-optimization for ledger_format() taking vector
Previous by thread: Re: [lmi] [PATCH] Micro-optimization for ledger_format() taking vector
Index(es):
- Date
- Thread