[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re-working yyitems
Re: Re-working yyitems
Thu, 9 May 2019 11:33:25 +0900
On Thu, May 9, 2019 at 12:42 AM Akim Demaille <address@hidden> wrote:
> > If we want to avoid re-writing a vector-like container, the grow
> > should be a simple copy. However, the elements have internal pointers to
> > each other. The possibilities are thus:
> > - Keep the current implementation: fast, but ugly and complex
> Something that you have not described, is how much effort you think
> it would take. I suppose this first approach is basically asking for
> no development effort, which is also a good thing.
Yes, this first approach is "doing nothing".
> In particular, I am worried about ruining the performances, so I would
> feel much better if we could keep the "current implementation" as a base
> line for benches. Then we would have something to compare alternatives
> against, which might rule them out for being too slow.
> Parsers must be fast. There's a huge number of parser generators, and
> we should not shoot ourselves in the ungulate's hoof; it must run as
> fast as possible.
> Yes, that's why I was hesitating this much. One possibility (but that's
maybe a bit over the top) is to implement our own memory management module
for the yyitems vector of unique_ptr: that would solve the sparse memory
problem, but it seems to be more complex than needed.
> - Switch to a vector<unique_ptr<item>>: This way the pointers are stable.
> > We're counting on malloc to handle the arena for us, and
> > growing/pushing/popping/deleting is painless. However, we lose access to
> > the index of an element (we can't work it out from a pointer), which is
> > only used in debug. If the index itself is not important and can be
> > replaced by a UID, it would be easy enough to add one to the item. This
> > the approach I propose.
> I have still not looked at your work, but maybe we should start reviewing?
> And focus on them for Bison 3.5? Then we explore these tracks? Or are
> you saying that variants cannot be tailored to work yet until you are done
> with these changes?
My work is (apart maybe from style) always ready for review on the glr_cc
branch. I plan to do one big style commit at the end fixing all the
inconsistencies if that's okay.
> Note that so that we stick to C++98... I am not strongly against requiring
> C++11 for glr.cc, but I would feel more comfortable if users of glr.cc
> could give their opinion.
I guess it wouldn't be very hard to roll out our own specialized version of
unique_ptr that is C++98-compatible.
> > - no access to the element's index
> > - Is it needed?
> > - Do they need to be dense? Or can we assign a UID to each element?
> > Need to
> > store the UID, though.
> Don't worry too much about the additional space costs in debugging mode.
> Just don't emit the corresponding piece of code with parse.trace is not
> Reading your summary, it does look much simpler and probably not so costly.
> But if IIUC, we are adding an indirection for all the yyitems. I
> suppose that it's because the union is split in two different structs
> of different sizes? I guess that's where we might lose quite some
> efficiency: many allocations instead of a single large one.
The reason for the indirection is to keep all the internal pointers valid
when growing the vector. Yes, the main difference in performance is in many
allocations instead of one big one. I expect this will be partly handled by
malloc, but it depends on the specific malloc implementation, and the
workload around. As I said before, it should be possible to implement an
arena-style allocator just for these unique pointers, if that proves to be
an issue. I feel that it would help separate the concerns (parsing
separated from memory management), but I'm not sure it would simplify the
code (added complexity of the arena implementation).