Really, the sticking point is that pan constructs its entire tree in-
memory. There has long been talk of doing a database backend using sqlite
or mysql or the like, such that pan only works with a few pages worth of
data in memory at once, the rest is in the db, but I think Charles wasn't
a db expert and hesitated to go there. Now of course Charles has moved
on and it's Heinrich that has been doing most of the new features and
heavy coding of late. He too has mentioned that he plans a db backend at
some point, but I've no clue whether he has a branch he's hacking on to
that end yet or not...
What I'm saying is... switching to standard data structures and etc may
have value, particularly if it's even better memory scaling than current,
but be absolutely sure you keep the scaling in mind, anticipating and
testing new code against sufficiently busy groups on servers with enough
retention to give you at least 200 million header groups to test against.
But any real serious data structure rewrite work should almost certainly
target a database backend solution of some sort as in reality, that's
about the only way to address the whole scalability thing once and for
all. While I have no idea where Heinrich might be on the database backend
work, I'd guess he at/least/ has some rough ideas about implementation,
so it'd probably be a good idea to discuss that with him (and there's a
couple others who might be interested) before diving in head first.