[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Preprocessing with Basser Lout

From: Jeff Kingston
Subject: Re: Preprocessing with Basser Lout
Date: Fri, 24 May 2002 09:31:21 +1000

On Thu, 23 May 2002 15:27:08 +0200, Ludovic Courtès wrote:
  > And BTW, is there somewhere we can find documentation about Lout's
  > index files (the .li) and also the .ld files, what it's used for
  > and so on? It looks like it's not `normal' Lout code (eg. with
  > some weird things like backslashes and @@...), is it?
  > Thanks,
  > Ludovic.

I don't think these formats are documented anywhere, so here goes.
The database system is essentially a set of Lout objects, which
all happen to be invocations of symbols which have a @Tag
parameter.  These objects are indexed by the symbol name and
the tag.  For example, if I write

       @Tag { intro }
       @Title { Introduction }
    @End @Section

in my document, then Lout will copy this symbol into the database,
indexed by "@Section" and "intro".  The actual objects are stored
in a .ld file corresponding to the file they came from originally;
or you can define your own .ld file as done in $(LOUTLIB)/data.
The .ld format it essentially a set of objects, each enclosed in
braces.  Unfortunately it's also necessary to store information
about the environment of these objects, e.g. for sections you
have to store information about which chapter they are in.  This
is because when evaluating the @Section symbol, since it's
defined inside @Chapter the parameters etc. of @Chapter must be
available to the evaluation.  Special symbols in .ld files are
used to introduce these environments and to save space when they
are shared, and that's the main reason why objects in .ld files look
funny.  There are also a couple of other odd things: @Vis symbols to
overcome scoping problems, mainly due to macro expansions which can't
be reversed; parameter names replaced by \a, \b, \c etc. to save space.

The .li file is an index for one or more .ld files.  It starts
with a header line such as

    00 Basser Lout Version 3.25 (December 2001) database index file

After that comes a list of the symbols for which this database
contains objects:

    00symbol 17 @BasicSetup @DocumentSetup @ReportSetup @Section

This line says that this index contains indexes for @Section
symbols (@Section meaning the symbol which is in scope after you
open the @BasicSetup, @DocumentSetup, and @ReportSetup symbols).
To save space, this symbol is referred to as 17 hereafter in the
index file.  A slightly different version is

    00target 20 @BasicSetup @DocumentSetup @ContentsPlace

which is to do with galley flushing when the galleys to be
flushed arrive too late in the run to match up with their
intended targets (in this case, entries in tables of contents).
These late-arriving galleys are stored indexed by their target's
name, not their own name, so that on the next run when we encounter
that target we can retrieve all the galleys that wanted to go there.

Subsequent lines are tab-separated lines containing the following:

    <symbol>&<tag> <seqnum> <extraseqnum> <filepos> <linenum> <filename>

whose meaning is:

    <symbol>      the symbol being indexed, e.g. 17 (i.e. @Section)
    <tag>         the tag, e.g. "intro" (often auto-generated though)
    <seqnum>      a sequence number, or the sort key of sorted galleys
    <extraseqnum> a global sequence number ensuring stable sorting
    <filepos>     the position in the .ld file where object is stored
    <linenum>     the line number in the .ld file where object is stored
    <filename>    the name of the .ld file, or "." if same as .ld file

After generating this .li file on one run, Lout sorts it at the end
of the run, and then uses binary search on it on subsequent runs.
Actually memory is so cheap these days that you might as well read
it all in, and an optimization of this kind was put in a few years ago.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]