[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Re: [PATCH 2.8.5-dev14] *Really* large tables

From: Leonid Pauzner
Subject: Re: lynx-dev Re: [PATCH 2.8.5-dev14] *Really* large tables
Date: Mon, 12 May 2003 23:51:58 +0400 (MSD)

There was a discussion on memory allocation in lynx tables.
The present state is messy, to say at least.

#define CELLS_GROWBY 16
#define ROWS_GROWBY 16
#else  /* This is very silly, and leads to *larger* memory consumption... */
#define CELLS_GROWBY 2
#define ROWS_GROWBY 2

#define  REUSE_ROWS_AS_CELLS_POOLS 0    /* Turns out to be not beneficial */

/* Experiments show that 2 is better than 1.5 is better than 1.25 (all by
   a small margin only)??? */

IMHO, cell growby strategy should be as follows:
the first row (maybe first and second) - cells_growby 16, init with 0;
other rows - cells_growby 1, init with the previous row length.

Having in mind that number of cells in a row is not large, and limited by
the screen width, the number of reallocs will be very small. And we will not
allocate memory for 16 cells in a row if we need only a couple...

10-Apr-2003 13:13 Ilya Zakharevich wrote:
>    (a) text storage: 40M;

>    (b) temporary table layout info: could be 38M, is 53M;

>    (c) text storage redone due to whitespace inclusion: 13M.

> There is a lot of possibilities to improve the constants (now when the
> algorithm is actually linear):

>    (a1) Even with styles enabled, the text includes ^A and ^B (etc)
>         characters.  [Do not know how they get there, in style-less
>         build they denote boundaries of bold/underwrited text.]  [They
>         are sometimes visible during a partial display stage.]
>         Removing them may improve (slightly) the memory footprint
>         taken by lines-as-strings.

>    (a2) The current algorithm for storing styles info is extremely
>       wasteful.  Each line contains info about the whole stack of
>       styles which lead to the style of this line.  E.g., in the
>       example above each line should be marked at the beginning as
>       start-html, start-body, start-table, and at the end as
>       end-table, end-body, end-html.  Since all 3 these styles are
>       not switched off until the end of the line, only the inner one
>       matters.

>       Moreover, almost each line of an HTML file will have start-html,
>       and start-body tags.  Is not it better to just assume them,
>       and (e.g.) just explicitly disable (an assumed) start-body
>       style for the display of the <head> matter?

>       I think (a1)+(a2) can lead to circa 20M winning (maybe more).

>     (b) Instead of array of row-info elements for table information
>       storage (rowinfo table[500000]) one could store an array of
>       pointers (rowinfo *table[500000]), and allocate the rowinfo's
>       themselves by chunks.  This way we need to realloc()ate only a
>       2M array (instead of 18M one).  This gives a pessimization of
>       2M, and an optimization of whatever reallocations we may save.
>       It is reasonable to expect a winning of circa 10M on this stage.

>     (c) I have no idea how to avoid this 13M waste...  Any thoughts?

> Summary: currently we take 107M to render a simple 500K-line table.
> One should expect that it should be possible to shave circa 40M out of
> this.

> Yours,
> Ilya

> ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]