[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev Re: [PATCH 2.8.5-dev14] *Really* large tables
From: |
Leonid Pauzner |
Subject: |
Re: lynx-dev Re: [PATCH 2.8.5-dev14] *Really* large tables |
Date: |
Mon, 12 May 2003 23:51:58 +0400 (MSD) |
There was a discussion on memory allocation in lynx tables.
The present state is messy, to say at least.
#ifdef SAVE_TIME_NOT_SPACE
#define CELLS_GROWBY 16
#define ROWS_GROWBY 16
#define ROWS_GROWBY_DIVISOR 2
#else /* This is very silly, and leads to *larger* memory consumption... */
#define CELLS_GROWBY 2
#define ROWS_GROWBY 2
#define ROWS_GROWBY_DIVISOR 10
#endif
#define REUSE_ROWS_AS_CELLS_POOLS 0 /* Turns out to be not beneficial */
/* Experiments show that 2 is better than 1.5 is better than 1.25 (all by
a small margin only)??? */
#define CELLS_GROWBY_FACTOR 2
IMHO, cell growby strategy should be as follows:
the first row (maybe first and second) - cells_growby 16, init with 0;
other rows - cells_growby 1, init with the previous row length.
Having in mind that number of cells in a row is not large, and limited by
the screen width, the number of reallocs will be very small. And we will not
allocate memory for 16 cells in a row if we need only a couple...
10-Apr-2003 13:13 Ilya Zakharevich wrote:
> (a) text storage: 40M;
> (b) temporary table layout info: could be 38M, is 53M;
> (c) text storage redone due to whitespace inclusion: 13M.
> There is a lot of possibilities to improve the constants (now when the
> algorithm is actually linear):
> (a1) Even with styles enabled, the text includes ^A and ^B (etc)
> characters. [Do not know how they get there, in style-less
> build they denote boundaries of bold/underwrited text.] [They
> are sometimes visible during a partial display stage.]
> Removing them may improve (slightly) the memory footprint
> taken by lines-as-strings.
> (a2) The current algorithm for storing styles info is extremely
> wasteful. Each line contains info about the whole stack of
> styles which lead to the style of this line. E.g., in the
> example above each line should be marked at the beginning as
> start-html, start-body, start-table, and at the end as
> end-table, end-body, end-html. Since all 3 these styles are
> not switched off until the end of the line, only the inner one
> matters.
> Moreover, almost each line of an HTML file will have start-html,
> and start-body tags. Is not it better to just assume them,
> and (e.g.) just explicitly disable (an assumed) start-body
> style for the display of the <head> matter?
> I think (a1)+(a2) can lead to circa 20M winning (maybe more).
> (b) Instead of array of row-info elements for table information
> storage (rowinfo table[500000]) one could store an array of
> pointers (rowinfo *table[500000]), and allocate the rowinfo's
> themselves by chunks. This way we need to realloc()ate only a
> 2M array (instead of 18M one). This gives a pessimization of
> 2M, and an optimization of whatever reallocations we may save.
> It is reasonable to expect a winning of circa 10M on this stage.
> (c) I have no idea how to avoid this 13M waste... Any thoughts?
> Summary: currently we take 107M to render a simple 500K-line table.
> One should expect that it should be possible to shave circa 40M out of
> this.
> Yours,
> Ilya
> ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden
; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden
- Re: lynx-dev Re: [PATCH 2.8.5-dev14] *Really* large tables,
Leonid Pauzner <=