[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gnulib-tool-py] List of modules: save in local file
From: |
Bruno Haible |
Subject: |
Re: [gnulib-tool-py] List of modules: save in local file |
Date: |
Tue, 08 May 2012 01:48:24 +0200 |
User-agent: |
KMail/4.7.4 (Linux/3.1.10-1.9-desktop; KDE/4.7.4; x86_64; ; ) |
Hi Dmitriy,
> I'm now working on section where you get all the available modules from
> gnulib-tool. What do you think about this strategy?
> 1. If the GNULibImport executes the first time, get all the modules in
> usual way (like in func_all_modules() function).
> 2. Save the list of modules using json Python module. This will help user
> to save some time in the future. We can even just save it line-by-line.
> 3. In the future get the list of the modules with json module. If it was
> saved line-by-line, just split lines of file.
> 4. If user needs to refresh list of modules, he can delete json file or
> call update method from GNULibImport class.
You mean, you want to cache the output of 'gnulib-tool --list' in some
form on the file system?
It would be an interesting idea, if it was needed.
But first some general warning about caches: In step 4 you propose a manual
detection whether the cache is up-to-date. This would be a big mistake.
Caches are there to make computer operations faster *at no additional cost*
for the user. Caches that require human intervention every now and then
are a cure that is (most often) worse than the original disease. Most
often such human intervention is required because the cache implementation
is buggy: The implementor forgot about some situations in which the cache
needs to be invalidated. But outright *never* invalidating the cache is
a no-no.
In this case, the cache needs to be considered invalid if the maximum
of the timestamp of the modules/ directory and of each if its subdirectories
is newer or at least as new as the cache file. Thus, a possible implementation
would be to store in the cache file that max-combined timestamp, and when
you read the cache file you ignore it if the timestamps of the modules/**/
directory hierarchy have changed (indicating that a file has been added
or removed or renamed).
But the question is: is it needed? I ran "time ./gnulib-tool --list" once:
0.3 sec. Once again: 0.2 sec. How often is this command run? Rarely.
I think not only 0.3 sec is acceptable, but even 3 sec would be acceptable.
Caches always have a drawback: They are not entirely invisible. When
people do "diff -r", they may lead to output. They may need to be filtered
away in some operations... Bottom line: If a cache is not needed, don't
implement it. Keep things simple if you can.
Bruno
PS: But when I'll add --local-dir variants that work across the network,
the tradeoff will be different. Network operations are rarely finished
in less than 0.3 seconds.