[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] Preventing matches in regular expressions
From: |
Aaron Bentley |
Subject: |
Re: [Gnu-arch-users] Preventing matches in regular expressions |
Date: |
Wed, 11 Aug 2004 09:13:59 -0400 |
User-agent: |
Mozilla Thunderbird 0.5 (X11/20040309) |
Tom Lord wrote:
So then this command:
> $fai categorize --local --unrecognized fox.zip
means something like:
"Let the rules I've specified so far be called X.
The new rules are:
If the file is fox.zip it's unrecognized,
otherwise, things work just like under rules X."
Is that really the idea?
Yes. Worse than that, I'd like to have specific-name rules have a
higher precedence than file-class rules.
I don't know of an easy way to do that without modifying tla.
The problem here is that the per-file activities of the `inventory'
loop are performance-critical. It's already bad that we cal
`regexec' several times. One really doesn't want to make that loop
any more costly.
I'm in full agreement there. If I had to choose, I'd choose more speed
over more flexability.
In rx, you can write:
R1[[:cut 1:]]|R2[[:cut 2:]]|R3[[:cut 3:]]| ...
and a single call to regexec will tell you which regexp matched.
(Rx `regexec' returns an extra value -- the "state label" which
can be set by using the `cut' operator.)
That bears a striking resemblance to the inventory optimization work I
did with the cut operator. This might work nicely also:
R1[[:cut 3:]]|R2[[:cut 1:]]|R3[:cut 3:]
As far as I can tell, reusing cut numbers is legal, and we could then
assign 1 to junk, 2 to backup, 3 to precious, etc.
So, for your fai feature, you could:
Propose a new alternative to =tagging-method files (or new
syntax to go inside of them), allowing for directives
in which the ordering of directives is significant.
Okay, I'll consider that. The current semantics are already very close
to what I want, because .arch-inventory tends to contain specific-file
regexes, while =tagging-method tends to contain file-class regexes.
Since I'm trying to automate things, I've been trying to look at every
case, which is why I was looking for a way to handle cases when the
regexes appear in the same files.
I think file-class in .arch-inventory happens often enough to worry
about, but I'm not sure whether it justifies rewriting the inventory
code. I don't think there's a such thing as file-specific code in
=tagging-method.
That is, in =tagging-method, a rule like
precious ^\.listing$
defines a class of files that all have the same name. Whereas the same
rule in .arch-inventory would refer to a specific file.
Modify the inventory code to use the rx optimization.
As a side effect, you'll probably make `tla inventory' faster.
Well, first we need to fix the cut operator handling.
Aaron
--
Aaron Bentley
Director of Technology
Panometrics, Inc.
Re: [Gnu-arch-users] Preventing matches in regular expressions, Tom Lord, 2004/08/11