gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Preventing matches in regular expressions


From: Aaron Bentley
Subject: Re: [Gnu-arch-users] Preventing matches in regular expressions
Date: Wed, 11 Aug 2004 09:13:59 -0400
User-agent: Mozilla Thunderbird 0.5 (X11/20040309)

Tom Lord wrote:

So then this command:

    > $fai categorize --local --unrecognized fox.zip

means something like:

    "Let the rules I've specified so far be called X.
The new rules are: If the file is fox.zip it's unrecognized,
        otherwise, things work just like under rules X."

Is that really the idea?

Yes. Worse than that, I'd like to have specific-name rules have a higher precedence than file-class rules.

I don't know of an easy way to do that without modifying tla.

The problem here is that the per-file activities of the `inventory'
loop are performance-critical.   It's already bad that we cal
`regexec' several times.   One really doesn't want to make that loop
any more costly.

I'm in full agreement there. If I had to choose, I'd choose more speed over more flexability.

In rx, you can write:

    R1[[:cut 1:]]|R2[[:cut 2:]]|R3[[:cut 3:]]| ...

and a single call to regexec will tell you which regexp matched.
(Rx `regexec' returns an extra value -- the "state label" which
can be set by using the `cut' operator.)

That bears a striking resemblance to the inventory optimization work I did with the cut operator. This might work nicely also:

R1[[:cut 3:]]|R2[[:cut 1:]]|R3[:cut 3:]

As far as I can tell, reusing cut numbers is legal, and we could then assign 1 to junk, 2 to backup, 3 to precious, etc.

So, for your fai feature, you could:

Propose a new alternative to =tagging-method files (or new syntax to go inside of them), allowing for directives in which the ordering of directives is significant.

Okay, I'll consider that. The current semantics are already very close to what I want, because .arch-inventory tends to contain specific-file regexes, while =tagging-method tends to contain file-class regexes.

Since I'm trying to automate things, I've been trying to look at every case, which is why I was looking for a way to handle cases when the regexes appear in the same files.

I think file-class in .arch-inventory happens often enough to worry about, but I'm not sure whether it justifies rewriting the inventory code. I don't think there's a such thing as file-specific code in =tagging-method.

That is, in =tagging-method, a rule like

precious ^\.listing$

defines a class of files that all have the same name. Whereas the same rule in .arch-inventory would refer to a specific file.

Modify the inventory code to use the rx optimization. As a side effect, you'll probably make `tla inventory' faster.

Well, first we need to fix the cut operator handling.

Aaron

--
Aaron Bentley
Director of Technology
Panometrics, Inc.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]