[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Regular expression matching
From: |
Ángel González |
Subject: |
Re: [Bug-wget] Regular expression matching |
Date: |
Wed, 04 Apr 2012 21:02:38 +0200 |
User-agent: |
Thunderbird |
On 04/04/12 20:16, Gijs van Tulder wrote:
> 1. You can match complete urls, instead of just the directory prefix
> or the file name suffix (which you can do with --accept and
> --include-directories).
> 2. You can use regular expressions to do the matching, which is
> sometimes easier to than using a list of wildcard patterns.
>
> Now this isn't a new idea (there are long discussions in the archive,
> see [1]). But somehow the previous attempts didn't make it, so I
> thought I'd send my own version. It's a small patch, I've been using
> it for a while and found it really useful.
>
> I've made two versions of the patch: one uses PCRE, the other uses the
> gnulib regex library, which is probably easier to integrate.
>
> Regards,
>
> Gijs
I really like PCRE, but I think the default should be POSIX regex (those
you called "gnulib regex library"), just as every other command lines
tool, such as sed or grep. There could be a --perl-regexp switch to
change it (which could take advantage of the posix interface of pcre).
How are the interactions between --{accept,reject}regex and
--{accept,reject}?