help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problem with regexp nested groups


From: Marc Tfardy
Subject: Re: Problem with regexp nested groups
Date: Sun, 11 May 2008 00:31:42 +0200
User-agent: Thunderbird 2.0.0.14 (Windows/20080421)

Phil Carmody schrieb:
Marc Tfardy <address@hidden> writes:
Hallo,

I have some problem with regexp and I hope someone could help me.


Assume we have following text in a buffer.

-- DATA ----------------------------------------------------------------
<DATA="some/file/sample1.mp3">
<DATA="some/file/sample2.mp3">
blablalba
<DATA="some/file/sample3.wav">
blabla
OBJ('some/file/sample4.mp3')
OBJ('some/file/sample5.au')
------------------------------------------------------------------------

My goal is to extract some text from the buffer, namely only
portion of text between `DATA="' and `"' or between `OBJ('' and
`''. In both cases text must end with `.mp3' and the left and
right delimeter shoud be ignored.
...
"\\(DATA=\"\\(.*?\.mp3\\)\"\\|OBJ('\\(.*?\.mp3\\)')\\)" nil t)

How about grouping /DATA="/ or /OBJ('/ as match 1, then capture
the filename as 2, and end with /.mp3"/ or /.mp3'/ as match 3?

Interesting idea and it really works and it is definitely better then
multiple (match-string-no-properties x). I've modified slightly
your suggestion and I put extension togother with filename:

"\\(DATA=\"\\|OBJ('\\)\\(.*?\.mp3\\)\\(\"\\|')\\)"

So I need only to refer (match-string-no-properties 2).


That would match malformed lines such as DATA="foo.mp3'

but if that's a problem you can capture the quotes too and fix them in post-production.

This disadvantage I can get over. The really problem occur when
the extract part of text exist not at position 2 (in case of
another, more complicated regexp). In this case I have the same
type of problem like in my first posting.

regards

Marc




reply via email to

[Prev in Thread] Current Thread [Next in Thread]