bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manu


From: Assaf Gordon
Subject: bug#24615: [PATCH] sed: handle the patterns which consist of ^ or $ manually
Date: Wed, 5 Oct 2016 00:46:44 -0400

Hello Norihiro,

Thank you for this improvement.

> On Oct 4, 2016, at 11:18, Norihiro Tanaka <address@hidden> wrote:
> 
> The patterns which consist of only ^ or $ often appear in substitution.
> For example, If we change a CSV file into double quoted, will do as
> following.

few issues:

1. 
In the patch, I'd recommend using the global/extern variable 'buffer_delimiter' 
instead of hard-coded '\n' - to seamlessly handle "sed -z" for NUL-terminated 
lines.


2.
While trying your patch, I think I uncovered a sed bug (not in your code):
It seems 's///m' do not work with "-z".

Compare, correct behavior, anchors match before/after every newline (in a 
pattern with multiple newlines):

  $ printf "a\nb\nc\n" | sed 'N;N;s/^/X/mg;s/$/Y/mg'
  XaY
  XbY
  XcY

versus failure to detect NUL as line terminators:

  $ printf "a\0b\0c\0" | sed -z 'N;N;s/^/X/mg;s/$/Y/mg' | od -An -a
   X   a nul   b nul   c   Y nul

Again, this is not a bug in your code, it was in sed before
(even before the new DFA regex engine).
I haven't pinpointed yet where does it originate from.


3.
From cursory testing, I suspect the following code causes infinite loop with 
your patch:

  printf "a\nb\nc\n" | ./sed/sed 'N;N;s/^/X/mg;s/$/Y/mg'


As the patch has few nested conditionals in a critical code path,
I think some tests would be beneficial to ensure full coverage.
I'll try to write them up in the coming days.

regards,
 - assaf







reply via email to

[Prev in Thread] Current Thread [Next in Thread]