coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dd SIGUSR1 race


From: Pádraig Brady
Subject: Re: dd SIGUSR1 race
Date: Fri, 26 Sep 2014 17:05:33 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 09/26/2014 04:39 PM, Federico Simoncelli wrote:
> ----- Original Message -----
>> From: "Pádraig Brady" <address@hidden>
>> To: "Bernhard Voelker" <address@hidden>
>> Cc: "Federico Simoncelli" <address@hidden>, "Coreutils" <address@hidden>
>> Sent: Friday, September 26, 2014 5:05:30 PM
>> Subject: Re: dd SIGUSR1 race
>>
>> We'd definitely leave the existing SIGUSR1 handling in place
>> but we could augment it to allow avoiding races.
>>
>> Option1:
>>   Unblock SIGUSR1 explicitly, allowing the parent to mask that and avoid
>>   races,
>>
>> Option2:
>>   Set the handler even if SIGUSR1 was set to ignored.
>>
>> Note the default behavior for SIGINFO on BSD is to discard it,
>> so that doesn't have the "kill dd by mistake" issue, but does it seem
>> have the possibility to lose SIGINFO, so that for robustness
>> they'd have to be resent after a timeout.
>> Now that doesn't seem too onerous since you'd have to be dealing
>> with timeouts anyway, so one could take the same approach with existing GNU
>> dd
>> implementations. I.E. just set SIGUSR1 to ignore in the parent
>> before the fork and then reset.
>> Another reason to favor Option2 is that can be easily controlled
>> from the shell using: trap '' USR1
>> Note we only print stats on receiving the SIGUSR1 so I don't see
>> much issue in requisitioning it (even though not strictly POSIX compliant).
>> Also we can even suppress that stats output if problematic with status=none.
> 
> Does it mean that the controlling process cannot reliably use SIGUSR1?
> 
> Currently vdsm[1] has a special meaning for SIGUSR1 and it makes wide
> use of "dd" for several reasons.
> 
> Will we have a short moment before/after forking for dd where SIGUSR1
> is masked and the signal is not handled, or is there some special
> trick we could use?
> (I thought of double-forking... but again, we shouldn't be forced to
> use cumbersome strategies to have progress report).

Technique from vdsm would probably to:
  sigprocmask(SIG_BLOCK, ...);   //block/queue SIGUSR1
  signal(SIGUSR1, SIG_IGN);      //Ignore for child
  child();
  signal(SIGUSR1, handler);      //reset handler
  sigprocmask(SIG_SETMASK, ...); //unblock SIGUSR1

I agree that's a little awkward, though also a bit unusual.

> Using signals is a big advantage for sysadmins that once they've
> launched the process they can monitor it at any time in case it seems
> to take too long.
> 
> Anyway for controlling processes dealing with signals is harder (they
> are asynchronous, you don't know if they're lost or not resulting in
> missing stats or too many stats, the initial sleep that you added in
> the script uses .5 which is an arbitrary value, etc.)

That's optional. You need to send sigs at desired rate,
and consume results at provided rate.  Consuming possibly multiple/missing
results pushed from the command is much the same issue in either case.

> Many other commands are providing an explicit flag: wget, curl,
> qemu-img, etc.

Yes, I see qemu-img supports both.
If not too invasive, we could accept a status=progress patch to output
stats every approx 1s (noting the caveats of being blocked behind large 
reads/writes)

thanks,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]