coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Add wipename option to shred


From: Pádraig Brady
Subject: Re: [PATCH] Add wipename option to shred
Date: Thu, 13 Jun 2013 16:35:24 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 06/13/2013 12:51 AM, Joseph D. Wagner wrote:
> On 06/11/2013 4:36 pm, Pádraig Brady wrote:
> 
>> On 06/11/2013 07:20 AM, Joseph D. Wagner wrote:
>>
>>> Currently, when --remove (-u) is specified, shred overwrites the file
>>> name once for each character, so a file name of 0123456789 would be
>>> overridden 10 times. While this may be the most secure, it is also the
>>> most time consuming as each of the 10 renames has its own fsync. Also,
>>> renaming may not be as effective on some journaled file systems. This
>>> patch adds the option --wipename (-w) which accepts the options: *
>>> perchar - overwrite file name once per character; same as current. *
>>> once - overwrite file name once total. * none - skip overwrite of file
>>> name entirely; just unlink. If --remove is specified but not
>>> --wipename, perchar is assumed, preserving current behavior. Specifying
>>> --wipename implies --remove. In theory, this should provide improved
>>> performance for those who choose it, especially when deleting many
>>> small files. I am currently testing performance on my system, but I
>>> wanted to get the ball rolling by soliciting your comments and your
>>> receptiveness to accepting this patch. Thoughts?
>>
>> Thanks for the patch.
>> While on the face of it, the extra control seems beneficial,
>> I'm not convinced. The main reason is that this gives
>> some extra credence to per file shreds, which TBH are
>> not guaranteed due to journalling etc.
>>
>> I see performance as an important consideration when
>> shredding large amounts of data like a disk device.
>> However single file performance is less of a concern.
>> The normal use case for shred would be for single files,
>> or at the device level. shredding many files is not the
>> normal use case to worry about IMHO. If one was worried
>> about securely shredding many files, it's probably best
>> to have those files on a separate file system, and shred
>> that at a lower level.
>>
>> In any case if you really were OK with just unlinking files
>> after shredding the data, that can be done in a separate operation:
>> find | xargs shred
>> find | xargs rm
>>
>> So I'm 60:40 against adding this option.
>>
>> thanks,
>> Pádraig.
>
> I thought about running two separate operations, as you suggested.
> However, my problem with that would be the loss of an atomic
> transaction.  What if something happens midway through the shred?  I
> would not know which files were shredded, and I would have to start
> over.  Worse, if running from a script, it might execute the unlinking
> without having completed the shred.  While I could create all sorts of
> sophisticated code to check these things, it would be a lot easier if I
> could simply rely on the mechanisms already built into shred.

Well you'd use the standard simple idiom of:

  shred &&
  rm

But granted that would mean no unlinking was done,
if there is any IO error in a shred.

> I can understand your concern about a tool being misused.  If adding a
> warning to the usage output would help alleviate your concerns, I would
> be happy to draft one and add it to my patch.  However, I do not believe
> people should be denied a tool due to its potential misuse.  Would you
> deny people the use of an iron due to its risk of misuse or injury?  My
> personal philosophy is to give them the tool with instructions and
> warnings.  If the user disregards this information, it is not my
> problem.  In my case, I am using shred to purge information from file
> systems that cannot be taken offline.  Given the specific file system,
> its configuration, and modest sensitivity of the information, the
> decision was made that this is an acceptable risk.  I believe I should
> be able to assume those risks, without being denied optimizations
> because they are not considered best practices for the majority of use
> cases.
> 
> As for the performance improvement itself, the result is measurably
> significant.  I wrote a script that creates 100,000 files, and then
> measures the performance of shredding those files using the different
> wipename options in my patch.  Exact results and the exact script are
> below.
> 
> I am hoping these hard numbers and my kind, persuasive arguments will
> convince you to change your mind, and accept my patch.

Thanks for the clear and detailed arguments.
They're certainly persuasive.

> ## perchar ##
> real    678m33.468s
> user    0m9.450s
> sys    3m20.001s
> 
> ## once ##
> real    151m54.655s
> user    0m3.336s
> sys    0m32.357s
> 
> ## none ##
> real    107m34.307s
> user    0m2.637s
> sys    0m21.825s

Whoa, so this creates 23s CPU work
but waits for 1 hour 47 mins on the sync!
What file system and backing device are you using here
as a matter of interest?

> 
> perchar: 11 hours 18 minutes 33.468 seconds
> once: 2 hours 31 minutes 54.655 seconds
>  * a 346% improvement over perchar
> none: 1 hour 47 minutes 34.307 seconds
>  * a 530% improvement over perchar
>  * a 41% improvement over once

cheers,
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]