bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified


From: Pádraig Brady
Subject: bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified
Date: Mon, 26 Jan 2015 23:19:34 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0

On 26/01/15 21:27, Giuseppe Scrivano wrote:
> Pádraig Brady <address@hidden> writes:
> 
>> On 26/01/15 08:36, Giuseppe Scrivano wrote:
>>> Pádraig Brady <address@hidden> writes:
>>>
>>>> On 25/01/15 18:05, Bernhard Voelker wrote:
>>>>> On 01/25/2015 06:41 PM, Pádraig Brady wrote:
>>>>>> So we have: fdatasync < fsync < syncfs < sync
>>>>>> referring to:: file data, file data + metadata, file system, all file 
>>>>>> systems
>>>>>
>>>>>> [...]
>>>>>
>>>>>> I'd be incline to go with the _what_ interface above.
>>>>>
>>>>> Either way, I think it's important to document sync is falling back
>>>>> to the bigger hammer if the smaller failed.
>>>>> ... or shouldn't do sync this?
>>>>
>>>> It should fall back where possible.
>>>>
>>>> Now there is a difference between the file and file system(s) interfaces
>>>> in that the former can return EIO error for example, while the latter
>>>> are specified to always return success. You wouldn't fall back to
>>>> a syncfs() if an fsync() gave an EIO for example.  Also gnulib
>>>> guarantees that fsync() and fdatasync() are available, so I wouldn't
>>>> fallback from file -> file system interfaces, nor between file interfaces.
>>>
>>> one risk here is when multiple arguments are specified and the fsync
>>> will return EIO more than once, we will fallback to syncfs multiple
>>> times.  Couldn't in this case a single sync be a better choice?
>>
>> I was saying we shouldn't fall back from fsync() to syncfs().
>> Just process each argument. Diagnose any errors and EXIT_FAILURE
>> if there was any error?
> 
> sorry for misunderstanding that.
> 
> I've worked out a new version that includes these suggestions, also
> since now the user can explicitly ask for the sync mechanism to use, I
> agree with you and we should raise an error if something goes wrong.
> 
> Since sync is completely different now, I took the freedom to add myself
> to the AUTHORS, feel free to drop this part if you disagree.
> 
> Regards,
> Giuseppe
> 
> 
> 
>>From 0dbc5ce9c78bc97ec5a678803270767ad9980618 Mon Sep 17 00:00:00 2001
> From: Giuseppe Scrivano <address@hidden>
> Date: Sun, 25 Jan 2015 01:33:45 +0100
> Subject: [PATCH] sync: add support for fsync(2), fdatasync(2) and syncfs(2)
> 
> * AUTHORS: Add myself to sync's authors.
> * NEWS: Mention the new feature.
> * m4/jm-macros.m4 (coreutils_MACROS): Check for syncfs.
> * doc/coreutils.texi (sync invocation): Document the new feature.
> * src/sync.c: Include "quote.h".
> (AUTHORS): Include myself.
> (MODE_FILE, MODE_FILE_DATA, MODE_FILE_SYSTEM): New enum values.
> (long_options): Define.
> (usage): Describe that arguments are now accepted.
> (main): Add arguments parsing and add support for fsync(2),
> fdatasync(2) and syncfs(2).
> ---
>  AUTHORS            |   2 +-
>  NEWS               |   3 ++
>  doc/coreutils.texi |  20 ++++++++-
>  m4/jm-macros.m4    |   1 +
>  src/sync.c         | 116 
> +++++++++++++++++++++++++++++++++++++++++++++++++----
>  5 files changed, 131 insertions(+), 11 deletions(-)
> 
> diff --git a/AUTHORS b/AUTHORS
> index 0296830..64c11d7 100644
> --- a/AUTHORS
> +++ b/AUTHORS
> @@ -83,7 +83,7 @@ stat: Michael Meskes
>  stdbuf: Pádraig Brady
>  stty: David MacKenzie
>  sum: Kayvan Aghaiepour, David MacKenzie
> -sync: Jim Meyering
> +sync: Jim Meyering, Giuseppe Scrivano
>  tac: Jay Lepreau, David MacKenzie
>  tail: Paul Rubin, David MacKenzie, Ian Lance Taylor, Jim Meyering
>  tee: Mike Parker, Richard M. Stallman, David MacKenzie
> diff --git a/NEWS b/NEWS
> index e0a2893..3d4190b 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -48,6 +48,9 @@ GNU coreutils NEWS                                    -*- 
> outline -*-
>    split accepts a new --separator option to select a record separator 
> character
>    other than the default newline character.
>  
> +  sync no longer ignores arguments and it uses fsync(2), fdatasync(2)
> +  and syncfs(2) synchronization in addition to sync(2).

sync no longer ignores arguments, and syncs each specified file, or with the
--file-system option, the file systems associated with each specified file.

>  ** Changes in behavior
>  
>    df no longer suppresses separate exports of the same remote device, as
> diff --git a/doc/coreutils.texi b/doc/coreutils.texi
> index 5a3c31a..c99b8ed 100644
> --- a/doc/coreutils.texi
> +++ b/doc/coreutils.texi
> @@ -12053,8 +12053,24 @@ crashes, data may be lost or the file system 
> corrupted as a
>  result.  The @command{sync} command ensures everything in memory
>  is written to disk.
>  
> -Any arguments are ignored, except for a lone @option{--help} or
> address@hidden (@pxref{Common options}).
> +If any argument is specified then only the specified paths will be

s/paths/files/.  paths is a bit ambiguous, while files implies dirs too.

> +synchronized.  It uses internally the syscall fsync(2) on each of them.
> +
> +If at least one path is specified, it is possible to change the
> +synchronization policy with the following options.  Also see
> address@hidden options}.
> +
> address@hidden @samp
> address@hidden --data
> address@hidden --data
> +Do not synchronize the file metadata unless it is required to maintain
> +data integrity.  It uses the syscall fdatasync(2).
> +
> address@hidden --file-system
> address@hidden --file-system
> +Synchronize all the I/O waiting for the file system that contains the path.

s/file system/file systems/
s/path/file/

> +It uses the syscall syncfs(2).
> address@hidden table
>  
>  @exitstatus
>  
> diff --git a/m4/jm-macros.m4 b/m4/jm-macros.m4
> index 58fdd97..79f124b 100644
> --- a/m4/jm-macros.m4
> +++ b/m4/jm-macros.m4
> @@ -89,6 +89,7 @@ AC_DEFUN([coreutils_MACROS],
>      sethostname
>      siginterrupt
>      sync
> +    syncfs
>      sysctl
>      sysinfo
>      tcgetpgrp
> diff --git a/src/sync.c b/src/sync.c
> index e9f4d7e..4a15d75 100644
> --- a/src/sync.c
> +++ b/src/sync.c
> @@ -23,12 +23,30 @@
>  
>  #include "system.h"
>  #include "error.h"
> -#include "long-options.h"
> +#include "quote.h"
>  
>  /* The official name of this program (e.g., no 'g' prefix).  */
>  #define PROGRAM_NAME "sync"
>  
> -#define AUTHORS proper_name ("Jim Meyering")
> +#define AUTHORS                                 \
> +  proper_name ("Jim Meyering"),                 \
> +  proper_name ("Giuseppe Scrivano")
> +
> +enum
> +{
> +  MODE_FILE,
> +  MODE_FILE_DATA,
> +  MODE_FILE_SYSTEM
> +};
> +
> +static struct option const long_options[] =
> +{
> +  {"data", no_argument, NULL, MODE_FILE_DATA},
> +  {"file-system", no_argument, NULL, MODE_FILE_SYSTEM},
> +  {GETOPT_HELP_OPTION_DECL},
> +  {GETOPT_VERSION_OPTION_DECL},
> +  {NULL, 0, NULL, 0}
> +};
>  
>  void
>  usage (int status)
> @@ -37,11 +55,21 @@ usage (int status)
>      emit_try_help ();
>    else
>      {
> -      printf (_("Usage: %s [OPTION]\n"), program_name);
> +      printf (_("Usage: %s [OPTION] [PATH]...\n"), program_name);
>        fputs (_("\
>  Force changed blocks to disk, update the super block.\n\
>  \n\
> +If one or more file paths are specified, sync only them,\n\
> +use --data and --file-system to change the default behavior\n\
> +\n\
> +"), stdout);
> +
> +      fputs (_("\
> +  --file-system              sync the file systems that contain the path\n\
> +  --data                     flush the metadata only if needed later for\n\

s/flush/sync/ as flush has ambiguous connotations with discard rather than 
drain.

> +                             data retrieval to be correctly handled\n\
>  "), stdout);
> +
>        fputs (HELP_OPTION_DESCRIPTION, stdout);
>        fputs (VERSION_OPTION_DESCRIPTION, stdout);
>        emit_ancillary_info (PROGRAM_NAME);
> @@ -52,6 +80,11 @@ Force changed blocks to disk, update the super block.\n\
>  int
>  main (int argc, char **argv)
>  {
> +  bool args_specified;
> +  int c;
> +  int mode = MODE_FILE;
> +  int (*sync_func)(int) = NULL;

This assumes these are never function like macros.
Probably OK, but it would be better to avoid that assumption
about external functions.

> +
>    initialize_main (&argc, &argv);
>    set_program_name (argv[0]);
>    setlocale (LC_ALL, "");
> @@ -60,12 +93,79 @@ main (int argc, char **argv)
>  
>    atexit (close_stdout);
>  
> -  parse_long_options (argc, argv, PROGRAM_NAME, PACKAGE, Version,
> -                      usage, AUTHORS, (char const *) NULL);
> -  if (getopt_long (argc, argv, "", NULL, NULL) != -1)
> -    usage (EXIT_FAILURE);
> +  while ((c = getopt_long (argc, argv, "", long_options, NULL))
> +         != -1)
> +    {
> +      switch (c)
> +        {
> +        case MODE_FILE_DATA:
> +          mode = MODE_FILE_DATA;
> +          break;
> +
> +        case MODE_FILE_SYSTEM:
> +          mode = MODE_FILE_SYSTEM;
> +          break;
> +
> +        case_GETOPT_HELP_CHAR;
> +
> +        case_GETOPT_VERSION_CHAR (PROGRAM_NAME, AUTHORS);
> +
> +        default:
> +          usage (EXIT_FAILURE);
> +        }
> +    }
> +
> +  args_specified = optind < argc;
> +
> +  if (! args_specified)
> +    goto sync;
> +
> +  if (!args_specified && mode != MODE_FILE)
> +    error (EXIT_FAILURE, errno, _("mode specified without any argument"));

That's a big aggressive. I'd allow --file-system with no args,
as then --file-system means to sync the specified or all file systems.

Requiring arguments with --data probably make sense.

Another validation worth applying is to disallow --data with --file-system
as they're invalid to specify together.

> +
> +  switch (mode)
> +    {
> +    case MODE_FILE_DATA:
> +      sync_func = fdatasync;
> +      break;
> +
> +    case MODE_FILE:
> +      sync_func = fsync;
> +      break;
> +#if HAVE_SYNCFS
> +      /* On systems where syncfs is not available, fallback to sync.  */
> +    case MODE_FILE_SYSTEM:
> +      sync_func = syncfs;
> +      break;
> +#endif
> +    default:
> +      goto sync;
> +    }
> +

I'd probably put the following loop body in a sync_arg() function returning a 
bool.

> +  while (optind < argc)
> +    {
> +      int fd = open (argv[optind], O_RDONLY);
> +      if (fd < 0)
> +        {
> +          error (EXIT_FAILURE, errno, _("cannot open %s for reading"),
> +                 quote (argv[optind]));
> +        }
> +
> +      if (sync_func (fd) < 0)
> +        error (EXIT_FAILURE, errno, _("syncing error"));
> +
> +      if (close (fd) < 0)
> +        {
> +          error (EXIT_FAILURE, errno, _("failed to close %s"),
> +                 quote (argv[optind]));
> +        }
> +
> +      optind++;
> +    }

It's probably better to register the failure and try subsequent arguments,
as then it's ambiguous as to what is sync'd if you only get the warning
about the first failure.

> +  return EXIT_SUCCESS;
>  
> -  if (optind < argc)

There aren't any deep nestings here, so things could probably be reworked
easily to avoid the goto.

> +sync:
> +  if (args_specified)
>      error (0, 0, _("ignoring all arguments"));
>  
>    sync ();

It's probably worth adding references to syncfs(2), fsync(2), and fdatasync(2)
to man/sync.x

thanks!
Pádraig.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]