qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv5] block: add native support for NFS


From: Peter Lieven
Subject: Re: [Qemu-devel] [PATCHv5] block: add native support for NFS
Date: Thu, 09 Jan 2014 17:08:56 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

Am 09.01.2014 15:13, schrieb Kevin Wolf:
> Am 26.12.2013 um 13:48 hat Peter Lieven geschrieben:
>> This patch adds native support for accessing images on NFS shares without
>> the requirement to actually mount the entire NFS share on the host.
>>
>> NFS Images can simply be specified by an url of the form:
>> nfs://<host>/<export>/<filename>
>>
>> For example:
>> qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2
>>
>> You need LibNFS from Ronnie Sahlberg available at:
>>    git://github.com/sahlberg/libnfs.git
>> for this to work.
>>
>> During configure it is automatically probed for libnfs and support
>> is enabled on-the-fly. You can forbid or enforce libnfs support
>> with --disable-libnfs or --enable-libnfs respectively.
>>
>> Due to NFS restrictions you might need to execute your binaries
>> as root, allow them to open priviledged ports (<1024) or specify
>> insecure option on the NFS server.
>>
>> For additional information on ROOT vs. non-ROOT operation and URL
>> format + parameters see:
>>    https://raw.github.com/sahlberg/libnfs/master/README
>>
>> LibNFS currently support NFS version 3 only.
>>
>> Signed-off-by: Peter Lieven <address@hidden>
>> ---
>> v4->v5:
>>  - disussed with Ronnie and decided to move URL + Paramter parsing to LibNFS.
>>    This allows for URL parameter processing directly in LibNFS without 
>> altering
>>    the qemu NFS block driver. This bumps the version requirement for LibNFS
>>    to 1.9.0 though.
> Considering that we'll likely want to add new options in the future, I'm
> not sure if this is a good idea. This means that struct nfs_url will
> change, and if qemu isn't updated, it might not even notice that some
> option was requested in a new field that it doesn't know and provide,
> so it will silently ignore it. Or if we have a qemu built against an
> older libnfs, this must be marked as an incompatible ABI change, so it
> can't run at all with the newer libnfs version.

Maybe we are talking about differnt things here. The paramteres/options
we were talking about are options to libnfs not to qemu. This could be
e.g. the uid or the protocol version. This is nothing qemu has to care about.
The nfs_url struct is likely not to change and even if it would change
I see no problem as long as we do only extend the struct and do not change
to position of the server, export and file.
>
>>  - added a pointer to the LibNFS readme where additional information about
>>    ROOT privilidge requirements can be found as this raised a few concerns.
>>  - removed a trailing dot in an error statement [Fam].
>>
>> v3->v4:
>>  - finally added full implementation of bdrv_get_allocated_file_size [Stefan]
>>  - removed trailing \n from error statements [Stefan]
>>
>> v2->v3:
>>  - rebased the stefanha/block
>>  - use pkg_config to check for libnfs (ignoring cflags which are broken in 
>> 1.8.0) [Stefan]
>>  - fixed NFSClient declaration [Stefan]
>>  - renamed Task variables to task [Stefan]
>>  - renamed NFSTask to NFSRPC [Ronnie]
>>  - do not update bs->total_sectors in nfs_co_writev [Stefan]
>>  - return -ENOMEM on all async call failures [Stefan,Ronnie]
>>  - fully implement ftruncate
>>  - use util/uri.c for URL parsing [Stefan]
>>  - reworked nfs_file_open_common to nfs_client_open which works on NFSClient 
>> [Stefan]
>>  - added a comment ot the connect message that libnfs support NFSv3 only at 
>> the moment.
>>  - DID NOT add full implementation of bdrv_get_allocated_file_size because
>>    we are not in a coroutine context and I cannot do an async call here.
>>    I could do a sync call if there would be a guarantee that no requests
>>    are in flight. [Stefan]
>>
>> v1->v2:
>>  - fixed block/Makefile.objs [Ronnie]
>>  - do not always register a read handler [Ronnie]
>>  - add support for reading beyond EOF [Fam]
>>  - fixed struct and paramter naming [Fam]
>>  - fixed overlong lines and whitespace errors [Fam]
>>  - return return status from libnfs whereever possible [Fam]
>>  - added comment why we set allocated_file_size to -ENOTSUP after write [Fam]
>>  - avoid segfault when parsing filname [Fam]
>>  - remove unused close_bh from NFSClient [Fam]
>>  - avoid dividing and mutliplying total_size by BDRV_SECTOR_SIZE in 
>> nfs_file_create [Fam]
>>
>>  MAINTAINERS         |    5 +
>>  block/Makefile.objs |    1 +
>>  block/nfs.c         |  405 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  configure           |   26 ++++
> qapi-schema.json is missing, so you can't add NFS block devices using
> blockdev-add.
>
>>  4 files changed, 437 insertions(+)
>>  create mode 100644 block/nfs.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index a5ab8f8..09996ab 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -935,6 +935,11 @@ M: Peter Lieven <address@hidden>
>>  S: Supported
>>  F: block/iscsi.c
>>  
>> +NFS
>> +M: Peter Lieven <address@hidden>
>> +S: Maintained
>> +F: block/nfs.c
>> +
>>  SSH
>>  M: Richard W.M. Jones <address@hidden>
>>  S: Supported
>> diff --git a/block/Makefile.objs b/block/Makefile.objs
>> index 4e8c91e..e254a21 100644
>> --- a/block/Makefile.objs
>> +++ b/block/Makefile.objs
>> @@ -12,6 +12,7 @@ block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>>  ifeq ($(CONFIG_POSIX),y)
>>  block-obj-y += nbd.o nbd-client.o sheepdog.o
>>  block-obj-$(CONFIG_LIBISCSI) += iscsi.o
>> +block-obj-$(CONFIG_LIBNFS) += nfs.o
>>  block-obj-$(CONFIG_CURL) += curl.o
>>  block-obj-$(CONFIG_RBD) += rbd.o
>>  block-obj-$(CONFIG_GLUSTERFS) += gluster.o
>> diff --git a/block/nfs.c b/block/nfs.c
>> new file mode 100644
>> index 0000000..4023b71
>> --- /dev/null
>> +++ b/block/nfs.c
>> @@ -0,0 +1,405 @@
>> +/*
>> + * QEMU Block driver for native access to files on NFS shares
>> + *
>> + * Copyright (c) 2013 Peter Lieven <address@hidden>
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a 
>> copy
>> + * of this software and associated documentation files (the "Software"), to 
>> deal
>> + * in the Software without restriction, including without limitation the 
>> rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
>> FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
>> + */
>> +
>> +#include "config-host.h"
>> +
>> +#include <poll.h>
>> +#include "qemu-common.h"
>> +#include "qemu/config-file.h"
>> +#include "qemu/error-report.h"
>> +#include "block/block_int.h"
>> +#include "trace.h"
>> +#include "qemu/iov.h"
>> +#include "sysemu/sysemu.h"
>> +
>> +#include <nfsc/libnfs-zdr.h>
>> +#include <nfsc/libnfs.h>
>> +#include <nfsc/libnfs-raw.h>
>> +#include <nfsc/libnfs-raw-mount.h>
>> +
>> +typedef struct NFSClient {
>> +    struct nfs_context *context;
>> +    struct nfsfh *fh;
>> +    int events;
>> +    bool has_zero_init;
>> +} NFSClient;
>> +
>> +typedef struct NFSRPC {
>> +    int status;
>> +    int complete;
>> +    QEMUIOVector *iov;
>> +    struct stat *st;
>> +    Coroutine *co;
>> +    QEMUBH *bh;
>> +} NFSRPC;
>> +
>> +static void nfs_process_read(void *arg);
>> +static void nfs_process_write(void *arg);
>> +
>> +static void nfs_set_events(NFSClient *client)
>> +{
>> +    int ev = nfs_which_events(client->context);
>> +    if (ev != client->events) {
>> +        qemu_aio_set_fd_handler(nfs_get_fd(client->context),
>> +                      (ev & POLLIN) ? nfs_process_read : NULL,
>> +                      (ev & POLLOUT) ? nfs_process_write : NULL,
>> +                      client);
>> +
>> +    }
>> +    client->events = ev;
>> +}
>> +
>> +static void nfs_process_read(void *arg)
>> +{
>> +    NFSClient *client = arg;
>> +    nfs_service(client->context, POLLIN);
>> +    nfs_set_events(client);
>> +}
>> +
>> +static void nfs_process_write(void *arg)
>> +{
>> +    NFSClient *client = arg;
>> +    nfs_service(client->context, POLLOUT);
>> +    nfs_set_events(client);
>> +}
>> +
>> +static void nfs_co_init_task(NFSClient *client, NFSRPC *task)
>> +{
>> +    *task = (NFSRPC) {
>> +        .co         = qemu_coroutine_self(),
>> +    };
>> +}
>> +
>> +static void nfs_co_generic_bh_cb(void *opaque)
>> +{
>> +    NFSRPC *task = opaque;
>> +    qemu_bh_delete(task->bh);
>> +    qemu_coroutine_enter(task->co, NULL);
>> +}
>> +
>> +static void
>> +nfs_co_generic_cb(int status, struct nfs_context *nfs, void *data,
>> +                  void *private_data)
>> +{
>> +    NFSRPC *task = private_data;
>> +    task->complete = 1;
>> +    task->status = status;
>> +    if (task->status > 0 && task->iov) {
>> +        if (task->status <= task->iov->size) {
>> +            qemu_iovec_from_buf(task->iov, 0, data, task->status);
>> +        } else {
>> +            task->status = -EIO;
> Short reads don't happen in practice with libnfs?
Not hat I now of. Ronnie?
>
>> +        }
>> +    }
>> +    if (task->status == 0 && task->st) {
>> +        memcpy(task->st, data, sizeof(struct stat));
>> +    }
>> +    if (task->co) {
>> +        task->bh = qemu_bh_new(nfs_co_generic_bh_cb, task);
>> +        qemu_bh_schedule(task->bh);
>> +    }
>> +}
>> +
>> +static int coroutine_fn nfs_co_readv(BlockDriverState *bs,
>> +                                     int64_t sector_num, int nb_sectors,
>> +                                     QEMUIOVector *iov)
>> +{
>> +    NFSClient *client = bs->opaque;
>> +    NFSRPC task;
>> +
>> +    nfs_co_init_task(client, &task);
>> +    task.iov = iov;
>> +
>> +    if (nfs_pread_async(client->context, client->fh,
>> +                        sector_num * BDRV_SECTOR_SIZE,
>> +                        nb_sectors * BDRV_SECTOR_SIZE,
>> +                        nfs_co_generic_cb, &task) != 0) {
>> +        return -ENOMEM;
>> +    }
>> +
>> +    while (!task.complete) {
>> +        nfs_set_events(client);
>> +        qemu_coroutine_yield();
>> +    }
>> +
>> +    if (task.status < 0) {
>> +        return task.status;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int coroutine_fn nfs_co_writev(BlockDriverState *bs,
>> +                                        int64_t sector_num, int nb_sectors,
>> +                                        QEMUIOVector *iov)
>> +{
>> +    NFSClient *client = bs->opaque;
>> +    NFSRPC task;
>> +    char *buf = NULL;
>> +
>> +    nfs_co_init_task(client, &task);
>> +
>> +    buf = g_malloc(nb_sectors * BDRV_SECTOR_SIZE);
>> +    qemu_iovec_to_buf(iov, 0, buf, nb_sectors * BDRV_SECTOR_SIZE);
>> +
>> +    if (nfs_pwrite_async(client->context, client->fh,
>> +                         sector_num * BDRV_SECTOR_SIZE,
>> +                         nb_sectors * BDRV_SECTOR_SIZE,
>> +                         buf, nfs_co_generic_cb, &task) != 0) {
>> +        g_free(buf);
>> +        return -ENOMEM;
>> +    }
>> +
>> +    while (!task.complete) {
>> +        nfs_set_events(client);
>> +        qemu_coroutine_yield();
>> +    }
>> +
>> +    g_free(buf);
>> +
>> +    if (task.status != nb_sectors * BDRV_SECTOR_SIZE) {
>> +        return task.status < 0 ? task.status : -EIO;
> Does this duplicate the logic of nfs_co_generic_cb?
The logic inside nfs_co_generic_cb is for the read case.
>
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int coroutine_fn nfs_co_flush(BlockDriverState *bs)
>> +{
>> +    NFSClient *client = bs->opaque;
>> +    NFSRPC task;
>> +
>> +    nfs_co_init_task(client, &task);
>> +
>> +    if (nfs_fsync_async(client->context, client->fh, nfs_co_generic_cb,
>> +                        &task) != 0) {
>> +        return -ENOMEM;
>> +    }
>> +
>> +    while (!task.complete) {
>> +        nfs_set_events(client);
>> +        qemu_coroutine_yield();
>> +    }
>> +
>> +    return task.status;
>> +}
>> +
>> +static QemuOptsList runtime_opts = {
>> +    .name = "nfs",
>> +    .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
>> +    .desc = {
>> +        {
>> +            .name = "filename",
>> +            .type = QEMU_OPT_STRING,
>> +            .help = "URL to the NFS file",
>> +        },
>> +        { /* end of list */ }
>> +    },
>> +};
> I think this is the point where I disagree. First of all, if it's a URL,
> call it "url" instead of "filename". But more importantly, a URL encodes
> options in a string instead of structured options that can be set
> separately.
Thats exactly what I meant above: The options are encoded in the
string, e.g.

nfs://server/export/file?uid=1000&gid=1000

Thats what libnfs expects. But all the options are not important for qemu.

"filename" was copied from block/iscsi.c. The semantic is exactly
the same. It accepts an URL and all the paramter parsing is
done inside libiscsi with iscsi_parse_url_full.

Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]