[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol
From: |
Blue Swirl |
Subject: |
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol |
Date: |
Sat, 18 Jun 2011 23:50:28 +0300 |
On Thu, Jun 16, 2011 at 5:48 PM, Corey Bryant <address@hidden> wrote:
>
>
> On 06/15/2011 03:12 PM, Blue Swirl wrote:
>>
>> On Tue, Jun 14, 2011 at 4:31 PM, Corey Bryant<address@hidden> wrote:
>>>
>>> > sVirt provides SELinux MAC isolation for Qemu guest processes and
>>> > their
>>> > corresponding resources (image files). sVirt provides this support
>>> > by labeling guests and resources with security labels that are stored
>>> > in file system extended attributes. Some file systems, such as NFS, do
>>> > not support the extended attribute security namespace, which is needed
>>> > for image file isolation when using the sVirt SELinux security driver
>>> > in libvirt.
>>> >
>>> > The proposed solution entails a combination of Qemu, libvirt, and
>>> > SELinux patches that work together to isolate multiple guests' images
>>> > when they're stored in the same NFS mount. This results in an
>>> > environment where sVirt isolation and NFS image file isolation can
>>> > both
>>> > be provided.
>>> >
>>> > This patch contains the Qemu code to support this solution. I would
>>> > like to solicit input from the libvirt community prior to starting
>>> > the libvirt patch.
>>> >
>>> > Currently, Qemu opens an image file in addition to performing the
>>> > necessary read and write operations. The proposed solution will move
>>> > the open out of Qemu and into libvirt. Once libvirt opens an image
>>> > file for the guest, it will pass the file descriptor to Qemu via a
>>> > new fd: protocol.
>>> >
>>> > If the image file resides in an NFS mount, the following SELinux
>>> > policy
>>> > changes will provide image isolation:
>>> >
>>> > - A new SELinux boolean is created (e.g. virt_read_write_nfs) to
>>> > allow Qemu (svirt_t) to only have SELinux read and write
>>> > permissions on nfs_t files
>>> >
>>> > - Qemu (svirt_t) also gets SELinux use permissions on libvirt
>>> > (virtd_t) file descriptors
>>> >
>>> > Following is a sample invocation of Qemu using the fd: protocol on
>>> > the command line:
>>> >
>>> > qemu -drive file=fd:4,format=qcow2
>>> >
>>> > The fd: protocol is also supported by the drive_add monitor command.
>>> > This requires that the specified file descriptor is passed to the
>>> > monitor alongside a prior getfd monitor command.
>>> >
>>> > There are some additional features provided by certain image types
>>> > where Qemu reopens the image file. All of these scenarios will be
>>> > unsupported for the fd: protocol, at least for this patch:
>>> >
>>> > - The -snapshot command line option
>>> > - The savevm monitor command
>>> > - The snapshot_blkdev monitor command
>>> > - Starting Qemu with a backing file
>>
>> There's also native CDROM device. Did you consider adding an explicit
>> reopen method to block layer?
>
> Thanks. Yes it looks like I overlooked CDROM reopens.
>
> I'm not sure that I'm clear on the purpose of the reopen function. Would the
> goal be to funnel all block layer reopens through a single function,
> enabling potential future support where a privileged layer of Qemu, or
> libvirt, performs the open?
Eventually yes, but I think it would help also now by moving the
checks to a single place. It's a bit orthogonal to this patch though.
>>> > The thought is that this support can be added in the future, but is
>>> > not required for the initial fd: support.
>>> >
>>> > This patch was tested with the following formats: raw, cow, qcow,
>>> > qcow2, qed, and vmdk, using the fd: protocol from the command line
>>> > and the monitor. Tests were also run to verify existing file name
>>> > support and qemu-img were not regressed. Non-valid file descriptors,
>>> > fd: without format, snapshot and backing files were also tested.
>>> >
>>> > Signed-off-by: Corey Bryant<address@hidden>
>>> > ---
>>> > block.c | 16 ++++++++++
>>> > block.h | 1 +
>>> > block/cow.c | 5 +++
>>> > block/qcow.c | 5 +++
>>> > block/qcow2.c | 5 +++
>>> > block/qed.c | 4 ++
>>> > block/raw-posix.c | 81
>>> > +++++++++++++++++++++++++++++++++++++++++++++++------
>>> > block/vmdk.c | 5 +++
>>> > block_int.h | 1 +
>>> > blockdev.c | 10 ++++++
>>> > monitor.c | 5 +++
>>> > monitor.h | 1 +
>>> > qemu-options.hx | 8 +++--
>>> > qemu-tool.c | 5 +++
>>> > 14 files changed, 140 insertions(+), 12 deletions(-)
>>> >
>>> > diff --git a/block.c b/block.c
>>> > index 24a25d5..500db84 100644
>>> > --- a/block.c
>>> > +++ b/block.c
>>> > @@ -536,6 +536,10 @@ int bdrv_open(BlockDriverState *bs, const char
>>> > *filename, int flags,
>>> > char tmp_filename[PATH_MAX];
>>> > char backing_filename[PATH_MAX];
>>> >
>>> > + if (bdrv_is_fd_protocol(bs)) {
>>> > + return -ENOTSUP;
>>> > + }
>>> > +
>>> > /* if snapshot, we create a temporary backing file and open
>>> > it
>>> > instead of opening 'filename' directly */
>>> >
>>> > @@ -585,6 +589,10 @@ int bdrv_open(BlockDriverState *bs, const char
>>> > *filename, int flags,
>>> >
>>> > /* Find the right image format driver */
>>> > if (!drv) {
>>> > + /* format must be specified for fd: protocol */
>>> > + if (bdrv_is_fd_protocol(bs)) {
>>> > + return -ENOTSUP;
>>> > + }
>>> > ret = find_image_format(filename,&drv);
>>> > }
>>> >
>>> > @@ -1460,6 +1468,11 @@ int bdrv_enable_write_cache(BlockDriverState
>>> > *bs)
>>> > return bs->enable_write_cache;
>>> > }
>>> >
>>> > +int bdrv_is_fd_protocol(BlockDriverState *bs)
>>> > +{
>>> > + return bs->fd_protocol;
>>> > +}
>>> > +
>>> > /* XXX: no longer used */
>>> > void bdrv_set_change_cb(BlockDriverState *bs,
>>> > void (*change_cb)(void *opaque, int reason),
>>> > @@ -1964,6 +1977,9 @@ int bdrv_snapshot_create(BlockDriverState *bs,
>>> > BlockDriver *drv = bs->drv;
>>> > if (!drv)
>>> > return -ENOMEDIUM;
>>> > + if (bdrv_is_fd_protocol(bs)) {
>>> > + return -ENOTSUP;
>>> > + }
>>> > if (drv->bdrv_snapshot_create)
>>> > return drv->bdrv_snapshot_create(bs, sn_info);
>>> > if (bs->file)
>>> > diff --git a/block.h b/block.h
>>> > index da7d39c..dd46d52 100644
>>> > --- a/block.h
>>> > +++ b/block.h
>>> > @@ -182,6 +182,7 @@ int bdrv_is_removable(BlockDriverState *bs);
>>> > int bdrv_is_read_only(BlockDriverState *bs);
>>> > int bdrv_is_sg(BlockDriverState *bs);
>>> > int bdrv_enable_write_cache(BlockDriverState *bs);
>>> > +int bdrv_is_fd_protocol(BlockDriverState *bs);
>>> > int bdrv_is_inserted(BlockDriverState *bs);
>>> > int bdrv_media_changed(BlockDriverState *bs);
>>> > int bdrv_is_locked(BlockDriverState *bs);
>>> > diff --git a/block/cow.c b/block/cow.c
>>> > index 4cf543c..e17f8e7 100644
>>> > --- a/block/cow.c
>>> > +++ b/block/cow.c
>>> > @@ -82,6 +82,11 @@ static int cow_open(BlockDriverState *bs, int
>>> > flags)
>>> > pstrcpy(bs->backing_file, sizeof(bs->backing_file),
>>> > cow_header.backing_file);
>>> >
>>> > + if (bs->backing_file[0] != '\0'&& bdrv_is_fd_protocol(bs)) {
>>> > + /* backing file currently not supported by fd: protocol */
>>> > + goto fail;
>>> > + }
>>> > +
>>> > bitmap_size = ((bs->total_sectors + 7)>> 3) +
>>> > sizeof(cow_header);
>>> > s->cow_sectors_offset = (bitmap_size + 511)& ~511;
>>> > return 0;
>>> > diff --git a/block/qcow.c b/block/qcow.c
>>> > index a26c886..1e46bdb 100644
>>> > --- a/block/qcow.c
>>> > +++ b/block/qcow.c
>>> > @@ -157,6 +157,11 @@ static int qcow_open(BlockDriverState *bs, int
>>> > flags)
>>> > if (bdrv_pread(bs->file, header.backing_file_offset,
>>> > bs->backing_file, len) != len)
>>> > goto fail;
>>> > bs->backing_file[len] = '\0';
>>> > +
>>> > + if (bs->backing_file[0] != '\0'&& bdrv_is_fd_protocol(bs)) {
>>> > + /* backing file currently not supported by fd: protocol
>>> > */
>>> > + goto fail;
>>> > + }
>>> > }
>>> > return 0;
>>> >
>>> > diff --git a/block/qcow2.c b/block/qcow2.c
>>> > index 8451ded..8b9c160 100644
>>> > --- a/block/qcow2.c
>>> > +++ b/block/qcow2.c
>>> > @@ -270,6 +270,11 @@ static int qcow2_open(BlockDriverState *bs, int
>>> > flags)
>>> > goto fail;
>>> > }
>>> > bs->backing_file[len] = '\0';
>>> > +
>>> > + if (bs->backing_file[0] != '\0'&& bdrv_is_fd_protocol(bs)) {
>>> > + ret = -ENOTSUP;
>>> > + goto fail;
>>> > + }
>>> > }
>>> > if (qcow2_read_snapshots(bs)< 0) {
>>> > ret = -EINVAL;
>>> > diff --git a/block/qed.c b/block/qed.c
>>> > index 3970379..5028897 100644
>>> > --- a/block/qed.c
>>> > +++ b/block/qed.c
>>> > @@ -446,6 +446,10 @@ static int bdrv_qed_open(BlockDriverState *bs,
>>> > int flags)
>>> > return ret;
>>> > }
>>> >
>>> > + if (bs->backing_file[0] != '\0'&& bdrv_is_fd_protocol(bs)) {
>>> > + return -ENOTSUP;
>>> > + }
>>> > +
>>> > if (s->header.features& QED_F_BACKING_FORMAT_NO_PROBE) {
>>> > pstrcpy(bs->backing_format, sizeof(bs->backing_format),
>>> > "raw");
>>> > }
>>> > diff --git a/block/raw-posix.c b/block/raw-posix.c
>>> > index 4cd7d7a..c72de3d 100644
>>> > --- a/block/raw-posix.c
>>> > +++ b/block/raw-posix.c
>>> > @@ -28,6 +28,7 @@
>>> > #include "block_int.h"
>>> > #include "module.h"
>>> > #include "block/raw-posix-aio.h"
>>> > +#include "monitor.h"
>>> >
>>> > #ifdef CONFIG_COCOA
>>> > #include<paths.h>
>>> > @@ -183,7 +184,8 @@ static int raw_open_common(BlockDriverState *bs,
>>> > const char *filename,
>>> > int bdrv_flags, int open_flags)
>>> > {
>>> > BDRVRawState *s = bs->opaque;
>>> > - int fd, ret;
>>> > + int fd = -1;
>>> > + int ret;
>>> >
>>> > ret = raw_normalize_devicepath(&filename);
>>> > if (ret != 0) {
>>> > @@ -205,15 +207,17 @@ static int raw_open_common(BlockDriverState *bs,
>>> > const char *filename,
>>> > if (!(bdrv_flags& BDRV_O_CACHE_WB))
>>> > s->open_flags |= O_DSYNC;
>>> >
>>> > - s->fd = -1;
>>> > - fd = qemu_open(filename, s->open_flags, 0644);
>>> > - if (fd< 0) {
>>> > - ret = -errno;
>>> > - if (ret == -EROFS)
>>> > - ret = -EACCES;
>>> > - return ret;
>>> > + if (s->fd == -1) {
>>> > + fd = qemu_open(filename, s->open_flags, 0644);
>>> > + if (fd< 0) {
>>> > + ret = -errno;
>>> > + if (ret == -EROFS) {
>>> > + ret = -EACCES;
>>> > + }
>>> > + return ret;
>>> > + }
>>> > + s->fd = fd;
>>> > }
>>> > - s->fd = fd;
>>> > s->aligned_buf = NULL;
>>> >
>>> > if ((bdrv_flags& BDRV_O_NOCACHE)) {
>>> > @@ -270,6 +274,7 @@ static int raw_open(BlockDriverState *bs, const
>>> > char *filename, int flags)
>>> > {
>>> > BDRVRawState *s = bs->opaque;
>>> >
>>> > + s->fd = -1;
>>> > s->type = FTYPE_FILE;
>>> > return raw_open_common(bs, filename, flags, 0);
>>> > }
>>> > @@ -890,6 +895,60 @@ static BlockDriver bdrv_file = {
>>> > .create_options = raw_create_options,
>>> > };
>>> >
>>> > +static int raw_open_fd(BlockDriverState *bs, const char *filename,
>>> > int flags)
>>> > +{
>>> > + BDRVRawState *s = bs->opaque;
>>> > + const char *fd_str;
>>> > + int fd;
>>> > +
>>> > + /* extract the file descriptor - fail if it's not fd: */
>>> > + if (!strstart(filename, "fd:",&fd_str)) {
>>> > + return -EINVAL;
>>> > + }
>>> > +
>>> > + if (!qemu_isdigit(fd_str[0])) {
>>> > + /* get fd from monitor */
>>> > + fd = qemu_get_fd(fd_str);
>>> > + if (fd == -1) {
>>> > + return -EBADF;
>>> > + }
>>> > + } else {
>>> > + char *endptr = NULL;
>>> > +
>>> > + fd = strtol(fd_str,&endptr, 10);
>>> > + if (*endptr || (fd == 0&& fd_str == endptr)) {
>>> > + return -EBADF;
>>> > + }
>>> > + }
>>> > +
>>> > + s->fd = fd;
>>> > + s->type = FTYPE_FILE;
>>> > +
>>> > + return raw_open_common(bs, filename, flags, 0);
>>> > +}
>>> > +
>>> > +static BlockDriver bdrv_file_fd = {
>>> > + .format_name = "file",
>>> > + .protocol_name = "fd",
>>> > + .instance_size = sizeof(BDRVRawState),
>>> > + .bdrv_probe = NULL, /* no probe for protocols */
>>> > + .bdrv_file_open = raw_open_fd,
>>> > + .bdrv_read = raw_read,
>>> > + .bdrv_write = raw_write,
>>> > + .bdrv_close = raw_close,
>>> > + .bdrv_flush = raw_flush,
>>> > + .bdrv_discard = raw_discard,
>>> > +
>>> > + .bdrv_aio_readv = raw_aio_readv,
>>> > + .bdrv_aio_writev = raw_aio_writev,
>>> > + .bdrv_aio_flush = raw_aio_flush,
>>> > +
>>> > + .bdrv_truncate = raw_truncate,
>>> > + .bdrv_getlength = raw_getlength,
>>> > +
>>> > + .create_options = raw_create_options,
>>> > +};
>>> > +
>>> > /***********************************************/
>>> > /* host device */
>>> >
>>> > @@ -998,6 +1057,7 @@ static int hdev_open(BlockDriverState *bs, const
>>> > char *filename, int flags)
>>> > }
>>> > #endif
>>> >
>>> > + s->fd = -1;
>>> > s->type = FTYPE_FILE;
>>> > #if defined(__linux__)
>>> > {
>>> > @@ -1168,6 +1228,7 @@ static int floppy_open(BlockDriverState *bs,
>>> > const char *filename, int flags)
>>> > BDRVRawState *s = bs->opaque;
>>> > int ret;
>>> >
>>> > + s->fd = -1;
>>> > s->type = FTYPE_FD;
>>> >
>>> > /* open will not fail even if no floppy is inserted, so add
>>> > O_NONBLOCK */
>>> > @@ -1280,6 +1341,7 @@ static int cdrom_open(BlockDriverState *bs,
>>> > const char *filename, int flags)
>>> > {
>>> > BDRVRawState *s = bs->opaque;
>>> >
>>> > + s->fd = -1;
>>> > s->type = FTYPE_CD;
>>> >
>>> > /* open will not fail even if no CD is inserted, so add
>>> > O_NONBLOCK */
>>> > @@ -1503,6 +1565,7 @@ static void bdrv_file_init(void)
>>> > * Register all the drivers. Note that order is important, the
>>> > driver
>>> > * registered last will get probed first.
>>> > */
>>> > + bdrv_register(&bdrv_file_fd);
>>> > bdrv_register(&bdrv_file);
>>> > bdrv_register(&bdrv_host_device);
>>> > #ifdef __linux__
>>> > diff --git a/block/vmdk.c b/block/vmdk.c
>>> > index 922b23d..2ea808e 100644
>>> > --- a/block/vmdk.c
>>> > +++ b/block/vmdk.c
>>> > @@ -353,6 +353,11 @@ static int vmdk_parent_open(BlockDriverState *bs)
>>> > return -1;
>>> >
>>> > pstrcpy(bs->backing_file, end_name - p_name + 1, p_name);
>>> > +
>>> > + if (bs->backing_file[0] != '\0'&& bdrv_is_fd_protocol(bs)) {
>>> > + /* backing file currently not supported by fd: protocol
>>> > */
>>> > + return -1;
>>> > + }
>>> > }
>>> >
>>> > return 0;
>>> > diff --git a/block_int.h b/block_int.h
>>> > index fa91337..a305ee2 100644
>>> > --- a/block_int.h
>>> > +++ b/block_int.h
>>> > @@ -152,6 +152,7 @@ struct BlockDriverState {
>>> > int encrypted; /* if true, the media is encrypted */
>>> > int valid_key; /* if true, a valid encryption key has been set */
>>> > int sg; /* if true, the device is a /dev/sg* */
>>> > + int fd_protocol; /* if true, the fd: protocol was specified */
>>
>> bool?
>>
> I was following suit here, but I agree that bool would be better. Or
> better, these could all be reduced to bit flags. My thought is that I'll
> stick with following suit here though.
OK, better convert all at once with a separate patch.
>>> > /* event callback when inserting/removing */
>>> > void (*change_cb)(void *opaque, int reason);
>>> > void *change_opaque;
>>> > diff --git a/blockdev.c b/blockdev.c
>>> > index e81e0ab..a536c20 100644
>>> > --- a/blockdev.c
>>> > +++ b/blockdev.c
>>> > @@ -546,6 +546,10 @@ DriveInfo *drive_init(QemuOpts *opts, int
>>> > default_to_scsi)
>>> >
>>> > bdrv_flags |= ro ? 0 : BDRV_O_RDWR;
>>> >
>>> > + if (strncmp(file, "fd:", 3) == 0) {
>>> > + dinfo->bdrv->fd_protocol = 1;
>>> > + }
>>> > +
>>> > ret = bdrv_open(dinfo->bdrv, file, bdrv_flags, drv);
>>> > if (ret< 0) {
>>> > error_report("could not open disk image %s: %s",
>>> > @@ -606,6 +610,12 @@ int do_snapshot_blkdev(Monitor *mon, const QDict
>>> > *qdict, QObject **ret_data)
>>> > goto out;
>>> > }
>>> >
>>> > + if (bdrv_is_fd_protocol(bs)) {
>>> > + qerror_report(QERR_UNSUPPORTED);
>>> > + ret = -1;
>>> > + goto out;
>>> > + }
>>> > +
>>> > pstrcpy(old_filename, sizeof(old_filename), bs->filename);
>>> >
>>> > old_drv = bs->drv;
>>> > diff --git a/monitor.c b/monitor.c
>>> > index 59a3e76..ea60be2 100644
>>> > --- a/monitor.c
>>> > +++ b/monitor.c
>>> > @@ -2832,6 +2832,11 @@ int monitor_get_fd(Monitor *mon, const char
>>> > *fdname)
>>> > return -1;
>>> > }
>>> >
>>> > +int qemu_get_fd(const char *fdname)
>>> > +{
>>> > + return cur_mon ? monitor_get_fd(cur_mon, fdname) : -1;
>>> > +}
>>> > +
>>> > static const mon_cmd_t mon_cmds[] = {
>>> > #include "hmp-commands.h"
>>> > { NULL, NULL, },
>>> > diff --git a/monitor.h b/monitor.h
>>> > index 4f2d328..de5b987 100644
>>> > --- a/monitor.h
>>> > +++ b/monitor.h
>>> > @@ -51,6 +51,7 @@ int monitor_read_bdrv_key_start(Monitor *mon,
>>> > BlockDriverState *bs,
>>> > void *opaque);
>>> >
>>> > int monitor_get_fd(Monitor *mon, const char *fdname);
>>> > +int qemu_get_fd(const char *fdname);
>>> >
>>> > void monitor_vprintf(Monitor *mon, const char *fmt, va_list ap)
>>> > GCC_FMT_ATTR(2, 0);
>>> > diff --git a/qemu-options.hx b/qemu-options.hx
>>> > index 1d5ad8b..f9b66f4 100644
>>> > --- a/qemu-options.hx
>>> > +++ b/qemu-options.hx
>>> > @@ -116,7 +116,7 @@ using @file{/dev/cdrom} as filename
>>> > (@pxref{host_drives}).
>>> > ETEXI
>>> >
>>> > DEF("drive", HAS_ARG, QEMU_OPTION_drive,
>>> > - "-drive
>>> > [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n"
>>> > + "-drive
>>> > [file=[fd:]file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n"
>>> > " [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n"
>>> > " [,cache=writethrough|writeback|none|unsafe][,format=f]\n"
>>> > " [,serial=s][,addr=A][,id=name][,aio=threads|native]\n"
>>> > @@ -129,10 +129,12 @@ STEXI
>>> > Define a new drive. Valid options are:
>>> >
>>> > address@hidden @option
>>> > address@hidden address@hidden
>>> > address@hidden file=[fd:address@hidden
>>> > This option defines which disk image (@pxref{disk_images}) to use
>>> > with
>>> > this drive. If the filename contains comma, you must double it
>>> > -(for instance, "file=my,,file" to use file "my,file").
>>> > +(for instance, "file=my,,file" to use file "my,file").
>>> > @option{fd:address@hidden
>>> > +specifies the file descriptor of an already open disk image.
>>> > address@hidden@var{format} is required by @option{fd:address@hidden
>>> > address@hidden address@hidden
>>> > This option defines on which type on interface the drive is
>>> > connected.
>>> > Available types are: ide, scsi, sd, mtd, floppy, pflash, virtio.
>>> > diff --git a/qemu-tool.c b/qemu-tool.c
>>> > index 41e5c41..8fe6b8c 100644
>>> > --- a/qemu-tool.c
>>> > +++ b/qemu-tool.c
>>> > @@ -96,3 +96,8 @@ int64_t qemu_get_clock_ns(QEMUClock *clock)
>>> > {
>>> > return 0;
>>> > }
>>> > +
>>> > +int qemu_get_fd(const char *fdname)
>>> > +{
>>> > + return -1;
>>> > +}
>>> > --
>>> > 1.7.1
>>> >
>>> >
>>> >
>
>
> Regards,
> Corey Bryant
>
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol, Blue Swirl, 2011/06/15
Re: [Qemu-devel] [PATCH v2] Add support for fd: protocol, Avi Kivity, 2011/06/20