[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Design of the blobstore [API of the NVRAM]
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] Design of the blobstore [API of the NVRAM] |
Date: |
Fri, 16 Sep 2011 11:35:17 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Thu, Sep 15, 2011 at 08:34:55AM -0400, Stefan Berger wrote:
> On 09/15/2011 07:17 AM, Stefan Hajnoczi wrote:
> >On Wed, Sep 14, 2011 at 6:05 PM, Stefan Berger
> ><address@hidden> wrote:
> >> One property of the blobstore is that it has a certain required size for
> >>accommodating all blobs of device that want to store their blobs onto. The
> >>assumption is that the size of these blobs is know a-priori to the writer of
> >>the device code and all devices can register their space requirements with
> >>the blobstore during device initialization. Then gathering all the
> >>registered blobs' sizes plus knowing the overhead of the layout of the data
> >>on the disk lets QEMU calculate the total required (minimum) size that the
> >>image has to have to accommodate all blobs in a particular blobstore.
> >Libraries like tdb or gdbm come to mind. We should be careful not to
> >reinvent cpio/tar or FAT :).
> Sure. As long as these dbs allow to over-ride open(), close(),
> read(), write() and seek() with bdrv ops we could recycle any of
> these. Maybe we can build something smaller than those...
> >What about live migration? If each VM has a LUN assigned on a SAN
> >then these qcow2 files add a new requirement for a shared file system.
> >
> Well, one can still block-migrate these. The user has to know of
> course whether shared storage is setup or not and pass the
> appropriate flags to libvirt for migration. I know it works (modulo
> some problems when using encrypted QCoW2) since I've been testing
> with it.
>
> >Perhaps it makes sense to include the blobstore in the VM state data
> >instead? If you take that approach then the blobstore will get
> >snapshotted *into* the existing qcow2 images. Then you don't need a
> >shared file system for migration to work.
> >
> It could be an option. However, if the user has a raw image for the
> VM we still need the NVRAM emulation for the TPM for example. So we
> need to store the persistent data somewhere but raw is not prepared
> for that. Even if snapshotting doesn't work at all we need to be
> able to persist devices' data.
>
>
> >Can you share your design for the actual QEMU API that the TPM code
> >will use to manipulate the blobstore? Is it designed to work in the
> >event loop while QEMU is running, or is it for rare I/O on
> >startup/shutdown?
> >
> Everything is kind of changing now. But here's what I have right now:
>
> tb->s.tpm_ltpms->nvram = nvram_setup(tpm_ltpms->drive_id, &errcode);
> if (!tb->s.tpm_ltpms->nvram) {
> fprintf(stderr, "Could not find nvram.\n");
> return errcode;
> }
>
> nvram_register_blob(tb->s.tpm_ltpms->nvram,
> NVRAM_ENTRY_PERMSTATE,
> tpmlib_get_prop(TPMPROP_TPM_MAX_NV_SPACE));
> nvram_register_blob(tb->s.tpm_ltpms->nvram,
> NVRAM_ENTRY_SAVESTATE,
> tpmlib_get_prop(TPMPROP_TPM_MAX_SAVESTATE_SPACE));
> nvram_register_blob(tb->s.tpm_ltpms->nvram,
> NVRAM_ENTRY_VOLASTATE,
> tpmlib_get_prop(TPMPROP_TPM_MAX_VOLATILESTATE_SPACE));
>
> rc = nvram_start(tpm_ltpms->nvram, fail_on_encrypted_drive);
>
> Above first sets up the NVRAM using the drive's id. That is the
> -tpmdev ...,nvram=my-bs, parameter. This establishes the NVRAM.
> Subsequently the blobs to be written into the NVRAM are registered.
> The nvram_start then reconciles the registered NVRAM blobs with
> those found on disk and if everything fits together the result is
> 'rc = 0' and the NVRAM is ready to go. Other devices can than do the
> same also with the same NVRAM or another NVRAM. (NVRAM now after
> renaming from blobstore).
>
> Reading from NVRAM in case of the TPM is a rare event. It happens in
> the context of QEMU's main thread:
>
> if (nvram_read_data(tpm_ltpms->nvram,
> NVRAM_ENTRY_PERMSTATE,
> &tpm_ltpms->permanent_state.buffer,
> &tpm_ltpms->permanent_state.size,
> 0, NULL, NULL) ||
> nvram_read_data(tpm_ltpms->nvram,
> NVRAM_ENTRY_SAVESTATE,
> &tpm_ltpms->save_state.buffer,
> &tpm_ltpms->save_state.size,
> 0, NULL, NULL))
> {
> tpm_ltpms->had_fatal_error = true;
> return;
> }
>
> Above reads the data of 2 blobs synchronously. This happens during startup.
>
>
> Writes are depending on what the user does with the TPM. He can
> trigger lots of updates to persistent state if he performs certain
> operations, i.e., persisting keys inside the TPM.
>
> rc = nvram_write_data(tpm_ltpms->nvram,
> what, tsb->buffer, tsb->size,
> VNVRAM_ASYNC_F | VNVRAM_WAIT_COMPLETION_F,
> NULL, NULL);
>
> Above writes a TPM blob into the NVRAM. This is triggered by the TPM
> thread and notifies the QEMU main thread to write the blob into
> NVRAM. I do this synchronously at the moment not using the last two
> parameters for callback after completion but the two flags. The
> first is to notify the main thread the 2nd flag is to wait for the
> completion of the request (using a condition internally).
>
> Here are the protos:
>
> VNVRAM *nvram_setup(const char *drive_id, int *errcode);
>
> int nvram_start(VNVRAM *, bool fail_on_encrypted_drive);
>
> int nvram_register_blob(VNVRAM *bs, enum NVRAMEntryType type,
> unsigned int maxsize);
>
> unsigned int nvram_get_totalsize(VNVRAM *bs);
> unsigned int nvram_get_totalsize_kb(VNVRAM *bs);
>
> typedef void NVRAMRWFinishCB(void *opaque, int errcode, bool is_write,
> unsigned char **data, unsigned int len);
>
> int nvram_write_data(VNVRAM *bs, enum NVRAMEntryType type,
> const unsigned char *data, unsigned int len,
> int flags, NVRAMRWFinishCB cb, void *opaque);
>
>
> As said, things are changing right now, so this is to give an impression...
Thanks, these details are interesting. I interpreted the blobstore as a
key-value store but these example show it as a stream. No IDs or
offsets are given, the reads are just performed in order and move
through the NVRAM. If it stays this simple then bdrv_*() is indeed a
natural way to do this - although my migration point remains since this
feature adds a new requirement for shared storage when it would be
pretty easy to put this stuff in the vm data stream (IIUC the TPM NVRAM
is relatively small?).
Stefan
Re: [Qemu-devel] Design of the blobstore, Daniel P. Berrange, 2011/09/15