[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [patch 4/5][v2] Aggregate same type clusters.
From: |
Laurent Vivier |
Subject: |
Re: [Qemu-devel] [patch 4/5][v2] Aggregate same type clusters. |
Date: |
Mon, 11 Aug 2008 14:39:05 +0200 |
Hi Kevin,
BTW, I'm currently rewriting this patch...
Le lundi 11 août 2008 à 14:10 +0200, Kevin Wolf a écrit :
> Laurent Vivier schrieb:
> > Modify get_cluster_offset(), alloc_cluster_offset() and free_used_clusters()
> > to specify how many clusters we want.
> >
> > Signed-off-by: Laurent Vivier <address@hidden>
> > ---
> > block-qcow2.c | 212
> > ++++++++++++++++++++++++++++++++++++++++++----------------
> > 1 file changed, 154 insertions(+), 58 deletions(-)
> >
> > Index: qemu/block-qcow2.c
> > ===================================================================
> > --- qemu.orig/block-qcow2.c 2008-07-29 15:22:26.000000000 +0200
> > +++ qemu/block-qcow2.c 2008-07-29 15:22:28.000000000 +0200
> > @@ -575,32 +575,76 @@ static int l2_allocate(BlockDriverState
> > return 1;
> > }
> >
> > -static uint64_t get_cluster_offset(BlockDriverState *bs, uint64_t offset)
> > +static uint64_t get_cluster_offset(BlockDriverState *bs,
> > + uint64_t offset, int *num)
>
> I think you start to know what kind of comments I'll provide. So yes,
> here's another one of them: While it's intuitive what value I should
> pass for num, it's cleary not what the function will return in it. Or
> even what the function is doing at all.
>
> This is how I understand it: The returned num is the number of
> contiguous clusters that can be read with a single read operation, i.e.
> they are all sparse, come from a backing file or are physically
> contiguous in the image file.
Yes
> Add a comment which says this and I'll be happy.
OK (I'll cut&paste the lines above ;-) )
> > {
> > BDRVQcowState *s = bs->opaque;
> > int l1_index, l2_index, ret;
> > - uint64_t l2_offset, *l2_table, cluster_offset;
> > + uint64_t l2_offset, *l2_table, cluster_offset, next;
> > + int l1_bits;
> > + int index_in_cluster, nb_available, nb_needed;
> >
> > - l1_index = offset >> (s->l2_bits + s->cluster_bits);
> > + index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
> > + nb_needed = *num + index_in_cluster;
> > +
> > + l1_bits = s->l2_bits + s->cluster_bits;
> > +
> > + nb_available = (1 << l1_bits) - (offset & ((1 << l1_bits) - 1));
> > + nb_available = (nb_available >> 9) + index_in_cluster;
>
> This could use a comment that nb_available is the remaining sectors in
> the L2 table (is it?) and that it is used in the following two
> conditions (the goto makes this non-obvious - at first, I thought that
> this value wouldn't be used at all)
nb_available = (1 << l1_bits) - (offset & ((1 << l1_bits) - 1));
is the number of bytes given by one l1 entry
nb_available = (nb_available >> 9) + index_in_cluster;
is the number of sectors from the first sector we want to the last
sector of the same l1 entry.
I do that because I don't want to manage the case where we run across a
l1 cache entry boundary.
I guess you want comments ?
> > +
> > + cluster_offset = 0;
> > +
> > + l1_index = offset >> l1_bits;
> > if (l1_index >= s->l1_size)
> > - return 0;
> > + goto out;
> >
> > if (!s->l1_table[l1_index])
> > - return 0;
> > + goto out;
> >
> > ret = l2_load(bs, l1_index, &l2_table, &l2_offset);
> > if (ret == 0)
> > - return 0;
> > + goto out;
>
> ret == 0 means that loading the L2 table failed. This is a real error,
> right? Isn't return 0 the right thing to do then?
Yes... :-P
> >
> > l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
> > cluster_offset = be64_to_cpu(l2_table[l2_index]);
> > + nb_available = s->cluster_sectors;
> > + l2_index++;
> > +
> > + if (!cluster_offset) {
> > +
> > + /* how many empty clusters ? */
> > +
> > + while (nb_available < nb_needed && !l2_table[l2_index]) {
> > + l2_index++;
> > + nb_available += s->cluster_sectors;
> > + }
> > +
> > + } else {
> >
> > - return cluster_offset & ~QCOW_OFLAG_COPIED;
> > + /* how many allocated clusters ? */
> > +
> > + cluster_offset &= ~QCOW_OFLAG_COPIED;
> > + while (nb_available < nb_needed) {
> > + next = be64_to_cpu(l2_table[l2_index]) & ~QCOW_OFLAG_COPIED;
> > + if (next != cluster_offset + (nb_available << 9))
> > + break;
> > + l2_index++;
> > + nb_available += s->cluster_sectors;
> > + }
> > + }
> > +
> > +out:
> > + if (nb_available > nb_needed)
> > + nb_available = nb_needed;
> > +
> > + *num = nb_available - index_in_cluster;
> > +
> > + return cluster_offset;
> > }
> >
> > static uint64_t free_used_clusters(BlockDriverState *bs, uint64_t offset,
> > uint64_t **l2_table, uint64_t *l2_offset,
> > - int *l2_index)
> > + int *l2_index, int *nb_clusters)
>
> You would save some ifs if you didn't allow nb_cluster to be NULL.
> Passing a local variable containing 1 should do the very same thing and
> seems to be less error prone. Otherwise, put a note here which says what
> passing NULL means.
>
To follow your previous comments (patch 3), free_used_clusters() has
been removed from this patch...
> > {
> > BDRVQcowState *s = bs->opaque;
> > int l1_index, ret;
> > @@ -629,21 +673,63 @@ static uint64_t free_used_clusters(Block
> > *l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1);
> > cluster_offset = be64_to_cpu((*l2_table)[*l2_index]);
> >
> > - if (cluster_offset & QCOW_OFLAG_COPIED)
> > + if (nb_clusters && *nb_clusters > s->l2_size - (*l2_index))
> > + *nb_clusters = s->l2_size - (*l2_index);
> > +
> > + if (!cluster_offset) {
> > + if (nb_clusters) {
> > + int i = 1;
> > + while (i < *nb_clusters && (*l2_table)[(*l2_index) + i] == 0) {
> > + i++;
> > + }
> > + *nb_clusters = i;
> > + }
> > + return 0;
> > + }
> > +
> > + if (cluster_offset & QCOW_OFLAG_COPIED) {
> > + if (nb_clusters) {
> > + int i = 1;
> > + uint64_t current;
> > + while (i < *nb_clusters) {
> > + current = be64_to_cpu((*l2_table)[(*l2_index) + i]);
> > + if (cluster_offset + (i << s->cluster_bits) != current)
> > + break;
> > + i++;
> > + }
> > + *nb_clusters = i;
> > + }
> > return cluster_offset;
> > + }
> >
> > - if (cluster_offset) {
> > - /* free the cluster */
> > - if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
> > - int nb_csectors;
> > - nb_csectors = ((cluster_offset >> s->csize_shift) &
> > - s->csize_mask) + 1;
> > - free_clusters(bs, (cluster_offset & s->cluster_offset_mask) &
> > ~511,
> > - nb_csectors * 512);
> > - } else {
> > - free_clusters(bs, cluster_offset, s->cluster_size);
> > + /* free the cluster */
> > +
> > + if (cluster_offset & QCOW_OFLAG_COMPRESSED) {
> > + int nb_csectors;
> > + nb_csectors = ((cluster_offset >> s->csize_shift) & s->csize_mask)
> > + 1;
> > + free_clusters(bs, (cluster_offset & s->cluster_offset_mask) & ~511,
> > + nb_csectors * 512);
> > + if (nb_clusters)
> > + *nb_clusters = 1;
> > + return 0;
> > + }
> > +
> > + if (nb_clusters) {
> > + int i = 1;
> > + uint64_t current;
> > + while (i < *nb_clusters) {
> > + current = be64_to_cpu((*l2_table)[(*l2_index) + i]);
> > + if (cluster_offset + (i << s->cluster_bits) != current)
> > + break;
> > + i++;
> > }
> > + *nb_clusters = i;
> > + free_clusters(bs, cluster_offset, i << s->cluster_bits);
> > + return 0;
> > }
> > +
> > + free_clusters(bs, cluster_offset, s->cluster_size);
> > +
> > return 0;
> > }
> >
> > @@ -657,7 +743,8 @@ static uint64_t alloc_compressed_cluster
> > int nb_csectors;
> >
> > cluster_offset = free_used_clusters(bs, offset,
> > - &l2_table, &l2_offset, &l2_index);
> > + &l2_table, &l2_offset, &l2_index,
> > + NULL);
> > if (cluster_offset & QCOW_OFLAG_COPIED)
> > return cluster_offset & ~QCOW_OFLAG_COPIED;
> >
> > @@ -683,63 +770,80 @@ static uint64_t alloc_compressed_cluster
> >
> > static uint64_t alloc_cluster_offset(BlockDriverState *bs,
> > uint64_t offset,
> > - int n_start, int n_end)
> > + int n_start, int n_end,
> > + int *num)
>
> The interface between get_cluster_offset and alloc_cluster_offset is
> inconsistent. In the former function, the value passed in num is used to
> determine the number of clusters to get. In the latter, num is an output
> parameter whose value isn't used. This is confusing.
Yes, I know, I have to think about this.
> > {
> > BDRVQcowState *s = bs->opaque;
> > int l2_index, ret;
> > uint64_t l2_offset, *l2_table, cluster_offset;
> > + int nb_available, nb_clusters, i;
> > + uint64_t start_sect;
> >
> > + nb_clusters = ((n_end << 9) + s->cluster_size - 1) >>
> > + s->cluster_bits;
> >
> > cluster_offset = free_used_clusters(bs, offset,
> > - &l2_table, &l2_offset, &l2_index);
> > - if (cluster_offset & QCOW_OFLAG_COPIED)
> > - return cluster_offset & ~QCOW_OFLAG_COPIED;
> > + &l2_table, &l2_offset, &l2_index,
> > + &nb_clusters);
> > + nb_available = nb_clusters << (s->cluster_bits - 9);
> > + if (nb_available > n_end)
> > + nb_available = n_end;
> > +
> > + if (cluster_offset & QCOW_OFLAG_COPIED) {
> > + cluster_offset &= ~QCOW_OFLAG_COPIED;
> > + goto out;
> > + }
> >
> > - /* allocate a new cluster */
> > + /* allocate new clusters */
> >
> > - cluster_offset = alloc_clusters(bs, s->cluster_size);
> > + cluster_offset = alloc_clusters(bs, nb_clusters * s->cluster_size);
> >
> > /* we must initialize the cluster content which won't be
> > written */
> >
> > - if ((n_end - n_start) < s->cluster_sectors) {
> > - uint64_t start_sect;
> > -
> > - start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
> > + start_sect = (offset & ~(s->cluster_size - 1)) >> 9;
> > + if (n_start) {
> > ret = copy_sectors(bs, start_sect, cluster_offset, 0, n_start);
> > if (ret < 0)
> > return 0;
> > - ret = copy_sectors(bs, start_sect,
> > - cluster_offset, n_end, s->cluster_sectors);
> > + }
> > +
> > + if (nb_available & (s->cluster_sectors - 1)) {
> > + uint64_t end = nb_available & ~(uint64_t)(s->cluster_sectors - 1);
> > + ret = copy_sectors(bs, start_sect + end,
> > + cluster_offset + (end << 9),
> > + nb_available - end,
> > + s->cluster_sectors);
> > if (ret < 0)
> > return 0;
> > }
> >
> > /* update L2 table */
> >
> > - l2_table[l2_index] = cpu_to_be64(cluster_offset | QCOW_OFLAG_COPIED);
> > + for (i = 0; i < nb_clusters; i++)
> > + l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
> > + (i << s->cluster_bits)) |
> > + QCOW_OFLAG_COPIED);
> > +
> > if (bdrv_pwrite(s->hd,
> > l2_offset + l2_index * sizeof(uint64_t),
> > l2_table + l2_index,
> > - sizeof(uint64_t)) != sizeof(uint64_t))
> > + nb_clusters * sizeof(uint64_t)) !=
> > + nb_clusters * sizeof(uint64_t))
> > return 0;
> >
> > +out:
> > + *num = nb_available - n_start;
> > return cluster_offset;
> > }
> >
> > static int qcow_is_allocated(BlockDriverState *bs, int64_t sector_num,
> > int nb_sectors, int *pnum)
> > {
> > - BDRVQcowState *s = bs->opaque;
> > - int index_in_cluster, n;
> > uint64_t cluster_offset;
> >
> > - cluster_offset = get_cluster_offset(bs, sector_num << 9);
> > - index_in_cluster = sector_num & (s->cluster_sectors - 1);
> > - n = s->cluster_sectors - index_in_cluster;
> > - if (n > nb_sectors)
> > - n = nb_sectors;
> > - *pnum = n;
> > + cluster_offset = get_cluster_offset(bs, sector_num << 9, pnum);
> > +
> > return (cluster_offset != 0);
> > }
> >
> > @@ -816,11 +920,9 @@ static int qcow_read(BlockDriverState *b
> > uint64_t cluster_offset;
> >
> > while (nb_sectors > 0) {
> > - cluster_offset = get_cluster_offset(bs, sector_num << 9);
> > + n = nb_sectors;
> > + cluster_offset = get_cluster_offset(bs, sector_num << 9, &n);
> > index_in_cluster = sector_num & (s->cluster_sectors - 1);
> > - n = s->cluster_sectors - index_in_cluster;
> > - if (n > nb_sectors)
> > - n = nb_sectors;
> > if (!cluster_offset) {
> > if (bs->backing_hd) {
> > /* read from the base image */
> > @@ -862,12 +964,10 @@ static int qcow_write(BlockDriverState *
> >
> > while (nb_sectors > 0) {
> > index_in_cluster = sector_num & (s->cluster_sectors - 1);
> > - n = s->cluster_sectors - index_in_cluster;
> > - if (n > nb_sectors)
> > - n = nb_sectors;
> > cluster_offset = alloc_cluster_offset(bs, sector_num << 9,
> > index_in_cluster,
> > - index_in_cluster + n);
> > + index_in_cluster +
> > nb_sectors,
> > + &n);
> > if (!cluster_offset)
> > return -1;
> > if (s->crypt_method) {
> > @@ -940,11 +1040,9 @@ static void qcow_aio_read_cb(void *opaqu
> > }
> >
> > /* prepare next AIO request */
> > - acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9);
> > + acb->n = acb->nb_sectors;
> > + acb->cluster_offset = get_cluster_offset(bs, acb->sector_num << 9,
> > &acb->n);
> > index_in_cluster = acb->sector_num & (s->cluster_sectors - 1);
> > - acb->n = s->cluster_sectors - index_in_cluster;
> > - if (acb->n > acb->nb_sectors)
> > - acb->n = acb->nb_sectors;
> >
> > if (!acb->cluster_offset) {
> > if (bs->backing_hd) {
> > @@ -1046,12 +1144,10 @@ static void qcow_aio_write_cb(void *opaq
> > }
> >
> > index_in_cluster = acb->sector_num & (s->cluster_sectors - 1);
> > - acb->n = s->cluster_sectors - index_in_cluster;
> > - if (acb->n > acb->nb_sectors)
> > - acb->n = acb->nb_sectors;
> > cluster_offset = alloc_cluster_offset(bs, acb->sector_num << 9,
> > index_in_cluster,
> > - index_in_cluster + acb->n);
> > + index_in_cluster +
> > acb->nb_sectors,
> > + &acb->n);
> > if (!cluster_offset || (cluster_offset & 511) != 0) {
> > ret = -EIO;
> > goto fail;
>
> In the writing functions, you can't just assign a big n, because
> s->cluster_data will be too small when processing encrypted data. As you
> said you fixed a segfault, I think you know this one already.
yes, this is the cause of the segfault.
Regards,
Laurent
--
----------------- address@hidden ------------------
"La perfection est atteinte non quand il ne reste rien à
ajouter mais quand il ne reste rien à enlever." Saint Exupéry