qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH] Fix a race condition and non-leaf images growing in


From: Igor Lvovsky
Subject: [Qemu-devel] [PATCH] Fix a race condition and non-leaf images growing in VMDK chains.
Date: Sun, 13 May 2007 04:13:20 -0700

 Hi,
In this patch I fixed two issues:
1. A race condition during write operations on snapshots.
        Now we write the grain of data first and the L2 metadata after.
        So, the snapshot will stay correct if the VM will be destroyed in the   
middle of the write. 
2. Non-leaf images growing during writes.
        Assume we have snapshots chain (Base->Snap1->Snap2->...->Leaf) and we   
        run a VM with the latest image of this chain (leaf image).
        We have a problem with non-leaf images growing in the snapshot-chain    
        (most noticeable when the VM performs aggressive writes).
        It's an incorrect behavior according to VMDK spec.
      For every write operation into an unknown offset, the active image        
must query its ancestors for this offset, and if exists in any of them  perform 
a read-from-ancestor/modify/write-to-active the whole grain of  that offset.
        The problem happened upon read-from-ancestor/modify/write-to-active     
where the ancestor was 2 or more generations above the active (leaf)    image 
(not a direct parent), as its direct child was modified.
      
        Fixed by always write to the 'active' (leaf) image.

                Regards, 
                        Igor Lvovsky
         
        



-----Original Message-----
From: address@hidden [mailto:address@hidden On Behalf Of Fabrice Bellard
Sent: Tuesday, January 16, 2007 9:36 PM
To: address@hidden
Subject: Re: [Qemu-devel] Race condition in VMDK (QCOW*) formats.

Well, it was never said that the QCOW* code was safe if you interrupted 
QEMU at some point.

But I agree that it could be safer to write the sector first and update 
the links after. It could be interesting to analyze the QCOW2 snapshots 
handling too (what if QEMU is stopped during the creation of a snapshot ?).

Regards,

Fabrice.

Igor Lvovsky wrote:
> 
> 
>  Hi all,
> 
> I have doubt about the race condition during the *write operation on 
> snapshot*.
> 
> I think the problem exists in VMDK and QCOW* formats (I didn't checked 
> the others).
> 
>  
> 
> The example from the block_vmdk.c.
> 
>  
> 
> static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
> 
>                      const uint8_t *buf, int nb_sectors)
> 
> {
> 
>     BDRVVmdkState *s = bs->opaque;
> 
>     int ret, index_in_cluster, n;
> 
>     uint64_t cluster_offset;
> 
>  
> 
>     while (nb_sectors > 0) {
> 
>         index_in_cluster = sector_num & (s->cluster_sectors - 1);
> 
>         n = s->cluster_sectors - index_in_cluster;
> 
>         if (n > nb_sectors)
> 
>             n = nb_sectors;
> 
>         cluster_offset = get_cluster_offset(bs, sector_num << 9, 1);
> 
>         if (!cluster_offset)
> 
>             return -1;
> 
>         lseek(s->fd, cluster_offset + index_in_cluster * 512, SEEK_SET);
> 
>         ret = write(s->fd, buf, n * 512);
> 
>         if (ret != n * 512)
> 
>             return -1;
> 
>         nb_sectors -= n;
> 
>         sector_num += n;
> 
>         buf += n * 512;
> 
>     }
> 
>     return 0;
> 
> }
> 
>  
> 
> The /get_cluster_offset(…)/ routine update the L2 table of the metadata 
> and return the /cluster_offset. /
> 
> After that the /vmdk_write(…)/ routine/ /actually write the grain at 
> right place.
> 
> So, we have timing hole here.
> 
>  
> 
> Assume, VM that perform write operation will be destroyed at this moment.
> 
> So, we have corrupted image (with updated L2 table, but without the 
> grain itself).
> 
>  
> 
>             Regards,
> 
>                         Igor Lvovsky
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Qemu-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/qemu-devel




_______________________________________________
Qemu-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/qemu-devel

Attachment: block-vmdk.diff
Description: block-vmdk.diff


reply via email to

[Prev in Thread] Current Thread [Next in Thread]