qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] optimization for qcow2 cache get/put


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC] optimization for qcow2 cache get/put
Date: Thu, 26 Mar 2015 14:33:58 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, Jan 26, 2015 at 09:20:00PM +0800, Zhang Haoyu wrote:
> Hi, all
> 
> Regarding too large qcow2 image, e.g., 2TB,
> so long disruption happened when performing snapshot,
> which was caused by cache update and IO wait.

I have CCed Kevin Wolf, the qcow2 maintainer.

> perf top data shown as below,
>    PerfTop:    2554 irqs/sec  kernel: 0.4%  exact:  0.0% [4000Hz cycles],  
> (target_pid: 34294)
> ------------------------------------------------------------------------------------------------------------------------
> 
>     33.80%  qemu-system-x86_64  [.] qcow2_cache_do_get            
>     27.59%  qemu-system-x86_64  [.] qcow2_cache_put               
>     15.19%  qemu-system-x86_64  [.] qcow2_cache_entry_mark_dirty  
>      5.49%  qemu-system-x86_64  [.] update_refcount               
>      3.02%  libpthread-2.13.so  [.] pthread_getspecific           
>      2.26%  qemu-system-x86_64  [.] get_refcount                  
>      1.95%  qemu-system-x86_64  [.] coroutine_get_thread_state    
>      1.32%  qemu-system-x86_64  [.] qcow2_update_snapshot_refcount
>      1.20%  qemu-system-x86_64  [.] qemu_coroutine_self           
>      1.16%  libz.so.1.2.7       [.] 0x0000000000003018            
>      0.95%  qemu-system-x86_64  [.] qcow2_update_cluster_refcount 
>      0.91%  qemu-system-x86_64  [.] qcow2_cache_get               
>      0.76%  libc-2.13.so        [.] 0x0000000000134e49            
>      0.73%  qemu-system-x86_64  [.] bdrv_debug_event              
>      0.16%  qemu-system-x86_64  [.] address@hidden       
>      0.12%  [kernel]            [k] _raw_spin_unlock_irqrestore   
>      0.10%  qemu-system-x86_64  [.] vga_draw_line24_32            
>      0.09%  [vdso]              [.] 0x000000000000060c            
>      0.09%  qemu-system-x86_64  [.] qcow2_check_metadata_overlap  
>      0.08%  [kernel]            [k] do_blockdev_direct_IO  
> 
> If expand the cache table size, the IO will be decreased, 
> but the calculation time will be grown.
> so it's worthy to optimize qcow2 cache get and put algorithm.
> 
> My proposal:
> get:
> using ((use offset >> cluster_bits) % c->size) to locate the cache entry,
> raw implementation,
> index = (use offset >> cluster_bits) % c->size;
> if (c->entries[index].offset == offset) {
>     goto found;
> }
> 
> replace:
> c->entries[use offset >> cluster_bits) % c->size].offset = offset;
> ...
> 
> put:
> using 64-entries cache table to cache
> the recently got c->entries, i.e., cache for cache,
> then during put process, firstly search the 64-entries cache,
> if not found, then the c->entries.
> 
> Any idea?
> 
> Thanks,
> Zhang Haoyu
> 

Attachment: pgpvAUXv1H4Nm.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]