qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] vhost, iova, and dirty page tracking


From: Alex Williamson
Subject: Re: [Qemu-devel] vhost, iova, and dirty page tracking
Date: Thu, 19 Sep 2019 11:20:48 -0600

On Wed, 18 Sep 2019 07:21:05 +0000
"Tian, Kevin" <address@hidden> wrote:

> > From: Jason Wang [mailto:address@hidden]
> > Sent: Wednesday, September 18, 2019 2:04 PM
> > 
> > On 2019/9/18 上午9:31, Tian, Kevin wrote:  
> > >> From: Alex Williamson [mailto:address@hidden]
> > >> Sent: Tuesday, September 17, 2019 10:54 PM
> > >>
> > >> On Tue, 17 Sep 2019 08:48:36 +0000
> > >> "Tian, Kevin"<address@hidden>  wrote:
> > >>  
> > >>>> From: Jason Wang [mailto:address@hidden]
> > >>>> Sent: Monday, September 16, 2019 4:33 PM
> > >>>>
> > >>>>
> > >>>> On 2019/9/16 上午9:51, Tian, Kevin wrote:  
> > >>>>> Hi, Jason
> > >>>>>
> > >>>>> We had a discussion about dirty page tracking in VFIO, when  
> > vIOMMU  
> > >>>>> is enabled:
> > >>>>>
> > >>>>> https://lists.nongnu.org/archive/html/qemu-devel/2019-  
> > >>>> 09/msg02690.html  
> > >>>>> It's actually a similar model as vhost - Qemu cannot interpose the  
> > fast-  
> > >>>> path  
> > >>>>> DMAs thus relies on the kernel part to track and report dirty page  
> > >>>> information.  
> > >>>>> Currently Qemu tracks dirty pages in GFN level, thus demanding a  
> > >>>> translation  
> > >>>>> from IOVA to GPA. Then the open in our discussion is where this  
> > >>>> translation  
> > >>>>> should happen. Doing the translation in kernel implies a device iotlb 
> > >>>>>  
> > >>>> flavor,  
> > >>>>> which is what vhost implements today. It requires potentially large  
> > >>>> tracking  
> > >>>>> structures in the host kernel, but leveraging the existing log_sync  
> > flow  
> > >> in  
> > >>>> Qemu.  
> > >>>>> On the other hand, Qemu may perform log_sync for every removal  
> > of  
> > >>>> IOVA  
> > >>>>> mapping and then do the translation itself, then avoiding the GPA  
> > >>>> awareness  
> > >>>>> in the kernel side. It needs some change to current Qemu log-sync  
> > >> flow,  
> > >>>> and  
> > >>>>> may bring more overhead if IOVA is frequently unmapped.
> > >>>>>
> > >>>>> So we'd like to hear about your opinions, especially about how you  
> > >> came  
> > >>>>> down to the current iotlb approach for vhost.  
> > >>>> We don't consider too much in the point when introducing vhost. And
> > >>>> before IOTLB, vhost has already know GPA through its mem table
> > >>>> (GPA->HVA). So it's nature and easier to track dirty pages at GPA level
> > >>>> then it won't any changes in the existing ABI.  
> > >>> This is the same situation as VFIO.  
> > >> It is?  VFIO doesn't know GPAs, it only knows HVA, HPA, and IOVA.  In
> > >> some cases IOVA is GPA, but not all.  
> > > Well, I thought vhost has a similar design, that the index of its mem 
> > > table
> > > is GPA when vIOMMU is off and then becomes IOVA when vIOMMU is on.
> > > But I may be wrong here. Jason, can you help clarify? I saw two
> > > interfaces which poke the mem table: VHOST_SET_MEM_TABLE (for GPA)
> > > and VHOST_IOTLB_UPDATE (for IOVA). Are they used exclusively or  
> > together?  
> > >  
> > 
> > Actually, vhost maintains two interval trees, mem table GPA->HVA, and
> > device IOTLB IOVA->HVA. Device IOTLB is only used when vIOMMU is
> > enabled, and in that case mem table is used only when vhost need to
> > track dirty pages (do reverse lookup of memtable to get HVA->GPA). So in
> > conclusion, for datapath, they are used exclusively, but they need
> > cowork for logging dirty pages when device IOTLB is enabled.
> >   
> 
> OK. Then it's different from current VFIO design, which maintains only
> one tree which is indexed by either GPA or IOVA exclusively, upon 
> whether vIOMMU is in use. 

Nit, the VFIO tree is only ever indexed by IOVA.  The MAP_DMA ioctl is
only ever performed with an IOVA.  Userspace decides how that IOVA maps
to GPA, VFIO only needs to know how the IOVA maps to HPA via the HVA.
Thanks,

Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]