qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 1/5] qga: win32: fix crashes when PCI info ca


From: Tomáš Golembiovský
Subject: Re: [Qemu-devel] [PATCH v3 1/5] qga: win32: fix crashes when PCI info cannot be retrived
Date: Mon, 1 Oct 2018 14:11:35 +0200

On Thu, 27 Sep 2018 18:16:04 +0300
Sameeh Jubran <address@hidden> wrote:

> On Thu, Sep 27, 2018 at 12:06 PM Tomáš Golembiovský <address@hidden>
> wrote:
> 
> > Hi Michael,
> >
> > thanks for looking into this. My comments are below.
> >
> > Adding Sameeh...
> >
> >
> > On Wed, 26 Sep 2018 12:15:48 -0500
> > Michael Roth <address@hidden> wrote:
> >  
> > > Quoting Tomáš Golembiovský (2018-09-07 06:42:09)  
> > > > The guest-get-fsinfo command collects also information about PCI
> > > > controller where the disk is attached. When this fails for some reasons
> > > > it tries to return just the partial information. However in certain
> > > > cases the pointer to the structure was not initialized and was set to
> > > > NULL. This breaks the serializer and leads to a crash of the guest  
> > agent.  
> > > >
> > > > Signed-off-by: Tomáš Golembiovský <address@hidden>  
> > >
> > > For a win10 guest started with:
> > >
> > > qemu-system-x86_64 -m 2G -smp 2,cores=2,sockets=1 -drive  
> > file=/home/mdroth/vm/win10_pro_n_snap0.qcow2,if=virtio -drive
> > file=/home/mdroth/vm/virtio-win-0.1.102.iso,if=ide,media=cdrom -rtc
> > base=localtime,driftfix=slew -vga std -boot d -name vm4 -netdev
> > tap,script=/etc/qemu-ifup,vhost=on,id=vnet0 -device
> > virtio-net-pci,mac=52:54:00:12:34:04,id=vnic0,netdev=vnet0,disable-modern=true
> > -vnc :4 -device virtio-serial -balloon virtio -mon chardev=hmp0 -chardev
> > socket,path=/tmp/vm4-hmp0.sock,server,nowait,id=hmp0 -mon
> > chardev=qmp0,mode=control -chardev
> > socket,path=/tmp/vm4-qmp0.sock,server,nowait,id=qmp0 -device
> > virtserialport,chardev=vs0,name=vs0,id=vs_vs0 -chardev
> > socket,path=/tmp/vm4-vs0.sock,server,nowait,id=vs0 -device
> > virtserialport,chardev=vs1,name=vs1,id=vs_vs1 -chardev
> > socket,path=/tmp/vm4-vs1.sock,server,nowait,id=vs1 -device
> > virtserialport,chardev=qga,name=org.qemu.guest_agent.0,id=vs_qga -chardev
> > socket,path=/tmp/vm4-qga.sock,server,nowait,id=qga -device
> > isa-serial,chardev=serial0,id=serial_serial0 -chardev
> > socket,path=/tmp/vm4-serial0.sock,server,nowait,id=serial0 -L
> > /home/mdroth/w/build/qemu-2.11.2-build/pc-bios --enable-kvm  
> > >
> > > this yields the following:
> > >
> > > {'execute':'guest-get-fsinfo'}
> > > {"return": [{"name":  
> > "\\\\?\\Volume{83835b2d-0032-11e6-a84f-806e6f6e6963}\\", "total-bytes":
> > 160755712, "mountpoint": "D:\\", "disk": [{"bus-type": "ide", "bus": 0,
> > "unit": 0, "pci-controller": {"bus": 0, "slot": 0, "domain": 0, "function":
> > 0}, "target": 0}], "used-bytes": 160755712, "type": "CDFS"}, {"name":
> > "\\\\?\\Volume{2ea839c6-0000-0000-0000-80620c000000}\\", "mountpoint":
> > "System Reserved", "disk": [{"bus-type": "scsi", "bus": 0, "unit": 0,
> > "pci-controller": {"bus": 0, "slot": 0, "domain": 0, "function": 0},
> > "target": 0}], "type": "NTFS"}, {"name":
> > "\\\\?\\Volume{2ea839c6-0000-0000-0000-501f00000000}\\", "total-bytes":
> > 52665839616, "mountpoint": "C:\\", "disk": [{"bus-type": "scsi", "bus": 0,
> > "unit": 0, "pci-controller": {"bus": 0, "slot": 0, "domain": 0, "function":
> > 0}, "target": 0}], "used-bytes": 25265487872, "type": "NTFS"}, {"name":
> > "\\\\?\\Volume{2ea839c6-0000-0000-0000-100000000000}\\", "mountpoint":
> > "System Reserved", "disk": [{"bus-type": "scsi", "bus": 0, "unit": 0,
> > "pci-controller": {"bus": 0, "slot": 0, "domain": 0, "function": 0},
> > "target": 0}], "type": "NTFS"}]}  
> > >
> > > domain/bus/slot/function=0 are valid PCI addresses so initializing to 0  
> > is  
> > > wrong. Sameeh had a previous series that initializes to -1 that I think
> > > is more appropriate (it hasn't gone in yet since we opted not to enable
> > > CONFIG_QGA_NTDDSCSI for 3.0 since the PCI stuff seems be generally
> > > broken for Windows, also because the 2nd patch needs some fixups:  
> >  
> Can you please elaborate? What kind of fixups? I can see no comments of
> this in the 2nd patch
> 

I think he refers to this fix:

https://github.com/mdroth/qemu/commit/201db36b56d7d1ba5ff720eedcb3b62b75306fde

Plus there seems to be an issue when "calculating" function number as
described below. (Maybe related to casting addr from -1 to uint and
back?)

> > >
> > >   https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg07500.html  
> >
> > I wasn't aware of that. I is trying to "fix" the same issue.
> >
> > I've been also thinking about using -1, but I didn't know what is/isn't
> > correct PCI address.
> >  
> > >
> > > With that series (and some fixups I have on top at
> > > https://github.com/mdroth/qemu/commits/qga-test), we get the following
> > > output:  
> >
> > Should I rebase on that and drop my patch?
> >  
> > >
> > > {'execute':'guest-get-fsinfo'}
> > > {"return": [{"name":  
> > "\\\\?\\Volume{83835b2d-0032-11e6-a84f-806e6f6e6963}\\", "total-bytes":
> > 160755712, "mountpoint": "D:\\", "disk": [{"bus-type": "ide", "bus": 0,
> > "unit": 0, "pci-controller": {"bus": -1, "slot": -1, "domain": -1,
> > "function": -1}, "target": 0}], "used-bytes": 160755712, "type": "CDFS"},
> > {"name": "\\\\?\\Volume{2ea839c6-0000-0000-0000-80620c000000}\\",
> > "mountpoint": "System Reserved", "disk": [{"bus-type": "scsi", "bus": 0,
> > "unit": 0, "pci-controller": {"bus": -1, "slot": -1, "domain": 0,
> > "function": 3}, "target": 0}], "type": "NTFS"}, {"name":
> > "\\\\?\\Volume{2ea839c6-0000-0000-0000-501f00000000}\\", "total-bytes":
> > 52665839616, "mountpoint": "C:\\", "disk": [{"bus-type": "scsi", "bus": 0,
> > "unit": 0, "pci-controller": {"bus": -1, "slot": -1, "domain": 0,
> > "function": 2}, "target": 0}], "used-bytes": 25267560448, "type": "NTFS"},
> > {"name": "\\\\?\\Volume{2ea839c6-0000-0000-0000-100000000000}\\",
> > "mountpoint": "System Reserved", "disk": [{"bus-type": "scsi", "bus": 0,
> > "unit": 0, "pci-controller": {"bus": -1, "slot": -1, "domain": 0,
> > "function": 1}, "target": 0}], "type": "NTFS"}]}  
> > >
> > > Here we see the non-sensical PCI topology info I mentioned previously.
> > > There are values like '"function": 3' even though there are no
> > > multifunction devices present. This will be exposed to users if we
> > > enable CONFIG_QGA_NTDDSCSI with things as they stand.  
> >
> > Sadly, I have no idea how to properly fix this code. But patch 5 of the
> > series IMHO brings it one step closer to proper solution. The original
> > code tries to fetch the addr/bus/domain/slot for volume handle (which
> > fails on missing properties) instead of disk handle.  
> 
> 
> > The remaining issue is that when listing the disks the entries are
> > cryptic names like "\Device\0000001d" instead of "\PhysicalDriveX" or
> > "\HarddiskX". And I don't know how to map one name to the other.
> >  
>     char *name = g_strdup(&guid[4]);
> This needs some investigation...
> 

This is to strip first 4 characters from UNC names.

    "\\?\abcdef" -> "\abcdef"


> >  
> > > Currently we
> > > just get an empty array for "disk" field of GuestFilesystemInfo
> > > for w32, which fortunately aligns with the current QAPI schema (it's
> > > an array since the volume can span multiple disks).  
> >
> > No. You get array with one item and useless data. Unless I missed
> > something.
> >
> >  
> > > I'm not about the
> > > SCSI bus/unit/target data either. It's a real mess (maybe things
> > > work a bit better when an actual SCSI controller is used?) and I'm
> > > not sure why/how I didn't notice during initial testing.  
> >
> > I think unit/target should work. At least with all patches from my
> > series (well notably the last patch). But please, do verify that if you
> > can.
> >
> >  
> > > I think all this needs to be addressed before we enable
> > > CONFIG_QGA_NTDDSCSI (in terms of what that's used for in the current
> > > code at least, i.e. enabling everything reported by
> > > GuestFilesystemInfo.disk).
> > >
> > > What is the oVirt use-case? It doesn't seem like you need PCI topology,
> > > but what about SCSI topology (and if so, does that information seem
> > > correct to you)? Or is just a list a serials/dev paths by themselves
> > > all that's needed? Trying to see how we might decouple things from the
> > > broken PCI topology stuff. May need to deprecate
> > > GuestFilesystemInfo.disk in favor of something less monolithic if all
> > > of this isn't fixable in a reasonable-enough timeframe.  
> >
> > We don't need the PCI info. I've been only trying to fix this to some
> > sort along the way. What we need is to get disk serial number (patch 4)
> > and "name" of the disk of some sort (patch 5). For that I use device
> > node (e.g. /dev/sda) on Linux and UNC for the disk on Windows (e.g.
> > \\?\PhysicalDrive1). But the code relies on CONFIG_QGA_NTDDSCSI being
> > enabled.
> >  
> This is already enabled in downstream version of qga-win, I don't think it
> should be
> enabled upstream if it's not necessary. ( Actually I'm not sure why it is
> disabled in the first place? I supposed because it is not stable)

So can we enable CONFIG_QGA_NTDDSCSI and disable PCI controller info by
different means?

> 
> >
> >     Tomas
> >  
> > >
> > >  
> > > > ---
> > > >  qga/commands-win32.c | 27 ++++++++++++++++++++++-----
> > > >  1 file changed, 22 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> > > > index 98d9735389..9c959122d9 100644
> > > > --- a/qga/commands-win32.c
> > > > +++ b/qga/commands-win32.c
> > > > @@ -633,15 +633,32 @@ static GuestDiskAddressList  
> > *build_guest_disk_info(char *guid, Error **errp)  
> > > >           *  
> > https://technet.microsoft.com/en-us/library/ee851589(v=ws.10).aspx */  
> > > >          if (DeviceIoControl(vol_h, IOCTL_SCSI_GET_ADDRESS, NULL, 0,  
> > scsi_ad,  
> > > >                              sizeof(SCSI_ADDRESS), &len, NULL)) {
> > > > +            Error *local_err = NULL;
> > > >              disk->unit = addr.Lun;
> > > >              disk->target = addr.TargetId;
> > > >              disk->bus = addr.PathId;
> > > > -            disk->pci_controller = get_pci_info(name, errp);
> > > > +            g_debug("unit=%lld target=%lld bus=%lld",
> > > > +                disk->unit, disk->target, disk->bus);
> > > > +            disk->pci_controller = get_pci_info(name, &local_err);
> > > > +
> > > > +            if (local_err) {
> > > > +                g_debug("failed to get PCI controller info: %s",
> > > > +                    error_get_pretty(local_err));
> > > > +                error_free(local_err);
> > > > +            } else if (disk->pci_controller != NULL) {
> > > > +                g_debug("pci: domain=%lld bus=%lld slot=%lld  
> > function=%lld",  
> > > > +                    disk->pci_controller->domain,
> > > > +                    disk->pci_controller->bus,
> > > > +                    disk->pci_controller->slot,
> > > > +                    disk->pci_controller->function);
> > > > +            }
> > > >          }
> > > > -        /* We do not set error in this case, because we still have  
> > enough  
> > > > -         * information about volume. */
> > > > -    } else {
> > > > -         disk->pci_controller = NULL;
> > > > +    }
> > > > +    /* We do not set error in case pci_controller is NULL, because we  
> > still  
> > > > +     * have enough information about volume. */
> > > > +    if (disk->pci_controller == NULL) {
> > > > +        g_debug("no PCI controller info");
> > > > +        disk->pci_controller = g_malloc0(sizeof(GuestPCIAddress));
> > > >      }
> > > >
> > > >      list = g_malloc0(sizeof(*list));
> > > > --
> > > > 2.18.0
> > > >  
> >
> >
> > --
> > Tomáš Golembiovský <address@hidden>
> >  


-- 
Tomáš Golembiovský <address@hidden>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]