qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Occasional VM soft lockup when a remote cdrom is attached


From: John Snow
Subject: Re: Occasional VM soft lockup when a remote cdrom is attached
Date: Mon, 14 Oct 2019 12:18:38 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0


On 10/13/19 10:04 PM, Guoheyi wrote:
> Really appreciate your advice. Some comments below:
> 
> 
> On 2019/10/12 3:06, John Snow wrote:
>>
>> On 10/11/19 9:22 AM, Guoheyi wrote:
>>> Hi folks,
>>>
>>> We observed Linux on VM occasionally (at very low rate) got soft lockup
>>> when a remote cdrom is attached. The guest hangs up at below call trace:
>>>
>> That's certainly a new one to me :)
>>
>>> [Tue Oct8 23:02:53 2019]ata_scsi_queuecmd+0xe0/0x2a0 [libata]
>>>
>>> [Tue Oct8 23:02:53 2019]scsi_dispatch_cmd+0xec/0x288
>>>
>>> [Tue Oct8 23:02:53 2019]scsi_queue_rq+0x5f4/0x6b8
>>>
>>> [Tue Oct8 23:02:53 2019]blk_mq_dispatch_rq_list+0xb0/0x690
>>>
>>> [Tue Oct8 23:02:53 2019]blk_mq_do_dispatch_sched+0x8c/0x130
>>>
>>> [Tue Oct8 23:02:53 2019]blk_mq_sched_dispatch_requests+0x128/0x1f0
>>>
>>> [Tue Oct8 23:02:53 2019]__blk_mq_run_hw_queue+0x9c/0x128
>>>
>>> [Tue Oct8 23:02:53 2019]__blk_mq_delay_run_hw_queue+0x198/0x1d8
>>>
>>> [Tue Oct8 23:02:53 2019]blk_mq_run_hw_queue+0x68/0x180
>>>
>>> [Tue Oct8 23:02:53 2019]blk_mq_sched_insert_request+0xbc/0x210
>>>
>>> [Tue Oct8 23:02:53 2019]blk_execute_rq_nowait+0x118/0x168
>>>
>>> [Tue Oct8 23:02:53 2019]blk_execute_rq+0x74/0xd8
>>>
>>> [Tue Oct8 23:02:53 2019]__scsi_execute+0xd8/0x1e0
>>>
>>> [Tue Oct8 23:02:53 2019]sr_check_events+0xd4/0x2c8 [sr_mod]
>>>
>>> [Tue Oct8 23:02:53 2019]cdrom_check_events+0x34/0x50 [cdrom]
>>>
>>> [Tue Oct8 23:02:53 2019]sr_block_check_events+0xdc/0x108 [sr_mod]
>>>
>>> [Tue Oct8 23:02:53 2019]disk_check_events+0x60/0x198
>>>
>>> [Tue Oct8 23:02:53 2019]disk_events_workfn+0x24/0x30
>>>
>>> [Tue Oct8 23:02:53 2019]process_one_work+0x1b4/0x3f8
>>>
>>> [Tue Oct8 23:02:53 2019]worker_thread+0x54/0x470
>>>
>>> [Tue Oct8 23:02:53 2019]kthread+0x134/0x138
>>>
>>> [Tue Oct8 23:02:53 2019]ret_from_fork+0x10/0x18
>>>
>>>
>>> We are running the whole stack on ARM64 platforms, using rcdrom on host
>>> to connect a remote cdrom, which is appeared as "/dev/sr0" on the host.
>>> Our Linux kernel version is 4.19.36 and qemu version is 2.8.1, which is
>>> fairly old but I checked the mainline and found the work flow does not
>>> change much. And KVM is enabled.
>>>
>>> We provide the remote cdrom to guest as a block device, attached under
>>> ICH SATA bus.
>>>
>>>
>>> The work flow should be like this (please correct me if I was wrong):
>>>
>>> 1. There is a kworker thread in guest kernel which will check cdrom
>>> status periodically.
>>>
>>> 2. The call of "ata_scsi_queuecmd" in guest will write AHCI port
>>> register "PORT_CMD_ISSUE", so this VCPU thread is trapped out to qemu.
>>>
>>> 3. qemu will grab the BQL and then dispatch the access to
>>> ahci_port_write().
>>>
>>> 4. For this is a "get event status notification" command, qemu finally
>>> goes to cmd_get_event_status_notification() and then
>>> cdrom_is_inserted().
>>>
>> via
>>
>> cmd_get_event_status_notification (SCSI 0x4A)
>>    event_status_media
>>      blk_is_inserted
>>
>>> 5. In cdrom_is_inserted(), an ioctl to cdrom fd is issued.
>>>
>> Using the bdrv_host_cdrom BlockDriver, for the .bdrv_is_inserted
>> callback.
>>
>>> However, in the last step, we found the ioctl() may have large latency,
>>> for it is a virtual device of remote cdrom, when the remote server is
>>> busy and of poor performance. We have observed more than 8 seconds
>>> latency in half an hour test, and the latency might reach more than 20
>>> seconds when guest soft lockup occurred.
>>>
>> I'm not sure what can be done here. the host_cdrom driver has a few
>> methods to query state (cdrom_is_inserted, cdrom_eject,
>> cdrom_lock_medium) and in general code is going to rely on
>> bdrv_is_inserted returning a truthful answer.
>>
>> (I'm not sure we have callbacks established to tell when the backing
>> media we are ourselves relying on has gone away. Maybe it could be
>> added, but it's not there now. We could maybe cache the answer if we had
>> something reliable.)
>>
>> You could always try using the host_device driver instead of the
>> host_cdrom one, which will just treat it as a "normal" block device
>> instead of a CDROM one, and doesn't use any cdrom specific ioctls. It
>> might avoid the costly call.
> By following this work around, the cdrom device was represented to guest
> as a normal disk ("/dev/sdb"). We are not sure if this will cause more
> functional differences.
> 

What command line are you using? You should be able to specify the
emulator (ask for device ide-cd) separately from the backend (specify
block driver host_device.)

It might require the use of -blockdev command line flags instead of
sugared variants (-cdrom, -drive.)

>>
>>> My question is, is there any way to get around of this issue? Does it
>>> make sense for qemu to setup an IO thread to issue this ioctl() and let
>>> the VCPU thread return to guest as soon as possible? Or it is kernel's
>>> responsibility to break up the long time ioctl() and return to user
>>> space?
>>>
>> Yeah, I think you could probably try to make this change -- the code is
>> unfortunately very callback-hell-ish with poor abstraction boundaries,
>> but obviously the data transfer commands already defer to bottom halves.
>>
>> (Ideally, I think every ATAPI command would just immediately defer to a
>> bottom half, but I don't know what effect that would have on callers of
>> ide_atapi_cmd. I'd have to audit it, and it's quite squiggly.)
>>
>> "Patches welcome" as they say, but it's quite messy down there.
> It sounds like a long term work, especially for we are not familiar with
> SATA/AHCI devices...
> 
> Thanks,
> 
> HG
> 

Yes, probably. It's low priority right now, but if it becomes more
important you can always try formal channels at RH to get it escalated.

--js.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]