qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 44/44] nbd: Implement NBD_OPT_BLOCK_SIZE on c


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v3 44/44] nbd: Implement NBD_OPT_BLOCK_SIZE on client
Date: Mon, 25 Apr 2016 13:16:59 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 04/25/2016 06:19 AM, Alex Bligh wrote:
> 
> On 23 Apr 2016, at 00:40, Eric Blake <address@hidden> wrote:
> 
>> The upstream NBD Protocol has defined a new extension to allow
>> the server to advertise block sizes to the client, as well as
>> a way for the client to inform the server that it intends to
>> obey block sizes.
>>
>> Pass any received sizes on to the block layer.
>>
>> Use the minimum block size as the sector size we pass to the
>> kernel - which also has the nice effect of cooperating with
>> (non-qemu) servers that don't do read-modify-write when exposing
>> a block device with 4k sectors; it can also allow us to visit a
>> file larger than 2T on a 32-bit kernel.
>>

>> +            be32_to_cpus(&info->max_block);
>> +            TRACE("Block sizes are 0x%" PRIx32 ", 0x%" PRIx32 ", 0x%" 
>> PRIx32,
>> +                  info->min_block, info->opt_block, info->max_block);
>> +            break;
>> +
> 
> You should probably check min_block <= opt_block <= max_block here

opt_block > max_block is possible if max_block is clamped to export size
(in the degenerate case where you have a small export that is too small
for the granularity of a hole or efficient I/O).  But yes, some sanity
checks that the server isn't horribly broken might be worthwhile.

> 
> Also should there be a check that BDRV_SECTOR_SIZE >= min_block?

No, because we take the server's min_block and feed it into
bs->request_align (which forces the block layer to comply with a minimum
alignment larger than 512, using code already tested on physical block
drives with 4k sectors), see the hunk in nbd-client.c.

>> int nbd_init(int fd, QIOChannelSocket *sioc, NbdExportInfo *info)
>> {
>> -    unsigned long sectors = info->size / BDRV_SECTOR_SIZE;
>> -    if (info->size / BDRV_SECTOR_SIZE != sectors) {
>> +    unsigned long sector_size = MAX(BDRV_SECTOR_SIZE, info->min_block);
>> +    unsigned long sectors = info->size / sector_size;
>> +    if (info->size / sector_size != sectors) {
>>         LOG("Export size %" PRId64 " too large for 32-bit kernel", 
>> info->size);
>>         return -E2BIG;
>>     }
>> @@ -724,18 +784,18 @@ int nbd_init(int fd, QIOChannelSocket *sioc, 
>> NbdExportInfo *info)
>>         return -serrno;
>>     }
>>
>> -    TRACE("Setting block size to %lu", (unsigned long)BDRV_SECTOR_SIZE);
>> +    TRACE("Setting block size to %lu", sector_size);
>>
>> -    if (ioctl(fd, NBD_SET_BLKSIZE, (unsigned long)BDRV_SECTOR_SIZE) < 0) {
>> +    if (ioctl(fd, NBD_SET_BLKSIZE, sector_size) < 0) {

We also feed the maximum of 512 or the advertised minimum block size
into the kernel when using ioctl() for the kernel to take over
transmission phase; although I'm not certain whether the kernel obeys
NBD_SET_BLKSIZE as a hint rather than a hard rule - but if that needs
patching, it needs patching in the kernel implementation, not in qemu.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]