[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH V11 0/5] VMXNET3 paravirtual NIC device implementati
From: |
Dmitry Fleytman |
Subject: |
[Qemu-devel] [PATCH V11 0/5] VMXNET3 paravirtual NIC device implementation |
Date: |
Mon, 25 Feb 2013 22:10:58 +0200 |
This set of patches implements VMWare VMXNET3 paravirtual NIC device.
The device supports of all the device features including offload capabilties,
VLANs and etc.
The device is tested on different OSes:
Fedora 15
Ubuntu 10.4
Centos 6.2
Windows 2008R2
Windows 2008 64bit
Windows 2008 32bit
Windows 2003 64bit
Windows 2003 32bit
Changes in V11:
Rebased to master HEAD to fix multiqueue compilation issues
Changes in V10:
Reported-by: Stefan Hajnoczi <address@hidden>
Issues reported by Stefan Hajnoczi fixed.
Changes in V9:
Reported-by: Stefan Hajnoczi <address@hidden>
Issues reported by Stefan Hajnoczi fixed.
Changes in V8:
Reported-by: Stefan Hajnoczi <address@hidden>
Issues reported by Stefan Hajnoczi reviewed and mostly fixed:
> + }
> + curr_src_off += src[i].iov_len;
> + }
> + return j;
> +}
The existing iov_copy() function provides equivalent functionality. I
don't think iov_rebuild() is needed.
[DF] Done. Thanks, missed it.
> + size -= len;
> + }
> + iovec_off += iov[i].iov_len;
> + }
> + return res;
> +}
Rename this net_checksum_add_iov() and place it in net/checksum.c,
then the new dependency on net from block can be dropped.
[DF] Done.
> +vmw_shmem_read(hwaddr addr, void *buf, int len)
> {
> VMW_SHPRN("SHMEM r: %" PRIx64 ", len: %d to %p", addr, len, buf);
> cpu_physical_memory_read(addr, buf, len);
> }
All changes to this file should be squashed with the previous patch.
[DF] Done
> +#ifdef VMXNET_DEBUG_SHMEM_ACCESS
> +#define VMW_SHPRN(fmt, ...)
> \
> + do {
> \
> + printf("[%s][SH][%s]: " fmt "\n", VMXNET_DEVICE_NAME, __func__,
> \
> + ## __VA_ARGS__);
> \
> + } while (0)
> +#else
> +#define VMW_SHPRN(fmt, ...) do {} while (0)
> +#endif
Please use QEMU tracing. It eliminates all this boilerplate and
conditional compilation. Tracing can be enabled/disabled at runtime
and works with SystemTap/DTrace. See docs/tracing.txt.
[DF] We'd like to stick with compile time logic in this case becase of 2
reasons:
[DF] 1. These printouts are intended for reverse engineering/development only
and there is
[DF] no need to enable them at run time
[DF] 2. There is a big number of printouts, all driver-device communication is
traced,
[DF] they hit performance even on strongest x86 in case of run-time logic
> +struct eth_header {
> + uint8_t h_dest[ETH_ALEN]; /* destination eth addr */
> + uint8_t h_source[ETH_ALEN]; /* source ether addr */
> + uint16_t h_proto; /* packet type ID field */
> +};
Looks like it's copy-pasted stuff from /usr/include/linux/if_*.h,
/usr/include/netinet/*.h, and friends. If the system-wide headers are
included names will collide for some of the macros at least.
Did you check if the slirp/ definitions can be reused?
[DF] Yes, you are right. This is copy-pasted from different places.
[DF] Slips definishing do not fully cover our needs.
I'd rather we import network header definitions once in a generic
place into the source tree. That way vmxnet and other components
don't need to redefine these structs.
[DF] Exaclty! Our intention is to create generic header with network
definitions and make everyone use it.
[DF] We can move our header to some shared place if you want, however I'd do it
in parallel with cleanup
[DF] of similar definitions in existing code and this is a big change that os
out of scope of these patches.
> +
> *===========================================================================*/
Is this huge comment box a sign that the code should be split into a
foo_tx.c and an foo_rx.c file?
[DF] As for me this file is not that big to be splitted (<800 lines), however
I'll do this if you insist :)
> +size_t vmxnet_tx_pkt_send(VmxnetTxPktH pkt, NetClientState *vc)
'vc' is an old name that was used for VLANClientState. The struct has
since been renamed to NetClientState and the rest of QEMU uses 'nc'
instead of 'vc'.
[DF] Fixed. Thanks.
> +/* tx module context handle */
> +typedef void *VmxnetTxPktH;
Forward-declaring the struct is nicer:
typedef struct VmxnetTxPkt VmxnetTxPkt;
The definition of VmxnetTxPkt is still hidden from the caller but you
avoid the void* and casting. In vmxnet_pkt.c define using:
struct VmxnetTxPkt {
...
};
[DF] Agree, fixed. Thanks.
> diff --git a/hw/vmxnet_utils.h b/hw/vmxnet_utils.h
> index 7fd9a01..fac3b7b 100644
> --- a/hw/vmxnet_utils.h
> +++ b/hw/vmxnet_utils.h
Please squash these fixes into the previous patch.
[DF] Oops, my bad. Fixed.
> + uint8_t msix_used;
> + /* Whether MSI support was installed successfully */
> + uint8_t msi_used;
These two fields should be bool.
> + /* Whether automatic interrupts masking enabled */
> + uint8_t auto_int_masking;
bool
[DF] Done.
> +static inline void vmxnet3_flush_shmem_changes(void)
> +{
> + /*
> + * Flush shared memory changes
> + * Needed before sending interrupt to guest to ensure
> + * it gets consistent memory state
> + */
> + smp_wmb();
> +}
It's useful to document why a memory barrier is being used in each
instance. Therefore hiding smp_wmb() inside a wrapper function isn't
great.
[DF] Fixed.
Also, it's suspicious that smb_wmb() is used but no other barriers are
used. What about a read memory barrier when accessing shared memory
written by the guest?
[DF] VMWARE interface build a in safe way - you always read out data buffer
(packet descriptor) with memcopy,
[DF] and then check its last byte to see whether it was fully filled by driver.
[DF] In this case device doesn't need read barriers. Drivers need write
barriers of course to secure
[DF] proper order of writes.
Changes in V7:
Reported-by: Michael S. Tsirkin <address@hidden>
Issues reported by Michael S. Tsirkin reviewed and mostly fixed:
File vmware_utils.h:
...
> +static inline void
> +vmw_shmem_st64(target_phys_addr_t addr, uint64_t value)
> +{
> + VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value);
> + stq_le_phys(addr, value);
> +}
> +
Pls remove these wrappers. These are just memory stores. Our codebase
is too large as it is without every driver wrapping all standard calls.
[DF] Idea behind this macros is to have printout in each of them
[DF] Printouts are really needed when reverse-engineering windows guest drivers
[DF] Since there is no windows drivers code this is the only way to understand
[DF] sequences they use
> +/* MACROS for simplification of operations on array-style registers */
UPPERCASE ABUSE
[DF] Fixed
> +#define IS_MULTIREG_ADDR(addr, base, cnt, regsize) \
> + (((addr + 1) > (base)) && ((addr) < (base) + (cnt) * (regsize)))
Same as range_covers_byte(base, cnt * regsize, addr)?
[DF] Fixed, thanks
> +
> +#define MULTIREG_IDX_BY_ADDR(addr, base, regsize) \
> + (((addr) - (base)) / (regsize))
> +
Above two macros is all that's left. No objection but it does not say
what they do - want to add minimal documentation?
And please prefix with VMWARE_ or something.
[DF] Prefix added, macros documented
File vmxnet_utils.h (and related with the same problems):
...
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#ifndef _VMXNET_UTILS_H_
> +#define _VMXNET_UTILS_H_
Please do not start identifiers with _ followed by an uppercase
lattters.
[DF] Fixed
...
> +
> +struct eth_header {
> + uint8_t h_dest[ETH_ALEN]; /* destination eth addr */
> + uint8_t h_source[ETH_ALEN]; /* source ether addr */
> + uint16_t h_proto; /* packet type ID field */
> +};
> +
And fix struct definitions to
1. follow qemu coding style
[DF] It's not clear what's wrong with the coding style. Please, elaborate.
2. start with vmxnet
[DF] Since this is generic network definitions with no device specifics
[DF] I'm not sure it makes sense to add vmxnet prefix.
[DF] I'd say it makes sense to put them into some generic header files under
"net" directory instead
[DF] and clean up other devices that have their local definitions of the same
structures.
[DF] Please, advise.
I also don't really understand why are these
functions split out - vmxnet is the only user, no?
[DF] Currently vmxnet3 is the only user, hovewer we have vmxnet2 implementation
as well (not submitted yet)
[DF] and it uses the same header
File vmxnet_pkt.h:
...
> +
> +/* tx module context handle */
> +typedef void *VmxnetTxPktH;
This gets you zero type safety and makes debugging impossible.
Just use pointers like everyone does.
[DF] Handle type is added to hide VmxnetTxPkt structure into .c header
[DF] for better encapsulation.
[DF] Should we drop it anyway?
Files vmxnet3.*:
...
> +
> + if (zero_region) {
> + vmw_shmem_set(pa, 0, size*cell_size);
spaces around *
[DF] Fixed
...
> +#define vmxnet3_ring_dump(macro, ring_name, ridx, r)
> \
> + macro("%s#%d: base %" PRIx64 " size %lu cell_size %lu gen %d next %lu",
> \
> + (ring_name), (ridx),
> \
> + (r)->pa, (r)->size, (r)->cell_size, (r)->gen, (r)->next)
make macros upper case
[DF] Fixed
...
> +static inline void vmxnet3_flush_shmem_changes(void)
> +{
> + /*
> + * Flush shared memory changes
> + * Needed before transferring control to guest
what does 'transferring control to guest' mean?
[DF] Changed to
[DF] /*
[DF] * Flush shared memory changes
[DF] * Needed before sending interrupt to guest to ensure
[DF] * it gets consistent memory state
[DF] */
...
> + */
> + smp_wmb();
> +}
Don't use wrappers like this. They just hide bugs. For example
it's not helpful before an interrupt in the function below.
[DF] I guess you are talking about vmxnet3_complete_packet()
[DF] Strictly speaking barrier is a must because we change shared memory in
[DF] vmxnet3_complete_packet()
[DF] And the wrapper is a good thing because its name explains its effect
[DF] in a formal way as opposed to comments
...
> + switch (status) {
> + case VMXNET3_PKT_STATUS_OK: {
don't put {} around cases: they align incorrectly
if it's too big move to a function.
[DF] Fixed
...
> +static bool
> +vmxnet3_send_packet(VMXNET3State *s, uint32_t qidx)
> +{
> + size_t bytes_sent = 0;
> + bool res = true;
why = true? don't initialize just because.
[DF] Fixed
...
> +/*
> + * VMWARE headers we got from Linux kernel do not fully comply QEMU coding
> + * standards in sense of types and defines used.
> + * Since we didn't want to change VMWARE code, following set of typedefs
> + * and defines needed to compile these headers with QEMU introduced.
> + */
No need for this now.
You can export headers and put them under linux-headers.
[DF] Not sure it is possible because the header as-is is not stand-alone and
won't compile
[DF] without changes. We extracted definitions we use from their header and
dropped unused
[DF] and kernel-specific stuff.
[DF} Please, advise.
...
> + if (VMXNET3_OM_TSO == s->offload_mode) {
Don't do Yoda style like this
[DF] "Yoda" style removed everywhere
Changes in V6:
Fixed most of problems pointed out by Michael S. Tsirkin
The only issue still open is creation of shared place
with generic network structures and functions. Currently
all generic network code introduced by VMXNET3 resides in
vmxnet_utils.c/h files. It could be moved to some shared location however
we believe it is a matter of separate refactoring as there are a lot of
copy-pasted
definitions in almost every device and code cleanup efforts requred in order
to create truly shared codebase.
Reported-by: Michael S. Tsirkin <address@hidden>
Implemented suggestions by Anthony Liguori
Reported-by: Anthony Liguori <address@hidden>
Fixed incorrect checksum caclulation for some packets in SW offloads mode
Reported-by: Gerhard Wiesinger <address@hidden>
Changes in V5:
MSI-X save/load implemented in the device instead of pci bus as
suggested by Michael S. Tsirkin
Reported-by: Michael S. Tsirkin <address@hidden>
Patches regrouped as suggested by Paolo Bonzini
Reported-by: Paolo Bonzini <address@hidden>
Changes in V4:
Fixed a few problems uncovered by NETIO test suit
Assertion on failure to initialize MSI/MSI-X replaced with warning
message and fallback to Legacy/MSI respectively
Reported-by: Gerhard Wiesinger <address@hidden>
Various coding style adjustments and patch split-up as suggested by Anthony
Liguori
Reported-by: Anthony Liguori <address@hidden>
Live migration support added
Changes in V3:
Fixed crash when net device that is used as network fronted has no
virtio HDR support.
Task offloads emulation for cases when net device that is used as
network fronted has no virtio HDR support.
Reported-by: Gerhard Wiesinger <address@hidden>
Changes in V2:
License text changed accoring to community suggestions
Standard license header from GPLv2+ - licensed QEMU files used
Dmitry Fleytman (5):
Adding utility function net_checksum_add_cont() that allows checksum
calculation of scattered data with odd chunk sizes
Adding utility function net_checksum_add_iov() for iovec checksum
calculation
Adding common definitions for VMWARE devices
Adding packet abstraction for VMWARE network devices
Adding VMXNET3 device implementation
default-configs/pci.mak | 1 +
hw/Makefile.objs | 1 +
hw/pci.h | 1 +
hw/vmware_utils.h | 143 +++
hw/vmxnet3.c | 2437 +++++++++++++++++++++++++++++++++++++++++++++++
hw/vmxnet3.h | 762 +++++++++++++++
hw/vmxnet_debug.h | 121 +++
hw/vmxnet_pkt.c | 758 +++++++++++++++
hw/vmxnet_pkt.h | 309 ++++++
hw/vmxnet_utils.c | 219 +++++
hw/vmxnet_utils.h | 340 +++++++
iov.h | 5 +
net/checksum.c | 41 +-
net/checksum.h | 22 +-
14 files changed, 5153 insertions(+), 7 deletions(-)
create mode 100644 hw/vmware_utils.h
create mode 100644 hw/vmxnet3.c
create mode 100644 hw/vmxnet3.h
create mode 100644 hw/vmxnet_debug.h
create mode 100644 hw/vmxnet_pkt.c
create mode 100644 hw/vmxnet_pkt.h
create mode 100644 hw/vmxnet_utils.c
create mode 100644 hw/vmxnet_utils.h
--
1.7.11.7
- [Qemu-devel] [PATCH V11 0/5] VMXNET3 paravirtual NIC device implementation,
Dmitry Fleytman <=
- [Qemu-devel] [PATCH v11 2/5] iovec checksum calculation fuction, Dmitry Fleytman, 2013/02/25
- [Qemu-devel] [PATCH v11 1/5] Checksum-related utility functions, Dmitry Fleytman, 2013/02/25
- [Qemu-devel] [PATCH v11 3/5] Common definitions for VMWARE devices, Dmitry Fleytman, 2013/02/25
- [Qemu-devel] [PATCH v11 4/5] Packet abstraction for VMWARE network devices, Dmitry Fleytman, 2013/02/25
- [Qemu-devel] [PATCH v11 5/5] VMXNET3 device implementation, Dmitry Fleytman, 2013/02/25