This patch provides documentation describing the AP architecture and
design concepts behind the virtualization of AP devices. It also
includes an example of how to configure AP devices for exclusive
use of KVM guests.
Signed-off-by: Tony Krowiak <address@hidden>
---
docs/vfio-ap.txt | 624
++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 624 insertions(+), 0 deletions(-)
create mode 100644 docs/vfio-ap.txt
diff --git a/docs/vfio-ap.txt b/docs/vfio-ap.txt
new file mode 100644
index 0000000..54e7523
--- /dev/null
+++ b/docs/vfio-ap.txt
@@ -0,0 +1,624 @@
+Adjunct Processor (AP) Device
+=============================
+
+Contents:
+=========
+* Introduction
+* AP Architectural Overview
+* Start Interpretive Execution (SIE) Instruction
+* AP Matrix Configuration on Linux Host
+* AP Matrix Configuration for a Linux Guest
+* Starting a Linux Guest Configured with an AP Matrix
+* Example: Configure AP Matrices for Two Linux Guests
+
+Introduction:
+============
+The IBM Adjunct Processor (AP) Cryptographic Facility is comprised
+of three AP instructions and from 1 to 256 PCIe cryptographic
adapter cards.
+These AP devices provide cryptographic functions to all CPUs
assigned to a
+linux system running in an IBM Z system LPAR.
+
+On s390x, AP adapter cards are exposed via the AP bus. This document
+describes how those cards may be made available to KVM guests using the
+VFIO mediated device framework.
+
+AP Architectural Overview:
+=========================
+In order understand the terminology used in the rest of this
document, let's
+start with some definitions:
+
+* AP adapter
+
+ An AP adapter is an IBM Z adapter card that can perform cryptographic
+ functions. There can be from 0 to 256 adapters assigned to an
LPAR. Adapters
+ assigned to the LPAR in which a linux host is running will be
available to
+ the linux host. Each adapter is identified by a number from 0 to
255. When
+ installed, an AP adapter is accessed by AP instructions executed
by any CPU.
+
+* AP domain
+
+ An adapter is partitioned into domains. Each domain can be thought
of as
+ a set of hardware registers for processing AP instructions. An
adapter can
+ hold up to 256 domains. Each domain is identified by a number from
0 to 255.
+ Domains can be further classified into two types:
+
+ * Usage domains are domains that can be accessed directly to
process AP
+ commands
+
+ * Control domains are domains that are accessed indirectly by AP
+ commands sent to a usage domain to control or change the
domain, for
+ example; to set a secure private key for the domain.
+
+* AP Queue
+
+ An AP queue is the means by which an AP command-request message is
sent to an
+ AP usage domain inside a specific AP. An AP queue is identified by
a tuple
+ comprised of an AP adapter ID (APID) and an AP queue index (APQI).
The
+ APQI corresponds to a given usage domain number within the
adapter. This tuple
+ forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
+ instructions include a field containing the APQN to identify the
AP queue to
+ which the AP command-request message is to be sent for processing.
+
+* AP Instructions:
+
+ There are three AP instructions:
+
+ * NQAP: to enqueue an AP command-request message to a queue
+ * DQAP: to dequeue an AP command-reply message from a queue
+ * PQAP: to administer the queues
+
+Start Interpretive Execution (SIE) Instruction
+==============================================
+A KVM guest is started by executing the Start Interpretive Execution
(SIE)
+instruction. The SIE state description is a control block that
contains the
+state information for a KVM guest and is supplied as input to the SIE
+instruction. The SIE state description contains a field that references
+a Crypto Control Block (CRYCB). The CRYCB contains three fields to
identify the
+adapters, usage domains and control domains assigned to the KVM guest:
+
+* The AP Mask (APM) field is a bit mask that identifies the AP
adapters assigned
+ to the KVM guest. Each bit in the mask, from most significant to
least
+ significant bit, corresponds to an APID from 0-255. If a bit is
set, the
+ corresponding adapter is valid for use by the KVM guest.
+
+* The AP Queue Mask (AQM) field is a bit mask identifying the AP
queues assigned
+ to the KVM guest. Each bit in the mask, from most significant to
least
+ significant bit, corresponds to an AP queue index (APQI) from
0-255. If a bit
+ is set, the corresponding queue is valid for use by the KVM guest.
+
+* The AP Domain Mask field is a bit mask that identifies the AP
control domains
+ assigned to the KVM guest. The ADM bit mask controls which domains
can be
+ changed by an AP command-request message sent to a usage domain
from the
+ guest. Each bit in the mask, from least significant to most
significant bit,
+ corresponds to a domain from 0-255. If a bit is set, the
corresponding domain
+ can be modified by an AP command-request message sent to a usage
domain
+ configured for the KVM guest.
+
+If you recall from the description of an AP Queue, AP instructions
include
+an APQN to identify the AP adapter and AP queue to which an AP
command-request
+message is to be sent (NQAP and PQAP instructions), or from which a
+command-reply message is to be received (DQAP instruction). The
validity of an
+APQN is defined by the matrix calculated from the APM and AQM; it is
the
+intersection of all assigned adapter numbers (APM) with all assigned
queue
+indexes (AQM). For example, if adapters 1 and 2 and usage domains 5
and 6 are
+assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be
valid for
+the guest.
+
+The APQNs provide secure key functionality - i.e., a private key is
stored on
+the adapter card for each of its domains - so each APQN must be
assigned to at
+most one guest or the linux host.
+
+ Example 1: Valid configuration:
+ ------------------------------
+ Guest1: adapters 1,2 domains 5,6
+ Guest2: adapter 1,2 domain 7
+
+ This is valid because both guests have a unique set of APQNs:
Guest1 has
+ APQNs (1,5), (1,6), (2,5) and (2,6); Guest2 has APQNs (1,7) and
(2,7).
+
+ Example 2: Invalid configuration:
+ --------------------------------
+ Guest1: adapters 1,2 domains 5,6
+ Guest2: adapter 1 domains 6,7
+
+ This is an invalid configuration because both guests have access to
+ APQN (1,6).
+
+AP device Configuration on Linux Host:
+=====================================
+A linux system is a guest of the LPAR in which it is running and has
access to
+the AP resources configured for the LPAR. The LPAR's AP matrix is
+configured using the 'Customize/Delete Activation Profiles' dialog
from the HMC.
+This dialog displays the activation profiles configured for the
linux system.
+Selecting the specific activation profile to be edited and clicking the
+'Customize Profile' button will open the 'Customize Image Profiles'
dialog.
+Selecting the 'Crypto' link in the tree view on the left hand side
of the dialog
+will display the AP matrix configuration in the right hand panel.
There, one can
+assign AP adapters - called Cryptos - and domains to the LPAR. When
the linux
+system is started using this activation profile, it will have access
to the
+matrix of AP adapters and domains configured via the activation
profile.
+
+When the linux system is started, the AP adapter devices will be
connected to
+the AP bus and the following AP matrix interfaces will be created in
sysfs:
+
+/sys/bus/ap
+... [devices]
+...... xx.yyyy
+...... ...
+...... cardxx
+...... ...
+
+Where:
+ cardxx is adapter number xx (in hex)
+ yyyy is a usage domain number yyyy (in hex)
+....xx.yyyy is APQN (xx,yyyy)
+
+For example, if AP adapters 5 and 6 and domains 4 and 71 (0x47) are
configured
+for the LPAR, the sysfs representation on the linux system would
look like this:
+
+/sys/bus/ap
+... [devices]
+...... 05.0004
+...... 05.0047
+...... 06.0004
+...... 06.0047
+...... card05
+...... card06
+
+There will also be AP device drivers created to control each type of
AP matrix
+interface available to the IBM Z system:
+
+/sys/bus/ap
+... [drivers]
+...... [cex2acard] for Crypto Express 2/3 accelerator cards
+...... [cex2aqueue] for AP queues served by Crypto Express 2/3
+ accelerator cards
+...... [cex4card] for Crypto Express 4/5/6 accelerator and
coprocessor
+ cards
+...... [cex4queue] for AP queues served by Crypto Express 4/5/6
+ accelerator and coprocessor cards
+...... [pcixcccard] for Crypto Express 2/3 coprocessor cards
+...... [pcixccqueue] for AP queues served by Crypto Express 2/3
+ coprocessor cards
+
+Links to the AP interfaces controlled by each AP device driver will
be created
+in the device driver's sysfs directory. For example, if AP adapter 5
and domains
+4 and 71 (0x47) are assigned to the LPAR and adapter 5 is a CEX5
card, the
+following links will be created in the CEX5 drivers' sysfs directories:
+
+/sys/bus/ap
+... [drivers]
+...... [cex4card]
+......... [card05]
+...... [cex4queue]
+......... [05.0004]
+......... [05.0047]
+
+AP Matrix Configuration for a Linux Guest:
+=========================================
+In order to configure the AP matrix for a guest, the adapters, usage
domains
+and control domains to be used by the guest must be assigned to the
guest. This
+section describes how to configure a guest's AP matrix.
+
+The kernel interfaces for configuring an AP matrix for a linux guest
are built
+on the VFIO mediated device framework and are provided by the vfio_ap
+kernel module. By default, the vfio_ap module is a loadable module, The
+dependency chain for the vfio_ap module is:
+* vfio
+* mdev
+* vfio_mdev
+* vfio_ap
+
+When installed, the vfio_ap module is initialized. During module
initialization,
+a vfio_ap driver is created and registered with the AP bus creating the
+following sysfs interfaces:
+
+ /sys/bus/ap/drivers/
+...[vfio_ap]
+...... bind
+...... unbind
+
+The vfio_ap device driver will create a 'matrix' device to hold the
APQNs
+reserved for exclusive use by KVM guests:
+
+/sys/devices/
+... [vfio_ap]
+......[matrix] symlink to the matrix device directory
+
+The vfio_ap device driver serves several purposes:
+1. Provides an interface for securing APQNs preventing their use by
the host
+ linux system and reserving their use by one or more guests.
+2. Creates the sysfs interfaces for configuring an AP matrix for a
linux guest.
+
+Securing APQNs
+--------------
+ An APQN is reserved by unbinding an AP queue device AP bus device
driver and
+ binding it to the vfio_ap device driver. For example, suppose we
want to
+ secure APQN (05,0004). Assuming that the AP adapter card 5 is a CEX5
+ coprocessor card:
+
+ echo 05.0004 > /sys/bus/ap/drivers/cex4queue/unbind
+ echo 05.0004 > /sys/bus/ap/drivers/vfio_ap/bind
+
+ This action will store the APQN in the
/sys/devices/vfio_ap/matrix device
+ which makes it available for use by a linux guest.
+
+Configuring an AP matrix for a linux guest.
+------------------------------------------
+These sysfs interfaces are built on the VFIO mediated device
framework. To
+configure an AP matrix for a guest, a mediated matrix device must
first be
+created for the /sys/devices/vfio_ap/matrix device. The sysfs
interfaceAPQI corresponding to
+for creating a mediated matrix device is in:
+
+/sys/devices
+... [vfio_ap]
+......[matrix]
+......... [mdev_supported_types]
+............ [vfio_ap-passthrough]
+............... create
+............... [devices]
+
+A mediated AP matrix device is created by writing a UUID to the
attribute
+file named 'create', for example:
+
+ uuidgen > create
+
+When a mediated AP matrix device is created, a sysfs directory named
after
+the UUID:
+
+/sys/devices
+... [vfio_ap]
+......[matrix]
+......... [mdev_supported_types]
+............ [vfio_ap-passthrough]
+............... create
+............... [devices]
+.................. [$uuid]
+
+There will also be three sets of attribute files created in the
mediated
+matrix device's sysfs directory to configure an AP matrix for the
+KVM guest:
+
+/sys/devices
+... [vfio_ap]
+......[matrix]
+......... [mdev_supported_types]
+............ [vfio_ap-passthrough]
+............... create
+............... [devices]
+.................. [$uuid]
+..................... assign_adapter
+..................... assign_control_domain
+..................... assign_domain
+..................... matrix
+..................... unassign_adapter
+..................... unassign_control_domain
+..................... unassign_domain
+
+assign_adapter
+ To assign an AP adapter to the mediated matrix device, its APID
is written
+ 'assign_adapter' file. This may be done multiple times to assign
more than
+ one adapter. The APID may be specified using conventional semantics
+ as a decimal, hexidecimal, or octal number. For example, to
assign adapters
+ 4, 5 and 16 to mediated matrix device $uuid in decimal,
hexidecimal and octal
+ respectively:
+
+ echo 4 > assign_adapter
+ echo 0x5 > assign_adapter
+ echo 020
+
+unassign_adapter
+ To unassign an AP adapter, its APID is written to the
'unassign_adapter'
+ file. This may also be done multiple times to unassign more than
one adapter.
+
+assign_domain
+ To assign a usage domain, the APQI corresponding to the domain
number is
+ written into the 'assign_domain' file. This may be done multiple
times to
+ assign more than one usage domain. The APQI may be specified using
+ conventional semantics as a decimal, hexidecimal, or octal
number. For
+ example, to assign usage domains 4, 8, and 71 to mediated matrix
device
+ $uuid in decimal, hexidecimal and octal respectively:
+
+ echo 4 > assign_domain
+ echo 0x8 > assign_domain
+ echo 0107 > assign_domain
+
+unassign_domain
+ To unassign a usage domain, the APQI corresponding to the domain
number is
+ written into the 'unassign_domain' file. This may be done
multiple times to
+ unassign more than one usage domain.
+
+assign_control_domain
+ To assign a control domain, the domain number is written into the
+ 'assign_control_domain' file. This may be done multiple times to
+ assign more than one control domain. The domain number may be
specified using
+ conventional semantics as a decimal, hexidecimal, or octal
number. For
+ example, to assign control domains 4, 8, and 71 to mediated
matrix device
+ $uuid in decimal, hexidecimal and octal respectively:
+
+ echo 4 > assign_domain
+ echo 0x8 > assign_domain
+ echo 0107 > assign_domain
+
+unassign_control_domain
+ To unassign a control domain, the domain number is written into the
+ 'unassign_domain' file. This may be done multiple times to
unassign more than
+ one control domain.
+
+Notes:
+* Hot plug/unplug is not currently supported for mediated AP matrix
devices,
+ so the AP matrix resulting from assignment and/or unassignment of AP
+ adapters, usage domains and control domains to a mediated AP
matrix device
+ while the guest is running will not take affect until the linux
guest is
+ rebooted.
+* By architectural convention, all usage domains configured for a
KVM guest
+ will also be implicitly assigned as control domains also, to there
is no
+ need to assign control domains that are assigned as usage domains.
+
+Starting a Linux Guest Configured with an AP Matrix:
+===================================================
+In addition to providing the sysfs interfaces for configuring the AP
matrix for
+a linux guest, a mediated matrix device also acts as a communication
pathway
+between QEMU and the vfio_ap device driver. To gain access to the
+device driver, the following option must be specified on the QEMU
command line:
+
+ -device vfio_ap,sysfsdev=$path-to-mdev
+
+The sysfsdev parameter specifies the path to the mediated matrix
device.
+There are a number of ways to specify this path:
+
+/sys/devices/vfio_ap/matrix/$uuid
+/sys/bus/mdev/devices/$uuid
+/sys/bus/mdev/drivers/vfio_mdev/$uuid
+/sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/devices/$uuid
+
+When the linux guest is subsequently started, the guest will open
the mediated
+matrix device's file descriptor to get information about the
mediated matrix
+device. The vfio_ap device driver will update the APM, AQM, and ADM
fields in the
+guest's CRYCB with the adapter, usage domain and control domains
assigned to
+via the mediated matrix device's sysfs attribute files. Programs
running on the
+linux guest will then:
+
+1. Have direct access to the APQNs derived from the intersection of
the AP
+ adapter and usage domain numbers specified in the APM and AQM
respectively
+
+2. Have authorization to process AP commands to change - e.g., store
a new
+ secure key - a control domain identified in an AP instruction
sent to a valid
+ APQN.
+
+CPU model features:
+
+Three CPU model features are available for controlling guest access
to AP
+facilities:
+
+1. AP facilities feature
+
+ The AP facilities feature indicates that AP facilities are
installed on the
+ guest. This feature will be enabled by the kernel only if the AP
facilities
+ are installed on the host system. It will turned on automatically
for guests
+ started with CPU model zEC12 or newer. The feature is
s390-specific and is
+ represented as a parameter of the -cpu option on the QEMU command
line:
+
+ qemu-system-s390x -cpu $model,ap=on|off
+
+ Where:
+
+ $model is the CPU model defined for the guest (defaults to
the model of
+ the host system if not specified).
+
+ ap=on|off indicates whether AP facilities are installed
(on) or not
+ installed (off). The default for CPU models zEC12
or newer
+ is ap=on. AP facilities must be installed when
this parameter
+ is used in conjunction with -device
vfio-ap,sysfsdev=$path or
+ the guest will not start.
+
+2. Query Configuration Information (QCI) facility
+
+ The QCI facility is used by the AP bus running on the guest to
query the
+ configuration of the AP facilities. This facility will be enabled by
+ the kernel only if the QCI facility is installed on the host
system. It will
+ be turned on automatically for guests started with CPU model
zEC12 or newer.
+ The feature is s390-specific and is represented as a parameter of
the -cpu
+ option on the QEMU command line:
+
+ qemu-system-s390x -cpu $model,qci=on|off
+
+ Where:
+
+ $model is the CPU model defined for the guest
+
+ qci=on|off indicates whether the QCI facility is installed
(on) or not
+ installed (off). The default for CPU models
zEC12 or newer
+ is qci=on. Turning the QCI facility on makes no
sense if it
+ is not used in conjunction with the
+ '-device vfio-ap,sysfsdev=$path' option. A
warning will be
+ presented if QCI is turned on and the AP
facilities are not
+ installed.
+
+ If the QCI facility is turned off, APQNs with an
APQI
+ greater than 15 will not be accessible from the
guest.
+
+3. Adjunct Process Facility Test (APFT) facility
+
+ The APFT facility is used by the AP bus running on the guest to
test the
+ AP facilities available for a given AP queue. This facility will
be enabled
+ by the kernel only if the APFT facility is installed on the host
system. It
+ will be turned on automatically for guests started with CPU model
zEC12 or
+ newer. The feature is s390-specific and is represented as a
parameter of the
+ -cpu option on the QEMU command line:
+
+ qemu-system-s390x -cpu $model,apft=on|off
+
+ Where:
+
+ $model is the CPU model defined for the guest (defaults to
the model of
+ the host system if not specified).
+
+ apft=on|off indicates whether the APFT facility is
installed (on) or
+ not installed (off). The default for CPU models
zEC12 and
+ newer is apft=on. Turning the APFT facility on
makes no
+ sense if it is not used in conjunction with the
+ -device vfio-ap,sysfsdev=$path option. A
warning will be
+ presented if APFT is turned on and the AP
facilities are
+ not installed.
+
+ It also makes no sense to turn APFT off when
used in
+ conjunction with the vfio-ap device because the
APFT
+ facility is required; the AP bus running on the
guest will
+ not detect CEX4 and newer devices without it.
Since only
+ CEX4 and newer devices are supported for guest
usage, no AP
+ devices can be made accessible to a guest
started without
+ APFT installed.
+
+Example: Configure AP Matrixes for Two Linux Guests:
+===================================================
+Let's now provide an example to illustrate how KVM guests may be given
+access to AP facilities. For this example, we will show how to
configure
+two guests such that executing the lszcrypt command on the guests would
+look like this:
+
+Guest1
+------
+CARD.DOMAIN TYPE MODE
+------------------------------
+05 CEX5C CCA-Coproc
+05.0004 CEX5C CCA-Coproc
+05.00ab CEX5C CCA-Coproc
+06 CEX5A Accelerator
+06.0004 CEX5A Accelerator
+06.00ab CEX5C CCA-Coproc
+