summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--conformance.tex2
-rw-r--r--content.tex145
2 files changed, 125 insertions, 22 deletions
diff --git a/conformance.tex b/conformance.tex
index 7b7df32..f59e360 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -105,6 +105,7 @@ A network driver MUST conform to the following normative statements:
A block driver MUST conform to the following normative statements:
\begin{itemize}
+\item \ref{drivernormative:Device Types / Block Device / Device Initialization}
\item \ref{drivernormative:Device Types / Block Device / Device Operation}
\end{itemize}
@@ -224,6 +225,7 @@ A network device MUST conform to the following normative statements:
A block device MUST conform to the following normative statements:
\begin{itemize}
+\item \ref{devicenormative:Device Types / Block Device / Device Initialization}
\item \ref{devicenormative:Device Types / Block Device / Device Operation}
\end{itemize}
diff --git a/content.tex b/content.tex
index 4be0f7d..e054919 100644
--- a/content.tex
+++ b/content.tex
@@ -4041,8 +4041,13 @@ device except where noted.
\item[VIRTIO_BLK_F_BLK_SIZE (6)] Block size of disk is in \field{blk_size}.
+\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support.
+
\item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O
alignment.
+
+\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback
+ and writethrough modes.
\end{description}
\subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits}
@@ -4051,16 +4056,12 @@ device except where noted.
\item[VIRTIO_BLK_F_BARRIER (0)] Device supports request barriers.
\item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands.
-
-\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support.
-
-\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback
- and writethrough modes.
\end{description}
-VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers
-MUST only negotiate this feature if they are capable of sending
-VIRTIO_BLK_T_FLUSH commands.
+\begin{note}
+ In the legacy interface, VIRTIO_BLK_F_FLUSH was also
+ called VIRTIO_BLK_F_WCE.
+\end{note}
\subsection{Device configuration layout}\label{sec:Device Types / Block Device / Device configuration layout}
@@ -4089,7 +4090,7 @@ struct virtio_blk_config {
// optimal (suggested maximum) I/O size in blocks
le32 opt_io_size;
} topology;
- u8 reserved;
+ u8 writeback;
};
\end{lstlisting}
@@ -4119,21 +4120,63 @@ according to the native endian of the guest rather than
\field{topology} struct can be read to determine the physical block size and optimal
I/O lengths for the driver to use. This also does not affect the units
in the protocol, only performance.
+
+\item If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache
+ mode can be read or set through the \field{writeback} field. 0 corresponds
+ to a writethrough cache, 1 to a writeback cache\footnote{Consistent with
+ \ref{devicenormative:Device Types / Block Device / Device Operation},
+ a writethrough cache can be defined broadly as a cache that commits
+ writes to persistent device backend storage before reporting their
+ completion. For example, a battery-backed writeback cache actually
+ counts as writethrough according to this definition.}. The cache mode
+ after reset can be either writeback or writethrough. The actual
+ mode can be determined by reading \field{writeback} after feature
+ negotiation.
\end{enumerate}
+\drivernormative{\subsubsection}{Device Initialization}{Device Types / Block Device / Device Initialization}
+
+Drivers SHOULD NOT negotiate VIRTIO_BLK_F_FLUSH if they are incapable of
+sending VIRTIO_BLK_T_FLUSH commands.
+
+If neither VIRTIO_BLK_F_CONFIG_WCE nor VIRTIO_BLK_F_FLUSH are
+negotiated, the driver MAY deduce the presence of a writethrough cache.
+If VIRTIO_BLK_F_CONFIG_WCE was not negotiated but VIRTIO_BLK_F_FLUSH was,
+the driver SHOULD assume presence of a writeback cache.
+
+The driver MUST NOT read \field{writeback} before setting
+the FEATURES_OK \field{status} bit.
+
+\devicenormative{\subsubsection}{Device Initialization}{Device Types / Block Device / Device Initialization}
+
+Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it
+if they offer VIRTIO_BLK_F_CONFIG_WCE.
+
+If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH
+is not, the device MUST initialize \field{writeback} to 0.
+
\subsubsection{Legacy Interface: Device Initialization}\label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization}
-The \field{reserved} field used to be called \field{writeback}. If the
-VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be
-read from \field{writeback}; the
-driver can also write to the field in order to toggle the cache
-between writethrough (0) and writeback (1) mode. If the feature is
-not available, the driver can instead look at the result of
-negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode
-after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated.
+Because legacy devices do not have FEATURES_OK, transitional devices
+MUST implement slightly different behavior around feature negotiation
+when used through the legacy interface. In particular, when using the
+legacy interface:
+
+\begin{itemize}
+\item the driver MAY read or write \field{writeback} before setting
+ the DRIVER or DRIVER_OK \field{status} bit
+
+\item the device MUST NOT modify the cache mode (and \field{writeback})
+ as a result of a driver setting a status bit, unless
+ the DRIVER_OK bit is being set and the driver has not set the
+ VIRTIO_BLK_F_CONFIG_WCE driver feature bit.
+
+\item the device MUST NOT modify the cache mode (and \field{writeback})
+ as a result of a driver modifying the driver feature bits, for example
+ if the driver sets the VIRTIO_BLK_F_CONFIG_WCE driver feature bit but
+ does not set the VIRTIO_BLK_F_FLUSH bit.
+\end{itemize}
-Some older legacy devices did not operate in writethrough mode even
-after a driver announced lack of support for VIRTIO_BLK_F_FLUSH.
\subsection{Device Operation}\label{sec:Device Types / Block Device / Device Operation}
@@ -4183,14 +4226,66 @@ A driver SHOULD accept the VIRTIO_BLK_F_RO feature if offered.
A driver MUST set \field{sector} to 0 for a VIRTIO_BLK_T_FLUSH request.
A driver SHOULD NOT include any data in a VIRTIO_BLK_T_FLUSH request.
+If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY
+switch to writethrough or writeback mode by writing respectively 0 and
+1 to the \field{writeback} field. After writing a 0 to \field{writeback},
+the driver MUST NOT assume that any volatile writes have been committed
+to persistent device backend storage.
+
\devicenormative{\subsubsection}{Device Operation}{Device Types / Block Device / Device Operation}
A device MUST set the \field{status} byte to VIRTIO_BLK_S_IOERR
for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT
write any data.
-Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure
-that any writes which were completed are committed to non-volatile storage.
+A write is considered volatile when it is submitted; the contents of
+sectors covered by a volatile write are undefined in persistent device
+backend storage until the write becomes stable. A write becomes stable
+once it is completed and one or more of the following conditions is true:
+
+\begin{enumerate}
+\item\label{item:flush1} neither VIRTIO_BLK_F_CONFIG_WCE nor
+ VIRTIO_BLK_F_FLUSH feature were negotiated, but VIRTIO_BLK_F_FLUSH was
+ offered by the device;
+
+\item\label{item:flush2} the VIRTIO_BLK_F_CONFIG_WCE feature was negotiated and the
+ \field{writeback} field in configuration space was 0 \textbf{all the time between
+ the submission of the write and its completion};
+
+\item\label{item:flush3} a VIRTIO_BLK_T_FLUSH request is sent \textbf{after the write is
+ completed} and is completed itself.
+\end{enumerate}
+
+If the device is backed by persistent storage, the device MUST ensure that
+stable writes are committed to it, before reporting completion of the write
+(cases~\ref{item:flush1} and~\ref{item:flush2}) or the flush
+(case~\ref{item:flush3}). Failure to do so can cause data loss
+in case of a crash.
+
+If the driver changes \field{writeback} between the submission of the write
+and its completion, the write could be either volatile or stable when
+its completion is reported; in other words, the exact behavior is undefined.
+
+% According to the device requirements for device initialization:
+% Offer(CONFIG_WCE) => Offer(FLUSH).
+%
+% After reversing the implication:
+% not Offer(FLUSH) => not Offer(CONFIG_WCE).
+
+If VIRTIO_BLK_F_FLUSH was not offered by the
+ device\footnote{Note that in this case, according to
+ \ref{devicenormative:Device Types / Block Device / Device Initialization},
+ the device will not have offered VIRTIO_BLK_F_CONFIG_WCE either.}, the
+device MAY also commit writes to persistent device backend storage before
+reporting their completion. Unlike case~\ref{item:flush1}, however, this
+is not an absolute requirement of the specification.
+
+\begin{note}
+ An implementation that does not offer VIRTIO_BLK_F_FLUSH and does not commit
+ completed writes will not be resilient to data loss in case of crashes.
+ Not offering VIRTIO_BLK_F_FLUSH is an absolute requirement
+ for implementations that do not wish to be safe against such data losses.
+\end{note}
\subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation}
When using the legacy interface, transitional devices and drivers
@@ -4233,6 +4328,12 @@ serve as data consistency guarantee. Only a VIRTIO_BLK_T_FLUSH request
does that.
\end{note}
+Some older legacy devices did not commit completed writes to persistent
+device backend storage when VIRTIO_BLK_F_FLUSH was offered but not
+negotiated. In order to work around this, the driver MAY set the
+\field{writeback} to 0 (if available) or it MAY send an explicit flush
+request after every completed write.
+
If the device has VIRTIO_BLK_F_SCSI feature, it can also support
scsi packet command requests, each of these requests is of form:
@@ -5577,7 +5678,7 @@ contents of \field{event}. The following events are defined:
\end{lstlisting}
By sending this event, the device signals a change in the configuration parameters
- of a logical unit, for example the capacity or caching mode.
+ of a logical unit, for example the capacity or cache mode.
\field{event} is set to VIRTIO_SCSI_T_PARAM_CHANGE.
\field{lun} addresses a logical unit in the SCSI host.