summaryrefslogtreecommitdiff
path: root/introduction.tex
blob: a4ac01d6e343c9356b78efaee89eddbc7863adcd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
\chapter{Introduction}

\input{abstract.tex}

\begin{description}
\item[Straightforward:] Virtio devices use normal bus mechanisms of
  interrupts and DMA which should be familiar to any device driver
  author. There is no exotic page-flipping or COW mechanism: it's just
  a normal device.\footnote{This lack of page-sharing implies that the implementation of the
device (e.g. the hypervisor or host) needs full access to the
guest memory. Communication with untrusted parties (i.e.
inter-guest communication) requires copying.
}

\item[Efficient:] Virtio devices consist of rings of descriptors
  for both input and output, which are neatly laid out to avoid cache
  effects from both driver and device writing to the same cache
  lines.

\item[Standard:] Virtio makes no assumptions about the environment in which
  it operates, beyond supporting the bus to which device is attached.
  In this specification, virtio
  devices are implemented over MMIO, Channel I/O and PCI bus transports
\footnote{The Linux implementation further separates the virtio
transport code from the specific virtio drivers: these drivers are shared
between different transports.
}, earlier drafts
  have been implemented on other buses not included here.

\item[Extensible:] Virtio devices contain feature bits which are
  acknowledged by the guest operating system during device setup.
  This allows forwards and backwards compatibility: the device
  offers all the features it knows about, and the driver
  acknowledges those it understands and wishes to use.
\end{description}

\section{Normative References}

\begin{longtable}{l p{5in}}
	\phantomsection\label{intro:rfc2119}\textbf{[RFC2119]} &
Bradner S., ``Key words for use in RFCs to Indicate Requirement
Levels'', BCP 14, RFC 2119, March 1997. \newline\url{http://www.ietf.org/rfc/rfc2119.txt}\\
	\phantomsection\label{intro:S390 PoP}\textbf{[S390 PoP]} & z/Architecture Principles of Operation, IBM Publication SA22-7832, \newline\url{http://publibfi.boulder.ibm.com/epubs/pdf/dz9zr009.pdf}, and any future revisions\\
	\phantomsection\label{intro:S390 Common I/O}\textbf{[S390 Common I/O]} & ESA/390 Common I/O-Device and Self-Description, IBM Publication SA22-7204, \newline\url{http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar501/CCONTENTS}, and any future revisions\\
	\phantomsection\label{intro:PCI}\textbf{[PCI]} &
	Conventional PCI Specifications,
	\newline\url{http://www.pcisig.com/specifications/conventional/},
	PCI-SIG\\
	\phantomsection\label{intro:PCIe}\textbf{[PCIe]} &
	PCI Express Specifications
	\newline\url{http://www.pcisig.com/specifications/pciexpress/},
	PCI-SIG\\
	\phantomsection\label{intro:IEEE 802}\textbf{[IEEE 802]} &
	IEEE Standard for Local and Metropolitan Area Networks: Overview and Architecture,
	\newline\url{http://standards.ieee.org/about/get/802/802.html},
	IEEE\\
	\phantomsection\label{intro:SAM}\textbf{[SAM]} &
        SCSI Architectural Model,
        \newline\url{http://www.t10.org/cgi-bin/ac.pl?t=f&f=sam4r05.pdf}\\
	\phantomsection\label{intro:SCSI MMC}\textbf{[SCSI MMC]} &
        SCSI Multimedia Commands,
        \newline\url{http://www.t10.org/cgi-bin/ac.pl?t=f&f=mmc6r00.pdf}\\

\end{longtable}

\section{Non-Normative References}

\begin{longtable}{l p{5in}}
	\phantomsection\label{intro:Virtio PCI Draft}\textbf{[Virtio PCI Draft]} &
	Virtio PCI Draft Specification
	\newline\url{http://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf}\\
\end{longtable}

\section{Terminology}\label{Terminology}

The key words ``MUST'', ``MUST NOT'', ``REQUIRED'', ``SHALL'', ``SHALL NOT'', ``SHOULD'', ``SHOULD NOT'', ``RECOMMENDED'', ``MAY'', and ``OPTIONAL'' in this document are to be interpreted as described in \hyperref[intro:rfc2119]{[RFC2119]}.

\subsection{Legacy Interface: Terminology}\label{intro:Legacy
Interface: Terminology}

Earlier drafts of this specification (i.e. revisions before 1.0,
see e.g. \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]})
defined a similar, but different
interface between the driver and the device.
Since these are widely deployed, this specification
accommodates OPTIONAL features to simplify transition
from these earlier draft interfaces.

Specifically devices and drivers MAY support:
\begin{description}
\item[Legacy Interface]
        is an interface specified by an earlier draft of this specification
        (before 1.0)
\item[Legacy Device]
        is a device implemented before this specification was released,
        and implementing a legacy interface on the host side
\item[Legacy Driver]
        is a driver implemented before this specification was released,
        and implementing a legacy interface on the guest side
\end{description}

Legacy devices and legacy drivers are not compliant with this
specification.

To simplify transition from these earlier draft interfaces,
a device MAY implement:

\begin{description}
\item[Transitional Device]
        a device supporting both drivers conforming to this
        specification, and allowing legacy drivers.
\end{description}

Similarly, a driver MAY implement:
\begin{description}
\item[Transitional Driver]
        a driver supporting both devices conforming to this
        specification, and legacy devices.
\end{description}

\begin{note}
  Legacy interfaces are not required; ie. don't implement them unless you
  have a need for backwards compatibility!
\end{note}

Devices or drivers with no legacy compatibility are referred to as
non-transitional devices and drivers, respectively.

\subsection{Transition from earlier specification drafts}\label{sec:Transition from earlier specification drafts}

For devices and drivers already implementing the legacy
interface, some changes will have to be made to support this
specification.

In this case, it might be beneficial for the reader to focus on
sections tagged "Legacy Interface" in the section title.
These highlight the changes made since the earlier drafts.

\section{Structure Specifications}

Many device and driver in-memory structure layouts are documented using
the C struct syntax. All structures are assumed to be without additional
padding. To stress this, cases where common C compilers are known to insert
extra padding within structures are tagged using the GNU C
__attribute__((packed))  syntax.

For the integer data types used in the structure definitions, the following
conventions are used:

\begin{description}
\item[u8, u16, u32, u64] An unsigned integer of the specified length in bits.

\item[le16, le32, le64] An unsigned integer of the specified length in bits,
in little-endian byte order.

\item[be16, be32, be64] An unsigned integer of the specified length in bits,
in big-endian byte order.
\end{description}

Some of the fields to be defined in this specification don't
start or don't end on a byte boundary. Such fields are called bit-fields.
A set of bit-fields is always a sub-division of an integer typed field.

Bit-fields within integer fields are always listed in order,
from the least significant to the most significant bit.  The
bit-fields are considered unsigned integers of the specified
width with the next in significance relationship of the bits
preserved.

For example:
\begin{lstlisting}
struct S {
        be16 {
                A : 15;
                B : 1;
        } x;
        be16 y;
};
\end{lstlisting}
documents the value A stored in the low 15 bit of \field{x} and
the value B stored in the high bit of \field{x}, the 16-bit
integer \field{x} in turn stored using the big-endian byte order
at the beginning of the structure S,
and being followed immediately by an unsigned integer \field{y}
stored in big-endian byte order at an offset of 2 bytes (16 bits)
from the beginning of the structure.

Note that this notation somewhat resembles the C bitfield syntax but
should not be naively converted to a bitfield notation for portable
code: it matches the way bitfields are packed by C compilers on
little-endian architectures but not the way bitfields are packed by C
compilers on big-endian architectures.

Assuming that CPU_TO_BE16 converts a 16-bit integer from a native
CPU to the big-endian byte order, the following is the equivalent
portable C code to generate a value to be stored into \field{x}:
\begin{lstlisting}
CPU_TO_BE16(B << 15 | A)
\end{lstlisting}

\newpage