Quantum Mechanical Fundamentals: Part I—Representing states

We shall see how to implement and represent a single qubit succinctly.

published 5/24/2025

brookquantum mechanicsqmf

Standard introductions to QM go into deep detail about particular QM systems, since after all, modeling particular systems is the raison d’être of physics. On the other hand, most QC introductions develop the idealized fragment of QM relevant to it, without delving deeper into the QM formalism that underpins QC. Neither typically discuss the mathematical results that justify the QM formalism. This is alright, but once the QC practitioner needs to optimize circuits for hardware, work in QM applications of QC, or work on QC technologies itself, a short introduction focusing on the QM formalism itself should prove useful. This post attempts to be that bridge. It assumes

a good grasp of undergrad analysis and linear algebra that would be typical for math graduates
and familiarity with quantum computing basics (you know about finite-dimesional Hilbert spaces, state preparation, measurement in the computational basis, and Pauli and rotation gates).

Recap: 1-qubit case

We first recall QC concepts in a single qubit. In this setting, the state exists as a unit vector in two dimensions, $ℂ^{2}$ , furnished with the Euclidean norm. We identify $(\begin{matrix} 1 \\ 0 \end{matrix}) = | 0 ⟩$ and $(\begin{matrix} 0 \\ 1 \end{matrix}) = | 1 ⟩$ . Then ${| 0 ⟩, | 1 ⟩}$ forms an orthonormal basis, called the computational basis, for $ℂ^{2}$ .

A quantum circuit typically starts in the state $| 0 ⟩$ , then proceeds to apply unitary operators $U \in ℂ^{2 \times 2}$ to manipulate the state. Let us look at the form of the state after applying any number of unitary operators. Recall that the composition of any number of unitary operators is again a unitary operator, so it suffices to just consider $U | 0 ⟩$ . Since $U$ is unitary, there exist unique complex numbers $α, β \in ℂ$ such that $U | 0 ⟩ = α | 0 ⟩ + β | 1 ⟩$ . Because $U$ is unitary, the resulting vector also has unit magnitude; that is, $\sqrt{| α |^{2} + | β |^{2}} = 1 = | α |^{2} + | β |^{2}$ .

Finally we measure, and for simplicity, we measure in the computational basis. By the Born rule, $P (0) = | α |^{2}$ and $P (1) = | β |^{2}$ .

Motivating implementation: photon polarization

Let us see how the class of 1-qubit circuits may be realized physically, and in doing so, validate the language of QC as a mathematically appropriate way to model these phenomena. Today, the polarization of light is a widely observed and manipulated quantum phenomenon. Polarizers on sunglasses filter out the polarized light of glare while allowing significant unpolarized light through. Liquid crystal displays manipulate the polarization of light to selectively allow or block the transmission of light from a backlight. The 1-qubit circuit class can be implemented in terms of light polarization in the following manner:

$| 0 ⟩$ and $| 1 ⟩$ shall correspond to horizontal and vertical polarization respectively.
Measurement corresponds to distinguishing between horizontal ( $| 0 ⟩$ ) and vertical ( $| 1 ⟩$ ) polarization states. To do so, we use a beam splitter that reflects light based on polarization. In more detail, a photon polarized horizontally is transmitted almost surely, a photon polarized vertically is reflected almost surely, and transmission is probabilistic when the state vector is in a superposition of these two states. Light detectors are set up to detect whether photons are reflected or transmitted.
An $R_{x} (θ)$ gate can be implemented with a half-wave plate rotated $θ / 4$ anticlockwise from the horizontal axis, sandwiched between two quarter-wave plates with the fast axis in the vertical axis.
An $R_{y} (θ)$ gate can be implemented with a half-wave plate with the fast axis in the horizontal axis, followed by a half-wave plate rotated $θ / 4$ anticlockwise from the horizontal axis.
An $R_{z} (θ)$ gate can be implemented as $R_{z} (θ) = R_{x} (- π / 2) R_{y} (θ) R_{x} (π / 2)$ . It can also be implemented in a more conceptually simple manner with a single wave plate of a precise thickness linear in $θ$ , oriented with the fast axis in the horizontal axis, even though wave plates of arbitrary thickness are not typically manufactured. We mention it now as we will reference this fact later.
We typically begin with horizontally polarized light; that is, the initial state is $| 0 ⟩$ .

Because they implement $R_{x} (θ), R_{y} (θ)$ for arbitrary $θ$ , a finite number of half– and quarter–wave plates can implement any unitary operator on the 1-qubit, including the $R_{z}$ operators. Note that this notion is more powerful than the notion of a universal set, which is usually taken to mean that any unitary operator can be approximated arbitrarily well with some sequence of gates in the universal set. We have shown how a 1-qubit circuit class could be represented by the QC framework, but let us look a little closer at why these were the right choices.

Why a Hilbert space? Why the Born rule?

Indeed, we want to at least be able to represent our outcome as probability mass functions (pmfs) or probability density functions (pdfs). Recall that a pmf is a function $p : D \to ℝ$ , with $D$ being the set of mutually exclusive outcomes, and $\sum_{x \in D} p (x) = 1$ , and each $p (x)$ nonnegative. When $D$ is countable, we can string the probability masses $p (x)$ as a vector

(p (x_{1}), p (x_{2}), \dots) \in ℝ^{| D |}

In the case of light polarization, the pmfs themselves viewed as vectors lie in $ℝ^{{0,1}} \equiv ℝ^{2}$ , and let us write $b_{0} = (1,0)$ and $b_{1} = (0,1)$ as basis vectors for this space. Consider this our first attempt at describing the state space underlying light polarization. We can summarize this as the following.

Fact. The function space $ℝ^{D}$ can be viewed as a vector space over $ℝ$ , using pointwise addition of functions as vector addition, and componentwise multiplication as scalar multiplication. Then, pmfs over $D$ are the subset of this vector space where each component is nonnegative and they sum to $1$ (they form a simplex).

However, we run into issues when we want to talk about how $R_{y}$ affects vectors in a state space like this, if we had $R_{y} : [0,2 π) \to ℝ^{2}$ . In particular, we have two concerns.

First, we want to talk about rotations affecting states in somewhat the same way. Continuing with this state space candidate, we see that we must give up with a claim like modeling rotations as a translation in this state space like $b_{0} - R_{y} (\frac{π}{2}) b_{0} = b_{1} - R_{y} (\frac{π}{2}) b_{1}$ , because $R_{y} (\frac{π}{2}) b_{0} = R_{y} (\frac{π}{2}) b_{1} = (1 / 2,1 / 2)$ , but $b_{0} \neq b_{1}$ . We shall contend with a notion of a norm on this state space that has the property that $‖ b_{0} - R_{y} (\frac{π}{2}) b_{0} ‖ = ‖ b_{1} - R_{y} (\frac{π}{2}) b_{1} ‖$ in this state space, which we should expect from a rotation.¹ But this is problematic for this choice of state space. Leaving the choice of norm arbitrary, and investigating the concrete values of probability masses, we require

$b_{0} = (1,0)$ ,
$R_{y} (π / 3) b_{0} = (3 / 4,1 / 4)$ ,
$R_{y} (π / 3) R_{y} (π / 3) b_{0} = (1 / 4,3 / 4)$ .

So even though $b_{0}$ is the same distance away from $R_{y} (π / 3) b_{0}$ as $R_{y} (π / 3) b_{0}$ itself is from $R_{y} (π / 3) R_{y} (π / 3) b_{0} = R_{y} (2 π / 3) b_{0}$ , we have

\frac{1}{4} ‖ (1, - 1) ‖ = ‖ b_{0} - R_{y} (\frac{π}{3}) b_{0} ‖ = ‖ R_{y} (\frac{π}{3}) - R_{y} (\frac{2 π}{3}) b_{0} ‖ = \frac{1}{2} ‖ (1, - 1) ‖,

which would only hold if $‖ b_{0} - b_{1} ‖ = ‖ (1, - 1) ‖ = 0$ , that is, if $b_{1} = b_{0}$ . But this is not the case. So we have to altogether scrap the idea of using the natural function space of pmfs as our state space if we wanted it to be compatible with such a norm, and find another candidate for the state space. Summarizing, we have the following.

Lemma. There is no way to define a norm on the natural function space on pmfs such that $‖ R_{y} (0) b_{0} - R_{y} (θ) b_{0} ‖ = ‖ R_{y} (θ) b_{0} - R_{y} (2 θ) b_{0} ‖$ for all $θ$ , where $R_{y} (θ) b_{0}$ is the pmf corresponding to measurement on horizontally polarized light beamed through a wave plate implementing $R_{y} (θ)$ .

Instead, let us start from the other end, with the natural way to describe rotations in $ℝ^{2}$ , and work our way back to PMFs. If we should continue to model our outcomes as basis vectors $b_{0}^{'} = (1,0)$ and $b_{1}^{'} = (0,1)$ , and $R_{y}$ as a rotation matrix on the plane, we have that rotations preserve magnitude with respect to the Euclidean norm. This means that if $(u, v) = R_{y} (θ) b_{0}^{'}$ , then $1 = \sqrt{u^{2} + v^{2}} = u^{2} + v^{2}$ . One easily checks that $(u^{2}, v^{2})$ satisfies the requirements of a pmf, and indeed would match the empirical pmf of an experiment implementing it. We note also that since all state vectors have unit magnitude, it becomes natural to talk about the similarity between states using the dot product defined as $(u_{1}, u_{2}) \cdot (v_{1}, v_{2}) = u_{1} v_{1} + u_{2} v_{2}$ , so that $b_{0}^{'} \cdot b_{1}^{'} = 0$ . This notion of a product is compatible with the Euclidean norm in the sense that $‖ s ‖^{2} = s \cdot s$ , and it is compatible with basis ${b_{0}^{'}, b_{1}^{'}}$ , making it orthonormal in respect to the dot product. We can then redescribe the map to obtain the pmf in more linear-algebraic terms: since $b_{0}^{'} = (1,0)$ and $b_{1}^{'} = (0,1)$ , the pmf of a state $s$ is given by $(| s \cdot b_{0}^{'} |^{2}, | s \cdot b_{1}^{'} |^{2})$ . This allows us to naturally talk about the pmf when measured in respect to an arbitrary orthonormal basis, and indeed describe with a change of basis what happens if we should rotate the beam splitter’s axis of polarization. We thus obtain a more suitable state space with respect to $R_{y}$ for the fragment of our experimental setup that allows only $R_{y}$ gates.

Example. The dot product on $ℝ^{2}$ , given any orthonormal basis ${e_{0}, e_{1}}$ , induces a map $m_{{e_{0}, e_{1}}} : s \mapsto (e_{i} \mapsto | e_{i} \cdot s |^{2})$ that maps vectors of unit magnitude to pmfs, and also induces the Euclidean norm, which satisfies $‖ R_{y} (θ) s - s ‖ = ‖ R_{y} (θ) t - t ‖$ for rotation matrices $R_{y} (θ)$ in the plane and vectors $s, t$ of unit magnitude. In particular, we recover the pmf for a light polarization setup as above by mapping a state vector through $m_{{b_{0}^{'}, b_{1}^{'}}}$ .

We proceed on to our other concern, which involves the $R_{z}$ gates. The $R_{z}$ gates, corresponding to a wave plate of variable thickness, affects the relative phase of the vertical and horizontal polarization of light, when the fast axis of the wave plate is aligned with the horizontal axis. By itself, an $R_{z}$ gate does not change the distribution of outcomes if then measured immediately in the computational basis. But you should already know that the $R_{z}$ gates are a form of rotation that affects how subsequent unitary transformations in turn affect the distribution. To that end, it is natural to encode this rotation as the polar angle of a complex number while preserving the magnitude, so that the vector space is defined over $ℂ^{2}$ . This means augmenting the dot product to $⟨ u, v ⟩ = u_{1}^{*} v_{1} + u_{2}^{*} v_{2}$ , so that the norm is extended to $\sqrt{⟨ u, u ⟩}$ . The map to PMFs is then the Born rule (for the finite-dimensional case) $m_{{| 0 ⟩, | 1 ⟩}} : | s ⟩ \mapsto ((i \in {0,1}) \mapsto ⟨ s | i ⟩ ⟨ i | s ⟩)$ . In the computational basis, our vectors now have the form $(a e^{i α}, b e^{i β})$ such that $(a^{2}, b^{2})$ forms a pmf. This increase in expressive power then gives us the notion of a global phase of the state that has no physical significance, which we simply have to live with.

Note that each complex Hilbert space is isomorphic to some real Hilbert space, but suffers from very unwieldy definitions if expressed as such. Phase is a central concept in QM and the representational clarity we gain from using complex numbers is immense.

Example. The inner product $⟨ u, v ⟩ = u_{1}^{*} v_{1} + u_{2}^{*} v_{2}$ on $ℂ^{2}$ , given any orthonormal basis ${α_{0}, α_{1}}$ , induces a map $m_{{α_{0}, α_{1}}} : s \mapsto (α_{i} \mapsto ⟨ s, α_{i} ⟩ ⟨ α_{i}, s ⟩)$ that maps vectors of unit magnitude to pmfs, and also induces a norm satisfying $‖ R_{t} (θ) s - s ‖ = ‖ R_{t} (θ) t - t ‖$ , where $R_{t}$ can be one of $R_{x}, R_{y}, R_{z}$ , and vectors $s, t$ have unit magnitude. In particular, we recover the pmf for a light polarization setup by mapping a state vector through $m_{{0,1}}$ .

Finally, we remark that $ℂ^{n}$ with the given inner product is an example of a separable Hilbert space, and the Born rule is defined abstractly over separable Hilbert spaces. The axiomatic definition of Hilbert spaces, separability, the Born rule, and the modeling of the state vector as a unit vector, are the properties of $ℂ^{n}$ so far that generalize to infinite-dimensional QM systems. It is now convenient to recap some relevant definitions, and we will refer to these definitions in later posts.

Hilbert space. A Hilbert space is a vector space over $ℝ$ or over $ℂ$ furnished with an inner product—a map of two vectors to the underlying scalar field that is conjugate symmetric, linear in the second² argument, and positive definite—such that the vector space is a complete metric space with respect to the distance induced by the norm, in turn induced by the inner product. We will use mathematical notation for arbitrary vectors, and bra-ket notation for unit-magnitude vectors.

Separability. A topological space is separable when it contains a countable dense subset. Then a Hilbert space is separable (wrt. the metric just discussed) if and only if it admits a countable orthonormal basis.

Note that the inner product also induces a canonical isomorphism $H \equiv H^{'}$ between $H$ and its dual space $H^{'}$ ; we shall usually write $⟨ s, t ⟩$ for the inner product between $s, t \in H$ , but write $s^{†} t$ or $(s^{*})^{'} t$ when we need to emphasize that we are mapping $s$ to its conjugate dual $s^{†} = (s^{*})^{'} \in H^{'}$ , then applying it to $t \in H$ to obtain a scalar. When using bra-ket notation, we write $⟨ s | t ⟩$ for the inner product, as we have already done earlier.

Unitary operator. A unitary operator $U$ on a Hilbert space $H$ is a surjective, bounded linear operator that preserves the inner product in the sense that $⟨ U s, U t ⟩ = ⟨ s, t ⟩$ for each $s, t \in H$ .

Adjoint operator. Let $A : D_{A} \to H$ be a linear operator from a dense subspace $D_{A}$ of the inner product space $H$ . The adjoint operator $A^{†} : D_{A^{†}} \to H$ of $A$ is defined over all $y \in H$ for which some $z \in H$ satisfies $y^{†} A x = z^{†} x$ . By the Riesz representation theorem, this $z$ is unique. Then, $A^{†} y = z$ by definition, so that $y^{†} A x = z^{†} x = (A^{†} y)^{†} x$ .

Self-adjoint operator. A self-adjoint operator $A : A_{D} \to H$ on a Hilbert space $H$ is a linear operator from a dense subspace $A_{D}$ of $H$ that satisfies $A = A^{†}$ (including agreement of domains). Then no self-adjoint operator is a proper restriction of another.

We have thus motivated the formalism of a 1-qubit circuit class in QC, by grounding it in the quantum mechanics of light polarization. The field of QC technologies that primarily implement qubits and gates using photons and optical instruments is called LOQC. Among contemporary QC technologies, our light polarization is most similar to this paradigm of quantum computing.

This makes rotation an isometry in this space. ↩
This is a matter of convention adopted in physics that also simplifies notation here; the opposite is conventional in mathematics. ↩

Quantum Mechanical Fundamentals: Part I—Representing states

We shall see how to implement and represent a single qubit succinctly.

Recap: 1-qubit case

Motivating implementation: photon polarization

Why a Hilbert space? Why the Born rule?

Footnotes