Quantum mechanics (QM), being a novel and revolutionary framework for describing phenomena, requires a substantially different mathematical tool-set and way of thinking about physical systems and objects. There is dispute over how exactly to interpret the mathematical system used, but we will not discuss here the various interpretations. Rather, we will just describe and examine the framework and how it can be used to make predictions, all of which is agreed upon.
This will be a multi-part series giving a general introduction to quantum theory. This is part 2.
Hilbert, State, and Dual Spaces
Hilbert space is a generalized
vector space: a sort of extended analog of the usual Euclidean space. Elements of a Hilbert space are sorts of
vectors, and are denoted using a label (basically just a name) and some indication of vector-hood. We will use "
bra-ket notation", in which elements of the vector space are denoted as\(\left | \phi \right >\) (a
ket) (\(\phi\) is merely a label. We may sometimes use numbers, or other symbols, but these are all merely labels). Every such element has a corresponding "sister" in what is called the
dual space, which is denoted by \(\left < \phi \right |\) (a
bra). (The name is basically a joke: two halves of the word "bracket"). The use of the dual space will become apparent in our later discussion. In general, and in QM especially, the vector space is
complex, meaning the vector's "components" (loosely speaking) are complex numbers.
Inner Products
To be a Hilbert space, there must also be an
inner product, or a way of associating a complex number to each pair of vectors (the order may be important: the inner product of A and B need not be the same as that of B and A). The inner product of \(\left | \phi \right > \) and \(\left | \psi \right > \) is denoted by \(\left \langle \psi \right | \left. \phi \right \rangle\), that is the dual of \(\left | \psi \right > \) acting on \(\left | \phi \right > \). In particular, to be a Hilbert space, we must have that if \(\left \langle \psi \right | \left. \phi \right \rangle = z \), \(\left \langle \phi \right | \left. \psi \right \rangle = \bar{z} \), that is, the complex conjugate. If \(\left | \phi \right \rangle= r \left | \psi \right \rangle\) then \(\left \langle \phi \right |= \bar{r} \left \langle \psi \right |\).
Also, we must have \(\left \langle \psi \right | \left. \psi \right \rangle \geq 0\), with equality holding iff \(\left | \psi \right >\) is the
zero vector. Clearly \(\left \langle \psi \right | \left. \psi \right \rangle \) will be real.
Beyond this, the inner product is
linear. In general, if
\(\left | \phi \right \rangle= a\left | \alpha \right \rangle+b\left | \beta \right \rangle \) and
\( \left | \psi \right \rangle= c\left | \gamma \right \rangle+d\left | \delta \right \rangle \), then we have:
\[
\left \langle \psi \right | \left. \phi \right \rangle
=a\bar{c}\left \langle \gamma \right | \left. \alpha \right \rangle
+
a\bar{d}\left \langle \delta \right | \left. \alpha \right \rangle
+
b\bar{c}\left \langle \gamma \right | \left. \beta \right \rangle
+
b\bar{d}\left \langle \delta \right | \left. \beta \right \rangle
\]
We can also prove the famous
Cauchy-Schwartz Inequality, namely, that:
\[ \left |\left \langle \psi \right | \left. \phi \right \rangle \right |^2 \leq
\left \langle \psi \right | \left. \psi \right \rangle \left \langle \phi \right | \left. \phi \right \rangle \]
Two vectors \(\left | \phi \right > \) and \(\left | \psi \right > \) are said to be
orthogonal if \(\left \langle \psi \right | \left. \phi \right \rangle=0\). A vector is said to be
normal or
normalized if \(\left \langle \phi \right | \left. \phi \right \rangle =1\).
If we have a set of vectors \({| \left. \phi_1 \right \rangle} , {| \left. \phi_2 \right \rangle} , {| \left. \phi_3 \right \rangle},...\) such that \( \left \langle \phi_j \right. | \left. \phi_k \right \rangle = 0 \) for all \(j \neq k\), then the set is called
orthogonal set. If it is also the case that \( \left \langle \phi_k \right. | \left. \phi_k \right \rangle = 1 \) for all k, then the set is called
orthonormal.
Operators
An
operator is something which acts on a vector to produce another vector: \(A \left | \phi \right \rangle= \left | \phi' \right \rangle\). The operator \(A\) is
linear if, for any \(\left | \phi \right \rangle= a\left | \alpha \right \rangle+b\left | \beta \right \rangle\), we have
\(
A\left | \phi \right \rangle= a A\left | \alpha \right \rangle+b A\left | \beta \right \rangle
\).
Let \(A \left | \phi \right \rangle= \left | \phi' \right \rangle\) and \(B \left | \psi \right \rangle= \left | \psi' \right \rangle\). If \(\left \langle \psi \right | \left. \phi' \right \rangle=\left \langle \psi' \right | \left. \phi \right \rangle\) then A and B are called
conjugate operators, denoted \(A=B^{\dagger}\) and \(B=A^{\dagger}\), so \(A=\left (A^{\dagger} \right )^\dagger\). We also have \(\left \langle \phi' \right |= \left \langle \phi \right | A^\dagger\). If \(A=A^\dagger\), then A is called
Hermitian. If \(A=-A^\dagger\), then A is called
anti-Hermitian.
If \(\left \langle \psi' \left | \right. \phi'\right \rangle = \left \langle \psi \left | \right. \phi\right \rangle \), for all pairs of vectors, then A is called
unitary.
We also have the following properties:
\[
(A+B)\left | \phi \right \rangle= A\left | \phi \right \rangle+B\left | \phi \right \rangle
\]
\[
AB\left | \phi \right \rangle= A\left (B\left | \phi \right \rangle \right )
\]
Note that it is not necessarily the case that
\[
AB\left | \phi \right \rangle= BA\left | \phi \right \rangle
\]
That is, operators need not
commute. In fact, we commonly use the notation \([A,B]=AB-BA\) (this is called the
commutator of A and B). Non-commutativity will play an important role in the theory.
For a given A, in some cases, for certain \(\left | \phi \right \rangle\), we have that \(A\left | \phi \right \rangle= \lambda \left | \phi \right \rangle \) for some constant \(\lambda\). In this case, we call \(\lambda\) an
eigenvalue of the operator A and \(\left | \phi \right \rangle\) the corresponding
eigenvector.
Often it is the case that we can find a set of orthonormal vectors that are the eigenvectors of a given linear operator, such that we can also write any vector as a linear sum of the eigenvectors. In that case, \[| \left. \psi \right \rangle = a_1 | \left. \phi_1 \right \rangle +a_2 | \left. \phi_2 \right \rangle+a_3 | \left. \phi_3 \right \rangle+...\]where \(a_k=\left \langle \phi_k \right. | \left. \psi \right \rangle\) (\(a_k\) is called the
projection of \(\psi\) into the \(\phi_k\) direction).
Then \[\left \langle \psi\left. \right | \psi \right \rangle=|a_1|^2+|a_2|^2+|a_3|^2+...\]
\[A\left| \psi \right \rangle = \lambda_1 a_1 | \left. \phi_1 \right \rangle + \lambda_2 a_2 | \left. \phi_2 \right \rangle+\lambda_3 a_3 | \left. \phi_3 \right \rangle+...\]
\[\left \langle \psi \right | A\left| \psi \right \rangle = \lambda_1 \left |a_1 \right |^2 + \lambda_2 \left |a_2 \right |^2+\lambda_3 \left |a_3 \right |^2 +...\]
If the operator is also Hermitian, then we call it an
observable. Particularly, if an operator is Hermitian, all its eigenvalues are real.
If \(| \left. \psi \right \rangle \) is normalized, then we can use the notation \(\left \langle A \right \rangle_\psi=\left \langle \psi\left | A \right |\psi \right \rangle\) and \(\sigma^2_A=\left \langle A^2 \right \rangle_\psi-\left \langle A \right \rangle^2_\psi\).
Postulates of Quantum Mechanics
Given that mathematical background, we can now lay out the fundamental postulates of QM. Exactly how to interpret these postulates will be left for later discussion.
-
Wavefunction Postulate
The state of a physical system at a given time is defined by a wavefunction which is a ket vector in the Hilbert space of possible states. Generally, the vector is required to be normalized.
-
Observable Postulate
Every physically measurable quantity corresponds to an observable operator that acts on the vectors in the Hilbert space of possible states.
-
Eigenvalue Postulate
The possible results of a measurement of a physically measurable quantity are the eigenvalues of the corresponding observable.
-
Probability Postulate
Suppose the set of orthonormal eigenvectors of observable A \({| \left. \phi_{k_1} \right \rangle} , {| \left. \phi_{k_2} \right \rangle} , {| \left. \phi_{k_3} \right \rangle},...\) all have eigenvalue \(\lambda\). Suppose the initial wavefunction can be written as
\(| \left. \psi \right \rangle = a_1 | \left. \phi_1 \right \rangle +a_2 | \left. \phi_2 \right \rangle+a_3 | \left. \phi_3 \right \rangle+...\) (i.e. the linear sum of orthonormal eigenvectors of A). Note that \(\psi\) is a superposition of other eigenstates. That is, it is a sort of combination of states that have definite properties. Each eigenstate has a well-defined value for the observable, but \(\psi\) does not.
The probability of measuring the observable to have the value \(\lambda\) is given by
\(P(\lambda)=\left | a_{k_1} \right |^2+\left | a_{k_2} \right |^2+\left | a_{k_3} \right |^2+...\). More simply, if no two eigenvectors have the same eigenvalue, then the probability that we will measure the observable to have value \(\lambda_k\) is \(| \left \langle \phi_k\left | \right. \psi\right \rangle |^2\). This is called the Born Rule.
Given this, it is easy to see that \(\left \langle A\right \rangle_\psi=\left \langle \psi \left | A \right | \psi\right \rangle\) is the expected value of the operator A.
-
Collapse Postulate
Immediately after measurement, the wavefunction becomes the normalized projection of the prior wavefunction onto the sub-space of values that give the measured eigenvalue. That is, using the above description, the wavefunction immediately after measurement becomes \(\alpha \cdot( a_{k_1}| \left. \phi_{k_1}\right \rangle +a_{k_2}| \left. \phi_{k_2}\right \rangle+a_{k_3}| \left. \phi_{k_3}\right \rangle +...)\) where \(\alpha\) is a suitable normalization constant, chosen to make the resulting vector normalized. More simply, if no two eigenvectors have the same eigenvalue, then the wavefunction immediately after we measure the observable to have value \(\lambda_k\) is \(| \left. \psi \right \rangle=| \left. \phi_k \right \rangle\).
-
Evolution Postulate
The time-evolution of the wavefunction, in the absence of measurement, is given by the time-dependent Schrodinger Equation:
\[
\hat{E} \left.|\psi \right \rangle=\hat{H}\left.|\psi \right \rangle
\]
Where \(\hat{E}\) is the energy operator, which is given by \(i \hbar \frac{\partial }{\partial t}\), and \(\hat{H}\) is the Hamiltonian operator, which is defined analogously as in classical mechanics. In particular, it is the sum of the kinetic and potential energy operators.
Spatial Dimensions
A common Hilbert space to use is that of functions of one spatial dimension and time. This is an example of an infinite dimensional Hilbert space (at any x-coordinate, the wavefunction could take on a completely independent value). We often speak of
eigenfunctions instead of eigenvectors in such a space. In this Hilbert space, we define the inner product of two wavefunctions to be \[\left \langle \phi\left | \right. \psi\right \rangle =\int_{-\infty}^{\infty}\bar{\phi}(x,t)\psi(x,t)dx\].
The
momentum operator in the x-direction is given by
\(P_x=\frac{\hbar}{i}\frac{\partial }{\partial x}\). The position operator is quite simply \(X=x\). The (un-normalized) eigenfunctions for each are easily found to be, respectively
\[
\left. | \psi\right \rangle_p=e^{ipx/\hbar}
\]
\[
\left. | \psi\right \rangle_{x_0}=\sqrt{\delta(x-x_0)}
\]
The classical kinetic energy is given by \(E_k=\frac{1}{2}mv^2=\frac{p^2}{2m}\). The potential energy is given simply by \(E_p=V(x,t)\), that is, merely a specification of the potential energy as a function of position and possibly time. Thus, the time-dependent Schrodinger Equation can be written as
\[
i \hbar \frac{\partial }{\partial t} \left.|\psi \right \rangle=\left ( \frac{-\hbar ^2}{2m} \frac{\partial^2 }{\partial x^2}+V(x,t) \right)\left.|\psi \right \rangle
\]
If the wavefunction is an eigenfunction of energy, with eigenvalue E, then its energy does not change with time and we can write the
time-independent Schrodinger Equation:
\[
E \left.|\psi \right \rangle=\left ( \frac{-\hbar ^2}{2m} \frac{\partial^2 }{\partial x^2}+V(x,t) \right)\left.|\psi \right \rangle
\]
That is, \(\psi\) is an eigenfunction of the Hamiltonian. We can often then solve this to find not only the wavefunction solutions, but the energy solutions: often such an equation will only be soluble with a discrete set of possible energies. The conditions of normalizability and normalization, as well as
boundary conditions contribute toward determining energies and solutions.
The extension to multiple dimensions follows analogously.
Spin
The Hilbert space to describe the spin state of an electron (or other spin 1/2 particle) is typically that of a two-by-one matrix. That is, a ket will be of the form
\[
\left. |\psi \right \rangle=
\begin{pmatrix}
a\\
b
\end{pmatrix}
\]
And the corresponding bra will be
\[
\left \langle \psi | \right.=
\begin{pmatrix}
\bar{a} & \bar{b}
\end{pmatrix}
\]
The condition for normalization is that \(|a|^2+|b|^2=1\). A similar description can be used for polarization for photons. The operators for spin in the x, y and z directions, are, respectively:
\[
S_x=\frac{\hbar}{2}\begin{pmatrix}
0 & 1\\
1 & 0
\end{pmatrix}
\]
\[
S_y=\frac{\hbar}{2}\begin{pmatrix}
0 & -i\\
i & 0
\end{pmatrix}
\]
\[
S_z=\frac{\hbar}{2}\begin{pmatrix}
1 & 0\\
0 & -1
\end{pmatrix}
\]
All of these have eigenvalues \(+\frac{\hbar}{2}\) and \(-\frac{\hbar}{2}\), with corresponding eigenvectors:
\[
\left. |+x \right \rangle=\left. |+ \right \rangle=\frac{1}{\sqrt{2}}\begin{pmatrix}
1\\
1
\end{pmatrix},\; \;
\left. |-x \right \rangle=\left. |- \right \rangle=\frac{1}{\sqrt{2}}\begin{pmatrix}
1\\
-1
\end{pmatrix}
\]
\[
\left. |+y \right \rangle=\left. |\rightarrow \right \rangle=\frac{1}{\sqrt{2}}\begin{pmatrix}
-i\\
1
\end{pmatrix},\; \;
\left. |-y \right \rangle=\left. |\leftarrow \right \rangle=\frac{1}{\sqrt{2}}\begin{pmatrix}
1\\
i
\end{pmatrix}
\]
\[
\left. |+z \right \rangle=\left. |\uparrow \right \rangle=\begin{pmatrix}
1\\
0
\end{pmatrix},\; \;
\left. |-z \right \rangle=\left. |\downarrow \right \rangle=\begin{pmatrix}
0\\
1
\end{pmatrix}
\]
Multiple Particles
In the case of more than one particle, we can construct a total wavefunction by composing those of each particle. For instance, if we have two particles, the first with spin up and the second with spin down, we can write that in a variety of ways. For instance:
\[
\left. |\uparrow \right \rangle_1 \otimes \left. |\downarrow \right \rangle_2=\left. |\uparrow \right \rangle_1\left. |\downarrow \right \rangle_2=\left. |\uparrow \downarrow \right \rangle
\]
Clearly this case can be described in a way that treats each particle separately: the first particle is in one state and the second particle is in another state. However, sometimes it can be the case that the total wavefunction cannot be described in such a way. For instance:
\[
\left. |\psi \right \rangle=\frac{1}{\sqrt{2}}\left ( \left. |\uparrow \downarrow \right \rangle +\left. | \downarrow \uparrow \right \rangle \right )
\]
In this case, if we measure the first particle to have spin up, the wavefunction collapses to the state \(\left. |\uparrow \downarrow \right \rangle\). This is an example of
entanglement, which is where two objects' states cannot be independently described.