Monday, April 21, 2014

Why Heisenberg uncertainty is not that weird

Whenever quantum mechanics (QM) is brought up in a popular context, in a scientific or pseudo-scientific way, the 'weirdness' of it is almost always mentioned, and the Heisenberg uncertainty principle$^1$ is almost always the go-to example of the weirdness (although in pseudo-scientific contexts it is almost always misrepresented).

So what is Heisenberg uncertainty? Simply put, it is a restriction on the accuracy of simultaneous measurements of 'observables' (measurable quantities). The prototypical example is position and momentum; Heisenberg uncertainty states that the position and momentum of a particle$^2$ cannot be known simultaneously to arbitrary precision. The mathematical statement of the position-momentum uncertainty principle is
\begin{equation}
\Delta x\Delta p\leq\frac{\hbar}{2}
\end{equation}
where $\hbar$ is the reduced Planck constant and $\Delta x$, $\Delta p$ represent (in some sense)$^3$ the uncertainty in $x$ and $p$ respectively. Effectively, the better you know position, the less well you know momentum, and vice versa. This is not a specifically experimental limitation, but a fundamental theoretical one. This sort of restriction, on first viewing, indeed seems very strange and certainly counter-intuitive. I will attempt to convince you that not only is it not necessarily strange, but expected.

What is important to remember about QM is that wave mechanics is a central theme. Particles are represented by wave-functions, which are complex solutions to the Schrödinger equation,$^4$ and this wave-nature contributes to a good deal of the quantum weirdness we are familiar with (an example is shown in a recent blog post of mine which relies on superposition and destructive interference of photon waves). With this in mind, let's take a look at some wave-functions.

For illustrative purposes, we will work in one spatial dimension and free space (zero potential everywhere) and only consider time-independent wave-functions. The simplest example of such a wave-function is the plane wave, which takes the form
\begin{equation}
\psi(x)=Ae^{ikx}\equiv Ae^{ipx/\hbar}.
\end{equation}
Here $A$ is the complex-valued amplitude. The amplitude in this case is not important, because any wave-function must be normalisable (as the probability distribution function, which must integrate to 1 over all space to preserve conservation of probability, is given by $|\psi(x)|^2$, known as the Born rule) and so $A$ will need to be scaled anyway. For those who are unfamiliar with complex exponential form, the waviness is more explicit in the less compact form $\exp{(ikx)}\equiv\cos{(kx)}+i\sin{(kx)}$. The $k$ is the wave-number (or wave-vector in higher dimensions), and this term appears naturally in most of the mathematics I'm presenting in this post. For this reason I will include the version of the equations with $k$ alongside the version with the more physically immediate momentum $p$ (the conversion is simply $p=\hbar k$).

Because the Schrödinger equation is linear, sums of solutions are themselves solutions (this property is known as the superposition principle). That means we can have wave-functions of the form
\begin{equation}
\psi(x)=\sum_{m=0}^{n}A_me^{ip_mx/\hbar}
\end{equation}
for any arbitrary $n$ (finite or infinite). Here the scaling of the $A$ values is still not important because of normalisation, but the relative magnitude of them is, as this determines the relative probability weightings according to the Born rule. However, because we have wavelength $\lambda_m=2\pi/k_m\equiv2\pi\hbar/p_m$, we can see that the above formulation does not capture the full number of possible modes; only integer multiples of the $m=0$ mode are captured. In free space all modes are permissible and so we can take the continuum limit (let the sum turn into an integral):
\begin{equation}
\psi(x)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\tilde{\psi}(k) e^{ikx}\mathrm{d}k\equiv\frac{1}{\sqrt{2\pi\hbar}}\int_{-\infty}^{\infty}\tilde{\psi}(p) e^{ipx/\hbar}\mathrm{d}p.
\end{equation}
Because we have moved away from integer indexing, the discrete set of amplitudes $A_m$ is replaced by the continuous function that is suggestively symbolled $\tilde{\psi}$. The function ranges over $p$ because we are integrating over all possible modes/wavelengths/momenta—in a sense $p$ takes over the role of index in the integral from $m$ in the summation. The factor of $1/\sqrt{2\pi}$ is a matter of convention and the factor of $1/\sqrt{\hbar}$ comes from the change of $k$ to $p$.

There is more to $\tilde{\psi}$ than meets the eye. Not only is it the amplitude function for the integral, but it's actually the wave-function itself, except not in physical space like $\psi$ but in momentum space.$^5$ In the context of QM this is known as the momentum space representation of the wave-function, but more broadly the mathematical construct is known as the Fourier transform,$^6$ and Fourier transforms occur very frequently in all manner of physical theories involving waves, be they QM, acoustics, optics, crystallography, signal analysis and so on.

So how do we determine the form of $\tilde{\psi}$? As it turns out, perhaps unsurprisingly, the Fourier transform is invertible, and so we find that
\begin{equation}
\tilde{\psi}(p)=\frac{1}{\sqrt{2\pi\hbar}}\int_{-\infty}^{\infty}\psi(x) e^{-ipx/\hbar}\mathrm{d}x.
\end{equation}
All well and good, but how do we make sense of it? Well, let's consider a limiting case. We can select out a single mode by using a Dirac delta such that $\tilde{\psi}(p)=\delta(p-p_0)$. This Dirac delta is zero everywhere except at $p=p_0$, where it is undefined but the area under the Dirac delta is always normalised to $1$. Inserting the delta into equation (4) yields$^7$
\begin{equation}
\psi(x)=\frac{1}{\sqrt{2\pi\hbar}}\int_{-\infty}^{\infty}\delta(p-p_0)e^{ipx/\hbar}\mathrm{d}p=\frac{e^{ip_0x/\hbar}}{\sqrt{2\pi\hbar}},
\end{equation}
which is, outside a numerical factor, effectively a complex exponential in $x$, or equivalently, a flat (complex-valued) wave across all space (applying the Born rule to a plane wave yields a probability distribution of $|e^{iz}|^2=\text{const.}$). So when the momentum is maximally well-defined (put into a single mode) the position is maximally poorly-defined (the wave-function is spread evenly across all space). By the invertibility of the Fourier transform, we can expect the vice versa case to hold (a maximally defined position, i.e., Dirac delta in $x$, will result in a maximally poorly-defined momentum, i.e., an even wave across all momentum space). This relationships lies at the heart of the Heisenberg uncertainty principle.

In this figure, the blue curve is the probability distribution for
a Gaussian wave-packet and the red curve is the Fourier transform
of the blue curve. All sub-figures are to the same (arbitrary) scale.
(i) The blue curve is given by $|\exp{(-x^2+ix)}|^2$. (ii) The blue
curve is given by $|\exp{(-(2x)^2+i(2x))}|^2$. (iii) The blue curve
is given by $|\exp{(-(4x)^2+i(4x))}|^2$. It is clear to see that as
the width of the blue curve is decreased, the width of its Fourier
transform correspondingly increases. 
It is easy to see why (physically speaking) a Dirac delta in momentum space produces a plane wave in position space, but perhaps not so for the other way around. One way to think of it is by considering the example of a Gaussian wave-packet in position space, which takes the Gaussian distribution as its probability distribution, making it somewhat localised (it's non-zero across all space, but asymptotes to zero approaching $\pm\infty$). It is not hard to show that the Fourier transform of the Gaussian wave-packet is also a Gaussian wave-packet. As we decrease the width of the position space wave-packet, we expect to require higher-frequency momentum modes as these will have shorter wavelengths and thus allow for more cancellation closer to the peak, therefore the momentum space wave-packet must correspondingly increase in width. Taking the limiting case, as the position space wave-packet becomes increasingly peaked the momentum space wave-packet must become increasingly spread-out, leading to the Dirac delta-plane wave correspondence we expected. This argument can also be used in reverse to achieve the more intuitive result of a Dirac delta in momentum space leading to a plane wave in position space.


As hinted to at that start of the post, position and momentum are not the only observables which obey Heisenberg uncertainty. The next most common pairing$^8$ is energy and time,$^9$ although uncertainty relationships can be generated more generally by taking the derivative of the (classical) action (and quantising). For example, momentum is the derivative of the action with respect to position, energy is the derivative of the action with respect to time, and so on.

Hopefully I have demonstrated that Heisenberg uncertainty is not so strange as might have first appeared. Rather than an arbitrary restriction on how accurately we can know certain measurable quantities, it is in fact a basic and unavoidable feature of any linear wave theory, of which quantum mechanics is only one example, albeit of a more fundamental and therefore perhaps more intuitively challenging sort than most.

Notes

$1$. The Heisenberg uncertainty principle is so ubiquitous in quantum physics that it is frequently referred to simply as 'the uncertainty principle'. Out of habit, however, I tend to use the less common 'Heisenberg uncertainty' to differentiate it from other, admittedly much less common uncertainty relations. As far as I know, any of these uses are considered acceptable.

$2$. I hesitate to use the word 'particle', as it is important when talking about these quantum concepts to be very clear about what one means. A better term might have been 'quantum' or 'wave-particle', as the wave-nature of QM is central to the discussion of Heisenberg uncertainty. However, despite the slightly misleading connotations, 'particle' is by far the most commonly used term and so I will use it also.

$3$. We could, for example, take the standard deviation in $x$ and $p$, $\sigma_x$ and $\sigma_p$, as these will be represented by continuous distributions.

$4$. In this post I will use the Schrödinger representation of QM as this makes explicit the wave-nature of the wave-function. However, there are many, many representations of QM, each with their own advantages (and disadvantages) when it comes to analysing real systems, but importantly they are all exactly equivalent and so this wave-nature I have been emphasising is intrinsic to all of them. In that sense I could just as well have chosen any representation and this blog post would otherwise have been identical, although perhaps not as easy to understand.

$5$. If you don't know what momentum space is, the most important thing to understand is that there are mathematical spaces other than the space(time) we are familiar with. Let's consider the flat 3-dimensional "position" space (3-space) of ordinary life, with $x$-, $y$- and $z$-directions. Suppose we have an object at the coordinates $x=1$, $y=0$ and $z=0$. This can be represented by a vector in the 3-space going to the point $(1,0,0)$, thus describing the position. Now suppose the 3-space we are looking at is part of a 4-space that includes time, except we are going to set the time to some instant and freeze it there.

Let's say at that instant the object at $(1,0,0)$ is travelling with a momentum of $0$ in the $x$-direction, $1$ in the $y$-direction and $0$ in the $z$-direction (in arbitrary units). We could then construct a "momentum" 3-space (or 4-space including time) with directions $p_x$, $p_y$ and $p_z$ and at that instant of time the vector corresponding to the object would be at the coordinates $(0,1,0)$. So for any $n$-dimensional position space it's easy to see there is a corresponding $n$-dimensional momentum space. In fact, we can define a $2n$-dimensional space known as the phase space by combining the position and momentum spaces, and that space will describe all possible states of a physical system.

$6$. I want to stress that strictly speaking the momentum representation of the wave-function is not the Fourier transform of the position representation. The Fourier transform dual to position is the wave-vector (or wave-number in 1-D) and not the momentum, although given one is a scalar multiple of the other I feel like we can be a little loose in our communication in this one respect.

$7$. In evaluating equation (6) we have made use of the so-called sifting property of the Dirac delta, where $\int_{-\infty}^{\infty}\delta(x-a)f(x)\mathrm{d}x=f(a)$. This property is analogous to the Kronecker delta for sums but used instead of integrals and is arguably its most useful feature.

$8$. These pairs of variables are typically referred to in physics as 'conjugate variables', although in this context we can also call them Fourier transform duals. It is important to remember that we would be less inclined to refer to them as such if we were working in, for example, the Heisenberg representation of QM where the Heisenberg uncertainty principle arises more directly out of the non-commutativity of Hermitian operator matrices. This is only because the Fourier transforms are implicit in that representation; they are still there in some sense due to the equivalence of representations of QM as discussed in Note 4, but are not nearly as obvious.

$9$. The energy-time Heisenberg uncertainty principle is given mathematically as $\Delta E\Delta t=\hbar/2$. This is analogous to position-momentum uncertainty in the sense that the Fourier dual of position is wave-number $k=p/\hbar$ and not momentum directly; the Fourier dual of time is technically angular frequency $\omega=E/\hbar$ and not energy directly. As in the position-momentum case, however, the difference is only a scalar factor of $\hbar$ and so we can speak reasonably loosely with some impunity.

Thursday, April 10, 2014

News (2014/04/10)

Two items of news to report this week (well, one and a half at least). The half-piece of news is that I'm working on a new blog post which will hopefully be posted some time next week (although it might take until the week after).

The real news is that the LHC has confirmed the existence of Z(4430), a so-called "exotic hadron". Hadrons are composite particles made of quarks (and held together by gluons). According to the quark model, hadrons can only form in one of two ways: a quark-antiquark pairing (known as a meson) and in a quark triplet (know as a baryon).

The most common hadrons in the universe are protons and neutrons which form atomic nuclei. Protons consist of 2 up-type quarks and 1 down-type quark ("uud") while neutrons consist of 1 up quark and 2 down quarks ("udd"). The names and details of quark types, which are known as "flavours", is not something I will go into here, as it is an interesting enough topic to deserve its own post (although a proper explanation for the layperson would need a little more than that I think).

The quark model is a simple one though and does not describe all of the dynamics permitted by quantum chromodynamics ("QCD", the part of the Standard Model that describes strong interactions). This leaves open the door for exotic hadrons which are not mesons or baryons. Z(4430) is one such exotic hadron.

It was first 'discovered' in 2007 (although 5-sigma confirmation didn't come until 2008) and has now been observed at the LHCb experiment at 13.9-sigma accuracy. This means the chances of the observation being a statistical fluke are $1$ in $1.579\times10^{43}$ (a very, very, very large number indeed). It is believed to be a tetraquark made up of 1 charm quark, 1 charm antiquark, 1 down quark and 1 up antiquark ("ccdu").

While perhaps not as exciting as, for example, the BICEP2 result recently, this confirmation is still a very interesting result and will hopefully spur on further developments in the search for exotic hadrons.

Thursday, April 3, 2014

Learning to learn

It has happened a few times over the past couple of years that someone will come up to me with some piece of mathematics that they're having trouble with and ask me if I can help them with it. I'll have a look over it and say something along the lines of "I'm not familiar with this specifically, but I think I should be able to pick it up pretty easily". The usual response to this involves some degree of indignation.

Of course I understand why. Nobody likes to feel stupid or belittled, and this is certainly not my intention, but I can see how someone might feel that way. If I were in that position it would probably be my first response as well. But with that said, I don't think it's an entirely reasonable one.

I'm not trying to show off how clever I am. They were the ones to come to me asking for my help (and if I can I try my best to). Presumably they came to me because they expected I had some degree of expertise, but I get the impression the sort of expertise they were expecting was that of having been taught the subject previously. Sometimes this is the case of course, but mathematics is such a broad topic that you will rarely have seen everything at a sub-postgraduate level no matter how educated you are.

The point of having gone to university and graduated with a degree (even a bachelor's) specialising in mathematics or a mathematics-heavy field like physics is not to simply learn as much of a field as you can and leave it there. The point is to equip yourself with the tools to teach yourself new things, and I hope to some extent I have managed to do that. I've taught myself things I never saw at university and so I know my limitations. Some concepts and techniques I'm reasonably familiar with and others I'm not; some things take more time and more effort to learn than others. It just happens that through study and experience I've managed to achieve the minimal level of competence to teach myself what they're having trouble with.

So my exasperation does not come from a place of ignorance. I have struggled with maths in the past, I know what it's like. The fact remains, however, that my degree would be worth very little if I hadn't picked up the ability to learn high school- or early undergraduate-level maths with relative ease. If someone's problem left me just as nonplussed as they were, then I should probably be seeking partial reimbursement from my university. I certainly hope they don't take it as a personal slight; I have no doubt that if they had studied what I had then they would have just as little trouble with their present difficult as I might (and, I would hazard, quite possibly even less so).