Quantum measurement, explained right - DeriveIt
In this note you'll learn where all the math behind quantum superposition and measurement actually comes from. These intuitions are pretty rare to find: I've only ever seen them stated as an axiom or as the "Born Rule"!Skip to \ref{mwe} if you already know the double slit experiment and basic quantum mechanics. # double slit experiment - seeing why measurement, superposition, and waves are a thing ## ## double slit experiment You probably already know the experiment (\detail{ We fire \b{a single electron at a time} towards a screen with two small holes in it. Electrons hit the screen and leave a mark on it. \image{slits} }). \image{interrference_screen} The electron here can't be described using known physics (\detail{ The electron can't be described as a point particle in this experiment: \image{normal particles} When I first learned about this I thought electrons could be point particles, but just deflecting off the edges of the slits. But there are only two slits, and so that idea could only give rise to two peaks like above. }). That's where quantum physics comes in. ## electron is a wave It looks like the probability distribution of finding the electron at a location on the screen was made by a water wave going through the two slits. So we conclude that that's exactly what happened: a wave \i{did} go through the two slits. $$\text{The electron is literally a wave.}$$ $$\text{The probability of finding the electron somewhere}\\\text{depends only on the height of the wave there.} $$ Here's an example (\detail{ This wave hits the screen, and gives rise to the wave pattern below. \image{screenpluswave} }). Here's what the evolution looks like \detail{ Each frame below is the probability distribution we get on the screen, given a specific distance from the slits to the screen. The distance starts ridiculously small and gets bigger. By looking at these slices, we can see how the wave evolves as it travels. A wave is clearly going through both slits and forming the probability distribution! }: \image{slitwave} People say "the electron is at multiple positions at once", because the electron is literally a wave, and waves occupy many positions at once. Scientists say the electron is in a \b{"superposition"} over multiple positions. ## the screen is special There's something very special about the screen. The screen causes the electron to occupy a specific position. Scientists say the screen \b{"measures"} or \b{"observes"} the wave, and causes it to \b{"collapse"} \detail{Scientists don't know why measurement happens at the screen, and not when the electron was flying through the air, for example. But they don't need to know why it happens to know that it does. }. Right after the electron hits the screen, the wave looks like this \detail{ The reasoning: Right after the electron hits the screen, we know the exact position of it. The position of the electron is wherever it left a mark. There's a 100% chance we'll find it there, and a 0% chance we'll find it anywhere else. In order for the height of the wave to give the probability of finding the electron there, it must be 100% at the mark, and 0% everywhere else. }: \image{wave collapse} ## all particles In the universe, there's light and there are things with mass. It turns out that all massive particles (like the electron) behave like waves. The electron isn't special. It turns out that light also behaves the same way. If you know a little physics, you can reason through this \detail{ Using physics, you can see that any particle's wave properties of frequency and wavelength are interchangeable with its particle properties of energy and momentum. This hints that particles of light and particles with mass might behave the same. Here's how we figure out the interchangeability: For light, \detail{ From special relativity, we know $E^2=(mc^2)^2+(pc)^2$ for any particle. From the photoelectric effect, we know single photons exist because there are discrete units of energy that light can carry. The photoelectric effect also tells us the energy of a single photon is $E=h\nu = \hbar \omega$. We define $\omega = \frac{\nu}{2\pi}$ as the angular frequency and $k=\frac{1}{2\pi} \frac{1}{\lambda}$ as the spatial angular frequency, and we know the speed of light is constant, so $c=\lambda \nu$, or in other words $\omega = ck$. It's easy to combine these for light since $m=0$. The two takeaways for light are that $p=\hbar k$ and $E=\hbar \omega$. So we know we can convert $(\omega,k) \iff (E,p)$ for light. } For particles, \detail{ The deBrogle hypothesis says that for particles, $p=\hbar k$. Apparently, you can also show $E= \hbar w$ in the rest frame of the particle. If you have a good derivation of either of these, please leave a comment! This is too much of an aside for me right now. } }. \box{ 1. The electron is a wave, occupying multiple positions at once, in a \b{superposition} of multiple positions. 2. The screen \b{measures} or \b{observes} the electron, aka it decides a specific position for the electron based on the wave's height and \b{collapses} the wave. \yt{Efb-fb1xy44} } # what about partial measurement? ## example - half screen Say we perform the double slit experiment, but only with the right half of the screen (\detail{ \image{righthalf} }). What happens when the wave hits the screen? Well, obviously, the universe needs to decide whether the electron hit the screen or not! We can use similar reasoning as above\detail{ If the universe decides the electron did hit the screen, then electron will be localized to a single point (where it made a mark). This happens half the time, since half the wave is there. If the universe decides the electron didn't hit the screen, the probability that the electron is on the screen is now 0, so the wave height is 0 on the screen. (Of course, what's left of the wave will contnue traveling and evolving as a wave). This happens the other half of the time. } to determine what happens: \image{half screen} Of course, the electrons that hit the screen still make the pattern \detail{\image{halfscreen pattern}}. \box{ If you measure the wave in a particular region, the wave either gets localized to a point in the region (based on the wave height there), or now has height of 0 in the region: \image{arbitrary M} } #new experiment - polarization filters \label{mwe} ## starting point Unfortunately, the double slit experiment is the best way to see superposition and measurement, but not the best starting point for describing these things mathematically, because it has superposition and measurement over position, which is infinite. On the other hand, the \b{"light filter experiment"} (explained here) has superposition and measurement over just two things, so it's much simpler. So we'll start with this light experiment. ## Maxwell's Equations for light We can use Maxwell's Equations to determine what light looks like\detail{ There's a decent amount of math that goes into solving Maxwell's Equations for light (or anything else), but the details aren't that important. If you're curious, here are 2 ways: Way 1: \detail{ A complicated but general way of solving Maxwell's Equations for the equations of light in free space below is to look at the "Lorenz gauge" ($\vec A$) and a few Fourier Transform properties about it, and then showing that for a single frequency $k_0$ of light that $\vec A(\vec r, t) = \text{Re} \{ \vec \epsilon e^{i \vec k_0 \cdot \vec r - \omega t}\}$, where $\vec \epsilon \perp \vec{k_0}$, and $\vec \epsilon$ is complex. You can show the E-field is proportional to $\vec A$, and the B field is too, but perpendicular to the E-field. } Way 2: \detail{ A simpler way is to use Maxwell's equations to show the general wave equation $\nabla^2\vec E= \frac{1}{c^2} \frac{\partial^2 \vec E}{\partial t^2}$ holds for the E-field, and then solve that with the assumption that you're looking for plane-wave solutions. Again, you can easily show the B field is perpendicular and proportional to the E-field. } }. Light has an electric field (\b{"E-field"}) and a magnetic field (\b{"B-field"}), and both are perpendicular to each other. Here's a simple case (\detail{Linearly polarized light: \image{1 EBflow}}). Since the two fields oscillate together and are always perpendicular, if we know one field, we can fully determine know the other. So we only need to describe one of them. Let's just consider the E field \detail{Linearly polarized light, E-field only: \image{2 Eflow}}. In general, the E-field is given by: $$ \label{E} \vec E(z,t) = a_1 \cos(kz - wt+\phi_1) \, \hat x \\ \;\;\;\;\;\;\;\;\;\;\;\;\;+ a_2 \cos(kz - wt+\phi_2) \, \hat y $$ This is just two perpendicular cosine waves added together! Both cosines travel together with the same speed and frequency (described by $k$, $w$, $z$, and $t$)\detail{ $k$ is the frequency in space, $w$ is the frequency in time (just 2pi/wavelength), and $z$ and $t$ are the position and the time. }. We can change each cosine's individual size/"amplitude" $a_1$ and $a_2$, and its offset/"phase", $\phi_1$ and $\phi_2$. (\detail{ It's called "linearly" because the E-field always stays on a line. \image{3.4 Eflow_linear_add_y} }) We can get \b{"linearly polarized"} light by setting $\phi_1=\phi_2$. (\detail{ Here are the two possible cases of circularly polarized light (clockwise and counter-clockwise). You can easily tell them apart by the helix. Watch $\phi_1$ to see what's going on. \image{3.6 Eflow_linear_add_phase} }) We can get \b{"circularly polarized"} light by making the waves totally out of phase with the same amplitude ($\phi_1\pm \frac{\pi}{2} = \phi_2$ and $a_1 = a_2$). Or we can get \b{"elliptically polarized"} light, which is anything in between linearly and circularly polarized. Here's a picture of everything: \image{Eflow-general} ## basis Your choice of coordinate systems is called a \b{"basis"}. Obviously light behaves the same no matter what coordinates you pick. So it doesn't matter what basis you use. \image{basis} ## polarization filter experiment Now that we know how light works, we can do an experiment with light particles to try and figure out how quantum mechanics works. To do the experiment, you send a linearly polarized light particle through a "polarization filter". We observe that the light particle either passes through the filter and aligns in the same direction as the filter, or it gets blocked by the filter. \image{filter_desc} It turns out there's a probability that a photon makes it through the filter. Interestingly, it only depends on the angle $\theta$ between the filter and the E-field. \detail{ The filter doesn't care where in the cycle the light wave is in! (It doesn't depend on $z$ or $t$). } ! $$P[\text{photon passes through}] = \cos^2 \theta $$ One more important idea: The filter only allows photons to pass through if they align with the filter. And half a photon isn't allowed to pass through\detail{A single photon always has energy $hf$. Filters don't change photon frequency $f$, so filters can't split a photon.}. So either 100% of the photon aligns with the filter and the photon passes through. Or 0% aligns and the photon gets blocked. If 0% aligns with the filter, the E-field still needs to be \i{somewhere} - specifically, it must be perpendicular to the filter. In other words, the filter causes the photon to either: $$ \begin{align*} &\text{1. align parallel.} \\ &\text{2. align perpendicular.} \end{align*} $$ (in case 2., the photon gets blocked). ## superposition and measurement This experiment is starting to look like the double slit experiment, where the electron became localized to a single position when it hit the screen. Here the photon is becoming localized to a single direction when it hits the filter! We can reason that the photon's E-field is in a \b{superposition} over its two cosines. When the photon hits the filter, it gets \b{measured} and collapses either parallel ($\hat y$), or perpendicular ($\hat x$) to the filter. ## why does $\cos^2\theta$ show up? Measurement is probabilistic. The filter needs to pick a probability for the incoming photon to align with $\hat x$, and a probability for it to align with $\hat y$. The only reasonable thing for the filter to do is to look at how much the incoming photon \i{currently} aligns with $\hat x$ and $\hat y$, and pick the probability based on that. In other words, decompose the photon into $a_1 \hat x + a_2 \hat y$, and pick probability based on $a_1$ and $a_2$. \image{cos theta filter} \image{prob triangle} This forces the universe to square the amplitudes to get the probabilities\detail{ We deduced $\hat x$ and $\hat y$ must be perpendicular. So we know $a_1^2+a_2^2=a^2$, or $\Big(\frac{a_1}{a}\Big)^2+\Big(\frac{a_2}{a}\Big)^2 = 1$ Since probabilities add to 1, we can easily identify that $\text{Prob}_1=\big(\frac{a_1}{a}\big)^2$, and $\text{Prob}_2=\big(\frac{a_2}{a}\big)^2$! }, and we can see that $P[\text{pass}]=\text{Prob}_2=\big(\frac{a_2}{a}\big)^2 = \cos^2\theta$! ## collapse picture Here's an analogy to the double slit experiment. The photon is a "wave" that collapses when it hits the filter, based on the height of the wave at each location! Same exact thing as the double slit experiment, just over 2 things, not infinitely many! \image{lin lin M only} \box{ A photon's E-field is in a superposition over its two cosines. A filter measures the incoming photon, causing its E-field to either collapse into a cosine that aligns with the filter, or a cosine that aligns perpendicular. $P[\text{pass}] = P[\text{photon is measured to align with the filter}]$. $P[\text{blocked}] = P[\text{photon is measured to align perpendicular}]$. } # probability ## what does the filter care about? To get a better idea of what the filter cares about, we can rewrite \ref{E} as: $$\vec E(z,t)= \text{Re}\Bigg\{ \Big( a_1 e^{i \phi_1} \hat x + a_2 e^{i \phi_2}\hat y \Big)\;\;\; e^{i(kz- \omega t)} \Bigg\} \label{E2}$$ The exponent term just shifts where the E-field is in its cycle, which the filter doesn't care about. The only thing the filter cares about is the "shape" of the photon, given by $a_1 e^{i \phi_1} \hat x + a_2 e^{i \phi_2}\hat y$\detail{ Note we can ignore $\text{Re}\{...\}$ because the E-field is not a fundamental description of light. The fundamental description comes from the things the E-field \i{depends on}. The E-field only depends on the shape $a_1 e^{i \phi_1} \hat x + a_2 e^{i \phi_2}\hat y$ and the place in the cycle $e^{i(kz- \omega t)}$. }! Note that the shape is usually called the \b{"state"} of the photon. Let's notate the incoming photon's state as $$\vec a = a_1 e^{i \phi_1} \,\hat x + a_2 e^{i \phi_2}\,\hat y$$ and let's notate the state of the filter's pass-through photon as $$\vec p = p_1 e^{i \gamma_1} \, \hat x + p_2 e^{i \gamma_2} \,\hat y$$ ## probability = projection$^2$ Note that when the filter decomposed the light into $\hat y$ and $\hat x$ components, it was really projecting the incoming photon onto $\hat x$ and $\hat y$. Putting this in terms of $\vec a$ and $\vec p$, $\cos^2\theta$ is what we get when we project $\vec a$ onto $\vec p$ and square the result for linearly polarized light\detail{ $\vec a = \sin \theta \hat x + \cos \theta \hat y$, and $\vec p = \hat y$. }. This projection idea is how we'll extend our result from linearly polarized light to all polarized light. \image{projection result 1} ## normalize First of all, the filter doesn't care about the magnitudes of either state. So we'll have to ignore them when we get the probability. It's common to notate the magnitude of a vector as $a = |\vec a| = \sqrt{a_1^2+a_2^2}$. Typically we choose to always ignore the magnitudes by setting $a=1$ and $p=1$\detail{ The magnitude the photon's E-field just depends on its energy squared, and the energy is $\hbar w$. The polarizer doesn't care about $w$, so it doesn't care about the magnitude. If we're in an orthogonal basis, we can normalize each vector to have a magnitude of 1 by shrinking $a_1,a_2,p_1,p_2$ all down: $a_1\rightarrow \frac{a_1}{a}$, $a_2\rightarrow\frac{a_2}{a}$, $p_1\rightarrow\frac{p_1}{p}$, and $p_2\rightarrow\frac{p_2}{p}$. }. ## projection Now, the goal is to figure out how exactly to perform the projection of $\vec a$ onto $\vec p$. The projection's phase is clearly irrelevant to computing probability, so we should ignore it\detail{ Suppose we're working with linearly polarized light with an initial offset, so $\vec p = \hat y$ and $\vec a =e^{i \phi_1}\cos \theta \hat y + e^{i \phi_1}\sin \theta \hat x$, just like above. The projection of $\vec a$ onto $\vec p$ is just $e^{i \phi_1} \cos \theta$. But the filter doesn't care about the $e^{i \phi_1}$, since that just offsets it in its cycle (if you're confused about this, plug $\vec a$ into \ref{E2}). That's why the probability is just $\cos \theta$, and doesn't depend on $\phi_1$. So the phase $e^{i \phi_1}$ should be ignored in the probability. }. Projection is typically done with the dot product. If we combine these two ideas, we get the initial guess of $(\text{projection of } \vec a \text{ onto } \vec p)= |\vec a \cdot \vec p|$. This works for linearly polarized light to give $\cos \theta$, but it gives nonsensical results for arbitrary $\vec a$ and $\vec p$\detail{ If we have a circularly polarizing filter, and the incoming circularly polarized light is circular (so already perfectly aligns with the filter), obviously the probability of passing should be 1. But the regular dot product gives us $P[\text{pass}] = \Bigg|\frac{1}{\sqrt 2} \begin{pmatrix}1\\ i \end{pmatrix} \cdot \frac{1}{\sqrt 2} \begin{pmatrix}1\\ i \end{pmatrix}\Bigg|^2=|\frac{1}{\sqrt 2}(1 - 1)|^2 = |0|^2 = 0$. The problem is that the projection of two complex vectors isn't well defined like this. }. But it's an easy fix: all we need to do is conjugate one of the vectors before taking the dot product \detail{ We can fix this problem by simply conjugating one of the vectors before taking the dot product. Clearly $1 = \begin{pmatrix}a_1 e^{i\phi_1}\\ a_2e^{i\phi_2}\end{pmatrix} \cdot \begin{pmatrix}a_1 e^{i\phi_1}\\ a_2e^{i\phi_2}\end{pmatrix}^*$. The reason this works: You can easily show for a complex number $c$, that $c^*c$ equals the magnitude of $c$ squared, i.e. $|c|^2$, since $c^*c=(c_1+ic_2)(c_1-ic_2)=c_1^2+c_2^2 = |c|^2$. This idea is the vector version of that, taking advantage of the fact that $a=1$, so $a_1^2+a_2^2=1$. }. The projection of $\vec v_1$ onto $\vec v_2$ is defined as: $$|\vec v_1^* \cdot \vec v_2|$$ Defining probability as $|\vec p^* \cdot \vec a| ^2$ works for all cases of linearly polarized light and one case of circularly polarized light, and we can reason that it works in general\detail{ We saw this formula works in two cases: linearly dotted with any other linearly, and circularly dotted with itself. Dot product is a linear operator, and these are the two extremes, so we can reason it holds in all cases: \image{dp_linear_fill} You could have also arrived at this projection result from a pure math perspective: \link{https://en.wikipedia.org/wiki/Dot_product#Complex_vectors, see wikipedia} (Note that $a p \cos \theta = |\vec a^* \cdot \vec b|$ here, but Wikipedia defines $\theta$ as the real part of the dot product rather than the norm of the dot product for some reason). }! Now that we're dealing with complex vectors, we don't say "perpendicular", we say \b{"orthogonal"}. Naturally, vectors $\vec v_1$ and $\vec v_2$ are orthogonal when the projection of one onto the other is $0$: $$\vec v_1 \text{ is orthogonal to } \vec v_2 \iff \vec v_1^* \cdot \vec v_2 = 0$$ ## big result Putting this all together, $$ P[\text{pass}] = P\Bigg[ \begin{pmatrix}a_1 e^{i\phi_1}\\ a_2 e^{i\phi_2}\end{pmatrix} \text { is measured as } \begin{pmatrix}p_1 e^{i\gamma_1}\\ p_2 e^{i\gamma_2} \end{pmatrix}\Bigg] = \Bigg|\begin{pmatrix}p_1 e^{i\gamma_1}\\ p_2 e^{i\gamma_2} \end{pmatrix}^* \cdot \begin{pmatrix}a_1 e^{i\phi_1}\\ a_2 e^{i\phi_2}\end{pmatrix} \Bigg|^2 $$ This is a picture of measurement in general, for any incoming light described by $\vec a$, and any filter that lets through light described by $\vec p$ (of course, $\vec p$ is orthogonal to $\vec p_\perp$). Note that the phase is ignored!: \image{projection upgrade} ## examples example - lin going into lin filter \detail{ $P[\text{pass}] =\Bigg| (\hat y)^* \cdot \Big( e^{i \phi_1} \sin \theta \hat x + e^{i \phi_1} \cos \theta \hat y \Big) \Bigg|^2 =\Bigg| \begin{pmatrix} 0 \\ 1 \end{pmatrix}^* \cdot \begin{pmatrix} e^{i \phi_1} \sin \theta \\ e^{i \phi_1} \cos \theta \end{pmatrix} \Bigg|^2 = \cos^2 \theta $ \image{lin lin} } (Note: \detail{ Note that circularly polarized light is given by $\frac{\hat x + i \hat y}{\sqrt 2} $\detail{ We get circularly polarized light when the cosines are perfectly out of phase and have the same magnitude, so $\phi_1\pm \frac{\pi}{2}=\phi_2$, and $a_1=a_2$. $a_1^2+a_2^2=1$, so both equal $\frac{1}{\sqrt 2}$. Let's just pick $\phi_1=0$ and $\phi_2 = \frac{\pi}{2}$. $a_1 e^{i \phi_1}\hat x + a_2 e^{i \phi_2}\hat y$ $ = \frac{1}{\sqrt 2} (e^{i0} \hat x + e^{i\frac{\pi}{2}} \hat y)$ $= \frac{\hat x + i \hat y}{\sqrt 2} $ }. }) example - circ going into lin filter\detail{ $P[\text{pass}] = \Bigg| \begin{pmatrix}0 \\ 1 \end{pmatrix}^* \cdot \frac{1}{\sqrt 2} \begin{pmatrix} 1 \\ i \end{pmatrix} \Bigg|^2 = \frac{1}{2} = 50\% $ \image{circ lin} } (Note: \detail{ Now, let's look at circularly polarized filters. Clockwise circularly polarized light is given by $\frac{\hat x + i \hat y}{\sqrt 2}$, and the orthogonal vector is $\frac{\hat x - i \hat y}{\sqrt 2}$ (which is counter-clockwise circularly polarized light). You can test that they're orthogonal by conjugating one of them and then dotting them together, which gives $0$. The whole point of a basis is that you can describe any vector in terms of the basis vectors. So the same way we wrote everything in terms of $\hat y$ and $\hat x$ above, we can also write everything in terms of $\frac{\hat x + i \hat y}{\sqrt 2}$ and $\frac{\hat x - i \hat y}{\sqrt 2}$ below. Obviously, the circularly polarizing filter measures the incoming light in this new orthogonal basis. }) example - circ going into circ filter\detail{ $P[\text{pass}] =$ $= \Bigg| \frac{1}{\sqrt 2} \begin{pmatrix} 1 \\ i \end{pmatrix}^* \cdot \frac{1}{\sqrt 2} \begin{pmatrix} 1 \\ i \end{pmatrix} \Bigg|^2$ (basis of $\hat x$ and $\hat y$) $ = \Bigg| \begin{pmatrix} 1 \\ 0 \end{pmatrix}^* \cdot \begin{pmatrix} 1 \\ 0 \end{pmatrix} \Bigg|^2 $ (basis of $\frac{\hat x + i \hat y}{\sqrt 2}$ and $\frac{\hat x - i \hat y}{\sqrt 2}$) $ = 1 = 100\%$ \image{circ circ} } example - lin going into circ filter\detail{ $P[\text{pass}] =$ $ = \Bigg| \frac{1}{\sqrt 2} \begin{pmatrix}1\\ i \end{pmatrix}^* \cdot \begin{pmatrix} 0 \\ 1 \end{pmatrix} \Bigg|^2$ (basis of $\hat x$ and $\hat y$) $ = \Bigg| \begin{pmatrix} 1 \\ 0 \end{pmatrix}^* \cdot \begin{pmatrix} \frac{i}{\sqrt 2} \\ \frac{-i}{\sqrt 2} \end{pmatrix} \Bigg|^2 $ (basis of $\frac{\hat x + i \hat y}{\sqrt 2}$ and $\frac{\hat x - i \hat y}{\sqrt 2}$) $= \frac{1}{2} = 50\%$ \image{lin circ} } example - anti circ going into circ filter\detail{ $P[\text{pass}] =$ $ = \Bigg| \frac{1}{\sqrt 2} \begin{pmatrix} 1 \\ i \end{pmatrix}^* \cdot \frac{1}{\sqrt 2} \begin{pmatrix} 1 \\ -i \end{pmatrix} \Bigg|^2$ (basis of $\hat x$ and $\hat y$) $= \Bigg| \begin{pmatrix} 1 \\ 0 \end{pmatrix}^* \cdot \begin{pmatrix} 0 \\ 1 \end{pmatrix} \Bigg|^2$ (basis of $\frac{\hat x + i \hat y}{\sqrt 2}$ and $\frac{\hat x - i \hat y}{\sqrt 2}$) $ = 0\% $ \image{anti circ circ} } \box{ The projection of a complex vector onto another one just requires conjugating one of the two vectors before we take their dot product\detail{assuming we've normalized the two vectors}. This gives: $$ \begin{align*} P[\text{pass}] &= P[\vec a \text { is measured as } \vec p] \\& = \Bigg|\begin{pmatrix}p_1 e^{i\gamma_1}\\ p_2 e^{i\gamma_2} \end{pmatrix}^* \cdot \begin{pmatrix}a_1 e^{i\phi_1}\\ a_2 e^{i\phi_2}\end{pmatrix} \Bigg|^2 \end{align*} $$ We reasoned that \b{"orthogonal"} for complex vectors should mean that the projection of one onto the other equals zero, i.e. $\vec v_1 \text{ is orthogonal to } \vec v_2 \iff \vec v_1^* \cdot \vec v_2 = 0$. } # wavefunction ## definition $$\text{The wavefunction is defined as a full description of the particle.}$$ ## wavefunction for photon polarization It shouldn't come as a surprise that for the light polarization experiment, the wavefunction is defined as \detail{ We saw this is what gives rise to the probability of the particle (and probability describes the particle fully in the context of this experiment)! }: $$a_1 e^{i \phi_1} \hat x + a_2 e^{i \phi_2}\hat y$$ ## wavefunction for double slit experiment The relevant quantity to measurement in the double slit experiment is\detail{ Since light and particles go through the double slits and produce the same exact results, we can focus on the case that a photon goes through them, since we know the shape of a photon. It's just $a \hat q e^{i \phi} e^{(ikz-wt)}$. We assume the polarization in the double slit experiment is uniform in some $\hat q$ direction, because polarization plays no role in the double slit experiment. So the only relevant quantity is the amplitude and phase $a e^{i \phi}$. Now that position is relevant (unlike in photon polarization, when only polarization was relevant), there's an amplitude and phase for every position. You can think of this as the amplitude of the E-field at each position, but it applies to electrons too, which don't have oscillating E-fields. }: $$a(x) e^{i \phi(x)} \text{ for every position } x$$ Even though in the double slit experiment positions are not physically orthogonal to each other, we describe measurement as if they are orthogonal\detail{ Just like in the light filter experiment, if we measure the electron in one position, it must not be at any other position. And if we measure it's not at a position, it must be at one of the others. Orthogonal just means "no overlap". }, just like in the photon polarization experiment! All the math is the same! Consider the double slit with only 3 possible positions for the electron. Then, measuring the electron's position would look like this: \image{3D M} Clearly, the wavefunction for the double slit experiment with only 3 possible positions is this: $ a_1 e^{i \phi_1} \hat x_1 + a_2 e^{i \phi_2} \hat x_2 + a_3 e^{i \phi_3} \hat x_3 $. If we generalize this to $10$ possible positions, the double slit's wavefunction is: $$ \sum_{i=1}^{10} a_i e^{i \phi_i} \hat x_i \label{discrete} $$ In the real experiment we have infinitely many positions, and the wavefunction is: $$ \int_{-\infty}^{\infty} a(x)e^{i \phi(x)} \hat x(x) \; dx \label{continuous} $$ ## normalization Each basis vector is orthogonal and normalized, as always\detail{ Formally, for discrete \ref{discrete}, $\hat x_i^* \cdot \hat x_j =\delta_{ij}$ Formally, for continuous \ref{continuous}, $\hat x(x_1)^* \cdot \hat x(x_2) = \delta (x_1 - x_2)$. A lot of the time you don't even need to identify each $\hat x$, because when you take the dot product you can just use these identities. That's important because picking a basis for the discrete case is easy, we can just set $\hat x_3 = (0,0,1,0,0,0,0,0,0,0)$ and so on. But in the continuous case \ref{continuous}, we can't write down a vector for each $\hat x(x)$ because $x$ can be infinite. }. Below I notate complex numbers as a letter with an underscore, $\underbar a$, which makes things much cleaner. $\underbar a = a e^{i \phi}$. Here's how we normalize \ref{discrete} and \ref{continuous}, respectively: $$\sum_{i=1}^{10} a_i^2 = \sum_{i=1}^{10} |\underbar{a}_i|^2 = 1$$ $$ \int_{-\infty}^{\infty} a(x)^2 \; dx = \int_{-\infty}^{\infty} \big|\underbar{a}(x)\big|^2 \; dx = 1$$ ## probability In the discrete case, what's the proabability that we measure $\vec a$ in the third position $\hat x_3$? Well, if all the math is the same in both experiments, then it should be ($\vec a$ projected onto $\hat x_3$)$^2$. Here's the probability for \ref{discrete} and \ref{continuous}, respectively: $$ P[\vec a \text{ is measured to be in state } \hat x_i]= \big|\hat x_i^* \cdot \vec a\big|^2 = \big|\underbar a_i \big|^2$$ $$P[\vec a \text{ is measured to be in state } \hat x(x_0)] = \big|\hat x^*(x_0) \cdot \vec a\big|^2 dx \;\;\;\; (= 0 \text{, continuous!})$$ The $dx$ makes the continuous case go to $0$ as expected\detail{ There's an implicit $dx$ multiplying each $a_i$ term in the discrete case, except it's the same for each position, and so it gets normalized out. But in the continuous case, we have an infinite number of $x$s, so $dx\rightarrow 0$, and it can't be normalized out, so we leave it in, multiplying each $\hat x$ and $\vec a$ component. }. ## probability in a region We can get the probability that the electron is in a region by just summing up all the probabilities there. Here's the probability in a region for \ref{discrete} and \ref{continuous}, respectively: $$ P[\vec a \text{ measured between } x_i \text{ and } x_j] =\sum_{k=i}^{j} \big|\hat x_{k}^* \cdot \vec a\big|^2 = \sum_{k=i}^{j} \big|\underbar a_k\big|^2 $$ $$ P[\vec a \text{ measured between } x_1 \text{ and } x_2] =\int_{x_1}^{x_2} \big|\hat x(x)^* \cdot \vec a\big|^2 \; dx = \int_{x_1}^{x_2} \big|\underbar a(x)|^2 \; dx $$ ## general measurement Above, the screen only measured the position of the electron, so it measured in the basis of $\hat x(x)$. But what if we have a measurement apparatus that measures, say $\hat b$? I haven't given any other examples besides position, but I figured I'd put this in to be complete\detail{ One example of $\hat b$ is the momentum. Another example is, say, a uniform disitribution in a region, so $\hat b = \int_{x_0}^{x_0+l} \frac{1}{l} \hat x(x) dx$. Note that if we measure this $\hat b$, the below equation doesn't give the probability that the electron is in the region. It gives the probability the electron collapses and becomes the same exact wave as $\hat b$! (and if it doesn't collapse, the wave in the orthogonal basis that includes $\vec b$ now becomes 0, and renormalizes everywhere else). }. Here's the probability for \ref{discrete} and \ref{continuous}, respectively: $$ P[\vec a \text{ is measured to be } \hat b]=|\hat b^* \cdot \vec a|^2 = \big|\sum_{i=1}^{10} \underbar b_i ^* \; \underbar a_i \big|^2 $$ $$ P[\vec a \text{ is measured to be } \hat b]=|\hat b^* \cdot \vec a|^2 =\big|\int_{-\infty}^{\infty} \underbar b^*(x) \underbar a(x)\; dx \big|^2 $$ Here are the full details for the continuous case\detail{ $|\hat b^* \cdot \vec a|^2$ $= \big|\Big(\int_{-\infty}^{\infty} \underbar b^*(x) \hat x(x) \; dx\Big) \cdot \Big(\int_{-\infty}^{\infty} \underbar a(x') \hat x(x') \; dx'\Big) \big|^2$ $= \big|\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \underbar b^*(x) \underbar a(x') \hat x(x) \cdot \hat x(x') \;dx \; dx' \big|^2$ $= \big|\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \underbar b^*(x) \underbar a(x') \delta(x - x') \;dx \; dx' \big|^2$ $= \big|\int_{-\infty}^{\infty} \underbar b^*(x) \underbar a(x) \;dx \big|^2$ }. ## bra-ket notation Rather than using vectors and writing $\vec a^* \cdot \vec p$, Paul Dirac invented bra-ket notation. It just changes our notation from vectors to matrices. When we start writing matrices and not just doing dot products, this notation becomes much easier to use than vectors and star. It's called bra-ket notation because: $$``\text{bracket}" = ``\text{bra-ket}" = \braket{\;\;|\;\;}$$ $$``\text{bra}" = \bra{\;\;}$$ $$``\text{ket}" = \ket{\;\;}$$ $\vec{a}=\ket{a}=\begin{pmatrix} \underbar a_1\\ \underbar a_2 \\ \underbar a_3\\.\\. \end{pmatrix}$ $\vec a^\dagger = \vec{a}^{\intercal *} =\bra{a}=\begin{pmatrix}\underbar a_1^*& \underbar a_2^* & \underbar a_3^*&.&. &.\end{pmatrix}$ $ \vec{a}^* \cdot\vec{b} =\bra{a}\ket{b} =\braket{a|b} =\begin{pmatrix}\underbar a_1^*& \underbar a_2^* & \underbar a_3^*&.&. &.\end{pmatrix} \begin{pmatrix} \underbar b_1\\ \underbar b_2 \\ \underbar b_3\\.\\. \end{pmatrix} $ ## everything written in bra-ket If you're confused about a result, just look at the part that's not written in bra-ket notation and compare. The wavefunction for light polarization was $\vec a = a_1 e^{i \phi_1} \hat x+ a_2 e^{i \phi_2}\hat y$. Now we can write it as $\ket \psi = \underbar a_1\ket \leftrightarrow + \underbar a_2 \ket \updownarrow$. For the discrete case of the double slit experiment, the wavefunction was $\vec a = \sum_{i=1}^{10} a_i e^{i \phi_i} \hat x_i$. Now we can write it as $\ket \psi = \sum_{i=1}^{10} \underbar a_i \ket{i}$. For the continuous case of the double slit experiment, the wavefunction was $\vec a = \int_{-\infty}^{\infty} a(x)e^{i \phi(x)} \hat x(x) \; dx$. Now we can write it as $\ket \psi = \int_{-\infty}^{\infty} \underbar a(x) \ket x dx$. All of these states $\psi$ were normalized so that the sum of the squares of the $a_i$s was $1$. Even the basis states are normalized. Typically, we normalize everything. A state $\ket \psi$ is normalized if and only if $$ \braket{\psi | \psi} = 1$$ Two states $\ket \phi$ and $\ket \psi$ are orthogonal if and only if $$ \braket{\phi| \psi } = 0$$ Measurement always takes place in an orthogonal basis. For the 3 examples above, that means $\braket{\leftrightarrow \!| \updownarrow} = 0$, $\braket{i | j} = 0$, and $\braket{x | x'} = 0$. And a measurement basis is always normalized, i.e. $\braket{i|i}=1$. Putting these two ideas together, a measurement basis $\ket b_1 \ket b_2 ... $ always satisfies $$ \braket{b_i | b_j} = \delta_{ij}$$ And of course, the probability of measuring any state $\ket \psi$ to be in the state $\phi$ is $$P\Big[\ket \psi \text{ is measured to be in state } \ket \phi \Big] = \big| \braket{\phi | \psi} \big|^2$$ \box{ A wavefunction is described by a sum or integral over basis vectors: $$\ket \psi = \sum_i \underbar a_i \ket i $$ We assume every wavefunction is normalized (to be clear, this applies to basis vectors too): $$\braket{\psi | \psi}=1$$ We also assume all basis vectors are orthogonal, so that: $$\braket{i|j}=\delta_{ij}$$ The probability of measuring $\ket \psi$ in state $\ket \phi$ is: $$ P\Big[ \ket \psi \text{ is measured to be in state } \ket \phi \Big] = |\braket{\phi|\psi}|^2$$ } # final word ## global phase $e^{i \phi} (\underbar a_1\ket \leftrightarrow + \underbar a_2 \ket \updownarrow)$ gives the same exact probability as $(\underbar a_1\ket \leftrightarrow + \underbar a_2 \ket \updownarrow)$. $e^{i \phi}$ is called "global phase" because it adds phase globally to the state. The "global phase" doesn't matter to the probability, because we take an absolute value. A common question is if the global phase is unknowable, or if it's just irrelevant. The answer is that it's irrelevant. The global phase for the photon was the place in the E-field cycle, which we can certainly figure out. It just isn't relevant to probability. ## more intuitions From \ref{E2} you should easily be able to reason why $\frac{1}{\sqrt 2} (\ket \leftrightarrow + i \ket \updownarrow)$ gives circularly polarized light $\ket \circlearrowleft$. The $\hat y$ or $\ket \updownarrow$ component just lags $90 \degree$ behind the $\hat x$ or $\ket \leftrightarrow$ component! This idea extends to all wavefunctions. In the double slit experiment, you can think of the phase of each basis state as the relative offset of the cosine wave at that position. ## next time The next note goes over the more physics-y side of things: position, momentum, spin (Stern Gerlach experiment), and Schrodinger's equation, which tells you how the wave evolves without measurement, i.e. how it evolves until it reaches the screen. Schrodinger's equation just says that wavefunction evolves and interferes the way you'd expect - the wave has the same shape as the $\vec E$ field we saw here, and radiates spherically at all points that it occupies. The interesting thing is that the wavefunction really is a full description of the particle. The wavefunction at one instant in time dictates how it will evolve in the future (assuming no measurement takes place). There's also the less physics-y side of things: quantum computing. Stay tuned. \box{ That completes the note, although we still have more to go: the Schrodinger Equations, the Stern Gerlach Experiment, entanglement, the Bloch Sphere, intuitions on quantum teleportation and quantum computing, and more. Let me know if you liked this note, or if there were any places I should improve (just leave a comment!). If you got stuck somewhere, I urge you to leave a question/comment in that location. } # there's a hidden link below: \problem{https://some.3b1b.co/feedback/ba73d9f7-83ef-4adf-b6c4-95239090a4cf}
https://www.deriveit.org/notes/49