Tuesday, October 27, 2015

Occam+Bayes=Induction

  A classic problem in philosophy and the philosophy of science is how to justify induction. That is, how to rationally go from the fact that X is true in N previously observed cases to the belief that it is true in all cases, or at least in an additional, unobserved case. We will here propose a quick and simple method to justify induction, based on the combination of Occam's razor (to choose hypotheses) and Bayesian inference to update epistemic probabilities.


Notation

Let us introduce the following notation. Let \(H\) be some hypothesis which we want to judge for plausibility. Let \(X_k\) be the fact that \(X\) is true in the kth instance. Let \(X^n\) be the fact that \(X\) is true in the first n cases, that is \[X^n=X_1 \cap X_2 \cap \cdots \cap X_n=\bigcap_{k=1}^{n}X_k\] so that \[X^{n-1}\cap X_n=X^n\] Thus \(P\left ( X^n|H \right ) \) is the (epistemic) probability that we observe X in n cases, supposing H is true, and \(P\left ( H|X^n \right ) \) is the (epistemic) probability that H is true, supposing we observe X to be the case in n cases.


Occam's Razor

There are three basic, simplest hypotheses we can form, all the rest being more complex. These three are the
  • Proinductive (P) hypothesis: the chance of X happening again increases as we see more instances of it.
  • Contrainductive (C) hypothesis: the chance of X happening again decreases as we see more instances of it.
  • Uninductive (U) hypothesis: the chance of X happening again stays the same as we see more instances of it.
For concreteness, let \(F_H(n)=P\left ( X_{n}|H \cap X^{n-1} \right )\). Thus we say that, for \(m > 0\), \(F_P(n+m) > F_P(n)\), and \(\lim_{n \rightarrow \infty} F_P(n)=1\), and \(F_C(n+m) < F_C(n)\), and \(\lim_{n \rightarrow \infty} F_C(n)=0\), and \(F_U(n)=F_U(0)\).


Bayesian Inference

We want to find \(P\left ( H|X^n \right ) \) for the hypotheses listed in the previous section. We have \[ P\left ( X^n|H \right )=P\left ( X_n \cap X^{n-1}|H \right )=P\left ( X_n |X^{n-1} \cap H \right ) \cdot P\left ( X^{n-1} |H \right )=F_H(n) \cdot P\left ( X^{n-1} |H \right ) \] Therefore \[ P\left ( X^n|H \right )=\prod_{k=1}^{n} F_H(k) \] Suppose that there are \(N\) mutually exclusive and collectively exhaustive hypotheses. Then, Bayes' formula states: \[ P(H_m|A)=\frac{P(A|H_m)P(H_m)}{P(A|H_1)P(H_1)+P(A|H_2)P(H_2)+\cdots+P(A|H_N)P(H_N)} \] Thus, we have \[ P(H_m|X^n)=\frac{P(X^n|H_m)P(H_m)}{P(X^n|H_1)P(H_1)+P(X^n|H_2)P(H_2)+\cdots+P(X^n|H_N)P(H_N)} \] Therefore \[ P(H_m|X^n)=\frac{P(H_m)\prod_{k=1}^{n} F_{H_m}(k)}{P(H_1)\prod_{k=1}^{n} F_{H_1}(k)+P(H_2)\prod_{k=1}^{n} F_{H_2}(k) + \cdots + P(H_N)\prod_{k=1}^{n} F_{H_N}(k)} \] Let us suppose that the three hypotheses mentioned above are collectively exhaustive. Suppose, for concreteness that \(F_P(n)=\frac{n}{n+1}\), \(F_C(n)=\frac{1}{n+1}\), and \(F_U(n)=\frac{1}{2}\). Thus \(\prod_{k=1}^{n} F_{P}(k)=\frac{1}{n+1}\), and \(\prod_{k=1}^{n} F_{C}(k)=\frac{1}{(n+1)!}\), and \(\prod_{k=1}^{n} F_{U}(k)=\frac{1}{2^n}\). Let \(P(P)=p\) and \(P(C)=q\) and \(P(U)=r\) where \(p+q+r=1\). Then: \[ P(P|X^n)=\frac{p\frac{1}{n+1}}{p\frac{1}{n+1}+q\frac{1}{(n+1)!}+r\frac{1}{2^n}} \] \[ P(C|X^n)=\frac{q\frac{1}{(n+1)!}}{p\frac{1}{n+1}+q\frac{1}{(n+1)!}+r\frac{1}{2^n}} \] \[ P(U|X^n)=\frac{r\frac{1}{2^n}}{p\frac{1}{n+1}+q\frac{1}{(n+1)!}+r\frac{1}{2^n}} \] A simple assessment of limits shows that the former goes to 1 quite rapidly, for increasing n, for any nonzero p, and the latter two go to zero. In fact, for \(p=q=r=1/3\), for \(n>10\), \(P(P|X^n)>0.99\), and for \(n>17\), \(P(P|X^n)>0.9999\).

This example is meant to be only illustrative, to show the general way in which Occam's razor, combined with Bayesian inference, leads to a support of induction. The same things happening repeatedly lends credence to the hypothesis that the same things happen repeatedly, and detracts from the hypothesis that the same things are unlikely to happen repeatedly, or always happen with the same probability. In a very similar way, a coin repeatedly coming up heads supports the hypothesis that it is biased to come up heads, and detracts from the hypotheses that it is biased to come up tails or is fair. This may seem obvious, but it is beneficial to see exactly how the mathematical machinery supports this intuition.

We may also wish to include other hypotheses, but we must first assess the prior probabilities that they are true, and Occam's razor advises taking the inverse probability as inverse to the complexity of the hypothesis. Thus, even if on the hypothesis, observing n X's is more likely than on the three discussed, it would needs be more complex or ad hoc, and so would have a significantly lower prior probability.

Friday, October 23, 2015

Some Introductory Quantum Mechanics: Classical Background and Non-Classical Phenomena

  Quantum mechanics (QM) is a theoretical framework that describes the fundamental nature of reality, of particles of matter and light among potential others. QM arose from and in contrast to classical mechanics (CM), with many formulations and features still relying heavily on CM ideas. However, several phenomena established that CM cannot be the whole story, and would need to be amended. A new theory would need to be introduced to account for these phenomena, which would also predict some startling other ones. However, the best way to interpret the new theory is still disputed.

This will be a multi-part series giving a general introduction to quantum theory.



Classical Mechanics


QM is distinct from CM, though similar in several respects. CM, in general, looks at the behavior of idealized geometrical bodies, rigid, elastic, and fluid. The state is always definite, and in this state, momentum, energy, position and the like are well-defined and definite (we may make an exception for statistical mechanics, but in that case, these quantities may take on distributions only in the sense of an ensemble: it would still be in principle possible to determine these properties for each element in the ensemble, as Maxwell's demon would do). CM is how we tend naively to see the world. Things look like they are definite, spatially constrained, like a bunch of tiny definite parts, or large definite volumes moving along definite paths. This is decidedly not the case in QM.

CM has three main, equivalent formulations: Newtonian, Lagrangian and Hamiltonian.

  • Newtonian: Newtonian mechanics is the typical pedagogical formulation. It deals with the position and velocity of point masses, extended bodies, fluids, etc. in terms of forces, which relate back to position via Newton's second Law (which is really more of a definition). That is, for each asymptotically infinitesimal bit of matter in the system, find the net forces (and torques/stresses), in terms of the positions of the other bits of matter, relate it to the acceleration via Newton's second Law, and then solve the big set of differential equations (or use an iterative approximation method like Runge-Kutta) to find the trajectories of each bit of matter (often the problem is much simplified by various symmetries, homogeneities, localities, redundancies, and conservation considerations).

  • Lagrangian: Lagrangian mechanics deals more in energies, specifically a certain function of time space and momentum (of all the degrees of freedom) called the Lagrangian, which is typically just the kinetic minus the potential energy. Lagrangian mechanics allows one to deal with constraints in a simpler and more elegant way. Integrating the Lagrangian over time gives the action. The principle of stationary action states that objects move so as to make the action at a minimum (or sometimes, though rarely, at a maximum). This can be roughly and loosely interpreted as saying that objects go along the "easiest" trajectories. Lagrangians are still used extensively in modern physics, such as in quantum field theory and the path integral formulation.

  • Hamiltonian: Hamiltonian mechanics also deals in energies, specifically a certain function, related to the Lagrangian, called the Hamiltonian, which generally is equal to the total energy of the system. The trajectories are then found via Hamilton's equations, which are a set of differential equations relating changes of the Hamiltonian to changes of the position and momentum. This formalism uses rather abstract notions, such as frames of reference, generalized coordinates, phase space and the like. However, it is one of the most powerful formulations of classical mechanics and serves as one of the basic frameworks for the development of quantum mechanics.

Measurement in CM is intuitive and simple. We measure the position of a thing by looking at where it is and recording that. Measurement need not affect the thing being measured, at least in principle. But even if it can't be done in practice, the information being sought is still there and definite regardless. A hypothetical Laplace's demon could know all the parameters as they really are. This is very plausibly not the case in QM

As a rule, in CM, if an object requires energy \(E\) to do X, but only has energy \(E'< E\), then the object won't be able to do X. For example, if a marble is in a bowl with sides at a height requiring energy E to overcome (i.e. if the object is of mass m, the height of the sides is \(E/mg\)), but the marble only has energy \(E'< E\), the marble cannot escape the bowl. There is no chance that anyone will ever make a measurement of the position of the marble and have that be outside the bowl. Interestingly, this is not the case in QM.

CM is generally deterministic in a rather strict sense (though there are certain rare exceptions). Given that all of the above formulations are equivalent, they are all reducible to a set of second-order differential equations of various initial positions. This means that if all initial positions and velocities are known, even if the relevant forces are time-dependent, the trajectory of each object at all future times is unique and determinable. Any apparent indeterminism is merely apparent, namely epistemic. Assigning probabilities to different states or outcomes is done not because the state is ill-defined or there is some amount of indeterminism that emerges somehow. Rather, it is due to not knowing the initial state or not knowing how the system evolves. Were we to know completely the initial state and how it evolves, there would be no indeterminism. Moreover, any correlations arise from epistemically vague definite correlations. For instance, if we have two marbles, one of mass 100g and one of mass 105g, give one to one experimenter and the other to another, though they do not know which they received, once one experimenter weighs his marble, he immediately knows the weight of the other marble, even if it is very far away. We will find that this is not the case in QM.

A further development of CM was the inclusion of electromagnetic phenomena. These were incorporated in Maxwell's equations, which describe how electromagnetic fields are generated and changed by charges and currents. In essence, there is a ubiquitous, continuous electromagnetic field, which can be excited and disturbed in various ways, producing effects like radiation and induction (which lend themselves to a huge array of engineering and technological applications). A relatively simple theorem of electromagnetic theory is that accelerating charges radiate energy. This is most easily seen as being due to producing electric fields of varying strengths, combined with the fact that electromagnetic changes travel at a finite speed. For example, an oscillating charge will produce fields now weaker now stronger as it moves closer and further from a point. If we put a charge on a spring a distance away, it would begin oscillating, too, due to the varying force acting on it. Thus we could extract energy from the oscillating charge, and so it must be radiating energy, and so its oscillations will gradually decay. (Note that this implies that charges in orbit around one another will gradually radiate off their energy and fall into one another.) One of the outcomes of Maxwell's electromagnetic theory was the demonstration that light was electromagnetic in nature: electromagnetic disturbances propagated at the speed of light, and thinking of light as electromagnetic radiation accounted for a huge array of optical phenomena.

Also, electromagnetism is decidedly a wave-theory. The electromagnetic field is continuous and ubiquitous: it doesn't come in discrete "chunks" or "lumps" and it can have any value. It can have arbitrary energy (or energy density, a the case may be). This is opposed to particles, objects like little marbles, with definite extents, centers. When particles move, the stuff they are made of literally goes from one place to another. Whereas, when a wave moves, the field in one place increases, and decreases in another place: the pattern as opposed to the substance moves. Waves display interference effects: two waves could interfere constructively (increasing the size of the wave) or destructively (decreasing the size of the wave), whereas this seems impossible for particles. Destructive interference for particles would mean that when two particles came together, suddenly there was less substance there. We will return to this in discussing the two-slit experiment below.



Non-Classical Phenomena


There were several phenomena that indicated that CM was not the whole story, that it failed to give a full description of the world. These then paved the way for the development of QM.
  • Millikan's and Rutherford's Experiments

    Millikan discovered, by a very ingenious experiment, that charge was quantized, i.e. it came in "chunks" or "lumps". There was a smallest unit of charge. The existence of electrons as objects with a definite mass had already been discovered by Thompson, experimenting with cathode ray tubes, but it was not known whether electrons had a definite, single charge. Millikan found that charge only came in integer multiples of the fundamental charge, known to be about \(1.6 \times 10^{-19} \mathrm{C}\). Rutherford then demonstrated that the atom was structured, not as Thomson supposed, like a plum pudding, but rather with a small, dense, positively charged nucleus with the electrons in some arrangement around it.

  • Stability and Discrete Radiation of the Atom

    Rutherford's model of the atom (as well as any similar model) is impossible, according to classical electromagnetic theory. As discussed above, orbiting charges cannot persist indefinitely, as they will radiate off energy, and the orbit will eventually decay, the particles eventually colliding. As this clearly does not happen, there must be some modification to the understanding of the atom. In addition, it was noticed that an excited atom only emitted radiation at definite frequencies, not in a continuous spectrum. In the case of hydrogen, the radiation frequencies followed a very simple pattern. This behavior, however, could not be accounted for on classical mechanics, as the electron orbiting the nucleus could potentially have any energy. Moreover, if the electron could only have certain definite energies, it became difficult to see how it could go from one definite energy to another without taking on the intermediate energies. Clearly classical theory would have to be modified to allow for this.

  • Photoelectric Effect

    It was observed that shining light on a metal induced a current. This by itself was predictable by CM, given the understanding that the metal had electrons in it, and when light shone on the metal, some electrons absorbed the energy and so were able to escape the metal to produce a current. However, according to CM, the energy of the light depended solely on the amplitude (i.e. brightness): it would not depend on the frequency (i.e. color) of the light used. Also, for sufficiently dim light, there should be a lag time between when the light comes on and electrons are emitted, due to the electrons needing to absorb a sufficient amount of light energy. However, neither of these predictions were correct: very bright light of sufficiently low frequency induced no current. And at sufficiently high frequencies, regardless of how dim the light was, the current began immediately, with no delay. This led Einstein correctly to conclude that light was quantized, in units called photons. The energy of each photon was related to the frequency of the light. The brighter the light, the greater the number of photons per unit time. This would entail that for light of a low frequency, even if bright, no electrons would be ejected from the metal, as each photon lacks enough energy to eject an electron, and the chance of multiple photons hitting the same electron is negligible (and the energy that is absorbed is dissipated as heat in the meantime). Moreover, for high enough frequencies, the energy per electron is linear with respect to frequency, with slope \(h= 6.626 \times 10^{-34} \mathrm{J}\cdot \mathrm{s}\), known as Planck's constant (however, the current, is dependent on the brightness of the light). This leads to the conclusion that the energy of each photon is given by \(E=hf\).

  • Black Body Radiation

    A black body is defined as a perfect radiating source: it absorbs all radiation that falls on it, at a constant temperature. Such a body is known to radiate electromagnetic radiation, but finding and making sense of the spectrum of such a body is non-trivial. According to classical electromagnetic theory, the amount of radiation produced is expected to be proportional to the square of the frequency. That is, the higher the frequency, the more radiation. This is clearly not what happens in nature: otherwise hot objects would emit huge amounts of X-rays and gamma rays, and would instantaneously reach absolute zero, transforming all the thermal energy into electromagnetic radiation, as the total radiation is unbounded. However, Planck found that, by postulating that electromagnetic radiation was quantized as photons, with energies given by \(E=hf\), the total radiation was bounded, and tailed off at higher frequencies. The resulting formula is well born out by experiments, lending support to his postulation.

  • Double Slit Experiment

    An experiment was performed in which a very dim coherent light source was placed in front of a photographic plate, behind an opaque plate with two narrow slits. The light source was so dim that it emitted no more than one photon at a time. What was found was very strange, according to classical mechanics. The photographic plate produced a pattern of spots where each photon hit it, indicating that the light had been behaving like particles. However the pattern produced is what the classical wave theory predicted: an interference pattern. Had the photons been acting like genuine classical particles, a different pattern would have emerged, one with only two peaks as opposed to many. Classical theory had no way to account for this. In addition, whenever any sort of measuring apparatus was put in place to detect which slit the photon passed through (if it was behaving like a classical particle, it would need to have a definite position and hence pass through a definite slit), the wave-pattern disappeared and a particle-pattern emerged. Classical physics has no way to explain this. Moreover, the experiment has these same features, even when performed with electrons, atoms and even molecules. In each case, the interference pattern produced is consistent with thinking of each object as if it were a wave with wavelength \(\lambda=h/p\), where p is the momentum of the object. More generally, \(\mathbf{p}=\frac{h}{2\pi}\mathbf{k}\), where \(\mathbf{k}\) is the wave vector (a sort of generalized, multidimensional wavelength). In fact, the quantity \(\frac{h}{2\pi}\) comes up so frequently that it is given its own symbol: \(\hbar\).

  • Stern-Gerlach Experiment

    It was noticed that when a stream of certain atoms passed through an inhomogeneous magnetic field, the stream separated into several beams, two in the case of silver atoms. This demonstrated not only that the atoms had a magnetic dipole moment, but also that this moment was quantized, as otherwise it would have produced a smear, as opposed to several beams. The magnetic moment was correctly attributed to the charged particles in the atom, in particular the electrons. This implied that the electron had angular momentum. In classical mechanics, an object has angular momentum purely in terms of its structure and rotation. For example a wheel has angular momentum given its distribution of mass combined with its rotation. A point particle in classical mechanics cannot have angular momentum. Thus, as the electron was not known to have any internal structure, nor any literal rotation, the angular momentum could not be accounted for by classical physics. The angular momentum was thus given the name spin. An electron always has a measured angular momentum of either \(+h/2\) (called spin up) or \(-h/2\) (called spin down), relative to the axis of measurement. This itself is non-classical: classically, if an object has angular momentum about a certain axis, its angular momentum about an orthogonal axis will be zero, but electrons are never measured to have zero spin.

  • Apparent Indeterminacy

    Suppose we have an electron with measured spin up along the x-axis. If it is measured along the y-axis, it will be found to have either spin up or spin down along that axis. Moreover, the spin measured along that axis will appear to be perfectly random: the results of such an experiment pass every known test for statistical randomness. This feature arises often in similar cases. For instance, in the two-slit experiment, where the next photon (or electron) hits the screen is also apparently random. A half-silvered mirror is a common device in optics, which transmits half the light shone on it and reflects the other half. However, if we put two detectors at points where transmitted and reflected light would go, and shine very dim light on it, such that no more than one photon is reaching the half-silvered mirror at a time, the pattern of detectors registering will be also apparently random. The pattern of detection passes every known test for statistical randomness. This type of behavior is very different from the usual CM sort. This apparent indeterminacy or randomness is a major aspect of quantum mechanics, and belies much of the disputes and misunderstandings surrounding it.

Tuesday, October 20, 2015

Product Formula for Sine and Some Interesting Corollaries

 

Deriving the Product Formula: The Easy Way


Recall from this post that: \[ \sum_{n=1}^{\infty} \frac{1}{x^2+n^2}=\frac{\pi}{2x} \coth(\pi x)-\frac{1}{2x^2} \] We then substitute \(x=i z\): \[ \sum_{n=1}^{\infty} \frac{1}{n^2-z^2}=-\frac{\pi}{2z} \cot(\pi z)+\frac{1}{2z^2} \] We then go down the following line of calculation: \[ \sum_{n=1}^{\infty} \frac{2z}{n^2-z^2}=\frac{1}{z}-\pi\cot(\pi z) \] \[ \int\sum_{n=1}^{\infty} \frac{2z}{n^2-z^2}dz=C+\int \frac{1}{z}-\pi\cot(\pi z) dz \] \[ \sum_{n=1}^{\infty} -\ln \left (1-\frac{z^2}{n^2} \right )=C+\ln (z) - \ln (\sin (\pi z) ) \] \[ \sin(\pi z)=C' z\prod_{n=1}^{\infty}\left ( 1-\frac{z^2}{n^2} \right ) \] We can find \(C'\) by looking at the behavior near zero, and so find that: \[ \sin(\pi z)=\pi z\prod_{n=1}^{\infty}\left ( 1-\frac{z^2}{n^2} \right ) \] Therefore: \[ \sin(z)=z\prod_{n=1}^{\infty}\left ( 1-\frac{z^2}{\pi^2 n^2} \right ) \]



Deriving the Product Formula: The Overkill Way, by Weierstrass' Factorization Theorem


Suppose a function can be expressed as \[ f(x)=A\frac{\prod_{n=1}^{M}\left ( x-z_n \right )}{\prod_{n=1}^{N}\left ( x-p_n \right )} \] Where \(M \leq N\) and \(N\) can be arbitrarily large, even tending to infinity. Assuming there are no poles of degree >1 (all poles are simple), we can rewrite this as \[ f(x)=K+\sum_{n=1}^{\infty} \frac{b_n}{x-p_n} \] Where some of the \(b_n\) may be zero. We can also write this as \[ f(x)=f(0)+\sum_{n=1}^{\infty} b_n \cdot \left ( \frac{1}{x-p_n}+\frac{1}{p_n} \right ) \] Suppose \(f(0) \neq 0\), and that \(f\) is an integral function (i.e. an entire function). In that case, the logarithmic derivative \(f'(x)/f(x)\) has poles of degree 1. Moreover, \[\lim_{x \rightarrow z_n} (x-z_n)\frac{f'(x)}{f(x)}=d_n \] Where \(d_n\) is the degree of the zero at \(z_n\). Thus: \[ \frac{f'(x)}{f(x)}=\frac{f'(0)}{f(0)}+\sum_{n=1}^{\infty} d_n \cdot \left ( \frac{1}{x-z_n}+\frac{1}{z_n} \right ) \] Integrating: \[ \ln(f(x))=\ln(f(0))+x \frac{f'(0)}{f(0)}+\sum_{n=1}^{\infty} d_n \cdot \left ( \ln \left (1-\frac{x}{z_n} \right ) +\frac{x}{z_n} \right ) \] \[ f(x)=f(0) e^{x \frac{f'(0)}{f(0)}} \prod_{n=1}^{\infty} \left (1-\frac{x}{z_n} \right )^{d_n} e^{x\frac{d_n}{z_n}} \] This is our main result, called the Weierstrass factorization theorem. In particular, for the function \(f(x)=\sin(x)/x\) \[ \frac{\sin(x)}{x}=\prod_{n=-\infty, n \neq 0}^{\infty} \left (1-\frac{x}{n \pi} \right ) e^{x\frac{1}{n \pi}}=\prod_{n=1}^{\infty} \left (1-\frac{x^2}{n^2 \pi^2} \right ) \] Thus \[ \sin(x)=x\prod_{n=1}^{\infty} \left (1-\frac{x^2}{\pi^2 n^2 } \right ) \]



Corollary 1: Wallis Product


Let us plug in \(x=\pi/2\): \[ \sin(\pi/2)=1=\frac{\pi}{2}\prod_{n=1}^{\infty} \left (1-\frac{1}{4 n^2 } \right ) \] \[ \pi=2\prod_{n=1}^{\infty} \left (\frac{4 n^2}{4 n^2-1 } \right )=2\frac{2 \cdot 2}{1 \cdot 3} \cdot \frac{4 \cdot 4}{3 \cdot 5} \cdot \frac{6 \cdot 6}{5 \cdot 7} \cdot \frac{8 \cdot 8}{7 \cdot 9} \cdots \] More generally: \[ \pi=\frac{N}{M} \sin(\pi M/N) \prod_{n=1}^{\infty} \left (\frac{N^2 n^2}{N^2 n^2 -M^2} \right ) \] This is useful when \(\sin(\pi M/N)\) is easily computable, such as when \(\sin(\pi M/N)\) is algebraic (e.g. \(M=1\), \(N=2^m\) ). For example: \[ \pi=2 \sqrt{2} \prod_{n=1}^{\infty} \left (\frac{4^2 n^2}{4^2 n^2 -1^2} \right ) \] \[ \pi=\frac{2}{3} \sqrt{2} \prod_{n=1}^{\infty} \left (\frac{4^2 n^2}{4^2 n^2 -3^2} \right ) \] \[ \pi=\frac{3}{2} \sqrt{3} \prod_{n=1}^{\infty} \left (\frac{3^2 n^2}{3^2 n^2 -1^2} \right ) \] \[ \pi=\frac{3}{4} \sqrt{3} \prod_{n=1}^{\infty} \left (\frac{3^2 n^2}{3^2 n^2 -2^2} \right ) \] \[ \pi=3 \prod_{n=1}^{\infty} \left (\frac{6^2 n^2}{6^2 n^2 -1^2} \right ) \] \[ \pi=\frac{3}{5} \prod_{n=1}^{\infty} \left (\frac{6^2 n^2}{6^2 n^2 -5^2} \right ) \] \[ \pi=3\sqrt{2}(-1+\sqrt{3}) \prod_{n=1}^{\infty} \left (\frac{12^2 n^2}{12^2 n^2 -1^2} \right ) \]



Corollary 2: Product Formula for Cosine


Let us evaluate the sine formula at \(x+\pi/2\): \[ \sin(x+\pi/2)=\cos(x)=\left (x+\frac{\pi}{2} \right )\prod_{n=-\infty, n \neq 0}^{\infty} \left (1-\frac{x+\pi/2}{\pi n } \right ) \] \[ \cos(x)=\frac{\sin(x+\pi/2)}{\sin(\pi/2)}=\left (1+\frac{x}{\pi/2} \right )\prod_{n=-\infty, n \neq 0}^{\infty} \frac{\left (1-\frac{x+\pi/2}{\pi n } \right )}{\left (1-\frac{\pi/2}{\pi n } \right )} \] \[ \cos(x)=\left (1+\frac{x}{\pi/2} \right )\prod_{n=-\infty, n \neq 0}^{\infty} \left (1-\frac{x}{\pi (n-1/2) } \right )=\prod_{n=-\infty}^{\infty} \left (1-\frac{x}{\pi (n-1/2) } \right ) \] \[ \cos(x)=\prod_{n=1}^{\infty} \left (1-\frac{x^2}{\pi^2 (n-1/2)^2 } \right ) \] Alternatively, we can derive this directly from the Weierstrass factorization theorem.
Additionally, by using imaginary arguments, we can derive the formulae: \[ \sinh(x)=x\prod_{n=1}^{\infty} \left (1+\frac{x^2}{\pi^2 n^2 } \right ) \] \[ \cosh(x)=\prod_{n=1}^{\infty} \left (1+\frac{x^2}{\pi^2 (n-1/2)^2 } \right ) \]



Corollary 3: Sine is Periodic


Let us evaluate the sine formula at \(x+\pi\): \[ \sin(x+\pi)=\left (x+\pi \right )\prod_{n=-\infty, n \neq 0}^{\infty} \left (1-\frac{x+\pi}{\pi n } \right ) \] \[ \sin(x+\pi)=\cdots \left (1+\frac{x+\pi}{3\pi} \right ) \left (1+\frac{x+\pi}{2\pi} \right )\left (1+\frac{x+\pi}{\pi} \right )\left (x+\pi \right ) \left (1-\frac{x+\pi}{\pi} \right )\left (1-\frac{x+\pi}{2\pi} \right ) \left (1-\frac{x+\pi}{3\pi} \right ) \cdots \] \[ \sin(x+\pi)=\cdots \left (\frac{4}{3}+\frac{x}{3\pi} \right ) \left (\frac{3}{2}+\frac{x}{2\pi} \right )\left (2+\frac{x}{\pi} \right ) \pi \left (1+\frac{x}{\pi}\right ) \left (\frac{-x}{\pi} \right )\left (\frac{1}{2}-\frac{x}{2\pi} \right ) \left (\frac{2}{3}-\frac{x}{3\pi} \right ) \cdots \] \[ \sin(x+\pi)=\cdots \frac{4}{3}\left (1+\frac{x}{4\pi} \right ) \frac{3}{2}\left (1+\frac{x}{3\pi} \right )2\left (1+\frac{x}{2\pi} \right ) \pi \left (1+\frac{x}{\pi}\right ) \left (\frac{-x}{\pi} \right ) \frac{1}{2}\left (1-\frac{x}{\pi} \right ) \frac{2}{3}\left (1-\frac{x}{2\pi} \right ) \cdots \] \[ \sin(x+\pi)=-2x\left ( \prod_{k=2}^{\infty} \frac{k^2-1}{k^2} \right ) \left ( \prod_{n=1}^{\infty} \left (1-\frac{x^2}{n^2 \pi^2} \right ) \right )=-\sin(x) \] As the first product easily telescopes. Thus \(\sin(x+2\pi)=\sin((x+\pi)+\pi)=-\sin(x+\pi)=\sin(x)\). Therefore, sine is periodic with period \(2\pi\).



Corollary 3: Some Zeta Values


Let us begin expanding the product for sine in a power series \[ \sin(x)=x\prod_{n=1}^{\infty} \left (1-\frac{x^2}{\pi^2 n^2 } \right )=x-\frac{x^3}{\pi^2}\left (\frac{1}{1^2}+\frac{1}{2^2}+\cdots \right )+\frac{x^5}{\pi^4}\left (\frac{1}{1^2 \cdot2^2}+\frac{1}{1^2 \cdot3^2}+\cdots \frac{1}{2^2 \cdot3^2}+\frac{1}{2^2 \cdot4^2}+\cdots \right )+\cdots \] \[ \sin(x)=x-\frac{x^3}{\pi^2}\left (\sum_{k=1}^{\infty}\frac{1}{k^2} \right )+\frac{x^5}{\pi^4}\left (\sum_{m=1,n=1, m < n}^{\infty}\frac{1}{m^2n^2} \right )+\cdots \] \[ \sin(x)=x-\frac{x^3}{\pi^2}\left (\sum_{k=1}^{\infty}\frac{1}{k^2} \right )+\frac{x^5}{2\pi^4}\left (\left (\sum_{k=1}^{\infty}\frac{1}{k^2} \right )^2- \sum_{k=1}^{\infty}\frac{1}{k^4} \right )+\cdots \] By comparing this to the Taylor series for sine, we find: \[ \frac{1}{3!}=\frac{1}{\pi^2}\left (\sum_{k=1}^{\infty}\frac{1}{k^2} \right ) \] \[ \frac{1}{5!}=\frac{1}{2\pi^4}\left (\left (\sum_{k=1}^{\infty}\frac{1}{k^2} \right )^2- \sum_{k=1}^{\infty}\frac{1}{k^4} \right ) \] From which it follows that \[ \sum_{k=1}^{\infty}\frac{1}{k^2}=\frac{\pi^2}{6} \] \[ \sum_{k=1}^{\infty}\frac{1}{k^4}=\frac{\pi^4}{90} \] In fact, for the fourth term, we find, similarly, that \[ \frac{1}{7!}=\frac{1}{6\pi^6}\left ( \left (\sum_{k=1}^{\infty}\frac{1}{k^2} \right )^3-3\left (\sum_{k=1}^{\infty}\frac{1}{k^2} \right )\left (\sum_{k=1}^{\infty}\frac{1}{k^4} \right )+2\left (\sum_{k=1}^{\infty}\frac{1}{k^6} \right ) \right ) \] From which it follows that \[ \sum_{k=1}^{\infty}\frac{1}{k^6}=\frac{\pi^6}{945} \]

Saturday, October 10, 2015

Derivation of a Formula for the Even Values of the Riemann Zeta Function

 

Lemma 1: Fourier Series of the Dirac Comb


A Dirac comb of period T is defined as \[{\mathrm{III}}_T(x)=\sum_{k=-\infty}^{\infty} \delta(x-kT)\] Where \(\delta(x)\) is the Dirac delta function. Since the Dirac comb is periodic with period T, we can expand it as a fourier series: \[\sum_{k=-\infty}^{\infty} \delta(x-kT)=\sum_{n=-\infty}^{\infty} A_n e^{i 2 \pi n x/T}\] We solve for the \(A_m\) in the usual way: \[ \int_{-T/2}^{T/2}\sum_{k=-\infty}^{\infty} \delta(x-kT)e^{-i 2 \pi m x/T} dx=1=\int_{-T/2}^{T/2}\sum_{n=-\infty}^{\infty} A_n e^{i 2 \pi (n-m) x/T} dx=T\cdot A_m \]\[ A_m=1/T \] Thus: \[\sum_{k=-\infty}^{\infty} \delta(x-kT)=\frac{1}{T}\sum_{n=-\infty}^{\infty} e^{i 2 \pi n x/T}\]



Lemma 2: An Infinite Series


\[ \sum_{n=-\infty}^{\infty} \frac{1}{x+i n}=\frac{1}{x}+\sum_{n=1}^{\infty} \frac{1}{x+i n}+\frac{1}{x-i n}=\frac{1}{x}+2x\sum_{n=1}^{\infty} \frac{1}{x^2+n^2} \]\[ \sum_{n=-\infty}^{\infty} \frac{1}{x+i n}=\int_{0}^{\infty} \sum_{n=-\infty}^{\infty} e^{-y(x+i n)} dy \]\[ \sum_{n=-\infty}^{\infty} \frac{1}{x+i n}=\int_{0}^{\infty} e^{-yx} \sum_{n=-\infty}^{\infty} e^{-iyn} dy \]\[ \sum_{n=-\infty}^{\infty} \frac{1}{x+i n}=2\pi \int_{0}^{\infty} e^{-yx} \sum_{k=-\infty}^{\infty} \delta(x-2\pi k) dy \]\[ \sum_{n=-\infty}^{\infty} \frac{1}{x+i n}=2\pi \left (\frac{1}{2}+ \sum_{k=1}^{\infty} e^{-2\pi k x} \right ) \]\[ \sum_{n=-\infty}^{\infty} \frac{1}{x+i n}=2\pi \left (\frac{1}{2}+ \frac{e^{-2\pi x}}{1-e^{-2\pi x}} \right )= \pi \frac{e^{2\pi x}+1}{e^{2\pi x}-1} \] Therefore, combining the first and last expressions and rearranging, we find: \[ \sum_{n=1}^{\infty} \frac{1}{x^2+n^2}=\frac{\pi}{2x} \frac{e^{2\pi x}+1}{e^{2\pi x}-1}-\frac{1}{2x^2}=\frac{\pi}{2x} \coth(\pi x)-\frac{1}{2x^2} \] Additionally, by taking the limit as x approaches zero, we find: \[ \sum_{n=1}^{\infty} \frac{1}{n^2}=\frac{\pi^2}{6} \]



Theorem: Formula for the Even Values of the Riemann Zeta Function


Recall that, by definition: \[ \zeta(n)=\sum_{k=1}^{\infty}\frac{1}{k^n} \] Let us then analyze \[ f(x)=1-\frac{x}{2}+\sum_{n=2}^{\infty}\frac{x^{n}}{n!} A_{n} \] Where \[ A_n=-2 \cdot n! \cdot \cos(n\pi/2) \cdot 2^{-n}\pi^{-n} \zeta(n) \] Thus: \[ f(x)=1-\frac{x}{2}-2\sum_{n=1}^{\infty}\left (\frac{-x^2}{4\pi^2} \right )^n \zeta(2n) \]\[ f(x)=1-\frac{x}{2}-2\sum_{n=1}^{\infty}\left (\frac{-x^2}{4\pi^2} \right )^n \sum_{k=1}^{\infty}\frac{1}{k^{2n}} \]\[ f(x)=1-\frac{x}{2}-2\sum_{k=1}^{\infty}\sum_{n=1}^{\infty}\left (\frac{-x^2}{4\pi^2 k^2} \right )^n \]\[ f(x)=1-\frac{x}{2}-2\sum_{k=1}^{\infty} \frac{-x^2}{4\pi^2 k^2}\frac{1}{1+\frac{x^2}{4\pi^2 k^2}} \]\[ f(x)=1-\frac{x}{2}+\frac{x^2}{2\pi^2}\sum_{k=1}^{\infty} \frac{1}{k^2+\frac{x^2}{4\pi^2}} \]\[ f(x)=1-\frac{x}{2}+\frac{x^2}{2\pi^2} \left ( \frac{\pi^2}{x} \frac{e^x+1}{e^x-1} -\frac{2\pi^2}{x^2} \right ) \]\[ f(x)=\frac{x}{2} \left ( \frac{e^x+1}{e^x-1} -1 \right )=\frac{x}{e^x-1} \] Therefore, for n>1, \[ A_n=\lim_{x \rightarrow 0} \frac{\mathrm{d}^n }{\mathrm{d} x^n} \frac{x}{e^x-1} \] These numbers are called the Bernoulli Numbers, symbolized as \(B_n\) and they are easily found to be all rational. Thus, by rearranging, we find: \[ \zeta(2n)=\frac{\pi^{2n} 2^{2n-1} \left | B_{2n} \right |} {(2n)!} \] Thus, all the even values of the zeta function can be found by finding the appropriate Bernoulli number, which itself can be found by simple differentiation. Moreover, we see that all the values are rational multiples of the corresponding power of pi. Specifically, we find that: \[ \zeta(2)=\frac{\pi^2}{6} \]\[ \zeta(4)=\frac{\pi^4}{90} \]\[ \zeta(6)=\frac{\pi^6}{945} \]\[ \zeta(8)=\frac{\pi^8}{9450} \]\[ \zeta(10)=\frac{\pi^{10}}{93555} \]