Stosszahlansatz

"Facts are stubborn things, but statistics are pliable"
-Mark Twain
All theories of microscopic physics are governed by laws that are time reversible. Evolution of a system can be traced back into the past by the same evolution equation that governs the prediction in the future. Then the macroscopic laws arise, and we see that there is indeed something like an arrow of time, indicating that the macroscopic phenomena, unlike the microscopic ones, are not invariant under reversal of time, due to the second law of thermodynamics. In other words, both Figure 1.a and 1.b seem reasonable, whereas Figure 2.b seems quite impossible. So the natural question arises, how come the laws that govern two atoms are invariant under time reversal, i.e., Figure 1 makes sense in both directions of time, but Figure 2 doesn't?

Figure 1. Collision of two molecules, where the (b) image on the right hand side is the time reversed version of the (a) image on the left hand side.



Figure 2. Diffusion of ink, where the (b) image on the right hand side is the time reversed version of the (a) image on the left hand side.

Now we know that this has a quite persuasive explanation : macroscopic laws are statistical laws. You can refer to this post to read about what thermodynamic arrow of time actually means, and why Figure 2.b is also not impossible. Nevertheless, why the physical laws at micro and macro scale differs in terms of time revesal was a quite hot debate back in 1800s. As a result, paradoxes arose. (Long live gedankenexperimentes!)

In this post, I will mention two of these paradoxes, which are considered as attacks to Boltzmann's H-theorem : Loschmidt's reversibility paradox and Zermelo's reccurent paradox. The model we will use is named as Kac ring model, which is capable of demonstrating both paradoxes. Before diving into the model, let me quickly summarize what these paradoxes are.

Loschmidt's reversibility paradox

Assume you have a vessel, with perfectly reflecting walls (energy of the molecules are not absorbed by the walls in case of a collision). It contains gas in a non-equilibrium state (a non-Maxwellian speed distribution). As time elapses, system will converge to an equilibrium, in which the gas molecules have a Maxwellian speed distribution, and the H value of the system is at minimum (The value of H acts exactly the reverse of thermodynamic entropy). After the system is at equilibrium, assume you reverse the direction of the speed of all particles at the same time. Since we can trace back the system using time reversible microscopic laws, system should move towards its initial state, in which, H value is increasing. This is in contrast to Boltzmann's H-theorem.

Zermelo's reccurent paradox

This paradox is based on Poincare recurrence theorem, which states that certain systems will, after a sufficiently long but finite time, return to a state very close to the initial one. Zermelo applied some of Poincare's thinking to the particles in a gas, and argued that given a long enough time, the particles would return to a phase distribution (by phase, I mean the combination of position and velocity in a 3D space) in that would be indistinguishable from the original distribution. As a result, H function must be periodic, which is in contradiction with Boltzmann's H-theorem.

Although both paradoxes are against the H-theorem, how they think of time in terms of thermodynamics is quite different. Loschmidt's paradox thinks of time as a one dimensional line, where we can go back and forth if certain conditions are met (like reversing the direction of the speed of all particles at the same time). Zermelo's paradox, on the other hand, thinks of time like a cursor moving on a circle, in which the states of a system is positioned on. As a result, both paradoxes refuse that H value of a system always decreases, and both of them are absolutely right. Macroscopic laws are fundamentally nothing but statistical descriptions. They represent the most probable behavior of a system, and by using the Kac ring model, we will demonstrate it here.

The Kac ring model

In order to demonstrate the paradoxes above, i.e., the process of passing from a microscopic, time reversible description to a macroscopic, thermodynamic description, we need to make some abstractions about what a time reversible physical law is. Since our model has to represent both paradoxes, it has to be both periodic and reversible in terms of time.

Assume you a circle, which has \(N\) number of sites arranged around it, and connected with edges, which are basically the arcs in between the sites. Each site is occupied by a black or white ball, and the color of the ball is an abstraction of its physical state. System evolves on a discrete set of ticks, and each ball moves to the clockwise neighboring site at each time step.

We also need to add a physical law to change the state of balls from black to white, or vice versa. In order to do that, we introduce \(n\) number of markers to the system, which are placed on the edges of our ring. These markers are the abstractions of the one and only physical law in our system, and they change the physical state, i.e., the color of the ball if the ball passes through the marker. You can see the illustration of such a system in Figure 3, where the red squares denote the markers. The blue diamond is there to make it easier to keep track of the system.

Figure 3: Illustration of the Kac ring model, where \(N=10\) and \(n=5\).

So if we run the time counterclockwise, the balls retrace their past color sequence, which means that the system is time reversible. Moreover, after \(N\) clock ticks, each ball reaches its initial position on the circle, and changed its color \(n\) times. As a result, if \(n\) is even, the initial state of the system reoccurs. If \(n\) is odd, then we need to run the system twice, and it will take \(2N\) clock ticks in order to go back to the initial state. In either case, this system also has reccurence. As a result, this model satisfies the two necessary conditions to demonstrate the two paradoxes.

Now we need to define what is microscopic and what is macroscopic here. What quantites can describe this system as a whole, and what quantities describe the balls individually?

Since the state of a ball is either black or white, we can define the total number of white and black balls in the system as macroscopic quantities. Of course, time evolution of these quantities depends on the number of white balls which are about to be black (or vice versa), i.e., just behind a marker, which itself, is a microscopic quantity. Mathematically speaking, we can write the following equations \begin{align} B(t+1) &= B(t)+w(t)-b(t), \label{eq:black} \\ W(t+1) &= W(t)+b(t)-w(t), \label{eq:white} \end{align} where \(B(t)\), \(W(t)\), \(b(t)\), and \(w(t)\) denote the total number of black balls, total number of white balls, number of black balls just behind a marker, and number of white balls just behind a marker, respectively. We will investigate the behaviour of another macroscopic quantity, \(\Delta (t)\), which contains the information coming from both \eqref{eq:white} and \eqref{eq:black}, and defined as \begin{align} \Delta (t) &= B(t) - W(t). \label{eq:delta} \end{align} Plugging in \eqref{eq:black} and \eqref{eq:white} yields to, \begin{align} \Delta (t) &= B(t-1)+w(t-1)-b(t-1)-W(t-1)-b(t-1)+w(t-1), \nonumber \\ &= \Delta(t-1) +2w(t) - 2b(t). \label{eq:glob} \end{align} Equation \eqref{eq:glob} indicates a quite important corollary : the evolution of the macroscopic (global) quantities is not computable only from macroscopic information. It is not possible to eliminate \(b(t)\) and \(w(t)\) from \eqref{eq:black}-\eqref{eq:glob}. This is also known as the closure problem1 in statistical mechanics.

Now assume that the markers are distributed randomly. The probability that a particular site is occupied by a marker is given by \begin{align} \mu \equiv \frac{n}{N} = \frac{b}{B} = \frac{w}{W}. \label{eq:clo} \end{align} I know something doesn't feel right with this step, but let me continue. We will come back.

Assuming \eqref{eq:clo} is the analogue of Boltzmann's Stosszahlansatz. It overcomes the closure problem by discarding the history of the system evolution, and consequently, we no longer need the microscopic (local) quantities to compute the macroscopic (global) ones.

Substituting \eqref{eq:clo} in \eqref{eq:glob} yields to \begin{align} \Delta(t+1) = \Delta(t) + 2\mu W(t) - 2\mu B(t) = (1-2\mu)\Delta(t), \end{align} and by reccurence, we can write \begin{align} \Delta(t) = (1-2\mu)^t \Delta(0). \end{align} Now we have a problem. As \(t\to \infty\), \(\Delta (t) \to 0\), which means that the number of black and white balls will be equal to each other, independent of the initial distribution of white and black balls, meaning that the initial state can not reccur. But we have chose the system such that it is indeed reccurent, and concluded that at most \(2N\) time steps are sufficient for the system to go back to its initial state. So we have an instance of Zermelo's paradox (yay!). Moreover, since \(\Delta (t)\) monotonically decreases with time, the process is not time reversible, and we have an instance of Loschmidt's paradox too.

So what went wrong?

Remember the step that didn't feel quite right. Equating \(\mu\) to \(\frac{n}{N}\) was fine, but to \(\frac{w}{W}\) or \(\frac{b}{B}\)? Where did the variable \(t\) go? Shouldn't it be \(\mu(t) = \frac{w(t)}{W(t)} = \frac{b(t)}{B(t)}\)?

This is why Boltzman suggested that macroscopic laws can only be valid in a statistical sense. What we did was to take the ensemble average of \(\mu\), and assume that it is valid for each and every ring. But the truth is, \(\mu\) has a distribution. What we used was the expected value of it, and it is no suprise to conclude that if we had infinite number of rings, we will end up having half of the balls black, and half of them white2.

So let's fix this problem mathematically.

Assume that we define a variable, \(C_{i}(t)\), which denotes the color of the ball occupying the \(i\)th lattice at time \(t\), and defined as \begin{align} C_{i}(t) = \begin{cases} -1 & i\,\text{th lattice is black at time}\,t, \\ 1 & i\,\text{th lattice is white at time}\,t. \end{cases} \end{align} Moreover, let the variable \(m_{i}\) denote the absensence or presence of a marker in the following way \begin{align} m_{i} = \begin{cases} -1 & i\,\,\text{and}\,\,i+1\,\,\text{are connected with a marker}, \\ 1 & i\,\,\text{and}\,\,i+1\,\,\text{are not connected with a marker}. \end{cases} \end{align} Using these variables, we can write the recurrence relation as follows \begin{align} C_{i}(t) &= m_{i-1}C_{i-1}(t-1), \\ &= m_{i-1}m_{i-2}C_{i-2}(t-2), \\ &\dots \\ &= \left( \prod_{k=1}^{t}m_{i-k} \right) C_{i-t}(0). \label{eq:recur} \end{align} Now using \eqref{eq:recur}, we can rewrite \(\Delta (t)\), such that \begin{align} \Delta (t) = \sum_{i=1}^{N} C_{i}(t) = \sum_{i=1}^{N}m_{i-1}C_{i-1}(t-1) = \dots = \sum_{i=1}^{N} m_{i-1}m_{i-2}\dots m_{i-t}C_{i-t}(0). \end{align} Now we have two things to prove. First, in order to show that each individual ring goes back to its initial state after at most \(2N\) time steps, we need to verify that \(\Delta(2N) = \Delta(0)\). Afterwards, we need to correct our mistake, and show that \(\langle \Delta(t) \rangle = (1-2\mu)^t \Delta(0)\), not \( \Delta(t) = (1-2\mu)^t \Delta(0)\).

So lets begin with the first one.

To begin with, let me write both \(\Delta (0)\) and \(\Delta (2N)\) in a more clear form, such that \begin{align} \Delta (0) &= \sum_{i=1}^{N}C_{i}(0) = C_{1}(0)+C_{2}(0)+\dots + C_{N}(0), \label{eq:n} \\ \Delta (2N) &= \sum_{i=1}^{N}C_{i}(2N) = C_{1}(2N)+C_{2}(2N)+\dots +C_{N}(2N). \label{eq:2n} \end{align} If we can verify that \(C_{i}(0)=C_{i}(2N)\,\,\forall i\), then we can easily conclude that \(\Delta (0) = \Delta (2N)\). Writing \(C_{i}(2N)\) explicitly yields to \begin{align} C_{i}(2N) &= \left( \prod_{k=2N}^{t}m_{i-k} \right) C_{i-2N}(0). \label{eq:cc} \end{align} Since the lattice is periodic with \(N\), we can immediately say that \(C_{i}(t) = C_{i\pm jN}(t)\) for \(j=0,1,2,\dots\). Similarly, we can also say that \(m_{i}(t) = m_{i\pm jN}(t)\) for \(j=0,1,2,\dots\). Using these identities, we can rewrite \eqref{eq:cc} such that \begin{align} C_{i}(2N) &= m_{i-1}m_{i-2}\dots m_{i-N}m_{i-(N+1)} \dots m_{i-(2N-1)}m_{i-2N} C_{i}(0), \\ & = m_{i-1}m_{i-2}\dots m_{i-N}m_{i-1} \dots m_{i-(N-1)}m_{i-N} C_{i}(0), \\ & = m_{i-1}^2 m_{i-2}^2 \dots m_{i-N}^2 C_{i}(0), \end{align} and since \(m_{i} \pm 1\,\,\forall i\), we will have \(C_{i}(2N) = C_{i}(0)\), and consequently, \(\Delta(0)=\Delta(2N)\).

So we have showed that an individual ring indeed comes back to its initial state after at most \(2N\) time steps. Zermelo's paradox solved. Let's deal with the Loschmidt's now.

In order to resolve Loschmidt's paradox, we need to show that \(\langle \Delta(t) \rangle = (1-2\mu)^t \Delta(0)\). Recall that the initial configuration \(C_{i}(0)\) is fixed, but the marker positions change randomly at each time step. As a result, if we take the average, we can pull \(C_{i}(0)\) out of the operator, and obtain \begin{align} \langle\Delta (t) \rangle = \sum_{i=2}^{N}\langle m_{i-1}m_{i-2} \dots m_{i-t} \rangle C_{i-t}(0). \label{eq:summa} \end{align} Since \(m_{i}\) and \(C_{i}(t)\) are periodic with \(N\), and the summation is from \(1\) to \(N\), \(\langle m_{i-1}m_{i-2} \dots m_{i-t} \rangle\) and \(\sum_{i=1}^{N} C_{i-t}(0)\) become invariant under index shifts. Moreover, since \(C_{i}(t) \pm 1\), summation for all the balls gives the difference between black and white ones, i.e., \(\Delta (0)\). As a result, we can write \begin{align} \langle\Delta (t) \rangle &= \langle m_{1}m_{2} \dots m_{t} \rangle \sum_{i=1}^{N} C_{i-t}(0),\\ &= \langle m_{1}m_{2} \dots m_{t} \rangle \Big[ C_{1-t}(0)+C_{2-t}(0)+\dots +C_{N-t}(0) \Big], \\ &= \langle m_{1}m_{2} \dots m_{t} \rangle \Delta(0). \end{align} Now we need to find an explicit expression for \(\langle m_{1}m_{2} \dots m_{t} \rangle\). Notice that the product \(m_{1}m_{2} \dots m_{t} \) means that there are \(t\) consecutive markers, since \(m_{i}\) is the variable for the absence or presence of a marker on the edge, connecting \(i\)th and \((i+1)\)th lattices. As a result, in order to compute \(\langle m_{1}m_{2} \dots m_{t} \rangle\), we need to find the probability of finding \(j\) markers on \(t\) consecutive edges. Note that if \(j\) is even (if we have even number of markers) result of \(m_{1}m_{2} \dots m_{t}\) will be \(1\), and similarly, it will be \(-1\) if \(j\) is odd. Thus, we can write \begin{align} \langle m_{1} m_{2} \dots m_{t}\rangle = \sum_{j=0}^{t}(-1)^jp_{j}(t), \label{eq:me} \end{align} where \(p_{j}(t)\) denotes the probability of finding \(j\) markers on \(t\) consecutive edges. The markers follow a binomial distribution, such that \begin{align} p_{j}(t) = \mu^j (1-\mu)^{t-j} \dbinom{t}{j}. \label{eq:bin} \end{align} Plugging \eqref{eq:bin} in \eqref{eq:me} yields to \begin{align} \langle m_{1} m_{2} \dots m_{t}\rangle &= \sum_{j=0}^{t} \dbinom{t}{j}(-\mu)^j (1-\mu)^{t-j}, \\ &= (1-2\mu)^t. \label{eq:binn} \end{align} Inserting \eqref{eq:binn} into \eqref{eq:summa}, we have \begin{align} \langle \Delta (t) \rangle = (1-2\mu)^t\Delta(0), \end{align} and the paradox is resolved.

So what is the take home message of all these calculations?

As quoted from Mark Twain at the very beginning of this post, facts and statistics are quite different. The reason why we can't digest most of the physical phenomena, including quantum mechanics, is the lack of grasping what a sample and what an ensemble average means. Moreover, living in a human-scale world limits of our imagination of thinking in billions of atoms, million years of evolution, or nano scaled molecular machines, etc. Even though we can understand the behaviour of one atom, or one cell, we are not capable of imagining the outcomes of putting together a billion of them. We constructed out intuition in a linear, non-quantum, human-scaled world. This is what leads us to most of our paradoxes.

Footnotes
1. Closure implies that there is a sufficient number of equations for all the unknowns.
2. Under the condition that all rings with different initial conditions are equally likely.