Errata for the second law : Part I

I am sure that many of you are familiar with the definition of the second law of thermodynamics. After all, it is recklessly taught in physics 201, without paying attention to the physical interpretations of the words constituting this well memorized sentence,

In a closed system, entropy always increases.

That's why it might be the most interdisciplinary, but yet the most misunderstood law of nature. So we need to fix it here.

Let's begin with the word always.

You all know the famous story of the ink drop. When left into a glass of water, the drop diffuses into the water, exhibiting Brownian motion. We know - because this is what we observed up until now - that the diffused ink molecules1 will never gather back together and form a drop. Or will they, if we wait long enough?

Brownian motion simply means random motion. Molecules move by hitting and pushing each other in a chaotic environment. Remember the time that the concert is over and you are trying to get out with your friends, also with a million additional people, simultaneously. In this analogy, you and your friends form a "friend drop". Even though you don't take a step intentionally, as the crowd begins to go out, you begin to be pushed by them, and your motion gets out of your control. You begin to move randomly.

This random motion of molecules were precisely explained by Albert Einstein, in his 1905 paper. What you need to know about it for now is the following : steps that you are taking when you are trying to get out of the concert hall are independent of each other. Remember that your motion is not in control, you are pushed around by other people (in the ink case, other molecules), and you also push other people when you are being pushed around. So you don't know where you will set your next step on. But you can not take a step of 1 km long, right? So intuitively, your hunch tells you that there must be an underlying probability distribution of your step size. In the Brownian motion case, these step sizes are (as you may guess) normally distributed.

So each step of yours becomes a normal random variable, and the cumulative sum of these variables constitute your trajectory. But what determines the parameters of this normal distribution? Keep in mind that escaping from the concert hall is just an analogy. Molecules don't have an intention to reach to the exit door, neither they are trying to take a cab home. So in case that they are not being pushed around, they don't have a reason to move. This gives us the first parameter of our Normal distribution, which is the mean, being equal to zero.

Back to the concert analogy. What is one of the things that determines the length of your step size? The crowd! Your step size depends on how crowded the hall is, and from a molecule's perspective, how dense the medium is. Of course, density is not the only parameter for a molecule to go faster or slower. Temperature, viscosity of the fluid, and the size of the molecule itself (how fat and tall you are) are also important. All these properties combine in a single parameter, which called the diffusion coefficient. Physically, diffusion coefficient is the measure of how much molecules diffuse thorough a unit surface in a unit time at a concentration gradient of unity[1]. Intuitively, it provides us a measure of how fast the molecules diffuse. Since your speed depends on the length of your step for a fixed distance, diffusion coefficient is significant in determining the variance of the step size, which is the second parameter of our Normal distribution.

Up until now, we only talked about a single step. But to proceed further, we need to determine the trajectory of our molecule. Recall that each step is independent of each other, and cumulative sum of the steps constitute the trajectory. Which means that, in a discrete case, if I am forced to take 10 steps in the crowd (I am not taking them intentionally), my trajectory will be the sum of 10 normally distributed, zero mean random variables, which is again, a normal random variable with zero mean. Since these variables are independent of each other, the variance of my trajectory will be equal to the variance of my each step multiplied by the number of my steps, which in this case is equal to 10. For the case of a continuous motion, we will adapt these discrete sums as integrals over a time interval.

Enough with the analogy, let's calculate the probability to see our beloved drop back again.

Assume that you have a cylinder glass with a radius of \(R=4\) cm, and it is full of water up to \(h=6\) cm height. You injected your ink drop into water with an injection syringe, precisely in the middle. You can see the illustration of this setup in Figure 1.
Figure 1: Illustration of the diffusion system.


Let \({\mathbf{X^{i}}} = [X_{1}^{i},X_{2}^{i},X_{3}^{i}]\) denote the position vector of the \(i^{th}\) ink molecule, and let \(X_{j}^{i}\) denote the position variable in the \(j^{th}\) dimension at time \(t\) (For the sake of simplicity, variable \(t\) is not included in the notation). Since the glass is 3-D, we will have \(j=\{1,2,3\}\), and \(i=\{1,2,...,N\}\), where \(N\) is the total number of ink molecules in our initial drop. Statistically, the average radius of a water drop is equal to 1.5 mm, and it contains approximately \(10^{20}\) \(H_{2}O\) molecules, so let us use these values for our ink drop too. Let's denote the radius of the drop with \(r=1.5\)mm, and \(N=10^{20}\). Since the drop is so small, also let us assume that the initial positions of all molecules are zero, i.e., \({\mathbf{x^{i}}}=[0,0,0]\) for \(i=\{1,2,...,N\}\) at \(t=0\).

Since all the molecules will exhibit Brownian motion, their position can be modelled as a continuous time stochastic process, in which, the increments in a given dimension are normally distributed, and independent of each other. As time elapses, these independent increments add up to form molecules' path in the medium, and as these independent increments add up, the variance of the position adds up too (Recall the concert hall analogy). So the position variable at time \(t\) becomes normally distributed such that \begin{align} X_{1,2,3}^{i} \sim \mathcal{N}(0,\sigma^2) \phantom{00} \text{i=\{1,2,...,N\}}, \label{eq:g} \end{align} where \(\sigma^2 = 2Dt\), and \(D\) denotes the diffusion coefficient.

Let us simplify the problem a little. We were trying to find the probability that all \(10^{20}\) molecules form back the original drop again with a proper alignment, i.e., position of the molecules must not collapse, and their centers must be located such that they produce a sphere-like shape. Instead of this, let me re-define the problem. What is the probability that all ink molecules are in a sphere of radius \(r\), centered at \({\mathbf{C}}=[C_{1},C_{2},C_{3}]\) at the same instant \(t\)? So for a single ink molecule, probability of being in the drop at a given time \(t\) and for a given center \(\mathbf{C}\) is equal to \begin{align} P\big\{\mathbf{X^{i}} \in \mathcal{D}\mid t, \mathbf{C}\big\} = P\big\{\lVert \mathbf{X^{i}}-\mathbf{C} \rVert \leq +r\mid t, \mathbf{C}\big\}. \label{eq:ineq} \end{align} where \(\mathcal{D}\) denote the connected set of points that constitutes the drop. But we need to calculate the probability of all ink molecules being in the drop at the same time. As a result, since the ink molecules' trajectories are independent of each other, what we need to calculate becomes \begin{align} P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \mid t, \mathbf{C} \big\} = \prod_{i=1}^{N}P\big\{\mathbf{X^{i}} \in \mathcal{D} \mid t, \mathbf{C} \big\} = \prod_{i=1}^{N}P\big\{\lVert \mathbf{X^{i}}-\mathbf{C} \rVert \leq +r\mid t, \mathbf{C}\big\}. \label{eq:prod} \end{align} Furthermore, due to the Brownian motion, steps taken at each dimension is also independent of each other. So instead of calculating the Euclidian distance between the center of the drop and the ink molecules, we can check whether each dimension of the ink molecules' falls inside the range of the drop in the corresponding dimension. To do that, we need to be more precise about the center of the drop. Since the drop can not get close the boundaries more than its radius, center of the drop must be bounded such that \begin{align} -(R-r) &\leq c_{1,2} \leq +(R-r),\nonumber \\ -(h/2-r) &\leq c_{3} \leq +(h/2-r).\nonumber \\ \end{align} Since there is no restriction on where the drop will form, it is safe to assume that the center is uniformly distributed between these ranges. As a result, center variables for each dimension become \begin{align} C_{1,2} \sim \mathcal{U}(-(R-r),&+(R-r)), \nonumber \\ C_{3} \sim \mathcal{U}(-(h/2-r),&+(h/2-r)). \nonumber \end{align} Calculating whether all the molecules' positions fall in the drop's corresponding ranges gives us \begin{align} P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \mid t, \mathbf{C} \big\} &= \prod_{i=1}^{N}P\big\{-r \leq X_{1}^{i}-C_{1}\leq +r\mid t, C_{1}\big\} \times \nonumber \\ & P\big\{-r \leq X_{2}^{i}-C_{2}\leq +r\mid t, C_{2}\big\}P\big\{-r \leq X_{3}^{i}-C_{3}\leq +r\mid t, C_{3}\big\}. \label{eq:prod12} \end{align} We need to relax the problem a little bit more here. The calculation given in \eqref{eq:prod12} requires to calculate the probabilities conditioned on \(\mathbf{C}\). This makes sense, since at time \(t=0\), we will see the drop at \(\mathbf{C}=[0,0,0]\) with probability \(1\), because we put it there in the first place. In order to make these probabilities independent of \(\mathbf{C}\), we need all the molecules diffuse into the medium, and the solution to become homogeneous. This means that the ink molecules can be everywhere with equal probability, which is the asymptotic behaviour of \eqref{eq:g}, i.e., as \(t \to \infty \). As a result, for a homogeneous solution, we can write \begin{align} X_{1,2}^{i} \sim \mathcal{U}(-R,&+R), \nonumber \\ X_{3}^{i} \sim \mathcal{U}(-h/2,&+h/2), \nonumber \label{eq:g2} \end{align} for \({i=\{1,2,...,N\}}\). Only under this assumption, it is not going to matter where the center is, since the ink molecules become randomly distributed in water. Furthermore, by assuming the solution is homogeneous, all the probability distributions become independent of time too.

Let's assume that enough time had passed, the solution became homogeneous, and the probabilities in \eqref{eq:prod12} became independent of \(\mathbf{C}\) and \(t\). Now we can re-write \eqref{eq:prod12} such that \begin{align} P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\} &= \prod_{i=1}^{N}P\big\{-r \leq X_{1}^{i}-C_{1}\leq +r\big\}\nonumber \\ &P\big\{-r \leq X_{2}^{i}-C_{2}\leq +r\big\}P\big\{-r \leq X_{3}^{i}-C_{3}\leq +r\big\}. \label{eq:prod2} \end{align} To calculate \eqref{eq:prod2}, we need to know the probability distribution of \((X^{i}_{j}-C_{j})\) for \(j=\{1,2,3\}\). Since \(C_{j}\) is symmetric around zero for all \(j\), probability distribution of \((X^{i}_{j}+C_{j})\) will be identical to the probability distribution of \((X^{i}_{j}-C_{j})\). Let us define a new random variable, \(Z_{j}^{i}\), such that \begin{align} Z_{j}^{i} = X^{i}_{j}+C_{j}, \end{align} which permits us to write the probability distribution function of \(Z_{j}^{i}\) as \begin{align} f_{Z^{i}_{j}}(z^{i}_{j}) = f_{C_{j}}(c_{j}) \ast f_{X_{j}^{i}}(x_{j}^{i}). \end{align} This is the convolution of two uniform probability distributions. Writing the convolution explicitly yields to the piecewise probability distribution function $$ f_{Z^{i}_{1,2}}(z^{i}_{1,2}) = \left\{ \begin{array}{ll} \frac{z^{i}_{1,2}+2R-r}{4(R-r)R} & : z^{i}_{1,2} \in [-(2R-r), -r],\\ \frac{1}{2R} & : z^{i}_{1,2} \in [-r, r],\\ \frac{z^{i}_{1,2}-(2R-r)}{4(R-r)R} & : z^{i}_{1,2} \in [r, 2R-r]. \end{array} \right.$$ $$ f_{Z^{i}_{3}}(z^{i}_{3}) = \left\{ \begin{array}{ll} \frac{z^{i}_{3}+h-r}{2(h/2-r)h} & : z^{i}_{3} \in [-(h-r), -r],\\ \frac{1}{h} & : z^{i}_{3} \in [-r, r],\\ \frac{z^{i}_{3}-(h-r)}{2(h/2-r)h} & : z^{i}_{1,2} \in [r, h-r]. \end{array} \right.$$ which is plotted in Figure 2.



Figure 2:\( \,\,\,f_{Z^{i}_{1,2}}(z^{i}_{1,2})\) (on the left) and \( \,\,\,f_{Z^{i}_{1,2}}(z^{i}_{3})\) (on the right).

Now we can calculate \eqref{eq:prod2} by integrating \(f_{Z^{i}_{j}}(z^{i}_{j})\) over \([-r, +r]\), which is \begin{align} P\Big\{{-r \leq Z_{1,2}^{i} \leq +r}\Big\} &=\int_{-r}^{+r}f_{Z^{i}_{j}}(z^{i}_{1,2})\,dz^{i}_{1,2}, \nonumber \\ &=\int_{-r}^{+r}\frac{1}{2R}\,dz^{i}_{1,2},\nonumber \\ &=\frac{r}{R}. \end{align} \begin{align} P\Big\{{-r \leq Z_{3}^{i} \leq +r}\Big\} &=\int_{-r}^{+r}f_{Z^{i}_{j}}(z^{i}_{3})\,dz^{i}_{3},\nonumber \\ &=\int_{-r}^{+r}\frac{1}{h}\,dz^{i}_{3},\nonumber \\ &=\frac{2r}{h}. \end{align} Plugging these results in \eqref{eq:prod2} gives us \begin{align} P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\} = \Big(\frac{2r^2}{R^2h}\Big)^N. \label{eq:final} \end{align} Equation \eqref{eq:final} tells us that the volume of the solution, which appears to be at the denominator, affects \(P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\}\) by the power of \(N\). In other words, larger the glass, lower the probability of seeing our drop back. Additionally, recall that \(r = 1.5\)mm, which is a really small value compared to the dimensions of the solution.

Let's talk in numbers. Recall that we were trying to calculate the probability of all ink molecules being in the drop simulatenously. Before that, let's focus on calculating the probability of only one ink molecule being in the drop, i.e., \(P\big\{\mathbf{X^{i}} \in \mathcal{D}\big\}\). The reason is the following : If \(P\big\{\mathbf{X^{i}} \in \mathcal{D} \big\}\) is so small, then multiplying it by itself \(N\) times will be practically zero due to the machine precision. Although we have chosen a realistic number of molecules for an ink drop \((N=10^{20})\), we must be careful about \(N\), and increase it step by step.

Let's start calculating. You can see the probabilities for different \(N\) values in Table 1. Notice that approximately after \(N=60\), which is really small compared to the realistic value of \(N=10^{20}\), machine precision becomes inadequate, and probabilities appear to be zero. Of course these numbers are some kind of an upperbound, since we relaxed our problem in some aspects.

\(\newcommand\T{\Rule{0pt}{1em}{.3em}}\) \begin{array}{c|c} \hline N \,\,\, & P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\} \T \\\hline 1 & 4.6875\times 10^{-5} \\\hline 2 \T & 2.1973\times 10^{-9} \\\hline 5 \T & 2.2631\times 10^{-22} \\\hline 10 \T & 5.1217\times 10^{-44} \\\hline 15 \T & 1.1591\times 10^{-65} \\\hline 20 \T & 2.6232\times 10^{-87} \\\hline 30 \T & 1.3435\times 10^{-130} \\\hline 50 \T & 3.5242\times 10^{-217} \\\hline 60 \T & 1.8050\times 10^{-260} \\\hline 80 \T & 0 \\\hline 100 \T & 0 \\\hline \end{array}
Table 1: Probabilities of forming a drop containing \(N\) molecules.


So, what does all those numbers mean?

They mean that second law of thermodynamics is a statistical law. They mean that even though the probabilities are extremely low, there exists a probability to see the ink molecules form a drop back. Yes, practically it is equal to zero; but it's important to be aware of that this is just a practical result. This perspective also helps us to understand the transitions from microscale to macroscale. In microscale, if you observe only two molecules hitting each other and bouncing back to their original positions, you don't think they violate the second law. But if a homogeneous water-ink solution forms a drop from nowhere, things become strange. This is because you are not talking about the behaviour of only two molecules anymore, you are talking about trillions of trillions.

So if everybody is convinced, let's fix the first part of the sentence.

Second law doesn't say that it is impossible to see a case such that the entropy decreases. Instead, it says that most of the time (usually very close to always), systems tend toward their most probable state [2], which is (as calculated above) is obviously not the state which the entropy decreases.

References
[1] http://www.thermopedia.com/content/696/
[2] Peter M. Hoffman, Life's Ratchet : How molecular machines extract order from chaos.

Footnotes
1) Ink is composed of many different components, but let's assume there is an identical "ink molecule".