I am sure that many of you are familiar with the definition of the second law of thermodynamics. After all, it is recklessly taught in physics 201,
without paying attention to the physical interpretations of the words constituting this well memorized sentence,
In a closed system, entropy always increases.
That's why it might be the most interdisciplinary, but yet the most misunderstood law of nature. So we need to fix it here.
Let's begin with the word
always.
You all know the famous story of the ink drop. When left into a glass of water,
the drop diffuses into the water, exhibiting Brownian motion. We know - because this is
what we observed up until now - that the diffused ink molecules
1 will
never gather back together and form a drop.
Or
will they, if we wait long enough?
Brownian motion simply means
random motion. Molecules move by hitting and pushing each other in a
chaotic environment. Remember the time that the concert is over and you are trying to get out with
your friends, also with a million additional people, simultaneously. In this analogy, you and your
friends form a "friend drop". Even though you don't take a step intentionally, as the crowd begins
to go out, you begin to be pushed by them, and your motion gets out of your control. You begin to
move randomly.
This random motion of molecules were precisely explained by Albert Einstein, in his
1905
paper. What you need to know about it for now is the following : steps that you
are taking when you are trying to get out of the concert hall are
independent of each other.
Remember that your motion is not in control, you are pushed around by other people (in the
ink case, other molecules), and you also push other people when you are being pushed around.
So you don't know where you will set your next step on. But you can not take a step of 1 km long, right?
So intuitively, your hunch tells you that there must be an
underlying probability distribution of your step
size. In the Brownian motion case, these step sizes are (as you may guess) normally distributed.
So each step of yours becomes a normal random variable, and the cumulative sum of these variables
constitute your trajectory. But what determines the parameters of this normal distribution? Keep in mind
that escaping from the concert hall is just an analogy. Molecules don't have an intention to reach to the
exit door, neither they are trying to take a cab home. So in case that they are not being pushed around,
they don't have a reason to move. This gives us the first parameter of our Normal distribution, which is
the mean, being equal to zero.
Back to the concert analogy. What is one of the things that determines the length of your step size? The crowd!
Your step size depends on how crowded the hall is, and from a molecule's perspective, how
dense
the medium is. Of course, density is not the only parameter for a molecule to go faster or slower.
Temperature, viscosity of the fluid, and the size of the molecule itself (how fat and tall you are)
are also important. All these properties combine in a single parameter, which called the
diffusion
coefficient. Physically, diffusion coefficient is the measure of how much molecules diffuse thorough
a unit surface in a unit time at a concentration gradient of unity
[1]. Intuitively, it provides
us a measure of how
fast the molecules diffuse. Since your speed depends on the length of your step
for a fixed distance, diffusion coefficient is significant in determining the variance of the step size,
which is the second parameter of our Normal distribution.
Up until now, we only talked about a single step. But to proceed further, we need to determine the
trajectory of our molecule. Recall that each step is independent of each other, and cumulative sum of
the steps constitute the trajectory. Which means that, in a discrete case, if I am forced to take 10
steps in the crowd (I am not taking them intentionally), my trajectory will be the sum of 10 normally
distributed, zero mean random variables, which is again, a normal random variable with zero mean. Since
these variables are independent of each other, the variance of my trajectory will be equal to the variance
of my each step multiplied by the number of my steps, which in this case is equal to 10. For the case of a continuous
motion, we will adapt these discrete sums as integrals over a time interval.
Enough with the analogy, let's calculate the probability to see our beloved drop back again.
Assume that you have a cylinder glass with a radius of \(R=4\) cm, and it is full of water up to \(h=6\) cm height.
You injected your ink drop into water with an injection syringe, precisely in the middle. You can see the
illustration of this setup in Figure 1.
 |
Figure 1: Illustration of the diffusion system. |
Let \({\mathbf{X^{i}}} = [X_{1}^{i},X_{2}^{i},X_{3}^{i}]\) denote the position vector of the \(i^{th}\)
ink molecule, and let \(X_{j}^{i}\) denote the position variable in the \(j^{th}\) dimension at
time \(t\) (For the sake of simplicity, variable \(t\) is not included in the notation). Since the glass
is 3-D, we will have \(j=\{1,2,3\}\), and \(i=\{1,2,...,N\}\), where \(N\) is the total number of ink
molecules in our initial drop. Statistically, the average radius of a water drop is equal to 1.5 mm, and
it contains approximately \(10^{20}\) \(H_{2}O\) molecules, so let us use these values for our ink drop too.
Let's denote the radius of the drop with \(r=1.5\)mm, and \(N=10^{20}\). Since the drop is so small, also
let us assume that the initial positions of all molecules are zero, i.e., \({\mathbf{x^{i}}}=[0,0,0]\) for \(i=\{1,2,...,N\}\) at \(t=0\).
Since all the molecules will exhibit Brownian motion, their position can be modelled as a continuous time
stochastic process, in which, the increments in a given dimension are normally distributed, and independent
of each other. As time elapses, these independent increments add up to form molecules' path in the
medium, and as these independent increments add up, the variance of the position adds up too (Recall the
concert hall analogy). So the position variable at time \(t\) becomes normally distributed such that
\begin{align}
X_{1,2,3}^{i} \sim \mathcal{N}(0,\sigma^2) \phantom{00} \text{i=\{1,2,...,N\}},
\label{eq:g}
\end{align}
where \(\sigma^2 = 2Dt\), and \(D\) denotes the diffusion coefficient.
Let us simplify the problem a little. We were trying to find the probability that all \(10^{20}\) molecules form
back the original drop again with a proper alignment, i.e., position of the molecules must not collapse, and
their centers must be located such that they produce a sphere-like shape. Instead of this, let me re-define the
problem. What is the probability that all ink molecules are
in a sphere of radius \(r\), centered at
\({\mathbf{C}}=[C_{1},C_{2},C_{3}]\) at the same instant \(t\)? So for a single ink molecule, probability of being in
the drop at a given time \(t\) and for a given center \(\mathbf{C}\) is equal to
\begin{align}
P\big\{\mathbf{X^{i}} \in \mathcal{D}\mid t, \mathbf{C}\big\} = P\big\{\lVert \mathbf{X^{i}}-\mathbf{C} \rVert \leq +r\mid t, \mathbf{C}\big\}.
\label{eq:ineq}
\end{align}
where \(\mathcal{D}\) denote the connected set of points
that constitutes the drop. But we need to calculate the probability of
all ink molecules being in the drop at the same time. As a result,
since the ink molecules' trajectories are independent of each other, what we need to calculate becomes
\begin{align}
P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \mid t, \mathbf{C} \big\} = \prod_{i=1}^{N}P\big\{\mathbf{X^{i}} \in \mathcal{D} \mid t, \mathbf{C} \big\} = \prod_{i=1}^{N}P\big\{\lVert \mathbf{X^{i}}-\mathbf{C} \rVert \leq +r\mid t, \mathbf{C}\big\}.
\label{eq:prod}
\end{align}
Furthermore, due to the Brownian motion, steps taken at each dimension is also independent of each other. So instead of
calculating the Euclidian distance between the center of the drop and the ink molecules, we can check whether each dimension
of the ink molecules' falls inside the range of the drop in the corresponding dimension. To do that, we need
to be more precise about the center of the drop. Since the drop can not get close the boundaries more than its radius,
center of the drop must be bounded such that
\begin{align}
-(R-r) &\leq c_{1,2} \leq +(R-r),\nonumber \\
-(h/2-r) &\leq c_{3} \leq +(h/2-r).\nonumber \\
\end{align}
Since there is no restriction on where the drop will form, it is safe to assume that the center is
uniformly distributed
between these ranges. As a result, center variables for each dimension become
\begin{align}
C_{1,2} \sim \mathcal{U}(-(R-r),&+(R-r)), \nonumber \\
C_{3} \sim \mathcal{U}(-(h/2-r),&+(h/2-r)). \nonumber
\end{align}
Calculating whether all the molecules' positions fall in the drop's corresponding ranges gives us
\begin{align}
P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \mid t, \mathbf{C} \big\} &= \prod_{i=1}^{N}P\big\{-r \leq X_{1}^{i}-C_{1}\leq +r\mid t, C_{1}\big\} \times \nonumber \\
& P\big\{-r \leq X_{2}^{i}-C_{2}\leq +r\mid t, C_{2}\big\}P\big\{-r \leq X_{3}^{i}-C_{3}\leq +r\mid t, C_{3}\big\}.
\label{eq:prod12}
\end{align}
We need to relax the problem a little bit more here. The calculation given in \eqref{eq:prod12} requires to calculate the probabilities
conditioned on \(\mathbf{C}\).
This makes sense, since at time \(t=0\), we will see the drop at \(\mathbf{C}=[0,0,0]\) with probability \(1\), because
we put it there in the first place.
In order to make these probabilities independent of \(\mathbf{C}\), we need all the molecules diffuse into the medium, and the solution to become
homogeneous. This means that the
ink molecules can be
everywhere with equal probability, which is the asymptotic behaviour of \eqref{eq:g}, i.e., as \(t \to \infty \). As a result, for a homogeneous solution, we
can write
\begin{align}
X_{1,2}^{i} \sim \mathcal{U}(-R,&+R), \nonumber \\
X_{3}^{i} \sim \mathcal{U}(-h/2,&+h/2), \nonumber
\label{eq:g2}
\end{align}
for \({i=\{1,2,...,N\}}\). Only under this assumption, it is not going to matter where the center is, since the ink molecules become randomly distributed in water.
Furthermore, by assuming the solution is homogeneous, all the probability distributions become independent of time too.
Let's assume that enough time had passed, the solution became homogeneous, and the probabilities in \eqref{eq:prod12} became independent of \(\mathbf{C}\) and \(t\).
Now we can re-write \eqref{eq:prod12} such that
\begin{align}
P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\} &= \prod_{i=1}^{N}P\big\{-r \leq X_{1}^{i}-C_{1}\leq +r\big\}\nonumber \\
&P\big\{-r \leq X_{2}^{i}-C_{2}\leq +r\big\}P\big\{-r \leq X_{3}^{i}-C_{3}\leq +r\big\}.
\label{eq:prod2}
\end{align}
To calculate \eqref{eq:prod2}, we need to know the probability distribution of \((X^{i}_{j}-C_{j})\) for \(j=\{1,2,3\}\). Since \(C_{j}\)
is symmetric around zero for all \(j\), probability distribution of \((X^{i}_{j}+C_{j})\) will be identical to the probability distribution
of \((X^{i}_{j}-C_{j})\). Let us define a new random variable, \(Z_{j}^{i}\), such that
\begin{align}
Z_{j}^{i} = X^{i}_{j}+C_{j},
\end{align}
which permits us to write the probability distribution function of \(Z_{j}^{i}\) as
\begin{align}
f_{Z^{i}_{j}}(z^{i}_{j}) = f_{C_{j}}(c_{j}) \ast f_{X_{j}^{i}}(x_{j}^{i}).
\end{align}
This is the convolution of two uniform probability distributions. Writing the convolution explicitly yields to the piecewise probability
distribution function
$$ f_{Z^{i}_{1,2}}(z^{i}_{1,2}) =
\left\{
\begin{array}{ll}
\frac{z^{i}_{1,2}+2R-r}{4(R-r)R} & : z^{i}_{1,2} \in [-(2R-r), -r],\\
\frac{1}{2R} & : z^{i}_{1,2} \in [-r, r],\\
\frac{z^{i}_{1,2}-(2R-r)}{4(R-r)R} & : z^{i}_{1,2} \in [r, 2R-r].
\end{array}
\right.$$
$$ f_{Z^{i}_{3}}(z^{i}_{3}) =
\left\{
\begin{array}{ll}
\frac{z^{i}_{3}+h-r}{2(h/2-r)h} & : z^{i}_{3} \in [-(h-r), -r],\\
\frac{1}{h} & : z^{i}_{3} \in [-r, r],\\
\frac{z^{i}_{3}-(h-r)}{2(h/2-r)h} & : z^{i}_{1,2} \in [r, h-r].
\end{array}
\right.$$
which is plotted in Figure 2.
 |
Figure 2:\( \,\,\,f_{Z^{i}_{1,2}}(z^{i}_{1,2})\) (on the left) and \( \,\,\,f_{Z^{i}_{1,2}}(z^{i}_{3})\) (on the right). |
Now we can calculate \eqref{eq:prod2} by integrating \(f_{Z^{i}_{j}}(z^{i}_{j})\) over \([-r, +r]\), which is
\begin{align}
P\Big\{{-r \leq Z_{1,2}^{i} \leq +r}\Big\} &=\int_{-r}^{+r}f_{Z^{i}_{j}}(z^{i}_{1,2})\,dz^{i}_{1,2}, \nonumber \\
&=\int_{-r}^{+r}\frac{1}{2R}\,dz^{i}_{1,2},\nonumber \\
&=\frac{r}{R}.
\end{align}
\begin{align}
P\Big\{{-r \leq Z_{3}^{i} \leq +r}\Big\} &=\int_{-r}^{+r}f_{Z^{i}_{j}}(z^{i}_{3})\,dz^{i}_{3},\nonumber \\
&=\int_{-r}^{+r}\frac{1}{h}\,dz^{i}_{3},\nonumber \\
&=\frac{2r}{h}.
\end{align}
Plugging these results in \eqref{eq:prod2} gives us
\begin{align}
P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\} = \Big(\frac{2r^2}{R^2h}\Big)^N.
\label{eq:final}
\end{align}
Equation \eqref{eq:final} tells us that the volume of the solution, which appears to be at the denominator, affects \(P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\}\)
by the power of \(N\). In other words, larger the glass, lower the probability of seeing our drop back. Additionally, recall that \(r = 1.5\)mm,
which is a
really small value compared to the dimensions of the solution.
Let's talk in numbers. Recall that we were trying to calculate the probability of
all ink molecules
being
in the drop
simulatenously. Before that, let's focus on calculating the probability of
only one ink molecule
being in the drop, i.e., \(P\big\{\mathbf{X^{i}} \in \mathcal{D}\big\}\). The reason is the
following : If \(P\big\{\mathbf{X^{i}} \in \mathcal{D} \big\}\) is so small, then multiplying it
by itself \(N\) times will be practically zero due to the machine precision. Although we have chosen a realistic number of molecules for
an ink drop \((N=10^{20})\), we must be careful about \(N\), and increase it step by step.
Let's start calculating. You can see the probabilities for different \(N\) values in Table 1. Notice that approximately after \(N=60\), which is
really small compared to the realistic value of \(N=10^{20}\), machine precision becomes inadequate, and probabilities appear to be zero.
Of course these numbers are some kind of an upperbound, since we relaxed our problem in some aspects.
\(\newcommand\T{\Rule{0pt}{1em}{.3em}}\)
\begin{array}{c|c}
\hline N \,\,\, & P\big\{\mathbf{X^{1:N}} \in \mathcal{D} \big\} \T \\\hline
1 & 4.6875\times 10^{-5} \\\hline
2 \T & 2.1973\times 10^{-9} \\\hline
5 \T & 2.2631\times 10^{-22} \\\hline
10 \T & 5.1217\times 10^{-44} \\\hline
15 \T & 1.1591\times 10^{-65} \\\hline
20 \T & 2.6232\times 10^{-87} \\\hline
30 \T & 1.3435\times 10^{-130} \\\hline
50 \T & 3.5242\times 10^{-217} \\\hline
60 \T & 1.8050\times 10^{-260} \\\hline
80 \T & 0 \\\hline
100 \T & 0 \\\hline
\end{array}
Table 1: Probabilities of forming a drop containing \(N\) molecules. |
So, what does all those numbers mean?
They mean that second law of thermodynamics is a
statistical law.
They mean that even though the probabilities are
extremely low, there
exists a probability to
see the ink molecules form a drop back. Yes,
practically it is equal to zero; but it's important to be
aware of that
this is just a
practical result. This perspective also helps us to understand the transitions from
microscale to macroscale. In microscale, if you observe only two molecules hitting each other and bouncing back to their
original positions,
you don't think they violate the second law. But if a homogeneous water-ink solution forms a
drop from nowhere, things become
strange. This is because you are not talking about the behaviour of only two
molecules anymore, you are talking about trillions of trillions.
So if everybody is convinced, let's fix the first part of the sentence.
Second law
doesn't say that it is
impossible to see a case such that the entropy
decreases.
Instead, it says that
most of the time (usually very close to
always), systems tend toward their most
probable state
[2],
which is (as calculated above) is obviously
not the state which the entropy
decreases.
References
[1] http://www.thermopedia.com/content/696/
[2] Peter M. Hoffman,
Life's Ratchet : How molecular machines extract order from chaos.
Footnotes
1) Ink is composed of many different components, but let's assume there is an identical "ink molecule".