smallscaled: Microscopic theory of Diffusion

I have recently encountered with the great lecture series given by Prof. V. Balakrishnan, who is a theoretical physicist and a distinguished lecturer. You can access to the first part of the lecture series on the diffusion equation here. What I would like to do in this post is to write down his derivations in a more systematic manner, like lecture notes taken while I was listening to him. I strongly recommend watching the lecture though.

Let us begin with what diffusion means. Diffusion is the movement of water molecules you observe on the paper when you accidentally spill your water on it. It is the process that each dye molecule is exposed to when you put a drop of dye in a glass of water. Thanks to diffusion, you can smell the food in the kitchen from your living room. In more general terms, diffusion is the movement of a diffusing substance put inside a solvent, which can be solid, liquid, or gas, due to molecular collisions.

Diffusion tends to make eveything homogenious. It is not that it wants to, but it obeys the second law of thermodynamics, as any other process in the physical world we know. From a microscopic point of view, each molecule moves randomly, pushing and pulling each other around, exhibiting Brownian motion. But as a whole, the diffusing substance in question moves down through the concentration gradient, i.e., from high concentration to low, until it becomes homogenious in the solvent¹. Mathematically speaking, we can express this behaviour with the following equation \begin{align} \vec{J}(\vec{r},t) = - D \nabla \rho(\vec{r},t), \label{eq:law2} \end{align} where \(\vec{J}(\vec{r},t)\), \(\rho(\vec{r},t)\), and \(D\) denotes the flux (the amount of molecules crossing through the unit area per unit time), volume density (concentration), and diffusion constant, respectively. This equations tells you that the molecules flow from higher concentration to lower concentration, which is nothing but the Fick's second law of diffusion. Note that diffusion coefficient is there to even out the dimensions of the right and the left hand side of the equation.

Another, yet the simplest rule of this system, is the Fick's first law of diffusion : material never dissapears. This law leads to the following equation \begin{align} \frac{\partial \rho(\vec{r},t)}{\partial t} + \nabla \cdot \vec{J}(\vec{r},t) = 0, \label{eq:law1} \end{align} which is also known as the equation of continuity. It describes the transport of a conserved quantity, which is, in this case, the total number of diffusing molecules. You can also interpret (\ref{eq:law1}) as follows : in order concentration to increase at a given cross section, or surface, in the medium, either inward flow of molecules through that surface must increase, or magically, more molecules must be created within that surface. Since the latter is not possible, change in the concentration at each point with respect to time, plus the change in the amount of molecules crossing through unit area per unit time, i.e., the change in flux, with respect to the spatial position of this surface, is equal to zero.

If you put (\ref{eq:law2}) back in (\ref{eq:law1}), you will have \begin{align} \frac{\partial \rho(\vec{r},t)}{\partial t} = D \nabla^2 \rho(\vec{r},t), \label{eq:diffEq} \end{align} which is indeed the diffusion equation.

What we have discussed so far is actually the macroscopic picture. We have talked in terms of concentrations, which contains the information of the average behaviour of all the molecules, rather than the temporal or spatial information of each molecule. So what would be the microscopic interpretation? In other words, what would you ask for if you would like to gain information about only one molecule? You know that you can not look for its exact position at a given time, since you know that the movement of each molecule is random. But although their movement is random, luckly, we know the rules of this random process. So what we can look for is the probability of observing that molecule at a position \(\vec{r}\) at a given time \(t\). Let us denote this probability distribution with \(p(\vec{r},t)\). This probability distribution obeys exactly the same diffusion equation. All you need to do is to replace \(\rho(\vec{r},t)\) with \(p(\vec{r},t)\).

In order to see how, let us begin with investigating the random walk of only one single molecule on a discrete line. Our molecule will coss a toin each time it moves, and will choose to move left or right by one unit, depending on the outcome of the coin toss. This will assure the equivalence of probability of moving left and probability of moving right.

Figure 1: Discrete line that our molecule exhibits a random walk.

To begin with, let us denote the probability of being on the point \(j\) at a specific time \(t\) with \(P(j,t)\). In order to find the left hand side of (\ref{eq:diffEq}), we should ask for the change of \(P(j,t)\) with respect to \(t\). So, what are the possible scenarios that can affect the change of \(P(j,t)\)? First, our molecule may be at position \(j\), and move left or right with equal probabilities. This would definitely result in the decrease of \(P(j,t)\). Second, it may be moving to point \(j\), either from point \(j-1\) or \(j+1\) with equal probabilities, and this would obviously increase \(P(j,t)\). As a result, change in \(P(j,t)\) with respect to time can be written as \begin{align} \frac{\partial P(j,t)}{\partial t} &= \lambda \Bigg[\frac{1}{2}P(j-1,t)+\frac{1}{2}P(j+1,t)-\frac{1}{2}P(j,t)-\frac{1}{2}P(j,t)\Bigg], \\[1em] \frac{\partial P(j,t)}{\partial t} &= \frac{\lambda}{2}\Bigg[P(j-1,t)+P(j+1,t)-2P(j,t)\Bigg], \label{eq:lat} \\[1em] \end{align} where \(\lambda\) denotes the rate of jumps, i.e., number of jumps per unit time. Right hand side of this equation is also demonstrated in Figure 1.

Now, assume that our molecule doesn't have to move by one unit per unit time, but instead, it can choose a constant, denoted by \(a\). This is also called the lattice constant, since it denotes the distance between equally spaced consecutive points on the line. Considering the lattice constant, we can re-write (\ref{eq:lat}), such that \begin{align} \frac{\partial P(aj,t)}{\partial t} &= \frac{\lambda}{2} \Bigg[\big\{P(ja+a,t)-P(ja,t)\big\}-\big\{P(ja,t)-P(ja-a,t)\big\}\Bigg]. \end{align} Furthermore, we can play with this equation by multiplying and dividing the right hand side by \(a\), which results in \begin{align} \frac{\partial P(aj,t)}{\partial t} &= \frac{a\lambda}{2} \Bigg[\frac{P(ja+a,t)-P(ja,t)}{a}-\frac{P(ja,t)-P(ja-a,t)}{a}\Bigg]. \label{eq:lat2}\\ \end{align} Now, what are the steps to make this discrete-space random walk look like a one dimensional Brownian motion? We should manipulate it such that it becomes continuous both in space and time. First thing to do is to make this line a continuous one, by taking the lattice constant \(a\), to \(0\). Second thing to do is to make this movement also continuous in time. Since \(\lambda\) denotes the rate of jumps, \(1/\lambda\) will give you the mean time between jumps. So we can achive a continuous time movement by taking \(1/\lambda\) to \(0\), i.e., taking \(\lambda\) to infinity.

Let us see what will happen when we take the limits. If you look at the right hand side of (\ref{eq:lat2}) more carefully, you can see that the taking the limit as \(a\) goes to \(0\) leads to the right as left derivative of \(P(ja,t)\), such that \begin{align} \lim_{a \to 0} \frac{P(ja+a,t)-P(ja,t)}{a} & \text{ is the right derivative},\\[.1em] \lim_{a \to 0} \frac{P(ja,t)-P(ja-a,t)}{a} & \text{ is the left derivative}. \end{align} Since we have the difference of these two derivatives, we have one missing component to make this difference the second derivative of \(P(ja,t)\), which is another \(a\) in the denominator. So if we divide (\ref{eq:lat2}) by another \(a\), we will have \begin{align} \frac{\partial P(aj,t)}{\partial t} &= \frac{a^2\lambda}{2} \Bigg[\frac{\frac{P(ja+a,t)-P(ja,t)}{a}-\frac{P(ja,t)-P(ja-a,t)}{a}}{a}\Bigg],\\ \end{align} and when we take the limits, we will end up with the following equation \begin{align} \lim_{\substack{a\to 0 \\ \lambda \to \infty}} \frac{\partial P(aj,t)}{\partial t} &= \lim_{\substack{a\to 0 \\ \lambda \to \infty}} \frac{a^2 \lambda}{2}\frac{\frac{P(ja+a,t)-P(ja,t)}{a}-\frac{P(ja,t)-P(ja-a,t)}{a}}{a}, \end{align} and since \(ja\) becomes a continuous variable now, probabilities will become a probability density, such that \begin{align} \frac{\partial p(x,t)}{\partial t} &= D \frac{\partial^2 p(x,t)}{\partial x^2}, \label{eq:difc} \end{align} where \(x=ja\), and \begin{align} D \equiv \lim_{\substack{a\to 0 \\ \lambda \to \infty}} \frac{a^2 \lambda}{2}, \end{align} which is there to even out the dimensions of right and left hand side of the equation. Note that we ended up with exactly the same diffusion equation, (\ref{eq:diffEq}), where concentration is used rather than the probability density.

Now we should solve this equation.

In order to make a general solution, let use increase the spatial dimensions, and assume a \(d\) dimensional space where the position vector is denoted with \(\mathbf{\vec{r}}\), \begin{equation} \mathbf{\vec{r}}=\begin{bmatrix} r_{1}, r_{2}, \dots, r_{d} \end{bmatrix} ^{T}, \end{equation} and (\ref{eq:difc}) becomes, \begin{align} \frac{\partial p(\, \mathbf{\vec{r}}, t \,)}{\partial t} = D \nabla ^2 p(\, \mathbf{\vec{r}}, t \,) \end{align} Since this is a partial differential equation, first order in time and second order in space, we will an initial condition, and two boundary conditions. For the boundary conditions, we will assume the space is infinite, such that the probability of being at \(\pm \infty\) is \(0\), and write \begin{align} lim_{\, \lVert {\mathbf{\vec{r}}} \rVert \to \infty} \, p(\, \mathbf{\vec{r}}, t \,) = 0. \end{align} Note that we could also define different boundary conditions, such as reflecting or absorbing surfaces at \(\pm \infty\).

Regarding the initial condition, we will assume that we have a single molecule at the origin at \(t=0\), such that \begin{align} p(\, \mathbf{\vec{r}}, 0 \,) = \delta^{(d)}(\,{\mathbf{\vec{r}}}\,). \end{align} Equation (\ref{eq:difc}) contains partial derivaties with respect to both time and space. Since the space we have defines is infitine, position of our molecule in each dimension varies from \(-\infty\) to \(+\infty\). On the other hand, time flows only in the positive direction, i.e., \(t : 0 \to \infty\). So in order to solve this equation, we must take both the Fourier transform and Laplace transform, former with respect to space, and latter with respect to time.

Let us define the Laplace transform of \(p(\, \mathbf{\vec{r}}, t \,)\) first, such that \begin{align} \mathcal{L}\big[p(\, \mathbf{\vec{r}}, t \,)\big] = \tilde{p}\,(\,\mathbf{\vec{r}},s\,). \end{align} So when we take the Laplace transform of the whole diffusion equation given as (\ref{eq:difc}), by taking the Laplace transform properties into consideration, we will have \begin{align} s\,\tilde{p}\,(\,\mathbf{\vec{r}},s\,) - p(\, \mathbf{\vec{r}},0 \,) &= D \nabla ^2 p(\, \mathbf{\vec{r}}, s \,),\\[.1em] (s-D \nabla ^2\,)\tilde{p}\,(\,\mathbf{\vec{r}},s\,) &= p(\, \mathbf{\vec{r}},0 \,) ,\\[.1em] (s-D \nabla ^2\,)\tilde{p}\,(\,\mathbf{\vec{r}},s\,) &= \delta^{(d)}(\,{\mathbf{\vec{r}}}\,) \label{eq:takef}, \end{align} Now, as the next step, we should take the Fourier Transform of (\ref{eq:takef}) for the spatial direction, which results in \begin{align} (s+D\mathbf{{k}^2}\,)\,\tilde{f}\,(\,\mathbf{\vec{k}},s\,) &= 1. \label{eq:ft1} \end{align} where \(\tilde{f}\,(\,\mathbf{\vec{k}},s\,)\) denotes the Fourier Transform of \(\tilde{p}\,(\,\mathbf{\vec{r}},s\,)\), defined as \begin{align} \mathscr{F}[{\tilde{p}\,(\,\mathbf{\vec{r}},s\,)}] = \tilde{f}\,(\,\mathbf{\vec{k}},s\,) = \int_{-\infty}^{\infty} \tilde{p}\,(\,\mathbf{\vec{r}},s\,) \,e^{-i\mathbf{\vec{k}}\cdot\mathbf{\vec{r}}}\mathop{}\!\mathrm{d}^{d}\mathbf{\vec{r}}. \end{align} You can refer to the appendix section at the end of this post for the proof of \(\nabla ^2\,\tilde{p}\,(\,\mathbf{\vec{r}},s\,) = -\mathbf{{k}^2}\,\tilde{f}\,(\,\mathbf{\vec{k}},s\,)\). We can re-write (\ref{eq:ft1}), which leads to the equation \begin{align} \tilde{f}\,(\,\mathbf{\vec{k}},s\,) &= \frac{1}{(s+D\mathbf{{k}^2}\,)} \label{eq:fin01}. \end{align} So in order to find the solution for \(p(\mathbf{\vec{r}},t)\), we need to inverse all the transforms we have taken so far.

Which one to inverse first? If you inverse the Fourier transform first, than you have to deal with a \(d\) dimensional integral, since the space is \(d\) dimensional. On the other hand, if you invert the Laplace transform first, the only dimension you have to deal with will be time. So we will go with inverting the Laplace transform first, such that \begin{align} f\,(\,\mathbf{\vec{k}},t\,) = \mathcal{L}^{-1}\Big[\,{\tilde{f}\,(\,\mathbf{\vec{k}},s\,)}\Big]=\mathcal{L}^{-1}\Bigg[{ \frac{1}{(s+D\mathbf{{k}^2}\,)}}\Bigg] &= e^{-D\mathbf{k^2}t}, \end{align} where \(\mathbf{k}^2 = k_{1}^2 + k_{2}^2 + \dots + k_{d}^2\).

Since we can not avoid the Fourier transform forever, we need to invert it too. Taking the inverse Fourier transform leads to \begin{align} p(\, \mathbf{\vec{r}}, t \,) = \mathscr{F}^{-1}\big[e^{-D\mathbf{k^2}t}\big] = \frac{1}{(2\pi)^{d}} \int_{-\infty}^{\infty}e^{-D\mathbf{k^2}t}\, e^{i\mathbf{\vec{k}}\cdot\mathbf{\vec{r}}}\, \mathop{}\!\mathrm{d}^{d}\mathbf{\vec{k}}, \end{align} which is nothing but the multiplication of \(d\) integrals when you choose Cartesian coordinates to integrate. This multiplication can be written as \begin{align} p(\, \mathbf{\vec{r}}, t \,) &= \frac{1}{(2\pi)^{d}} \Bigg(\int_{-\infty}^{\infty} e^{ik_{1}r_{1}-Dk_{1}^{2}t} \mathop{}\!\mathrm{d} k_{1}\Bigg) \dots \Bigg(\int_{-\infty}^{\infty} e^{ik_{d}r_{d}-Dk_{d}^{2}t} \mathop{}\!\mathrm{d} k_{d}\Bigg), \\[.1em] &= \prod_{n=1}^{d}\frac{1}{2\pi}\int_{-\infty}^{\infty} e^{ik_{n}r_{n}-Dk_{n}^{2}t} \mathop{}\!\mathrm{d} k_{n}. \label{eq:pro} \end{align} Notice that each multiplier in (\ref{eq:pro}) is indeed a Gaussian integral, if you complete the square of the power of each exponential function. After completing the square for each exponential, we will have \begin{align} p(\, \mathbf{\vec{r}}, t \,) &= \prod_{n=1}^{d}\frac{1}{2\pi}\int_{-\infty}^{\infty} e^{-Dt\big[{k_{n}^{2} -\frac{ik_{n}r_{n}}{Dt} + \frac{i^2r_{n}^2}{4D^2t^2} -\frac{i^2r_{n}^2}{4D^2t^2}}\big]} \mathop{}\!\mathrm{d} k_{n}, \\[.1em] &= \prod_{n=1}^{d}\frac{1}{2\pi}\int_{-\infty}^{\infty} e^{-Dt\big[{k_{n}-\frac{ir_{n}}{2Dt}}\big]^2 + \frac{i^2r_{n}^2}{4Dt}} \mathop{}\!\mathrm{d} k_{n}, \\[.1em] &= \prod_{n=1}^{d}\,\frac{1}{2\pi}e^{-\frac{r_{n}^2}{4Dt}}\int_{-\infty}^{\infty} e^{-Dt\big[{k_{n}-\frac{ir_{n}}{2Dt}}\big]^2} \mathop{}\!\mathrm{d} k_{n}, \\[.1em] &= \prod_{n=1}^{d}\,\frac{1}{2\pi}e^{-\frac{r_{n}^2}{4Dt}} \sqrt{\frac{\pi}{Dt}}, \\[.1em] &= \frac{1}{(4\pi Dt)^{d/2}}e^{\frac{-\mathbf{r}^2}{4Dt}} \label{eq:finend}. \end{align} which is the solution for the diffusion equation.

In order to gain some intution about this solution, take a look at the animation below. Looks like we are watching a drop of coffee diffusing through a piece of paper, right? This is indeed the animation of (\ref{eq:finend}) in a \(2\) dimensional space, captured for \(4\) seconds.

APPENDIX

Let us define the Fourier transform of \(\nabla ^2 f(x,t)\), such as \begin{align} \mathscr{F}\big[\nabla ^2 f(x,t)\big] &= \int_{-\infty}^{\infty}e^{-i\omega x}\frac{\partial^{2}f(x,t)}{\partial x^2} \mathop{}\!\mathrm{d} x, \end{align} which can be solved by applying the method of integration by parts, which is the utilization of the following formula \begin{align} \int u \, \mathop{}\!\mathrm{d} v = u\,v - \int v\, \mathop{}\!\mathrm{d} u. \label{eq:ibp} \end{align} We will choose \(u\) and \(v\) such that \begin{align} \mathop{}\!\mathrm{d} v &= \frac{\partial^{2}f(x,t)}{\partial x^2} \mathop{}\!\mathrm{d} x, \\[.1em] \frac{\mathop{}\!\mathrm{d} v}{\mathop{}\!\mathrm{d} x} &= \frac{\partial^{2}f(x,t)}{\partial x^2}, \\[.1em] v &= \frac{\partial f(x,t)}{\partial x}, \end{align} and \begin{align} u &= e^{-i\omega x} \\[.1em] \mathop{}\!\mathrm{d} u &= -i\omega e^{-i\omega x} \mathop{}\!\mathrm{d} x. \end{align} Plugging them into the (\ref{eq:ibp}) leads to \begin{align} \int_{-\infty}^{\infty} u \, \mathop{}\!\mathrm{d} v &= u\, v \Big|_{-\infty}^{\infty} - \int_{-\infty}^{\infty} v \,\mathop{}\!\mathrm{d} u \\[.1em] \int_{-\infty}^{\infty} e^{-i\omega x} \frac{\partial^{2}f(x,t)}{\partial x^2} \mathop{}\!\mathrm{d} x &= e^{-i\omega x} \, \frac{\partial f(x,t)}{\partial x} \Big|_{-\infty}^{\infty} +i\omega\int_{-\infty}^{\infty} \frac{\partial f(x,t)}{\partial x} \, e^{-i\omega x} \mathop{}\!\mathrm{d} x\\[.1em] &=i\omega\int_{-\infty}^{\infty} \frac{\partial f(x,t)}{\partial x} \, e^{-i\omega x} \mathop{}\!\mathrm{d} x. \end{align} We need to repeat integration by parts one more time for the right hand side of the equation. We will choose the new \(u^*\) and \(v^*\), such that \begin{align} \mathop{}\!\mathrm{d} v^* &= \frac{\partial f(x,t)}{\partial x} \mathop{}\!\mathrm{d} x, \\[.1em] \frac{\mathop{}\!\mathrm{d} v^*}{\mathop{}\!\mathrm{d} x} &= \frac{\partial f(x,t)}{\partial x}, \\[.1em] v^* &= f(x,t), \end{align} and \begin{align} u^* &= e^{-i\omega x}, \\[.1em] \mathop{}\!\mathrm{d} u^* &= -i\omega e^{-i\omega x} \mathop{}\!\mathrm{d} x. \end{align} Again, plugging them into the (\ref{eq:ibp}) results in \begin{align} i\omega\int_{-\infty}^{\infty} \frac{\partial f(x,t)}{\partial x} \, e^{-i\omega x} \mathop{}\!\mathrm{d} x & = i\omega \bigg\{{e^{-i\omega x} \, f(x,t) \Big|_{-\infty}^{\infty} +i\omega\int_{-\infty}^{\infty} f(x,t)\, e^{-i\omega x} \mathop{}\!\mathrm{d} x\ }\bigg\}, \\[.1em] &= i^2\omega^2 \mathscr{F}\big[f(x,t)\big], \\[.1em] &= -\omega^2\mathscr{F}\big[f(x,t)\big]. \end{align} As a result, we will have \begin{align} \mathscr{F}\big[\nabla ^2 f(x,t)\big] = -\omega^2\mathscr{F}\big[f(x,t)\big]. \end{align} Footnotes

1. You can also refer to this post for more information about this random movement, i.e., Brownian motion.

smallscaled

Microscopic theory of Diffusion

Blog Archive

Category