The Feynman-Kac theorem, Kolmogorov's Forward Equation, and the Fokker-Plank Equation

Published on 8/5/2025

probabilitypdes

Let T>0T>0, B(t)B(t) be a Brownian motion, F(t)\mathcal{F}(t) the filtration generated by B(t)B(t), X(t)X(t) a stochastic process adapted to F(t)\mathcal{F}(t), and h:RRh: \mathbb{R} \to \mathbb{R} a measurable function. Suppose X(t)X(t) satisfies:

dX(t)=α(t,X(t))dt+β(t,X(t))dB(t),\begin{equation} dX(t) = \alpha(t,X(t)) dt + \beta(t,X(t)) d B(t), \end{equation}

where α(t,X(t))\alpha(t,X(t)), β(t,X(t))\beta(t,X(t)) are adapted processes. For xRx \in \mathbb{R} and 0<t<T0 < t < T , let Ex,tE^{x,t} be the expected value of h(X(T))h(X(T)) starting from X(t)=xX(t) = x, and denote this function by g(x,t)g(x,t). If α(t,X(t)),β(t,X(t))\alpha(t,X(t)), \beta(t,X(t)) are sufficiently nice, then X(t)X(t) is a Markov process and

E[h(X(T))F(t)]=g(t,X(t)).E[h(X(T)) \mid \mathcal{F}(t)] = g(t,X(t)).

Theorem 1 (Feynman-Kac) The function g(t,x)g(t,x) satisfies

gt(t,x)+α(t,X(t))gx(t,x)+12β2(t,x)gxx(t,x)=0g(T,x)=h(x).\begin{align} &g_t(t,x) + \alpha(t,X(t)) g_x(t,x) + \frac{1}{2} \beta^2(t,x) g_{xx}(t,x) = 0 \\ & g(T,x) = h(x). \end{align}

Proof. We first note that the process g(t,X(t))g(t,X(t)) is a martingale. Indeed, for 0<s<t0< s < t,

E[g(t,X(t))F(s)]=E[E[h(X(T))F(t)]F(s)]=E[h(X(T))F(s)]=g(s,X(s)).E[g(t,X(t)) \mid \mathcal{F}(s)] = E[ E[h(X(T)) \mid \mathcal{F}(t)] \mid \mathcal{F}(s)] = E[h(X(T)) \mid \mathcal{F}(s)] = g(s,X(s)).

We now use Itô's formula to compute the differential of gg:

dg(t,X(t))=gtdt+gxdX(t)+12gxxdX(t)dX(t)=(gt+α(t,X(t))gx+12β2(t,x)gxx)dt+gxβ(t,X(t))dB(t).d g(t,X(t)) = g_t dt + g_x dX(t) + \frac{1}{2} g_{xx} dX(t)dX(t) = \left(g_t + \alpha(t,X(t)) g_x + \frac{1}{2} \beta^2(t,x) g_{xx}\right)dt + g_x \beta(t,X(t)) dB(t).

Since g(t,X(t))g(t,X(t)) is a martingale, the dtdt term must vanish, so

gt(t,x)+α(t,X(t))gx(t,x)+12β2(t,x)gxx(t,x)=0.g_t(t,x) + \alpha(t,X(t)) g_x(t,x) + \frac{1}{2} \beta^2(t,x) g_{xx}(t,x) = 0.

Finally, since h(X(T))h(X(T)) is F(T)\mathcal{F}(T)-measurable,

g(T,X(T))=E[h(X(T))F(T)]=h(X(T)),g(T,X(T)) = E[h(X(T)) \mid \mathcal{F}(T)] = h(X(T)),

which must hold for any value of X(T)X(T).

Since the process X(t)X(t) is a Markov process, it has a probability transition density p(t,T,x,y)p(t,T,x,y), so if X(t)=xX(t)=x, then

E[h(X(T))F(t)]=h(y)p(t,T,x,y)dy.E[h(X(T)) \mid \mathcal{F}(t)] = \int h(y) p(t,T, x, y) dy.

From the Feynman-Kac theorem, we obtain the Kolmogorov Backward Equation.

Theorem: Let p(t,T,x,y)p(t,T,x,y) be the transition density for the solution to the SDE in equation (1). Then,

pt+αpx+12β2pxx=0.p_t + \alpha p_x + \frac{1}{2} \beta^2 p_{xx} = 0.

Proof. From the Feynman-Kac theorem we know

gt(t,x)+α(t,X(t))gx(t,x)+12β2(t,x)gxx(t,x)=0.g_t(t,x) + \alpha(t,X(t)) g_x(t,x) + \frac{1}{2} \beta^2(t,x) g_{xx}(t,x) = 0.

Since

g(t,x)=h(y)p(t,T,x,y)dy,g(t,x) = \int h(y) p(t,T,x,y) dy,

differentiating under the integral sign yields

h(y)(pt+αpx+12β2(t,x)pxx)dy=0.\int h(y) (p_t + \alpha p_x + \frac{1}{2} \beta^2(t,x) p_{xx}) dy = 0.

This holds for all measurable functions hh, so

pt+αpx+12β2pxx=0.p_t + \alpha p_x + \frac{1}{2} \beta^2 p_{xx} = 0.

What this theorem tells us is that given X(T)=yX(T) = y, we can solve for the probability that X(t)=xX(t) = x as a function of xx and tt. Similarly, we can establish a forward equation for the transition function. Given the current position X(t)=xX(t) = x, we wish to find the probability that X(T)=yX(T) = y as a function of TT and yy for T>tT > t.

Theorem. (Kolmogorov Forward Equation/Fokker-Planck Equation). Let p(t,T,x,y)p(t,T,x,y) be the transition density for the solution to the SDE in equation (1). Then p(t,T,x,y)p(t,T,x,y) satisfies

Tp(t,T,x,y)+y(β(t,X(t),y)p(t,T,x,y))122y2(β2(T,y)p(t,T,x,y))=0.\frac{\partial }{\partial T} p(t,T,x,y) + \frac{\partial}{\partial y} (\beta(t,X(t),y) p(t,T,x,y)) - \frac{1}{2} \frac{\partial^2}{\partial y^2} (\beta^2(T,y) p(t,T,x,y)) = 0.

Proof. Let a<ba < b, let hh be any smooth function compactly supported on [a,b][a,b], and suppose X(t)=x(a,b)X(t) = x \in (a,b). Take for example

h(x)={e1(xa)(xb)x(a,b)0o.w..h(x) = \begin{cases} e^{\frac{1}{(x-a)(x-b)}} & x \in (a,b)\\ 0 & \text{o.w.} \end{cases}.

For t<u<Tt< u < T ,

d(h(X(u)))=(α(u,X(u))h(X(u))+12β2(u,X(u))h(X(u)))du+β(u,X(u))h(X(u))dB(u).d(h(X(u))) = \left(\alpha(u,X(u)) h'(X(u)) + \frac{1}{2} \beta^2(u,X(u)) h''(X(u))\right) du + \beta(u,X(u)) h'(X(u)) dB(u).

Equivalently,

h(X(T))=h(x)+tT(α(u,X(u))h(X(u))+12β2(u,X(u))h(X(u)))du+tTβ(u,X(u))h(X(u))dB(u).\begin{align*} &h(X(T)) = h(x) + \int_t^T \left(\alpha(u,X(u)) h'(X(u)) + \frac{1}{2} \beta^2(u,X(u)) h''(X(u))\right) du \\ & + \int_t^T \beta(u,X(u)) h'(X(u)) dB(u). \end{align*}

Taking expectations of both sides and noting that h,h,hh, h', h'' are compactly supported on (a,b)(a,b),

abh(y)p(t,T,x,y)dy=E[h(X(T))]=h(x)+tTE(α(u,X(u))h(X(u))+12β2(u,X(u))h(X(u)))du=h(x)+tTabα(u,y)h(y)p(t,u,x,y)dydu+tTab12β2(u,y)h(u)p(t,u,x,y)dydu.\begin{align*} &\int_a^b h(y) p(t,T,x,y) dy = E [h(X(T))]\\ &= h(x) + \int_t^T E \left(\alpha(u,X(u)) h'(X(u)) + \frac{1}{2} \beta^2(u,X(u)) h''(X(u))\right) du \\ &=h(x) + \int_t^T \int_a^b \alpha(u,y) h'(y) p(t,u,x,y) dy du + \int_t^T \int_a^b \frac{1}{2} \beta^2(u,y) h''(u) p(t,u,x,y) dy du. \end{align*}

We now apply integration by parts once to the second term and twice to the third term and use the fact that h(a)=h(a)=h(b)=h(b)=0h(a) = h'(a) = h'(b) = h(b) = 0:

abh(y)p(t,T,x,y)dy=h(x)tTaby(α(u,y)p(t,u,x,y))h(y)dydu+12tTab2y2(β2(u,y)p(t,u,x,y))h(y)dydu.\begin{align*} &\int_a^b h(y) p(t,T,x,y) dy \\ &= h(x) - \int_t^T \int_a^b \frac{\partial}{\partial y} (\alpha(u,y) p(t,u,x,y)) h(y) dy du + \frac{1}{2} \int_t^T \int_a^b \frac{\partial^2}{\partial y^2}\left(\beta^2(u,y) p(t,u,x,y) \right) h(y) dy du. \end{align*}

Finally, differentiating both sides with respect to TT results in

abh(y)pT(t,T,x,y)dy=aby(α(T,y)p(t,T,x,y))h(y)dy+12ab2y2(β2(T,y)p(t,T,x,y))h(y)dy.\int_a^b h(y) p_T(t,T,x,y) dy = - \int_a^b \frac{\partial}{\partial y} (\alpha(T,y) p(t,T,x,y)) h(y) dy + \frac{1}{2} \int_a^b \frac{\partial^2}{\partial y^2}\left(\beta^2(T,y) p(t,T,x,y) \right) h(y) dy.

Hence,

abh(y)(pT(t,T,x,y)+y(α(T,y)p(t,T,x,y))122y2(β2(T,y)p(t,T,x,y)))dy=0.\int_a^b h(y) \left(p_T(t,T,x,y) + \frac{\partial}{\partial y} (\alpha(T,y) p(t,T,x,y)) - \frac{1}{2}\frac{\partial^2}{\partial y^2}(\beta^2(T,y) p(t,T,x,y))\right) dy = 0.

This holds for all smooth functions compactly supported on (a,b)(a,b). Since the set of smooth functions compactly supported on (a,b)(a,b) is dense in the set of continuous compactly supported functions on (a,b)(a,b), it follows that

pT(t,T,x,y)+y(α(T,y)p(t,T,x,y))122y2(β2(T,y)p(t,T,x,y))=0 for all y(a,b).p_T(t,T,x,y) + \frac{\partial}{\partial y} (\alpha(T,y) p(t,T,x,y)) - \frac{1}{2}\frac{\partial^2}{\partial y^2}(\beta^2(T,y) p(t,T,x,y)) = 0 \text{ for all } y \in (a,b).

Since a<ba < b are arbitrary, we thus have

Tp(t,T,x,y)+y(β(t,X(t),y)p(t,T,x,y))122y2(β2(T,y)p(t,T,x,y))=0.\frac{\partial }{\partial T} p(t,T,x,y) + \frac{\partial}{\partial y} (\beta(t,X(t),y) p(t,T,x,y)) - \frac{1}{2} \frac{\partial^2}{\partial y^2} (\beta^2(T,y) p(t,T,x,y)) = 0.