GiggleLiu - The Numeric Monster

Jacobians and Hessians for Reversible Primitives

This blog covers the Jacobians and Hessians for reversible primitives, they can be used to propagate gradients and Hessians in a reversible programs.

The Definition

For function y=f(x)\vec{y} = f(\vec{x}), we define its Jacobian as

Jij=yixj. J_{ij} = \frac{\partial y_i}{\partial x_j}.

Its Hessian is

Hijk=ykxixj H^k_{ij} = \frac{\partial y_k}{x_i x_j}

Jacobian of Reversible Primitives

(1). a+=ba \mathrel+= b

J=(1101)H=0 \begin{align} J &= \left(\begin{matrix} 1 & 1\\ 0 & 1 \end{matrix}\right)\\ H &= \mathbf{0} \end{align}

The inverse is a=ba \mathrel-= b​, its Jacobian is the inverse of the matrix above

J(f1)=J1=(1101) J(f^{-1}) = J^{-1} = \left(\begin{matrix} 1 & -1\\ 0 & 1 \end{matrix}\right)

In the following, we omit the Jacobians and Hessians of inverse functions.

(2). a+=bca\mathrel+=b*c

J=(1cb010001)Hbca=Hcba=1,else 0 J = \left(\begin{matrix} 1 & c & b\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{matrix}\right)\\ H^a_{bc} = H^a_{cb} = 1, else ~0

(3). a+=b/ca\mathrel+=b/c​

J=(11/cb/c2010001)Hcca=2b/c3,Hbca=Hcba=1/c2,else 0 J = \left(\begin{matrix} 1 & 1/c &-b/c^2\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{matrix}\right)\\ H^a_{cc} = 2b/c^3,\\ H^a_{bc} = H^a_{cb} = -1/c^2,else ~ 0

(4). a+=bca\mathrel+=b^c​

J=(1cbc1bclogb010001)Hbca=Hcba=bc1+cbc1logb,Hbba=(c1)cbc2,Hcca=bclog2b,else 0 J = \left(\begin{matrix} 1 & cb^{c-1} & b^c \log b \\ 0 & 1 & 0\\ 0 & 0 & 1 \end{matrix}\right)\\ H^a_{bc} = H^a_{cb} = b^{c-1} + c b^{c-1}\log b,\\ H^a_{bb} = (c-1)c b^{c-2},\\ H^a_{cc} = b^c\log^2b, else ~0

(5). a+=eba\mathrel+=e^b

J=(1eb01)Hbba=eb,else 0 J = \left(\begin{matrix} 1 & e^b \\ 0 & 1 \end{matrix}\right)\\ H^a_{bb} = e^b, else ~0

(6). a+=logba\mathrel+=\log b​

J=(11/b01)Hbba=1/b2,else 0 J = \left(\begin{matrix} 1 & 1/b \\ 0 & 1 \end{matrix}\right)\\ H^a_{bb} = -1/b^2, else ~0

(7). a+=sinba\mathrel+=\sin b​

J=(1cosb01)Hbba=sinb,else 0 J = \left(\begin{matrix} 1 & \cos b \\ 0 & 1 \end{matrix}\right)\\ H^a_{bb} = -\sin b, else ~0

(8). a+=cosba\mathrel+=\cos b​

J=(1sinb01)Hbba=cosb,else 0 J = \left(\begin{matrix} 1 & -\sin b \\ 0 & 1 \end{matrix}\right)\\ H^a_{bb} = -\cos b, else ~0

(9). a+=ba \mathrel+= \vert b\vert​

J=(1sign(b)01)H=0 J = \left(\begin{matrix} 1 & {\rm sign} (b) \\ 0 & 1 \end{matrix}\right)\\ H = \mathbf{0}

(10). a=aa = -a​

J=(1)H=0 J = \left(\begin{matrix} -1 \end{matrix}\right)\\ H = \mathbf{0}

(11). SWAP(a,b)=(b,a){\rm SWAP}(a, b) = (b, a)

J=(0110)H=0 J = \left(\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right)\\ H = \mathbf{0}

(12). ​

ROT(a,b,θ)=(cosθsinθsinθcosθ)(ab) {\rm ROT}(a, b, \theta) = \left(\begin{matrix} \cos\theta & - \sin\theta\\ \sin\theta & \cos\theta \end{matrix}\right) \left(\begin{matrix} a\\ b \end{matrix}\right) J=(cosθsinθbcosθasinθsinθcosθacosθbsinθ001)Haθa=Hθ,ba=sinθ,Hbθa=Hθ,ba=cosθ,Hθθa=acosθ+bsinθ,Haθb=Hθab=cosθ,Hbθb=Hθbb=sinθ,Hθθb=bcosθasinθ,else 0 \begin{align*} &J = \left(\begin{matrix} \cos\theta & - \sin\theta & -b\cos\theta-a\sin \theta\\ \sin\theta & \cos\theta & a\cos\theta -b\sin\theta\\ 0 & 0 & 1 \end{matrix}\right)\\ &H^a_{a\theta} = H^a_{\theta, b} = -\sin\theta,\\ &H^a_{b\theta} = H^a_{\theta, b} = -\cos\theta,\\ &H^a_{\theta\theta} = -a\cos\theta + b\sin\theta,\\ &H^b_{a\theta} = H^b_{\theta a} = \cos\theta,\\ &H^b_{b\theta} = H^b_{\theta b} = -\sin\theta,\\ &H^b_{\theta\theta} = -b\cos\theta-a\sin\theta, else ~0 \end{align*}
CC BY-SA 4.0 GiggleLiu. Last modified: April 04, 2024. Website built with Franklin.jl and the Julia programming language.