1D Convolution Autodiff Primitive Rules

... and cross-correlation (which is often used instead of convolution, when defining convolutional neural networks; looking at you, TensorFlow and PyTorch.)

Created with ❤️ by Machine Learning & Simulation.

Follow @felix_m_koehler

Assuming a 1D convolution/cross-correlation with a kernel/filter of size $3$, i.e. $w = [w_1, w_2, w_3]^T \in \Re^3$, and input of size $5$, i.e., $x=[x_1, x_2, x_3, x_4, x_5]^T \in \Re^5$.

$w \ast x \dots$ convolution of $x$ with filter $w$.
$w \star x \dots$ cross-correlation of $x$ with filter $w$.
$f(\cdot)\dots$ the flip/reverse operator.
$p_0(\cdot)\dots$ the zero padding operator.
$p_\infty(\cdot)\dots$ the periodic padding operator.

Primitive Primal Pullback/vJp into filter Pullback/vJp into input
Symbolic Notation
Periodic Padding Convolution $ y = w \ast p_\infty(x) $ $ \bar{w} = f(x) \ast p_\infty(\bar{y}) $ $ \bar{x} = f(w) \ast p_\infty(\bar{y}) $
Periodic Padding Cross-Correlation $ y = w \star p_\infty(x) $ $ \bar{w} = f(x \star p_\infty(\bar{y})) $ $ \bar{x} = f(w) \star p_\infty(\bar{y}) $
"Same" Padding Convolution $ y = w \ast p_0(x) $ $ \bar{w} = f(x) \ast p_0(\bar{y}) $ $ \bar{x} = f(w) \ast p_0(\bar{y}) $
"Same" Padding Cross-Correlation $ y = w \star p_0(x) $ $ \bar{w} = f(x \star p_0(\bar{y})) $ $ \bar{x} = f(w) \star p_0(\bar{y}) $
"Valid" Padding Convolution $ y = w \ast x $ $ \bar{w} = f(x) \ast p_0^2(\bar{y}) $ $ \bar{x} = f(w) \ast p_0^2(\bar{y}) $
"Valid" Padding Cross-Correlation $ y = w \star x $ $ \bar{w} = f(x \star p_0^2(\bar{y})) $ $ \bar{x} = f(w) \star p_0^2(\bar{y}) $
"Full" Padding Convolution $ y = w \ast p_0^2(x) $ $ \bar{w} = f(x) \ast \bar{y} $ $ \bar{x} = f(w) \ast \bar{y} $
"Full" Padding Cross-Correlation $ y = w \star p_0^2(x) $ $ \bar{w} = f(x \star \bar{y}) $ $ \bar{x} = f(w) \star \bar{y} $
Matrix-Multiply Notation
Periodic Cross-Correlation $ \begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \end{bmatrix} = \begin{bmatrix} w_2 & w_3 & 0 & 0 & w_1 \\ w_1 & w_2 & w_3 & 0 & 0 \\ 0 & w_1 & w_2 & w_3 & 0 \\ 0 & 0 & w_1 & w_2 & w_3 \\ w_3 & 0 & 0 & w_1 & w_2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{w}_1 \\ \bar{w}_2 \\ \bar{w}_3 \end{bmatrix} = \begin{bmatrix} x_5 & x_1 & x_2 & x_3 & x_4 \\ x_1 & x_2 & x_3 & x_4 & x_5 \\ x_2 & x_3 & x_4 & x_5 & x_1 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \\ \bar{y}_4 \\ \bar{y}_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{x}_1 \\ \bar{x}_2 \\ \bar{x}_3 \\ \bar{x}_4 \\ \bar{x}_5 \end{bmatrix} = \begin{bmatrix} w_2 & w_1 & 0 & 0 & w_3 \\ w_3 & w_2 & w_1 & 0 & 0 \\ 0 & w_3 & w_2 & w_1 & 0 \\ 0 & 0 & w_3 & w_2 & w_1 \\ w_1 & 0 & 0 & w_3 & w_2 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \\ \bar{y}_4 \\ \bar{y}_5 \end{bmatrix} $
"Same" padding (one-element zero padding on both ends) Cross-Correlation $ \begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \end{bmatrix} = \begin{bmatrix} w_2 & w_3 & 0 & 0 & 0 \\ w_1 & w_2 & w_3 & 0 & 0 \\ 0 & w_1 & w_2 & w_3 & 0 \\ 0 & 0 & w_1 & w_2 & w_3 \\ 0 & 0 & 0 & w_1 & w_2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{w}_1 \\ \bar{w}_2 \\ \bar{w}_3 \end{bmatrix} = \begin{bmatrix} 0 & x_1 & x_2 & x_3 & x_4 \\ x_1 & x_2 & x_3 & x_4 & x_5 \\ x_2 & x_3 & x_4 & x_5 & 0 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \\ \bar{y}_4 \\ \bar{y}_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{x}_1 \\ \bar{x}_2 \\ \bar{x}_3 \\ \bar{x}_4 \\ \bar{x}_5 \end{bmatrix} = \begin{bmatrix} w_2 & w_1 & 0 & 0 & 0 \\ w_3 & w_2 & w_1 & 0 & 0 \\ 0 & w_3 & w_2 & w_1 & 0 \\ 0 & 0 & w_3 & w_2 & w_1 \\ 0 & 0 & 0 & w_3 & w_2 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \\ \bar{y}_4 \\ \bar{y}_5 \end{bmatrix} $
"Valid" padding (no padding) Cross-Correlation $ \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} = \begin{bmatrix} w_1 & w_2 & w_3 & 0 & 0 \\ 0 & w_1 & w_2 & w_3 & 0 \\ 0 & 0 & w_1 & w_2 & w_3 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{w}_1 \\ \bar{w}_2 \\ \bar{w}_3 \end{bmatrix} = \begin{bmatrix} x_1 & x_2 & x_3 \\ x_2 & x_3 & x_4 \\ x_3 & x_4 & x_5 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \end{bmatrix} $ $ \begin{bmatrix} \bar{x}_1 \\ \bar{x}_2 \\ \bar{x}_3 \\ \bar{x}_4 \\ \bar{x}_5 \end{bmatrix} = \begin{bmatrix} w_1 & 0 & 0 \\ w_2 & w_1 & 0 \\ w_3 & w_2 & w_1 \\ 0 & w_3 & w_2 \\ 0 & 0 & w_3 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \end{bmatrix} $
"Full" padding (2-element zero padding on both ends) Cross-Correlation $ \begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \\ y_6 \\ y_7 \end{bmatrix} = \begin{bmatrix} w_3 & 0 & 0 & 0 & 0 \\ w_2 & w_3 & 0 & 0 & 0 \\ w_1 & w_2 & w_3 & 0 & 0 \\ 0 & w_1 & w_2 & w_3 & 0 \\ 0 & 0 & w_1 & w_2 & w_3 \\ 0 & 0 & 0 & w_1 & w_2 \\ 0 & 0 & 0 & 0 & w_1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{w}_1 \\ \bar{w}_2 \\ \bar{w}_3 \end{bmatrix} = \begin{bmatrix} 0 & 0 & x_1 & x_2 & x_3 & x_4 & x_5 \\ 0 & x_1 & x_2 & x_3 & x_4 & x_5 & 0 \\ x_1 & x_2 & x_3 & x_4 & x_5 & 0 & 0 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \\ \bar{y}_4 \\ \bar{y}_5 \\ \bar{y}_6 \\ \bar{y}_7 \end{bmatrix} $ $ \begin{bmatrix} \bar{x}_1 \\ \bar{x}_2 \\ \bar{x}_3 \\ \bar{x}_4 \\ \bar{x}_5 \end{bmatrix} = \begin{bmatrix} w_3 & w_2 & w_1 & 0 & 0 & 0 & 0 \\ 0 & w_3 & w_2 & w_1 & 0 & 0 & 0 \\ 0 & 0 & w_3 & w_2 & w_1 & 0 & 0 \\ 0 & 0 & 0 & w_3 & w_2 & w_1 & 0 \\ 0 & 0 & 0 & 0 & w_3 & w_2 & w_1 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \\ \bar{y}_3 \\ \bar{y}_4 \\ \bar{y}_5 \\ \bar{y}_6 \\ \bar{y}_7 \end{bmatrix} $
Stride & Dilation in Matrix-Multiply notation
"Valid" Padding with stride $2$ Cross-Correlation $ \begin{bmatrix} y_1 \\ y_2 \end{bmatrix} = \begin{bmatrix} w_1 & w_2 & w_3 & 0 & 0 \\ 0 & 0 & w_1 & w_2 & w_3 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{w}_1 \\ \bar{w}_2 \\ \bar{w}_3 \end{bmatrix} = \begin{bmatrix} x_1 & x_3 \\ x_2 & x_4 \\ x_3 & x_5 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \end{bmatrix} $ $ \begin{bmatrix} \bar{x}_1 \\ \bar{x}_2 \\ \bar{x}_3 \\ \bar{x}_4 \\ \bar{x}_5 \end{bmatrix} = \begin{bmatrix} w_1 & 0 \\ w_2 & 0 \\ w_3 & w_1 \\ 0 & w_2 \\ 0 & w_3 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \\ \bar{y}_2 \end{bmatrix} $
"Valid" Padding with dilation = 2 Cross-Correlation $ \begin{bmatrix} y_1 \end{bmatrix} = \begin{bmatrix} w_1 & 0 & w_2 & 0 & w_3\\ \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{bmatrix} $ $ \begin{bmatrix} \bar{w}_1 \\ \bar{w}_2 \\ \bar{w}_3 \end{bmatrix} = \begin{bmatrix} x_1 \\ x_3 \\ x_5 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \end{bmatrix} $ $ \begin{bmatrix} \bar{x}_1 \\ \bar{x}_2 \\ \bar{x}_3 \\ \bar{x}_4 \\ \bar{x}_5 \end{bmatrix} = \begin{bmatrix} w_1 \\ 0 \\ w_2 \\ 0 \\ w_3 \end{bmatrix} \begin{bmatrix} \bar{y}_1 \end{bmatrix} $