On metrics -- Advanced Notebook¤

This will get mathematical, be warned!

⚠️ ⚠️ ⚠️ ⚠️ ⚠️ This notebook is a WIP, it will come with future release of Exponax ⚠️ ⚠️ ⚠️ ⚠️ ⚠️

At the moment it is a dump of ideas on metric consistency with functional norms, connection to Parseval's theorem, and the relation to the Fourier transform.

import jax
import jax.numpy as jnp

import exponax as ex

Consistency of the metrics computation¤

The discretized states in exponax \(u_h \in \mathbb{R}^{C \times N}\) represent continuous functions sampled at an equidistant interval \(\Delta x = L/N\) where \(L\) is the length of the domain and \(N\) is the number of discretization points. Since we only work with periodic boundary conditions, we employ the convention that the left point of the domain is considered a degree of freedom and the right point is not. Hence, \(u_0\) refers to the value of the continuous function at \(u(0)\) and \(u_{N-1}\) refers to the value of the continuous function at \(u(\frac{L}{N} (N-1))\).

Now assume, we wanted to compute the squared \(L^2\) norm of the function \(u(x)\) over the domain \(\Omega = (0, L)\)

\[ \|u\|_{L^2(\Omega)}^2 = \int_{\Omega} |u(x)|^2 \; \mathrm{d}x \]

A way to numerically approximate any integral with points given at equidistant samples is via the trapezoidal rule. Assume we wanted to evaluate the following integral

\[ I = \int_{0}^{L} f(x) \; \mathrm{d}x \]

The trapezoidal rule states that

\[ I = \Delta x \left( \frac{f(0) + f(L)}{2} + \sum_{i=1}^{M-1} f(i \Delta x) \right) + \mathcal{O}(\Delta x^2) \]

where \(\Delta x = L/(M-1)\) is the distance between two consecutive points. In contrast to our discretization on periodic grids, the trapezoidal rule also accounts for the point on the right end of the domain. However, since the right end of the domain must be equal to the value on the left end of the domain, we have that \(f(0) = f(L)\) and the trapezoidal rule simplifies to

\[ I = \Delta x \sum_{i=0}^{M-1} f(i \Delta x) + \mathcal{O}(\Delta x^2) \]

Or if we had \(f(x)\) discretized as \(f_h \in \mathbb{R}^N\) where \(f_h = f(i \Delta x)\) with the periodic convention, we get

\[ I = \Delta x \sum_i f_i + \mathcal{O}(\Delta x^2) \]

or expressed in terms of \(L\) and \(N\)

\[ I = \frac{L}{N} \sum_i f_i + \mathcal{O}\left(N^{-2}\right) \]

This is exactly as scaled mean

\[ I = L \; \text{mean}(f) + \mathcal{O}\left(N^{-2}\right) \]

Since we actually wanted to evaluate the integral over the square absolute function, we have that

\[ \|u\|_{L^2(\Omega)}^2 = \frac{L}{N} \sum_i |u_i|^2 + \mathcal{O}\left(N^{-2}\right) \]

or again in terms of the mean

\[ \|u\|_{L^2(\Omega)}^2 = L \; \text{mean}(|u_h|^2) + \mathcal{O}\left(N^{-2}\right) \]

Taking the mean over the element-wise squared is nothing else than the MSE (mean squared error)

\[ \|u\|_{L^2(\Omega)}^2 = L \; \text{MSE}(u_h) + \mathcal{O}\left(N^{-2}\right) \]

Hence, the consistent counterpart to the squared (functional) \(L^2\) norm is the scaled MSE.

For the regular \(L^2\) norm, we have that

\[ \|u\|_{L^2(\Omega)} = \sqrt{\int_{\Omega} |u(x)|^2 \; \mathrm{d}x} \]

As such, we get a consistent counterpart

\[ \|u\|_{L^2(\Omega)} = \sqrt{L \; \text{MSE}(u_h) + \mathcal{O}\left(N^{-2}\right)} \]

(TODO: check this). Roughly, we can say that

\[ \|u\|_{L^2(\Omega)} \approx \sqrt{L \; \text{MSE}(u_h)} + \mathcal{O}\left(N^{-1}\right) \]

Adn we can identify the RMSE as the consistent counterpart to the \(L^2\) norm.

\[ \|u\|_{L^2(\Omega)} \approx \sqrt{L} \text{RMSE}(u_h) + \mathcal{O}\left(N^{-1}\right) \]

It is scaled by the square root of the length of the domain.

Requirements¤

The quadratic convergence on the MSE is only valid if the function is at least twice continuously differentiable. In order to be that it must be that it also periodic. In such a case, the estimate (might) even converges exponentially (https://en.wikipedia.org/wiki/Trapezoidal_rule#Periodic_and_peak_functions) fast!

As a consequence, a bandlimited discrete function representation (might) not evem have a discretization error at all!

On the other hand, if the function is not periodic, the estimate likely not converges quadratically. It converges linearly (https://en.wikipedia.org/wiki/Riemann_sum#Left_rule) if it is continuous which is guaranteed by the periodicity assumption.

Conclusion¤

Assuming we are on the periodic domain, we have:

A bandlimited function is exactly integrated
A non-bandlimited, but periodically continuous function converges exponentially linear (similar to how the spectral derivative converges)
A discontinuous function converges linearly

Due the special case how periodic grids are layed out, we will never have the case of quadratic convergence.

Mean-Average Error (MAE)¤

Is consistent with the L1 norm???

Higher dimensions¤

In higher dimensions with a domain \(\Omega = (0, L)^D\) with \(D\) being the number of spatial dimensions with the same convention for periodic boundary conditions, we have that

\[ \|u\|_{L^2(\Omega)}^2 = \frac{L^D}{N^D} \sum_i |u_i|^2 + \mathcal{O}\left(N^{-2}\right) \]

Assuming the \(\text{mean}\) function takes the mean over the flattened axes with \(N^D\) elements, we have that

\[ \|u\|_{L^2(\Omega)}^2 = L^D \; \text{mean}(|u_h|^2) + \mathcal{O}\left(N^{-2}\right) \]

or in terms of the MSE

\[ \|u\|_{L^2(\Omega)}^2 = L^D \; \text{MSE}(u_h) + \mathcal{O}\left(N^{-2}\right) \]

Correspondingly, the RMSE is the consistent counterpart to the \(L^2\) norm in

\[ \|u\|_{L^2(\Omega)} \approx \sqrt{L^D} \; \text{RMSE}(u_h) + \mathcal{O}\left(N^{-1}\right) \]

Multiple Channels¤

If the underlying function is a vector-valued function \(u(x) \in \mathbb{R}^C\), we can compute the \(L^2\) norm of the function as

\[ \|u\|_{L^2(\Omega)}^2 = \int_{\Omega} u(x)^T u(x) \; \mathrm{d}x \]

Hence, the consistent MSE reads

\[ \|u\|_{L^2(\Omega)}^2 = L^D \; \text{MSE}(u_h^T u_h) + \mathcal{O}\left(N^{-2}\right) \]

with the inner product being understand only over the leading channel axis.

Differences between \(p_2\), \(l_2\), and \(L_2\) norms and their relation to commonly used metrics¤

https://mathworld.wolfram.com/L2-Norm.html

Parseval's Identity: Spatial and Fourier aggregator¤

Conceptually both do the same, but the Fourier aggregator can do more it that it also allows filtering and taking derivatives. The latter gives rise to Sobolev-based losses.

However, they are only identical if the function is bandlimited.