Gaussian Process Regression (GPR)
PyToM-3D wraps the regressors of scikit-learn, amongst which GPR. In this instance, the topography is modelled as the following regression model:
\[(x,y) = f(x,y) + \epsilon(0, \sigma),\]
where \(f(x,y)\) is a latent function modelling the topography and \(\epsilon\) is Gaussian noise having null mean and \(\sigma\) as the standard deviation. Next, a Gaussian Process is placed over \(f(x,y)\):
\[f \sim GP(M(x,y), K(x,y)),\]
where \(M(x,y)\) is the mean, and \(K(x,y)\) is the kernel (covariance function). Initially, we need to define the kernel:
from sklearn.gaussian_process.kernels import RBF, WhiteKernel, ConstantKernel kernel = ConstantKernel() * RBF([1.0, 1.0], (1e-5, 1e5)) \ + WhiteKernel(noise_level=1e-3, noise_level_bounds=(1e-5, 1e5))
which represents a typical squared exponential kernel with noise:
\[K(\mathbf{x},\mathbf{x}') = C \exp{\left(\frac{\Vert \mathbf{x} - \mathbf{x}'\Vert^2}{l^2}\right)} + \sigma_{ij},\]
where:
\[\sigma_{ij} = \delta_{ij} \epsilon,\quad \mathbf{x}=\left[x,y\right],\]
and \(\delta_{ij}\) is Kronecker’s delta applied to any pair of points \(\mathbf{x}_{i}\), \(\mathbf{x}_j\).
Finally, we invoke:
from sklearn.gaussian_process import GaussianProcessRegressor as gpr t.fit(gpr(kernel=kernel))
and the GPR is fit. To predict data:
t.pred(X)
where \(X\) is \(M \times 2\) a numpy array containing the \(x\) and \(y\) coordinates of an evaluation grid.
Spline fitting
PyToM-3D allows user to fit data by bi-variate splines specifically wrapping scipy.interpolate.bisplrep. Modelling the topography as a two-variable function \(f(x,y)\), we approximate it as:
\[f(x,y) \approx \sum_{i=1}^{m} \sum_{j=1}^{n} C_{ij} Q_{i,r}(x) Q_{j,s}(y),\]
where \(Q_{i,r}\) and \(Q_{j,s}\) are polynomials of degree \(r\) and \(s\) respectively. Additionally, \(m\) and \(n\) are the number of control points (aka knots) distributed along the \(x\)- and \(y\)-axis. Lastly, \(C_{ij}\) are the trainable coefficients, which are determined upon the points of the topography.
To fit a bi-variate spline to the data, we call:
from scipy.interpolate import bisplrep t.fit(bisplrep, kx, ky, tx, ty)
where kx, ky represent the above-mentioned \(r\) and \(s\) (degree of the polynomials), and tx and ty are the control points, which are expected to be np.ndarray-like objects. Once the fitting is done we forecast the topography elsewhere by:
t.pred(X)