Source code for emg3d.core

"""
The **core** contains the number-crunching functionalities of
:func:`emg3d.solver.solve`, the computationally most demanding parts. These
functions are implemented as just-in-time (jit) compiled functions using the
:func:`numba.jit`-decorator of `numba <https://numba.pydata.org>`_.

    «*Numba translates Python functions to optimized machine code at runtime
    using the industry-standard LLVM compiler library. Numba-compiled numerical
    algorithms in Python can approach the speeds of C or FORTRAN.*» (from the
    numba website.)

These functions are not meant to be called directly, particularly not from an
end-user; they are called from functions in :func:`emg3d.solver.solve`.

For an end-user it can still be insightful to look at the documentation and
code of these functions if you are interested in understanding how the
multigrid solver works, the theory and its implementation.

For a developer interested in making emg3d faster this is the right place to
start, as by far the most time is spent in these functions, particularly in
:func:`solve`.
"""
# Copyright 2018-2022 The emsig community.
#
# This file is part of emg3d.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not
# use this file except in compliance with the License.  You may obtain a copy
# of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the
# License for the specific language governing permissions and limitations under
# the License.

import numba as nb
import numpy as np

# Numba-settings
_numba_setting = {'nogil': True, 'fastmath': True, 'cache': True}


# LinearOperator to compute A x
[docs]@nb.njit(**_numba_setting) def amat_x(rx, ry, rz, ex, ey, ez, eta_x, eta_y, eta_z, zeta, hx, hy, hz): r"""Residual with or without source term. Compute the residual as given in [Muld06]_ in middle of the right column on page 636, but without the source term: .. math:: \mathbf{r} = V \left( \mathrm{i}\omega\mu_0 \tilde{\sigma} \mathbf{E} - \nabla \times \mu_\mathrm{r}^{-1} \nabla \times \mathbf{E} \right) . The computation is carried out in a matrix-free manner; on said page 636 (or in the :doc:`../manual/theory` of the manual) are the various steps laid out to discretize the different parts such as the involved curls. This can also be understood as the left-hand-side of :math:`A x = b`, as given in Equation 2 in [Muld06]_ (here without the cell volumes :math:`V`), .. math:: \mathrm{i}\omega\mu_0 \tilde{\sigma} \mathbf{E} - \nabla \times \mu_\mathrm{r}^{-1} \nabla \times \mathbf{E} = - \mathrm{i} \omega \mu_0 \mathbf{J_\mathrm{s}} . It can therefore be used as a ``matvec`` to create a ``LinearOperator``, which can be passed to a solver. It is assumed that the PEC boundary condition is applied to the electric field :math:`\mathbf{E}` (``ex``, ``ey``, and ``ez``). The residuals are subtracted in-place from ``rx``, ``ry``, and ``rz``. That means that if ``rx``, ``ry``, and ``rz`` contain the source field, they will contain the total residual afterwards; if they are empty fields, they will contain the negative partial residuals afterwards. Parameters ---------- rx, ry, rz : ndarray Source field or pre-allocated zero residual field in x-, y-, and z-directions (:class:`emg3d.fields.Field`). ex, ey, ez : ndarray Electric fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). eta_x, eta_y, eta_z, zeta : ndarray Volume-averaged model parameters (:class:`emg3d.models.VolumeModel`). hx, hy, hz : ndarray Cell widths in x-, y-, and z-directions (:class:`emg3d.meshes.TensorMesh`). """ # Get dimensions nx = len(hx) ny = len(hy) nz = len(hz) # NOTE about `i?m = max(0, i?-1)`: # In the cases when -1 is set to 0, these indices are only used in # parameters which are not actually used in these cases, see the note # towards the end. Resetting -1 to 0 is simply to avoid index errors. # Loop over dimensions; x-fastest, then y, z for iz in range(nz): izm = max(0, iz-1) izp = iz+1 for iy in range(ny): iym = max(0, iy-1) iyp = iy+1 for ix in range(nx): ixm = max(0, ix-1) ixp = ix+1 # 1. Curl [Muld06]_ Equation 7: # v = nabla x E. v1pp = ((ez[ix, iyp, iz] - ez[ix, iy, iz])/hy[iy] - (ey[ix, iy, izp] - ey[ix, iy, iz])/hz[iz]) v1mp = ((ez[ix, iy, iz] - ez[ix, iym, iz])/hy[iym] - (ey[ix, iym, izp] - ey[ix, iym, iz])/hz[iz]) v1pm = ((ez[ix, iyp, izm] - ez[ix, iy, izm])/hy[iy] - (ey[ix, iy, iz] - ey[ix, iy, izm])/hz[izm]) v2pp = ((ex[ix, iy, izp] - ex[ix, iy, iz])/hz[iz] - (ez[ixp, iy, iz] - ez[ix, iy, iz])/hx[ix]) v2mp = ((ex[ixm, iy, izp] - ex[ixm, iy, iz])/hz[iz] - (ez[ix, iy, iz] - ez[ixm, iy, iz])/hx[ixm]) v2pm = ((ex[ix, iy, iz] - ex[ix, iy, izm])/hz[izm] - (ez[ixp, iy, izm] - ez[ix, iy, izm])/hx[ix]) v3pp = ((ey[ixp, iy, iz] - ey[ix, iy, iz])/hx[ix] - (ex[ix, iyp, iz] - ex[ix, iy, iz])/hy[iy]) v3mp = ((ey[ix, iy, iz] - ey[ixm, iy, iz])/hx[ixm] - (ex[ixm, iyp, iz] - ex[ixm, iy, iz])/hy[iy]) v3pm = ((ey[ixp, iym, iz] - ey[ix, iym, iz])/hx[ix] - (ex[ix, iy, iz] - ex[ix, iym, iz])/hy[iym]) # 2. Multiply by average of mu_r [Muld06]_ p 636 bottom-left. # u = M v = V mu_r^-1 v = V mu_r^-1 nabla x E # (Factor 0.5 to average moved to point 5.) v1pp *= zeta[ixm, iy, iz] + zeta[ix, iy, iz] v1mp *= zeta[ixm, iym, iz] + zeta[ix, iym, iz] v1pm *= zeta[ixm, iy, izm] + zeta[ix, iy, izm] v2pp *= zeta[ix, iym, iz] + zeta[ix, iy, iz] v2mp *= zeta[ixm, iym, iz] + zeta[ixm, iy, iz] v2pm *= zeta[ix, iym, izm] + zeta[ix, iy, izm] v3pp *= zeta[ix, iy, izm] + zeta[ix, iy, iz] v3mp *= zeta[ixm, iy, izm] + zeta[ixm, iy, iz] v3pm *= zeta[ix, iym, izm] + zeta[ix, iym, iz] # 3. Another curl [Muld06]_ p. 636 bottom-right; completes: # nabla x M v = nabla x V mu_r^-1 nabla x E rrx = v3pp/hy[iy] - v3pm/hy[iym] - v2pp/hz[iz] + v2pm/hz[izm] rry = v1pp/hz[iz] - v1pm/hz[izm] - v3pp/hx[ix] + v3mp/hx[ixm] rrz = v2pp/hx[ix] - v2mp/hx[ixm] - v1pp/hy[iy] + v1mp/hy[iym] # 4. Sigma-term, [Muld06]_ p. 636 top-left (average of # eta). # S = i omega mu_0 sigma~ V # (Factor 0.25 to average moved to point 5.) stx = (eta_x[ix, iym, izm] + eta_x[ix, iym, iz] + eta_x[ix, iy, izm] + eta_x[ix, iy, iz]) sty = (eta_y[ixm, iy, izm] + eta_y[ix, iy, izm] + eta_y[ixm, iy, iz] + eta_y[ix, iy, iz]) stz = (eta_z[ixm, iym, iz] + eta_z[ix, iym, iz] + eta_z[ixm, iy, iz] + eta_z[ix, iy, iz]) # NOTE re zero boundary conditions for tangential E field: # At the moment these elements are computed but now # discarded. This function could be adjusted to omit the # computation of these. But one would have to test if it makes # it actually faster. if iy == 0 or iz == 0: # assuming ex = 0 rrx = 0 if ix == 0 or iz == 0: # assuming ey = 0 rry = 0 if ix == 0 or iy == 0: # assuming ez = 0 rrz = 0 # 5. [Muld06]_ p. 636 center-right; completes # -V (i omega mu_0 sigma~ E - nabla x mu_r^-1 nabla x E) # Subtracting this from the source terms will yield the # residual. rx[ix, iy, iz] -= 0.5*rrx - 0.25*stx*ex[ix, iy, iz] ry[ix, iy, iz] -= 0.5*rry - 0.25*sty*ey[ix, iy, iz] rz[ix, iy, iz] -= 0.5*rrz - 0.25*stz*ez[ix, iy, iz]
# Smoother (Gauss-Seidel method)
[docs]@nb.njit(**_numba_setting) def gauss_seidel(ex, ey, ez, sx, sy, sz, eta_x, eta_y, eta_z, zeta, hx, hy, hz, nu): r"""Gauss-Seidel method. Solves the linear equation system :math:`A x = b` iteratively using the following method: .. math:: \mathbf{x}^{(k+1)} = L_*^{-1} \left(\mathbf{b} - U \mathbf{x}^{(k)} \right) \ , where :math:`L_*` is the lower triangular component, and :math:`U` the strictly upper triangular component, :math:`A = L_* + U`, with .. math:: L_* = \left[ \begin{array} {cccc} a_{11} & 0 & \cdots & 0 \\ a_{21} & a_{22} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{array} \right] \ , \quad U = \left[ \begin{array} {cccc} 0 & a_{12} & \cdots & a_{1n} \\ 0 & 0 & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \end{array} \right] \ . On the coarsest grid it acts as a direct solver, whereas on the fine grid it acts as a smoother with only few iterations, defined by :math:`\nu` (``nu``). Odd numbers of ``nu`` use forward ordering, even numbers use backwards ordering; ``nu=2`` is therefore one symmetric Gauss-Seidel iteration, one forward ordered iteration followed by one backward ordered iteration. From [Muld06]_: «The method proposed by [ArFW00]_ is chosen as a smoother. It selects one node of the grid and simultaneously solves for the six degrees of freedom on the six edges attached to the node. If node :math:`(x_k, y_l, z_m)` is selected, the six equations, :math:`r_{x;k\pm1/2,l,m} = 0`, :math:`r_{y;k,l\pm1/2,m} = 0`, and :math:`r_{z;k,l,m\pm1/2} = 0`, are solved for :math:`e_{x;k\pm1/2,l,m}`, :math:`e_{y;k,l\pm1/2,m}`, and :math:`e_{z;k,l,m\pm1/2}`. Here, this smoother is applied in a symmetric Gauss-Seidel fashion, following the lexicographical ordering of the nodes :math:`(x_k, y_l, z_m)`, with fastest index :math:`k=1, \dots, N_x-1`, intermediate index :math:`l=1, \dots, N_y-1`, and slowest index :math:`m=1, \ldots, N_z-1`.» To actually solve the system of six equations a non-standard Cholesky factorisation is used implemented in :func:`solve`. Tangential components at the boundaries are assumed to be zero (PEC boundaries). The result is stored in the provided electric field components ``ex``, ``ey``, and ``ez``. Parameters ---------- ex, ey, ez : ndarray Electric fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). sx, sy, sz : ndarray Source fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). eta_x, eta_y, eta_z, zeta : ndarray Volume-averaged model parameters (:class:`emg3d.models.VolumeModel`). hx, hy, hz : ndarray Cell widths in x-, y-, and z-directions (:class:`emg3d.meshes.TensorMesh`). nu : int Number of Gauss-Seidel iterations. """ # Get dimensions nx = len(hx) ny = len(hy) nz = len(hz) # Get half of the inverse widths kx = 0.5/hx ky = 0.5/hy kz = 0.5/hz # Direction-switch for Gauss-Seidel iback = 0 # Pre-allocating `A` for the six edges attached to one node; it will be # overwritten at each iteration amat = np.zeros(36, dtype=ex.dtype) # Smoothing steps for _ in range(nu): # Direction of Gauss-Seidel ordering; 0=forward, 1=backward iback = 1-iback # Loop over cells, keeping boundaries fixed; x-fastest, then y, z. for izh in range(1, nz): # Back-forth-switch if iback: iz = nz-izh else: iz = izh # Minus/plus indices izm = iz-1 izp = iz+1 for iyh in range(1, ny): # Back-forth-switch if iback: iy = ny-iyh else: iy = iyh # Minus/plus indices iym = iy-1 iyp = iy+1 for ixh in range(1, nx): # Back-forth-switch if iback: ix = nx-ixh else: ix = ixh # Minus/plus indices ixm = ix-1 ixp = ix+1 # Averaging of 1/mu_r: mzyRxm etc. mzyLxm = ky[iym]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzyRxm = ky[iy]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) myzLxm = kz[izm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myzRxm = kz[iz]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) mzyLxp = ky[iym]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) mzyRxp = ky[iy]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) myzLxp = kz[izm]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) myzRxp = kz[iz]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) mzxLym = kx[ixm]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzxRym = kx[ix]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) mxzLym = kz[izm]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxzRym = kz[iz]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) mzxLyp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) mzxRyp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) mxzLyp = kz[izm]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) mxzRyp = kz[iz]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) myxLzm = kx[ixm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myxRzm = kx[ix]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) mxyLzm = ky[iym]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxyRzm = ky[iy]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) myxLzp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) myxRzp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) mxyLzp = ky[iym]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) mxyRzp = ky[iy]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) # Diagonal elements st0 = (eta_x[ixm, iy, iz] + eta_x[ixm, iy, izm] + eta_x[ixm, iym, iz] + eta_x[ixm, iym, izm]) st1 = (eta_x[ix, iy, iz] + eta_x[ix, iy, izm] + eta_x[ix, iym, iz] + eta_x[ix, iym, izm]) st2 = (eta_y[ix, iym, iz] + eta_y[ix, iym, izm] + eta_y[ixm, iym, iz] + eta_y[ixm, iym, izm]) st3 = (eta_y[ix, iy, iz] + eta_y[ix, iy, izm] + eta_y[ixm, iy, iz] + eta_y[ixm, iy, izm]) st4 = (eta_z[ix, iy, izm] + eta_z[ix, iym, izm] + eta_z[ixm, iy, izm] + eta_z[ixm, iym, izm]) st5 = (eta_z[ix, iy, iz] + eta_z[ix, iym, iz] + eta_z[ixm, iy, iz] + eta_z[ixm, iym, iz]) st = np.array([st0, st1, st2, st3, st4, st5])/4. # Fill amat amat[:] = 0. # Reset # Initial diagonal elements for k in range(6): amat[6*k] = -st[k] # Complete diagonals # A is symmetric and curl-curl part is real-valued amat[0] += mzyRxm/hy[iy] + mzyLxm/hy[iym] # 0,0| 0 amat[0] += myzRxm/hz[iz] + myzLxm/hz[izm] amat[6] += mzyRxp/hy[iy] + mzyLxp/hy[iym] # 1,1| 6 amat[6] += myzRxp/hz[iz] + myzLxp/hz[izm] amat[12] += mzxRym/hx[ix] + mzxLym/hx[ixm] # 2,2|12 amat[12] += mxzRym/hz[iz] + mxzLym/hz[izm] amat[18] += mzxRyp/hx[ix] + mzxLyp/hx[ixm] # 3,3|18 amat[18] += mxzRyp/hz[iz] + mxzLyp/hz[izm] amat[24] += myxRzm/hx[ix] + myxLzm/hx[ixm] # 4,4|24 amat[24] += mxyRzm/hy[iy] + mxyLzm/hy[iym] amat[30] += myxRzp/hx[ix] + myxLzp/hx[ixm] # 5,5|30 amat[30] += mxyRzp/hy[iy] + mxyLzp/hy[iym] # Off-diagonal elements # Upper triangle not needed and not set. # The elements # [1, 0] (1); [3, 2] (13); and [5, 4] (21) # are all zero. amat[2] = -mzyLxm/hx[ixm] # 2,0| 2 amat[3] = mzyRxm/hx[ixm] # 3,0| 3 amat[4] = -myzLxm/hx[ixm] # 4,0| 4 amat[5] = myzRxm/hx[ixm] # 5,0| 5 amat[7] = mzyLxp/hx[ix] # 2,1| 7 amat[8] = -mzyRxp/hx[ix] # 3,1| 8 amat[9] = myzLxp/hx[ix] # 4,1| 9 amat[10] = -myzRxp/hx[ix] # 5,1|10 amat[14] = -mxzLym/hy[iym] # 4,2|14 amat[15] = mxzRym/hy[iym] # 5,2|15 amat[19] = mxzLyp/hy[iy] # 4,3|19 amat[20] = -mxzRyp/hy[iy] # 5,3|20 # Fill residual (b - Ux^{(k)}) # Note: rhs is NOT the full residual at this point # Get the 6 edges for ix, iy, and iz rhs = np.array([sx[ixm, iy, iz], sx[ix, iy, iz], sy[ix, iym, iz], sy[ix, iy, iz], sz[ix, iy, izm], sz[ix, iy, iz]]) rhs[0] += mzyRxm*(ey[ixm, iy, iz]/hx[ixm] + ex[ixm, iyp, iz]/hy[iy]) rhs[0] += mzyLxm*(-ey[ixm, iym, iz]/hx[ixm] + ex[ixm, iym, iz]/hy[iym]) rhs[0] += myzRxm*(ez[ixm, iy, iz]/hx[ixm] + ex[ixm, iy, izp]/hz[iz]) rhs[0] += myzLxm*(-ez[ixm, iy, izm]/hx[ixm] + ex[ixm, iy, izm]/hz[izm]) rhs[1] += mzyRxp*(-ey[ixp, iy, iz]/hx[ix] + ex[ix, iyp, iz]/hy[iy]) rhs[1] += mzyLxp*(ey[ixp, iym, iz]/hx[ix] + ex[ix, iym, iz]/hy[iym]) rhs[1] += myzRxp*(-ez[ixp, iy, iz]/hx[ix] + ex[ix, iy, izp]/hz[iz]) rhs[1] += myzLxp*(ez[ixp, iy, izm]/hx[ix] + ex[ix, iy, izm]/hz[izm]) rhs[2] += mzxRym*(ey[ixp, iym, iz]/hx[ix] + ex[ix, iym, iz]/hy[iym]) rhs[2] += mzxLym*(ey[ixm, iym, iz]/hx[ixm] - ex[ixm, iym, iz]/hy[iym]) rhs[2] += mxzRym*(ez[ix, iym, iz]/hy[iym] + ey[ix, iym, izp]/hz[iz]) rhs[2] += mxzLym*(-ez[ix, iym, izm]/hy[iym] + ey[ix, iym, izm]/hz[izm]) rhs[3] += mzxRyp*(ey[ixp, iy, iz]/hx[ix] - ex[ix, iyp, iz]/hy[iy]) rhs[3] += mzxLyp*(ey[ixm, iy, iz]/hx[ixm] + ex[ixm, iyp, iz]/hy[iy]) rhs[3] += mxzRyp*(-ez[ix, iyp, iz]/hy[iy] + ey[ix, iy, izp]/hz[iz]) rhs[3] += mxzLyp*(ez[ix, iyp, izm]/hy[iy] + ey[ix, iy, izm]/hz[izm]) rhs[4] += myxRzm*(ez[ixp, iy, izm]/hx[ix] + ex[ix, iy, izm]/hz[izm]) rhs[4] += myxLzm*(ez[ixm, iy, izm]/hx[ixm] - ex[ixm, iy, izm]/hz[izm]) rhs[4] += mxyRzm*(ez[ix, iyp, izm]/hy[iy] + ey[ix, iy, izm]/hz[izm]) rhs[4] += mxyLzm*(ez[ix, iym, izm]/hy[iym] - ey[ix, iym, izm]/hz[izm]) rhs[5] += myxRzp*(ez[ixp, iy, iz]/hx[ix] - ex[ix, iy, izp]/hz[iz]) rhs[5] += myxLzp*(ez[ixm, iy, iz]/hx[ixm] + ex[ixm, iy, izp]/hz[iz]) rhs[5] += mxyRzp*(ez[ix, iyp, iz]/hy[iy] - ey[ix, iy, izp]/hz[iz]) rhs[5] += mxyLzp*(ez[ix, iym, iz]/hy[iym] + ey[ix, iym, izp]/hz[iz]) # Solve linear system A x = b solve(amat, rhs) # Update e-field (here we could apply damping weights) ex[ixm, iy, iz] = rhs[0] ex[ix, iy, iz] = rhs[1] ey[ix, iym, iz] = rhs[2] ey[ix, iy, iz] = rhs[3] ez[ix, iy, izm] = rhs[4] ez[ix, iy, iz] = rhs[5]
[docs]@nb.njit(**_numba_setting) def gauss_seidel_x(ex, ey, ez, sx, sy, sz, eta_x, eta_y, eta_z, zeta, hx, hy, hz, nu): r"""Gauss-Seidel method with line relaxation in x-direction. This is the equivalent to :func:`gauss_seidel`, but with line relaxation in the x-direction. See :func:`gauss_seidel` for more details on the smoother itself. The resulting system A x = b to solve consists of n unknowns (x-vector), and the corresponding matrix A is a banded matrix with the main diagonal and five upper and lower diagonals:: .-0 |X|\ 0 0-.-0 left: middle: right: \|X|\ (not used) 0-.-0 0- .- 0 \|X|\ \ |X |\ 0-.-0 0 \|X| 0-. . 1*1, - 4*1, | 1*4, X 4*4, \ 4*4 upper or lower The matrix A is complex and symmetric (A = A^T), and therefore only the main diagonal and the lower five off-diagonals are required. - The right-hand-side b has length 5*nx-4 (nx even). - The matrix A has length of b and 1+2*5 diagonals; we use for it an array of length 6*len(b). The values are computed in rows of 5 lines, with the indicated middle and left matrices as indicated in the above scheme. These blocks are filled into the main matrix A and vector b, and subsequently solved with a non-standard Cholesky factorisation implemented in :func:`solve`. Tangential components at the boundaries are assumed to be zero (PEC boundaries). The result is stored in the provided electric field components ``ex``, ``ey``, and ``ez``. Parameters ---------- ex, ey, ez : ndarray Electric fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). sx, sy, sz : ndarray Source fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). eta_x, eta_y, eta_z, zeta : ndarray Volume-averaged model parameters (:class:`emg3d.models.VolumeModel`). hx, hy, hz : ndarray Cell widths in x-, y-, and z-directions (:class:`emg3d.meshes.TensorMesh`). nu : int Number of Gauss-Seidel iterations. """ # Get dimensions nx = len(hx) ny = len(hy) nz = len(hz) # Get half of the inverse widths kx = 0.5/hx ky = 0.5/hy kz = 0.5/hz # Direction-switch for Gauss-Seidel iback = 0 # Pre-allocating middle and left for the 5x5-temporary middle and left # matrices; will be overwritten at each iteration middle = np.zeros(25, dtype=ex.dtype) left = np.zeros(25) # Pre-allocating full RHS (bvec) and full matrix A (amat). Will be # overwritten after each complete x-loop. nr = 5*nx-4 # Number of unknowns bvec = np.zeros(nr, dtype=ex.dtype) amat = np.zeros(6*nr, dtype=ex.dtype) # Smoothing steps for _ in range(nu): # Direction of Gauss-Seidel ordering; 0=forward, 1=backward iback = 1-iback # Loop over cells, keeping boundaries fixed; x-fastest, then y, z. for izh in range(1, nz): # Back-forth-switch if iback: iz = nz-izh else: iz = izh # Minus/plus indices izm = iz-1 izp = iz+1 for iyh in range(1, ny): # Back-forth-switch if iback: iy = ny-iyh else: iy = iyh # Minus/plus indices iym = iy-1 iyp = iy+1 # Reset vectors middle[:] = 0. left[:] = 0. bvec[:] = 0. amat[:] = 0. for ixh in range(1, nx+1): # Index and minus index ix = min(ixh, nx-1) ixm = ixh-1 # Averaging of 1/mu_r: mzyRxm etc. mzyLxm = ky[iym]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzyRxm = ky[iy]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) myzLxm = kz[izm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myzRxm = kz[iz]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) # mzyLxp = ky[iym]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) # mzyRxp = ky[iy]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) # myzLxp = kz[izm]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) # myzRxp = kz[iz]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) mzxLym = kx[ixm]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzxRym = kx[ix]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) mxzLym = kz[izm]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxzRym = kz[iz]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) mzxLyp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) mzxRyp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) mxzLyp = kz[izm]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) mxzRyp = kz[iz]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) myxLzm = kx[ixm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myxRzm = kx[ix]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) mxyLzm = ky[iym]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxyRzm = ky[iy]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) myxLzp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) myxRzp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) mxyLzp = ky[iym]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) mxyRzp = ky[iy]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) # Diagonal elements st0 = (eta_x[ixm, iy, iz] + eta_x[ixm, iy, izm] + eta_x[ixm, iym, iz] + eta_x[ixm, iym, izm]) # st1 = (eta_x[ix, iy, iz] + eta_x[ix, iy, izm] + # eta_x[ix, iym, iz] + eta_x[ix, iym, izm]) st2 = (eta_y[ix, iym, iz] + eta_y[ix, iym, izm] + eta_y[ixm, iym, iz] + eta_y[ixm, iym, izm]) st3 = (eta_y[ix, iy, iz] + eta_y[ix, iy, izm] + eta_y[ixm, iy, iz] + eta_y[ixm, iy, izm]) st4 = (eta_z[ix, iy, izm] + eta_z[ix, iym, izm] + eta_z[ixm, iy, izm] + eta_z[ixm, iym, izm]) st5 = (eta_z[ix, iy, iz] + eta_z[ix, iym, iz] + eta_z[ixm, iy, iz] + eta_z[ixm, iym, iz]) st = np.array([st0, st2, st3, st4, st5])/4. # Fill middle matrix # Initial diagonal elements for k in range(5): middle[6*k] = -st[k] # Complete diagonals. # middle is symmetric and curl curl part is real-valued. middle[0] += mzyRxm/hy[iy] + mzyLxm/hy[iym] # 0,0| 0 middle[0] += myzRxm/hz[iz] + myzLxm/hz[izm] middle[6] += mzxRym/hx[ix] + mzxLym/hx[ixm] # 1,1| 6 middle[6] += mxzRym/hz[iz] + mxzLym/hz[izm] middle[12] += mzxRyp/hx[ix] + mzxLyp/hx[ixm] # 2,2|12 middle[12] += mxzRyp/hz[iz] + mxzLyp/hz[izm] middle[18] += myxRzm/hx[ix] + myxLzm/hx[ixm] # 3,3|18 middle[18] += mxyRzm/hy[iy] + mxyLzm/hy[iym] middle[24] += myxRzp/hx[ix] + myxLzp/hx[ixm] # 4,4|24 middle[24] += mxyRzp/hy[iy] + mxyLzp/hy[iym] # Off-diagonal elements of middle. # Upper triangle not needed and not set. # The elements # [2, 1] (7); [1, 2] (11); [4, 3] (19); and [3, 4] (23) # are all zero. middle[1] = -mzyLxm/hx[ixm] # 1,0| 1 and 0,1| 5 middle[2] = mzyRxm/hx[ixm] # 2,0| 2 and 0,2|10 middle[3] = -myzLxm/hx[ixm] # 3,0| 3 and 0,3|15 middle[4] = myzRxm/hx[ixm] # 4,0| 4 and 0,4|20 middle[8] = -mxzLym/hy[iym] # 3,1| 8 and 1,3|16 middle[9] = mxzRym/hy[iym] # 4,1| 9 and 1,4|21 middle[13] = mxzLyp/hy[iy] # 3,2|13 and 2,3|17 middle[14] = -mxzRyp/hy[iy] # 4,2|14 and 2,4|22 # Fill left matrix left left[5] = mzyLxm/hx[ixm] # 0,1| 5 left[10] = -mzyRxm/hx[ixm] # 0,2|10 left[15] = myzLxm/hx[ixm] # 0,3|15 left[20] = -myzRxm/hx[ixm] # 0,4|20 left[6] = -mzxLym/hx[ixm] # 1,1| 6 left[12] = -mzxLyp/hx[ixm] # 2,2|12 left[18] = -myxLzm/hx[ixm] # 3,3|18 left[24] = -myxLzp/hx[ixm] # 4,4|24 # Fill residual (b - Ux^{(k)}) # Note: rhs is NOT the full residual at this point # Residual / right-hand-side r0 = sx[ixm, iy, iz] # r1 = sx[ix, iy, iz] r2 = sy[ix, iym, iz] r3 = sy[ix, iy, iz] r4 = sz[ix, iy, izm] r5 = sz[ix, iy, iz] rhs = np.array([r0, r2, r3, r4, r5]) rhs[0] += mzyRxm*ex[ixm, iyp, iz]/hy[iy] rhs[0] += mzyLxm*ex[ixm, iym, iz]/hy[iym] rhs[0] += myzRxm*ex[ixm, iy, izp]/hz[iz] rhs[0] += myzLxm*ex[ixm, iy, izm]/hz[izm] rhs[1] += (mzxRym*ex[ix, iym, iz] - mzxLym*ex[ixm, iym, iz] + mxzRym*ez[ix, iym, iz] - mxzLym*ez[ix, iym, izm])/hy[iym] rhs[1] += mxzRym*ey[ix, iym, izp]/hz[iz] rhs[1] += mxzLym*ey[ix, iym, izm]/hz[izm] rhs[2] += (mzxLyp*ex[ixm, iyp, iz] - mzxRyp*ex[ix, iyp, iz] + mxzLyp*ez[ix, iyp, izm] - mxzRyp*ez[ix, iyp, iz])/hy[iy] rhs[2] += mxzRyp*ey[ix, iy, izp]/hz[iz] rhs[2] += mxzLyp*ey[ix, iy, izm]/hz[izm] rhs[3] += (myxRzm*ex[ix, iy, izm] - myxLzm*ex[ixm, iy, izm] + mxyRzm*ey[ix, iy, izm] - mxyLzm*ey[ix, iym, izm])/hz[izm] rhs[3] += mxyRzm*ez[ix, iyp, izm]/hy[iy] rhs[3] += mxyLzm*ez[ix, iym, izm]/hy[iym] rhs[4] += (myxLzp*ex[ixm, iy, izp] - myxRzp*ex[ix, iy, izp] + mxyLzp*ey[ix, iym, izp] - mxyRzp*ey[ix, iy, izp])/hz[iz] rhs[4] += mxyRzp*ez[ix, iyp, iz]/hy[iy] rhs[4] += mxyLzp*ez[ix, iym, iz]/hy[iym] # Copy to big system blocks_to_amat(amat, bvec, middle, left, rhs, ixm, nx) # Solve linear system A x = b solve(amat, bvec) # Update efield (here we could apply damping weights) for ix in range(1, nx+1): ixm = ix-1 ex[ixm, iy, iz] = bvec[5*ixm] if ixm < nx-1: ey[ix, iym, iz] = bvec[1+5*ixm] ey[ix, iy, iz] = bvec[2+5*ixm] ez[ix, iy, izm] = bvec[3+5*ixm] ez[ix, iy, iz] = bvec[4+5*ixm]
[docs]@nb.njit(**_numba_setting) def gauss_seidel_y(ex, ey, ez, sx, sy, sz, eta_x, eta_y, eta_z, zeta, hx, hy, hz, nu): r"""Gauss-Seidel method with line relaxation in y-direction. This is the equivalent to :func:`gauss_seidel`, but with line relaxation in the y-direction. See :func:`gauss_seidel` for more details on the smoother itself. The resulting system A x = b to solve consists of n unknowns (x-vector), and the corresponding matrix A is a banded matrix with the main diagonal and five upper and lower diagonals:: .-0 |X|\ 0 0-.-0 left: middle: right: \|X|\ (not used) 0-.-0 0- .- 0 \|X|\ \ |X |\ 0-.-0 0 \|X| 0-. . 1*1, - 4*1, | 1*4, X 4*4, \ 4*4 upper or lower The matrix A is complex and symmetric (A = A^T), and therefore only the main diagonal and the lower five off-diagonals are required. - The right-hand-side b has length 5*ny-4 (ny even). - The matrix A has length of b and 1+2*5 diagonals; we use for it an array of length 6*len(b). The values are computed in rows of 5 lines, with the indicated middle and left matrices as indicated in the above scheme. These blocks are filled into the main matrix A and vector b, and subsequently solved with a non-standard Cholesky factorisation implemented in :func:`solve`. Tangential components at the boundaries are assumed to be zero (PEC boundaries). Note: The smoothing with linerelaxation in y-direction is carried out in reversed lexicographical order, in order to improve speed (memory access). All other smoothers (:func:`gauss_seidel`, :func:`gauss_seidel_x`, and :func:`gauss_seidel_z`) use lexicographical order. The result is stored in the provided electric field components ``ex``, ``ey``, and ``ez``. Parameters ---------- ex, ey, ez : ndarray Electric fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). sx, sy, sz : ndarray Source fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). eta_x, eta_y, eta_z, zeta : ndarray Volume-averaged model parameters (:class:`emg3d.models.VolumeModel`). hx, hy, hz : ndarray Cell widths in x-, y-, and z-directions (:class:`emg3d.meshes.TensorMesh`). nu : int Number of Gauss-Seidel iterations. """ # Get dimensions nx = len(hx) ny = len(hy) nz = len(hz) # Get half of the inverse widths kx = 0.5/hx ky = 0.5/hy kz = 0.5/hz # Direction-switch for Gauss-Seidel iback = 0 # Pre-allocating middle and left for the 5x5-temporary middle and left # matrices; will be overwritten at each iteration middle = np.zeros(25, dtype=ex.dtype) left = np.zeros(25) # Pre-allocating full RHS (bvec) and full matrix A (amat). Will be # overwritten after each complete y-loop. nr = 5*ny-4 # Number of unknowns bvec = np.zeros(nr, dtype=ex.dtype) amat = np.zeros(6*nr, dtype=ex.dtype) # Smoothing steps for _ in range(nu): # Direction of Gauss-Seidel ordering; 0=forward, 1=backward iback = 1-iback # Loop over cells, keeping boundaries fixed; y-fastest, then z, x. for izh in range(1, nz): # Back-forth-switch if iback: iz = nz-izh else: iz = izh # Minus/plus indices izm = iz-1 izp = iz+1 for ixh in range(1, nx): # Back-forth-switch if iback: ix = nx-ixh else: ix = ixh # Minus/plus indices ixm = ix-1 ixp = ix+1 # Reset vectors middle[:] = 0. left[:] = 0. bvec[:] = 0. amat[:] = 0. for iyh in range(1, ny+1): # Index and minus index iy = min(iyh, ny-1) iym = iyh-1 # Averaging of 1/mu_r: mzyRxm etc. mzyLxm = ky[iym]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzyRxm = ky[iy]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) myzLxm = kz[izm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myzRxm = kz[iz]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) mzyLxp = ky[iym]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) mzyRxp = ky[iy]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) myzLxp = kz[izm]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) myzRxp = kz[iz]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) mzxLym = kx[ixm]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzxRym = kx[ix]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) mxzLym = kz[izm]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxzRym = kz[iz]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) # mzxLyp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) # mzxRyp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) # mxzLyp = kz[izm]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) # mxzRyp = kz[iz]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) myxLzm = kx[ixm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myxRzm = kx[ix]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) mxyLzm = ky[iym]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxyRzm = ky[iy]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) myxLzp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) myxRzp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) mxyLzp = ky[iym]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) mxyRzp = ky[iy]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) # Diagonal elements st0 = (eta_x[ixm, iy, iz] + eta_x[ixm, iy, izm] + eta_x[ixm, iym, iz] + eta_x[ixm, iym, izm]) st1 = (eta_x[ix, iy, iz] + eta_x[ix, iy, izm] + eta_x[ix, iym, iz] + eta_x[ix, iym, izm]) st2 = (eta_y[ix, iym, iz] + eta_y[ix, iym, izm] + eta_y[ixm, iym, iz] + eta_y[ixm, iym, izm]) # st3 = (eta_y[ix, iy, iz] + eta_y[ix, iy, izm] + # eta_y[ixm, iy, iz] + eta_y[ixm, iy, izm]) st4 = (eta_z[ix, iy, izm] + eta_z[ix, iym, izm] + eta_z[ixm, iy, izm] + eta_z[ixm, iym, izm]) st5 = (eta_z[ix, iy, iz] + eta_z[ix, iym, iz] + eta_z[ixm, iy, iz] + eta_z[ixm, iym, iz]) st = np.array([st2, st0, st1, st4, st5])/4. # Fill middle matrix # Initial diagonal elements for k in range(5): middle[6*k] = -st[k] # Complete diagonals. # middle is symmetric and curl curl part is real-valued. middle[0] += mzxRym/hx[ix] + mzxLym/hx[ixm] # 0,0| 0 middle[0] += mxzRym/hz[iz] + mxzLym/hz[izm] middle[6] += mzyRxm/hy[iy] + mzyLxm/hy[iym] # 1,1| 6 middle[6] += myzRxm/hz[iz] + myzLxm/hz[izm] middle[12] += mzyRxp/hy[iy] + mzyLxp/hy[iym] # 2,2|12 middle[12] += myzRxp/hz[iz] + myzLxp/hz[izm] middle[18] += myxRzm/hx[ix] + myxLzm/hx[ixm] # 3,3|18 middle[18] += mxyRzm/hy[iy] + mxyLzm/hy[iym] middle[24] += myxRzp/hx[ix] + myxLzp/hx[ixm] # 4,4|24 middle[24] += mxyRzp/hy[iy] + mxyLzp/hy[iym] # Off-diagonal elements of middle. # Upper triangle not needed and not set. # The elements # [2, 1] (7); [1, 2] (11); [4, 3] (19); and [3, 4] (23) # are all zero. middle[1] = -mzyLxm/hx[ixm] # 1,0| 1 and 0,1| 5 middle[2] = mzyLxp/hx[ix] # 2,0| 2 and 0,2|10 middle[3] = -mxzLym/hy[iym] # 3,0| 3 and 0,3|15 middle[4] = mxzRym/hy[iym] # 4,0| 4 and 0,4|20 middle[8] = -myzLxm/hx[ixm] # 3,1| 8 and 1,3|16 middle[9] = myzRxm/hx[ixm] # 4,1| 9 and 1,4|21 middle[13] = myzLxp/hx[ix] # 3,2|13 and 2,3|17 middle[14] = -myzRxp/hx[ix] # 4,2|14 and 2,4|22 # Fill left matrix left left[5] = mzxLym/hy[iym] # 0,1| 5 left[10] = -mzxRym/hy[iym] # 0,2|10 left[15] = mxzLym/hy[iym] # 0,3|15 left[20] = -mxzRym/hy[iym] # 0,4|20 left[6] = -mzyLxm/hy[iym] # 1,1| 6 left[12] = -mzyLxp/hy[iym] # 2,2|12 left[18] = -mxyLzm/hy[iym] # 3,3|18 left[24] = -mxyLzp/hy[iym] # 4,4|24 # Fill residual (b - Ux^{(k)}) # Note: rhs is NOT the full residual at this point # Residual / right-hand-side r0 = sx[ixm, iy, iz] r1 = sx[ix, iy, iz] r2 = sy[ix, iym, iz] # r3 = sy[ix, iy, iz] r4 = sz[ix, iy, izm] r5 = sz[ix, iy, iz] rhs = np.array([r2, r0, r1, r4, r5]) rhs[0] += mzxRym*ey[ixp, iym, iz]/hx[ix] rhs[0] += mzxLym*ey[ixm, iym, iz]/hx[ixm] rhs[0] += mxzRym*ey[ix, iym, izp]/hz[iz] rhs[0] += mxzLym*ey[ix, iym, izm]/hz[izm] rhs[1] += (mzyRxm*ey[ixm, iy, iz] - mzyLxm*ey[ixm, iym, iz] + myzRxm*ez[ixm, iy, iz] - myzLxm*ez[ixm, iy, izm])/hx[ixm] rhs[1] += myzRxm*ex[ixm, iy, izp]/hz[iz] rhs[1] += myzLxm*ex[ixm, iy, izm]/hz[izm] rhs[2] += (mzyLxp*ey[ixp, iym, iz] - mzyRxp*ey[ixp, iy, iz] + myzLxp*ez[ixp, iy, izm] - myzRxp*ez[ixp, iy, iz])/hx[ix] rhs[2] += myzRxp*ex[ix, iy, izp]/hz[iz] rhs[2] += myzLxp*ex[ix, iy, izm]/hz[izm] rhs[3] += (myxRzm*ex[ix, iy, izm] - myxLzm*ex[ixm, iy, izm] + mxyRzm*ey[ix, iy, izm] - mxyLzm*ey[ix, iym, izm])/hz[izm] rhs[3] += myxRzm*ez[ixp, iy, izm]/hx[ix] rhs[3] += myxLzm*ez[ixm, iy, izm]/hx[ixm] rhs[4] += (myxLzp*ex[ixm, iy, izp] - myxRzp*ex[ix, iy, izp] + mxyLzp*ey[ix, iym, izp] - mxyRzp*ey[ix, iy, izp])/hz[iz] rhs[4] += myxRzp*ez[ixp, iy, iz]/hx[ix] rhs[4] += myxLzp*ez[ixm, iy, iz]/hx[ixm] # Copy to big system blocks_to_amat(amat, bvec, middle, left, rhs, iym, ny) # Solve linear system A x = b solve(amat, bvec) # Update efield (here we could apply damping weights) for iy in range(1, ny+1): iym = iy-1 ey[ix, iym, iz] = bvec[5*iym] if iym < ny-1: ex[ixm, iy, iz] = bvec[1+5*iym] ex[ix, iy, iz] = bvec[2+5*iym] ez[ix, iy, izm] = bvec[3+5*iym] ez[ix, iy, iz] = bvec[4+5*iym]
[docs]@nb.njit(**_numba_setting) def gauss_seidel_z(ex, ey, ez, sx, sy, sz, eta_x, eta_y, eta_z, zeta, hx, hy, hz, nu): r"""Gauss-Seidel method with line relaxation in z-direction. This is the equivalent to :func:`gauss_seidel`, but with line relaxation in the z-direction. See :func:`gauss_seidel` for more details on the smoother itself. The resulting system A x = b to solve consists of n unknowns (x-vector), and the corresponding matrix A is a banded matrix with the main diagonal and five upper and lower diagonals:: .-0 |X|\ 0 0-.-0 left: middle: right: \|X|\ (not used) 0-.-0 0- .- 0 \|X|\ \ |X |\ 0-.-0 0 \|X| 0-. . 1*1, - 4*1, | 1*4, X 4*4, \ 4*4 upper or lower The matrix A is complex and symmetric (A = A^T), and therefore only the main diagonal and the lower five off-diagonals are required. - The right-hand-side b has length 5*nz-4 (nz even). - The matrix A has length of b and 1+2*5 diagonals; we use for it an array of length 6*len(b). The values are computed in rows of 5 lines, with the indicated middle and left matrices as indicated in the above scheme. These blocks are filled into the main matrix A and vector b, and subsequently solved with a non-standard Cholesky factorisation implemented in :func:`solve`. Tangential components at the boundaries are assumed to be zero (PEC boundaries). The result is stored in the provided electric field components ``ex``, ``ey``, and ``ez``. Parameters ---------- ex, ey, ez : ndarray Electric fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). sx, sy, sz : ndarray Source fields in x-, y-, and z-directions (:class:`emg3d.fields.Field`). eta_x, eta_y, eta_z, zeta : ndarray Volume-averaged model parameters (:class:`emg3d.models.VolumeModel`). hx, hy, hz : ndarray Cell widths in x-, y-, and z-directions (:class:`emg3d.meshes.TensorMesh`). nu : int Number of Gauss-Seidel iterations. """ # Get dimensions nx = len(hx) ny = len(hy) nz = len(hz) # Get half of the inverse widths kx = 0.5/hx ky = 0.5/hy kz = 0.5/hz # Direction-switch for Gauss-Seidel iback = 0 # Pre-allocating middle and left for the 5x5-temporary middle and left # matrices; will be overwritten at each iteration middle = np.zeros(25, dtype=ex.dtype) left = np.zeros(25) # Pre-allocating full RHS (bvec) and full matrix A (amat). Will be # overwritten after each complete z-loop. nr = 5*nz-4 # Number of unknowns bvec = np.zeros(nr, dtype=ex.dtype) amat = np.zeros(6*nr, dtype=ex.dtype) # Smoothing steps for _ in range(nu): # Direction of Gauss-Seidel ordering; 0=forward, 1=backward iback = 1-iback # Loop over cells, keeping boundaries fixed; z-fastest, then x, y. for iyh in range(1, ny): # Back-forth-switch if iback: iy = ny-iyh else: iy = iyh # Minus/plus indices iym = iy-1 iyp = iy+1 for ixh in range(1, nx): # Back-forth-switch if iback: ix = nx-ixh else: ix = ixh # Minus/plus indices ixm = ix-1 ixp = ix+1 # Reset vectors middle[:] = 0. left[:] = 0. bvec[:] = 0. amat[:] = 0. for izh in range(1, nz+1): # Index and minus index iz = min(izh, nz-1) izm = izh-1 # Averaging of 1/mu_r: mzyRxm etc. mzyLxm = ky[iym]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzyRxm = ky[iy]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) myzLxm = kz[izm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myzRxm = kz[iz]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) mzyLxp = ky[iym]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) mzyRxp = ky[iy]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) myzLxp = kz[izm]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) myzRxp = kz[iz]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) mzxLym = kx[ixm]*(zeta[ixm, iym, iz] + zeta[ixm, iym, izm]) mzxRym = kx[ix]*(zeta[ix, iym, iz] + zeta[ix, iym, izm]) mxzLym = kz[izm]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxzRym = kz[iz]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) mzxLyp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iy, izm]) mzxRyp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iy, izm]) mxzLyp = kz[izm]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) mxzRyp = kz[iz]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) myxLzm = kx[ixm]*(zeta[ixm, iy, izm] + zeta[ixm, iym, izm]) myxRzm = kx[ix]*(zeta[ix, iy, izm] + zeta[ix, iym, izm]) mxyLzm = ky[iym]*(zeta[ix, iym, izm] + zeta[ixm, iym, izm]) mxyRzm = ky[iy]*(zeta[ix, iy, izm] + zeta[ixm, iy, izm]) # myxLzp = kx[ixm]*(zeta[ixm, iy, iz] + zeta[ixm, iym, iz]) # myxRzp = kx[ix]*(zeta[ix, iy, iz] + zeta[ix, iym, iz]) # mxyLzp = ky[iym]*(zeta[ix, iym, iz] + zeta[ixm, iym, iz]) # mxyRzp = ky[iy]*(zeta[ix, iy, iz] + zeta[ixm, iy, iz]) # Diagonal elements st0 = (eta_x[ixm, iy, iz] + eta_x[ixm, iy, izm] + eta_x[ixm, iym, iz] + eta_x[ixm, iym, izm]) st1 = (eta_x[ix, iy, iz] + eta_x[ix, iy, izm] + eta_x[ix, iym, iz] + eta_x[ix, iym, izm]) st2 = (eta_y[ix, iym, iz] + eta_y[ix, iym, izm] + eta_y[ixm, iym, iz] + eta_y[ixm, iym, izm]) st3 = (eta_y[ix, iy, iz] + eta_y[ix, iy, izm] + eta_y[ixm, iy, iz] + eta_y[ixm, iy, izm]) st4 = (eta_z[ix, iy, izm] + eta_z[ix, iym, izm] + eta_z[ixm, iy, izm] + eta_z[ixm, iym, izm]) # st5 = (eta_z[ix, iy, iz] + eta_z[ix, iym, iz] + # eta_z[ixm, iy, iz] + eta_z[ixm, iym, iz]) st = np.array([st4, st0, st1, st2, st3])/4. # Fill middle matrix # Initial diagonal elements for k in range(5): middle[6*k] = -st[k] # Complete diagonals. # middle is symmetric and curl curl part is real-valued. middle[0] += myxRzm/hx[ix] + myxLzm/hx[ixm] # 0,0| 0 middle[0] += mxyRzm/hy[iy] + mxyLzm/hy[iym] middle[6] += mzyRxm/hy[iy] + mzyLxm/hy[iym] # 1,1| 6 middle[6] += myzRxm/hz[iz] + myzLxm/hz[izm] middle[12] += mzyRxp/hy[iy] + mzyLxp/hy[iym] # 2,2|12 middle[12] += myzRxp/hz[iz] + myzLxp/hz[izm] middle[18] += mzxRym/hx[ix] + mzxLym/hx[ixm] # 3,3|18 middle[18] += mxzRym/hz[iz] + mxzLym/hz[izm] middle[24] += mzxRyp/hx[ix] + mzxLyp/hx[ixm] # 4,4|24 middle[24] += mxzRyp/hz[iz] + mxzLyp/hz[izm] # Off-diagonal elements of middle. # Upper triangle not needed and not set. # The elements # [2, 1] (7); [1, 2] (11); [4, 3] (19); and [3, 4] (23) # are all zero. middle[1] = -myzLxm/hx[ixm] # 1,0| 1 and 0,1| 5 middle[2] = myzLxp/hx[ix] # 2,0| 2 and 0,2|10 middle[3] = -mxzLym/hy[iym] # 3,0| 3 and 0,3|15 middle[4] = mxzLyp/hy[iy] # 4,0| 4 and 0,4|20 middle[8] = -mzyLxm/hx[ixm] # 3,1| 8 and 1,3|16 middle[9] = mzyRxm/hx[ixm] # 4,1| 9 and 1,4|21 middle[13] = mzyLxp/hx[ix] # 3,2|13 and 2,3|17 middle[14] = -mzyRxp/hx[ix] # 4,2|14 and 2,4|22 # Fill left matrix left left[5] = myxLzm/hz[izm] # 0,1| 5 left[10] = -myxRzm/hz[izm] # 0,2|10 left[15] = mxyLzm/hz[izm] # 0,3|15 left[20] = -mxyRzm/hz[izm] # 0,4|20 left[6] = -myzLxm/hz[izm] # 1,1| 6 left[12] = -myzLxp/hz[izm] # 2,2|12 left[18] = -mxzLym/hz[izm] # 3,3|18 left[24] = -mxzLyp/hz[izm] # 4,4|24 # Fill residual (b - Ux^{(k)}) # Note: rhs is NOT the full residual at this point # Residual / right-hand-side r0 = sx[ixm, iy, iz] r1 = sx[ix, iy, iz] r2 = sy[ix, iym, iz] r3 = sy[ix, iy, iz] r4 = sz[ix, iy, izm] # r5 = sz[ix, iy, iz] rhs = np.array([r4, r0, r1, r2, r3]) rhs[0] += myxRzm*(ez[ixp, iy, izm]/hx[ix]) rhs[0] += myxLzm*(ez[ixm, iy, izm]/hx[ixm]) rhs[0] += mxyRzm*(ez[ix, iyp, izm]/hy[iy]) rhs[0] += mxyLzm*(ez[ix, iym, izm]/hy[iym]) rhs[1] += (mzyRxm*ey[ixm, iy, iz] - mzyLxm*ey[ixm, iym, iz] + myzRxm*ez[ixm, iy, iz] - myzLxm*ez[ixm, iy, izm])/hx[ixm] rhs[1] += mzyRxm*ex[ixm, iyp, iz]/hy[iy] rhs[1] += mzyLxm*ex[ixm, iym, iz]/hy[iym] rhs[2] += (mzyLxp*ey[ixp, iym, iz] - mzyRxp*ey[ixp, iy, iz] + myzLxp*ez[ixp, iy, izm] - myzRxp*ez[ixp, iy, iz])/hx[ix] rhs[2] += mzyRxp*ex[ix, iyp, iz]/hy[iy] rhs[2] += mzyLxp*ex[ix, iym, iz]/hy[iym] rhs[3] += (mzxRym*ex[ix, iym, iz] - mzxLym*ex[ixm, iym, iz] + mxzRym*ez[ix, iym, iz] - mxzLym*ez[ix, iym, izm])/hy[iym] rhs[3] += mzxRym*ey[ixp, iym, iz]/hx[ix] rhs[3] += mzxLym*ey[ixm, iym, iz]/hx[ixm] rhs[4] += (mzxLyp*ex[ixm, iyp, iz] - mzxRyp*ex[ix, iyp, iz] + mxzLyp*ez[ix, iyp, izm] - mxzRyp*ez[ix, iyp, iz])/hy[iy] rhs[4] += mzxRyp*ey[ixp, iy, iz]/hx[ix] rhs[4] += mzxLyp*ey[ixm, iy, iz]/hx[ixm] # Copy to big system blocks_to_amat(amat, bvec, middle, left, rhs, izm, nz) # Solve linear system A x = b solve(amat, bvec) # Update efield (here we could apply damping weights) for iz in range(1, nz+1): izm = iz-1 ez[ix, iy, izm] = bvec[5*izm] if izm < nz-1: ex[ixm, iy, iz] = bvec[1+5*izm] ex[ix, iy, iz] = bvec[2+5*izm] ey[ix, iym, iz] = bvec[3+5*izm] ey[ix, iy, iz] = bvec[4+5*izm]
[docs]@nb.njit(**_numba_setting) def blocks_to_amat(amat, bvec, middle, left, rhs, im, nc): r"""Insert middle, left, and rhs into main arrays amat and bvec. The banded matrix ``amat`` contains the main diagonal and the first five lower off-diagonals. They are stored one column after the other, in a 6*n ndarray. .. highlight:: none The complete main matrix ``amat`` and the ``middle`` and ``left`` blocks are given by:: .-0 |X|\ 0 0-.-0 left: middle: right: \|X|\ (not used) 0-.-0 0- .- 0 \|X|\ \ |X |\ 0-.-0 0 \|X| 0-. . 1*1, - 4*1, | 1*4, X 4*4, \ 4*4 upper or lower Both, ``middle`` and ``left``, are 5x5 matrices. The corresponding right-hand-side ``rhs`` is filled into ``bvec``. The matrices ``left`` and ``middle`` provided in a single call are horizontally aligned (not vertically). The sorting of ``amat`` (banded matrix) and ``bvec`` is given by:: amat (66,) example: n = 11 bvec (11,) -------------- -- |01 | FIRST CALL 01 |02 07 | Only `middle` and `rhs` 02 |03 08 13 | are used, not `left`. 03 |04 09 14 19 | 04 |05 10 15 20 25| 05 -------------- -------------- -- | 0 11 16 21 26|31 | SUBSEQUENT CALLS 06 | 12 17 22 27|32 37 | (normal case) 07 | 18 23 28|33 38 43 | Complete `left`, 08 | 24 29|34 39 44 49 | `middle` and `rhs` 09 | 30|35 40 45 50 55| are used. 10 -------------- -------------- --- -- | 0 41 46 51 56|61 LAST CALL 11 | 0 0 0 0| 0 Only top row of `left` | 0 0 0| 0 and the first elements | 0 0| 0 of `middle` and `rhs` | 0| 0 are used. -------------- --- | 0 Single zeros (0) show elements in amat which are 0, hence not used. Their location in amat can be deduced from their neighbours. .. highlight:: default Parameters ---------- amat : ndarray Main banded matrix (stored as array) of length 6*n. bvec : ndarray Main right-hand-side of length n. middle : ndarray Middle block of size 5x5, as ndarray of length 25. Only the diagonal and the lower triangular part are used. left : ndarray Left block of size 5x5, as ndarray of length 25. Only the diagonal and the first row are used. rhs : ndarray Corresponding right-hand-side of length 5. im : int Current minus-index of direction of line relaxation, ``i{x;y;z}m``. nc : int Total number of cells in direction of line relaxation, ``n{x;y;z}``. """ # Define two often used indices fam = 5*im mam = fam-5 if im == 0: # First block-row; only middle, no left # RHS for k in range(5): bvec[k] = rhs[k] # Middle block for k in range(5): for m in range(k+1): amat[k+5*m] = middle[k+5*m] elif im <= nc-2 and nc > 2: # Normal case; full middle and left # RHS for k in range(5): bvec[k+fam] = rhs[k] # Left block for m in range(1, 5): for k in range(m+1): amat[k+fam+5*(m+mam)] = left[k+5*m] # Middle block for k in range(5): for m in range(k+1): amat[k+fam+5*(m+fam)] = middle[k+5*m] elif im == nc-1: # The last point # RHS bvec[fam] = rhs[0] # First row from left block for m in range(1, 5): amat[fam+5*(m+mam)] = left[5*m] # First element from middle block amat[6*fam] = middle[0]
# Actual solver (the core of the core)
[docs]@nb.njit(**_numba_setting) def solve(amat, bvec): r"""Solve A x = b using a non-standard Cholesky factorisation. Solve the system A x = b using a non-standard Cholesky factorisation without pivoting for a symmetric, complex matrix A tailored to the problem of the multigrid solver. The matrix A (``amat``) is an array of length 6*n, containing the main diagonal and the first five lower off-diagonals (ordered so that the first element of the main diagonal is followed by the first elements of the off diagonals, then the second elements and so on). The vector ``bvec`` has length b. The solution is placed in b (``bvec``), and A (``amat``) is replaced by its decomposition. 1. Non-standard Cholesky factorisation. From [Muld07]_: «We use a non-standard Cholesky factorisation. The standard factorisation factors a Hermitian matrix A into L L^H, where L is a lower triangular matrix and L^H its complex conjugate transpose. In our case, the discretisation is based on the Finite Integration Technique ([Weil77]_) and provides a matrix A that is complex-valued and symmetric: A = A^T, where the superscript T denotes the transpose. The line relaxation scheme takes a matrix B that is a subset of A along the line. B is a complex symmetric band matrix with eleven diagonals. The non-standard Cholesky factorisation factors the matrix B into L L^T. Because of the symmetry, only the main diagonal and five lower diagonal elements of B need to be computed. The Cholesky factorisation replaces this matrix by L, containing six diagonals, after which the line relaxation can be carried out by simple back-substitution.» :math:`A = L D L^T` factorisation without pivoting: .. math:: D(j) &= A(j,j)-\sum_{k=1}^{j-1} L(j,k)^2 D(k),\ j=1,..,n ;\\ L(i,j) &= \frac{1}{D(j)} \left[A(i,j)-\sum_{k=1}^{j-1} L(i,k)L(j,k)D(k)\right], \ i=j+1,..,n . A and L are in this case arrays, where :math:`A(i, j) \rightarrow A(i+5j)`. 2. Solve A x = b. Solve A x = b, given L which is the result from the factorisation in the first step (and stored in A), hence, solve L x = b, where x is stored in b: .. math:: b(j) = b(j) - \sum_{k=1}^{j-1} L(j,k) x(k), j = 2,..,n . The result is equivalent with simply using :func:`numpy.linalg.solve`, but faster for the particular use-case of this code. Note that in this custom solver there is no pivoting, and the diagonals of the matrix cannot be zero. Parameters ---------- amat : ndarray Banded matrix A provided as a vector of length 6*n, containing main diagonal plus first five lower diagonals. bvec : ndarray Right-hand-side vector b of length n. """ # Number of unknowns n = len(bvec) # Pre-allocate h h = np.zeros(1, dtype=amat.dtype)[0] # 1. Get L from non-standard Cholesky L D L^T factorisation # First element (i = j = 0). Warning: Diagonals of amat cannot be 0! d = 1./amat[0] # Multiply to other elements of first column (j = 0) for i in range(1, min(n, 6)): amat[i] *= d # Other columns (1 to n) for j in range(1, n): h *= 0. # Reset h for k in range(max(0, j-5), j): h += amat[j+5*k]*amat[j+5*k]*amat[6*k] amat[6*j] -= h # Warning: Diagonals of amat cannot be 0! d = 1./amat[6*j] # Off-diagonals, rows i > j for i in range(j+1, min(n, j+6)): h *= 0. # Reset h for k in range(max(0, i-5), j): h += amat[i+5*k]*amat[j+5*k]*amat[6*k] amat[i+5*j] -= h amat[i+5*j] *= d # Replace diagonal by 1/D amat[6*(n-1)] = d # Last one is still around for j in range(n-2, -1, -1): amat[6*j] = 1./amat[6*j] # 2. Solve A x = b # All elements except first column for j in range(1, n): h *= 0. # Reset h for k in range(max(0, j-5), j): h += amat[j+5*k]*bvec[k] bvec[j] -= h # Divide by diagonal; A[j, j] (hence A[6j]) contains 1/D[j] for j in range(n): bvec[j] *= amat[6*j] # Solve L^T x = b, x stored in b, L is 1 on diagonal for j in range(n-2, -1, -1): h *= 0. # Reset h for k in range(j+1, min(n, j+6)): h += amat[k+5*j]*bvec[k] bvec[j] -= h
# Restriction
[docs]@nb.njit(**_numba_setting) def restrict(crx, cry, crz, rx, ry, rz, wx, wy, wz, sc_dir): r"""Restriction of residual from fine to coarse grid. Corresponds to Equation 8 in [Muld06]_. The equation for the x-direction, using the notation :math:`\{x,y,z\}` instead of :math:`\{1,2,3\}`, is given by .. math:: r_{x,K+1/2,L,M}^{2h} = &\sum_{j_y=-1}^1\sum_{j_z=-1}^1 w_{L,j_y}^y w_{M,j_z}^z \\ &\times \left(r_{x,k+1/2,l+j_y,m+j_z}^h+r_{x,k+3/2,l+j_y,m+j_z}^h\right) . The superscripts :math:`h, 2h` indicate quantities defined on the coarse grid and on the fine grid, respectively. The indices :math:`\{K, L, M\}` on the coarse grid correspond to :math:`\{k, l, m\} = 2\{K, L, M\}` on the fine grid. The weights :math:`w` are obtained from :func:`restrict_weights`. The restrictions of ``rx``, ``ry``, and ``rz`` are stored directly in ``crx``, ``cry``, and ``crz``. Parameters ---------- crx, cry, crz : ndarray Coarse grid {x,y,z}-directed residual (pre-allocated empty arrays). rx, ry, rz : ndarray Fine grid {x,y,z}-directed residual. wx, wy, wz: (ndarray, ndarray, ndarray) Tuples containing the weights (``wl``, ``w0``, ``wr``) as returned from :func:`restrict_weights` for the {x,y,z}-directions. sc_dir : int Direction of semicoarsening; 0 for no semicoarsening. """ # Number of coarse grid edges. cnx, cny, cnz = cry.shape[0], crx.shape[1], crx.shape[2] # Number of fine grid edges. nx, ny, nz = ry.shape[0], rx.shape[1], rx.shape[2] # Get weights wxl, wx0, wxr = wx wyl, wy0, wyr = wy wzl, wz0, wzr = wz if sc_dir == 0: # Standard # Loop over coarse z-edges. for ciz in range(cnz): iz = 2*ciz izm = max(0, iz-1) izp = min(nz-1, iz+1) # Loop over coarse y-edges. for ciy in range(cny): iy = 2*ciy iym = max(0, iy-1) iyp = min(ny-1, iy+1) # Loop over coarse x-edges. for cix in range(cnx): ix = 2*cix ixm = max(0, ix-1) ixp = min(nx-1, ix+1) # Sum the terms for x-field. if cix < cnx-1: crx[cix, ciy, ciz] = wy0[ciy]*( wz0[ciz]*(rx[ix, iy, iz] + rx[ixp, iy, iz]) + wzl[ciz]*(rx[ix, iy, izm] + rx[ixp, iy, izm]) + wzr[ciz]*(rx[ix, iy, izp] + rx[ixp, iy, izp]) ) crx[cix, ciy, ciz] += wyl[ciy]*( wz0[ciz]*(rx[ix, iym, iz] + rx[ixp, iym, iz]) + wzl[ciz]*(rx[ix, iym, izm] + rx[ixp, iym, izm]) + wzr[ciz]*(rx[ix, iym, izp] + rx[ixp, iym, izp]) ) crx[cix, ciy, ciz] += wyr[ciy]*( wz0[ciz]*(rx[ix, iyp, iz] + rx[ixp, iyp, iz]) + wzl[ciz]*(rx[ix, iyp, izm] + rx[ixp, iyp, izm]) + wzr[ciz]*(rx[ix, iyp, izp] + rx[ixp, iyp, izp]) ) # Sum the terms for y-field. if ciy < cny-1: cry[cix, ciy, ciz] = wx0[cix]*( wz0[ciz]*(ry[ix, iy, iz] + ry[ix, iyp, iz]) + wzl[ciz]*(ry[ix, iy, izm] + ry[ix, iyp, izm]) + wzr[ciz]*(ry[ix, iy, izp] + ry[ix, iyp, izp]) ) cry[cix, ciy, ciz] += wxl[cix]*( wz0[ciz]*(ry[ixm, iy, iz] + ry[ixm, iyp, iz]) + wzl[ciz]*(ry[ixm, iy, izm] + ry[ixm, iyp, izm]) + wzr[ciz]*(ry[ixm, iy, izp] + ry[ixm, iyp, izp]) ) cry[cix, ciy, ciz] += wxr[cix]*( wz0[ciz]*(ry[ixp, iy, iz] + ry[ixp, iyp, iz]) + wzl[ciz]*(ry[ixp, iy, izm] + ry[ixp, iyp, izm]) + wzr[ciz]*(ry[ixp, iy, izp] + ry[ixp, iyp, izp]) ) # Sum the terms for z-field. if ciz < cnz-1: crz[cix, ciy, ciz] = wx0[cix]*( wy0[ciy]*(rz[ix, iy, iz] + rz[ix, iy, izp]) + wyl[ciy]*(rz[ix, iym, iz] + rz[ix, iym, izp]) + wyr[ciy]*(rz[ix, iyp, iz] + rz[ix, iyp, izp]) ) crz[cix, ciy, ciz] += wxl[cix]*( wy0[ciy]*(rz[ixm, iy, iz] + rz[ixm, iy, izp]) + wyl[ciy]*(rz[ixm, iym, iz] + rz[ixm, iym, izp]) + wyr[ciy]*(rz[ixm, iyp, iz] + rz[ixm, iyp, izp]) ) crz[cix, ciy, ciz] += wxr[cix]*( wy0[ciy]*(rz[ixp, iy, iz] + rz[ixp, iy, izp]) + wyl[ciy]*(rz[ixp, iym, iz] + rz[ixp, iym, izp]) + wyr[ciy]*(rz[ixp, iyp, iz] + rz[ixp, iyp, izp]) ) elif sc_dir == 1: # Restrict in y- and z-directions # Loop over coarse z-edges. for ciz in range(cnz): iz = 2*ciz izm = max(0, iz-1) izp = min(nz-1, iz+1) # Loop over coarse y-edges. for ciy in range(cny): iy = 2*ciy iym = max(0, iy-1) iyp = min(ny-1, iy+1) # Loop over coarse x-edges. for cix in range(cnx): # Sum the terms for x-field. if cix < cnx-1: crx[cix, ciy, ciz] = wy0[ciy]*( wz0[ciz]*rx[cix, iy, iz] + wzl[ciz]*rx[cix, iy, izm] + wzr[ciz]*rx[cix, iy, izp] ) crx[cix, ciy, ciz] += wyl[ciy]*( wz0[ciz]*rx[cix, iym, iz] + wzl[ciz]*rx[cix, iym, izm] + wzr[ciz]*rx[cix, iym, izp] ) crx[cix, ciy, ciz] += wyr[ciy]*( wz0[ciz]*rx[cix, iyp, iz] + wzl[ciz]*rx[cix, iyp, izm] + wzr[ciz]*rx[cix, iyp, izp] ) # Sum the terms for y-field. if ciy < cny-1: cry[cix, ciy, ciz] = ( wz0[ciz]*(ry[cix, iy, iz] + ry[cix, iyp, iz]) + wzl[ciz]*(ry[cix, iy, izm] + ry[cix, iyp, izm]) + wzr[ciz]*(ry[cix, iy, izp] + ry[cix, iyp, izp]) ) # Sum the terms for z-field. if ciz < cnz-1: crz[cix, ciy, ciz] = ( wy0[ciy]*(rz[cix, iy, iz] + rz[cix, iy, izp]) + wyl[ciy]*(rz[cix, iym, iz] + rz[cix, iym, izp]) + wyr[ciy]*(rz[cix, iyp, iz] + rz[cix, iyp, izp]) ) elif sc_dir == 2: # Restrict in x- and z-directions # Loop over coarse z-edges. for ciz in range(cnz): iz = 2*ciz izm = max(0, iz-1) izp = min(nz-1, iz+1) # Loop over coarse y-edges. for ciy in range(cny): # Loop over coarse x-edges. for cix in range(cnx): ix = 2*cix ixm = max(0, ix-1) ixp = min(nx-1, ix+1) # Sum the terms for x-field. if cix < cnx-1: crx[cix, ciy, ciz] = ( wz0[ciz]*(rx[ix, ciy, iz] + rx[ixp, ciy, iz]) + wzl[ciz]*(rx[ix, ciy, izm] + rx[ixp, ciy, izm]) + wzr[ciz]*(rx[ix, ciy, izp] + rx[ixp, ciy, izp]) ) # Sum the terms for y-field. if ciy < cny-1: cry[cix, ciy, ciz] = wx0[cix]*( wz0[ciz]*ry[ix, ciy, iz] + wzl[ciz]*ry[ix, ciy, izm] + wzr[ciz]*ry[ix, ciy, izp] ) cry[cix, ciy, ciz] += wxl[cix]*( wz0[ciz]*ry[ixm, ciy, iz] + wzl[ciz]*ry[ixm, ciy, izm] + wzr[ciz]*ry[ixm, ciy, izp] ) cry[cix, ciy, ciz] += wxr[cix]*( wz0[ciz]*ry[ixp, ciy, iz] + wzl[ciz]*ry[ixp, ciy, izm] + wzr[ciz]*ry[ixp, ciy, izp] ) # Sum the terms for z-field. if ciz < cnz-1: crz[cix, ciy, ciz] = ( wx0[cix]*(rz[ix, ciy, iz] + rz[ix, ciy, izp]) + wxl[cix]*(rz[ixm, ciy, iz] + rz[ixm, ciy, izp]) + wxr[cix]*(rz[ixp, ciy, iz] + rz[ixp, ciy, izp]) ) elif sc_dir == 3: # Restrict in x- and y-directions # Loop over coarse z-edges. for ciz in range(cnz): # Loop over coarse y-edges. for ciy in range(cny): iy = 2*ciy iym = max(0, iy-1) iyp = min(ny-1, iy+1) # Loop over coarse x-edges. for cix in range(cnx): ix = 2*cix ixm = max(0, ix-1) ixp = min(nx-1, ix+1) # Sum the term for x-field. if cix < cnx-1: crx[cix, ciy, ciz] = ( wy0[ciy]*(rx[ix, iy, ciz] + rx[ixp, iy, ciz]) + wyl[ciy]*(rx[ix, iym, ciz] + rx[ixp, iym, ciz]) + wyr[ciy]*(rx[ix, iyp, ciz] + rx[ixp, iyp, ciz]) ) # Sum the term for y-field. if ciy < cny-1: cry[cix, ciy, ciz] = ( wx0[cix]*(ry[ix, iy, ciz] + ry[ix, iyp, ciz]) + wxl[cix]*(ry[ixm, iy, ciz] + ry[ixm, iyp, ciz]) + wxr[cix]*(ry[ixp, iy, ciz] + ry[ixp, iyp, ciz]) ) # Sum the terms for z-field. if ciz < cnz-1: crz[cix, ciy, ciz] = wx0[cix]*( wy0[ciy]*rz[ix, iy, ciz] + wyl[ciy]*rz[ix, iym, ciz] + wyr[ciy]*rz[ix, iyp, ciz] ) crz[cix, ciy, ciz] += wxl[cix]*( wy0[ciy]*rz[ixm, iy, ciz] + wyl[ciy]*rz[ixm, iym, ciz] + wyr[ciy]*rz[ixm, iyp, ciz] ) crz[cix, ciy, ciz] += wxr[cix]*( wy0[ciy]*rz[ixp, iy, ciz] + wyl[ciy]*rz[ixp, iym, ciz] + wyr[ciy]*rz[ixp, iyp, ciz] ) elif sc_dir == 4: # Restrict in x-direction # Loop over coarse z-edges. for ciz in range(cnz): # Loop over coarse y-edges. for ciy in range(cny): # Loop over coarse x-edges. for cix in range(cnx): ix = 2*cix ixm = max(0, ix-1) ixp = min(nx-1, ix+1) # Sum the terms for x-field. if cix < cnx-1: crx[cix, ciy, ciz] = rx[ix, ciy, ciz] crx[cix, ciy, ciz] += rx[ixp, ciy, ciz] # Sum the terms for y-field. if ciy < cny-1: cry[cix, ciy, ciz] = wx0[cix]*ry[ix, ciy, ciz] cry[cix, ciy, ciz] += wxl[cix]*ry[ixm, ciy, ciz] cry[cix, ciy, ciz] += wxr[cix]*ry[ixp, ciy, ciz] # Sum the terms for z-field. if ciz < cnz-1: crz[cix, ciy, ciz] = wx0[cix]*rz[ix, ciy, ciz] crz[cix, ciy, ciz] += wxl[cix]*rz[ixm, ciy, ciz] crz[cix, ciy, ciz] += wxr[cix]*rz[ixp, ciy, ciz] elif sc_dir == 5: # Restrict in y-direction # Loop over coarse z-edges. for ciz in range(cnz): # Loop over coarse y-edges. for ciy in range(cny): iy = 2*ciy iym = max(0, iy-1) iyp = min(ny-1, iy+1) # Loop over coarse x-edges. for cix in range(cnx): # Sum the terms for x-field. if cix < cnx-1: crx[cix, ciy, ciz] = wy0[ciy]*rx[cix, iy, ciz] crx[cix, ciy, ciz] += wyl[ciy]*rx[cix, iym, ciz] crx[cix, ciy, ciz] += wyr[ciy]*rx[cix, iyp, ciz] # Sum the terms for y-field. if ciy < cny-1: cry[cix, ciy, ciz] = ry[cix, iy, ciz] cry[cix, ciy, ciz] += ry[cix, iyp, ciz] # Sum the terms for z-field. if ciz < cnz-1: crz[cix, ciy, ciz] = wy0[ciy]*rz[cix, iy, ciz] crz[cix, ciy, ciz] += wyl[ciy]*rz[cix, iym, ciz] crz[cix, ciy, ciz] += wyr[ciy]*rz[cix, iyp, ciz] elif sc_dir == 6: # Restrict in z-direction # Loop over coarse z-edges. for ciz in range(cnz): iz = 2*ciz izm = max(0, iz-1) izp = min(nz-1, iz+1) # Loop over coarse y-edges. for ciy in range(cny): # Loop over coarse x-edges. for cix in range(cnx): # Sum the terms for x-field. if cix < cnx-1: crx[cix, ciy, ciz] = wz0[ciz]*rx[cix, ciy, iz] crx[cix, ciy, ciz] += wzl[ciz]*rx[cix, ciy, izm] crx[cix, ciy, ciz] += wzr[ciz]*rx[cix, ciy, izp] # Sum the terms for y-field. if ciy < cny-1: cry[cix, ciy, ciz] = wz0[ciz]*ry[cix, ciy, iz] cry[cix, ciy, ciz] += wzl[ciz]*ry[cix, ciy, izm] cry[cix, ciy, ciz] += wzr[ciz]*ry[cix, ciy, izp] # Sum the terms for z-field. if ciz < cnz-1: crz[cix, ciy, ciz] = rz[cix, ciy, iz] crz[cix, ciy, ciz] += rz[cix, ciy, izp]
[docs]@nb.njit(**_numba_setting) def restrict_weights(nodes, cell_centers, h, cnodes, ccell_centers, ch): r"""Restriction weights for the coarse-grid correction operator. Corresponds to Equation 9 in [Muld06]_. A generalized version of that equation is given by .. math:: w_{Q,-1}^v &= \left(v_{q-1/2}^h-v_{Q-1/2}^{2h}\right)/d_{q-1}^v ,\\ w_{Q,0}^v &= 1 ,\\ w_{Q,1}^v &= \left(v_{Q+1/2}^{2h}-v_{q+1/2}^h \right)/d_{q+1}^v , where :math:`d` are the dual grid cell widths, :math:`v` is one of :math:`\{x, y, z\}`, and :math:`Q, q` the corresponding entries of :math:`\{K, L, M\}, \{k, l, m\}`, respectively. The superscripts :math:`h, 2h` indicate quantities defined on the coarse grid and on the fine grid, respectively. The indices :math:`\{K, L, M\}` on the coarse grid correspond to :math:`\{k, l, m\} = 2\{K, L, M\}` on the fine grid. For the dual volume cell widths at the boundaries the scheme of [MoSu94]_ is applied, where :math:`d_0^x = h_{1/2}^x/2` at :math:`k = 0`, :math:`d_{N_x}^x = h_{N_x-1/2}^x` at :math:`k = N_x`, and so on. The following parameters must all be in the same direction, hence, all must be either for the x, the y, or the z direction. The returned weights are for this direction. Parameters ---------- nodes, cnodes : ndarray Cell edges of the fine (``nodes``) and coarse (``cnodes``) grids. cell_centers, ccell_centers : ndarray Cell centers of the fine (``cell_centers``) and coarse (``ccell_centers``) grids. h, ch : ndarray Cell widths of the fine (``h``) and coarse (``ch``) grids. Returns ------- wl, w0, wr : ndarray Left, central, and right weights in the direction provided in the input. """ # Get length of weights n = len(cnodes) # Dual grid cell widths d = np.empty(n+1) d[0] = h[0]/2 d[-1] = h[-1]/2 for i in range(1, n): d[i] = (h[2*i-2]+h[2*i-1])/2. # Left weight wl = 1/d[:-1] wl[0] *= (nodes[0]-h[0]/2) - (cnodes[0]-ch[0]/2) for i in range(1, n): wl[i] *= cell_centers[2*i-1]-ccell_centers[i-1] # Central weight w0 = np.ones(n) # Right weight wr = 1/d[1:] wr[-1] *= (cnodes[-1]+ch[-1]/2) - (nodes[-1]+h[-1]/2) for i in range(n-1): wr[i] *= ccell_centers[i]-cell_centers[2*i] return wl, w0, wr