Figures of Merit in `vipdopt`¶

This notebook serves as a guide to using the figure of merit (FoM) class FoM. This class has a number of tools designed to abstract the notion of a FoM.

The FoMs in vipdopt are based on the adjoint state method (see this paper)

For later cells to work, please run the following block first.

[29]:

# imports
from pathlib import Path
import sys

import numpy as np

np.set_printoptions(threshold=100)

# Get vipdopt directory path from Notebook
parent_dir = str(Path().resolve().parents[2])

# Add to sys.path
sys.path.insert(0, parent_dir)

# Imports from vipdopt
from vipdopt.optimization import FoM, BayerFilterFoM, UniformMAEFoM, SuperFoM
from vipdopt.simulation import Power, Profile, LumericalSimulation, GaussianSource, DipoleSource, LumericalFDTD

Using the `FoM` Class¶

The FoM class at its core is defined by two functions:

A function that computes the FoM (fom_func)
A function that computes the gradient of the FoM for optimization (grad_func)

To instantiate a FoM object, these two functions will need to be provided as well as a number of other arguments:

Which polarization to use out of TE, TM, or TE + TM (polarization)
A list of sources to enable in the forward / adjoint simulations (fwd_srcs / adj_srcs)
A list of monitors in the forward / adjoint simulations from which data will be accessed in computing the FoM (fwd_monitors / adj_monitors)
A list of the indices of frequencies being minimized / maximized in the optimization (pos_max_freqs / neg_min_freqs)
A list of weights to apply to each frequency (discussed more in “Spectral Weighting”)
A function for reducing the FoM from per-voxel to a single value (defaults to sum)

Monitor Data and Frequencies¶

Note that monitors will output data of shape N x … x L where L is the total number of frequencies. So specifying that pos_max_freqs=range(L) is equivalent to saying that we’re maximizing over every frequency.

Reducing `FoM`s¶

When the FoM is computed the function obtains a value at each point in space, for every frequency. These values are then reduced to return a final single value. This can be changed by setting reduce=False when calling fom.compute_fom(). By default, the reduction function used is sum, (so all of the values will be summed together), but this can be replaced with another function should the user wish. For example, UniformMAEFoM uses mean instead.

When the gradient is computed, the fom may need to be computed, due to the nature of differentiation, and in this case the values will not be reduced.

Example¶

In the following example, we make use of the BayerFilterFoM subclass, which has pre-defined functions for the FoM and gradient.

[ ]:

# Note: all object names are based on those in "simulation_example.json"


# List of sources needed to make forward / adjoint simulations
forward_sources = [
    GaussianSource('forward_src_x'),
]
adjoint_sources = [
    GaussianSource('forward_src_x'),
    DipoleSource('adjoint_src_0x'),
]

# Monitors whose data is necessary for computing the FoM
forward_monitors = [
    Power('focal_monitor_0'),
    Power('transmission_monitor_0'),
    Profile('design_efield_monitor'),
]
adjoint_monitors = [
    Profile('design_efield_monitor'),
]

fom = BayerFilterFoM(
    polarization='TE',
    fwd_srcs=forward_sources,
    adj_srcs=adjoint_sources,
    fwd_monitors=forward_monitors,
    adj_monitors=adjoint_monitors,
    pos_max_freqs=list(range(60)),  # We're maximizing all 60 frequencies
    neg_min_freqs=[],
)

Each FoM is also able to create its forward and adjoint simulations given a provided “base simulation” to start from. It achieves this by disabling all sources except for those specified, and by replacing its Monitor’s with their counterparts in the simulation.

So long as the FoM’s monitors and sources share a name with a corresponding object in the base simulation, the create_fwd_sim and create_adj_sim methods will create their respective simulation variants automatically.

[ ]:

base_sim = LumericalSimulation('simulation_example.json')

fwd_sim = fom.create_forward_sim(base_sim)[0]  # Access first element, as method returns a list
adj_sim = fom.create_adjoint_sim(base_sim)[0]

# Run the simulations and save the monitor data
fdtd = LumericalFDTD()

fdtd.save('fwd_sim.fsp', fwd_sim)
fdtd.save('adj_sim.fsp', adj_sim)

fdtd.addjob('fwd_sim.fsp')
fdtd.addjob('adj_sim.fsp')

fdtd.runjobs()
fdtd.reformat_monitor_data([fwd_sim, adj_sim])

Now that the FoM has been created, and its forward and adjoint simulations have been run, we can finally compute its functions.

[ ]:

f = fom.compute_fom()

g = fom.compute_grad()

print(f'The FoM is {f}')
print(f'The gradient is {g}')

Arithmetic Functions on `FoM`’s¶

Every FoM is equipped with arithmetic functions in order to combine separate figures. The supported operations are as follows:

Addition / subtraction
Multiplication
Scalar multiplication / division

Combining FoM in this way will result in a SuperFoM which represents a weighted sum of the different FoM’s. A SuperFoM contains two lists:

a list of tuples of FoMs
a list of weights to corresponding to each tuple

IMPORTANT: When combining FoMs, they must use the same arugments in compute_fom and compute_gradient. Calling SuperFom.compute_fom(*args) will pass those arguments to all of the FoMs.

Each tuple is a product of all of the FoMs it contains. So if you had the tuple (f(x),), that represents just \(f(x)\) by itself, whereas (f(x), g(x)) would correspond to a \(f(x)\cdot g(x)\) factor. Using the supported operations can therefore produce combined FoMs of the form \(\text{FoM}(x) = w_1 f(x) + w_2 g(x) h(x)\) , where \(f, g, h\) are FoMs, possibly SuperFoMs.

[30]:

# This FoM takes an array and measures the absolute error from a pre-determined
# constant (i.e. 0.5) at each point. It doesn't require monitors, or even a
# simulation to be run and is primarily for examples / testing.
fom1 = UniformMAEFoM([0], [0], constant=0.5)
fom2 = UniformMAEFoM([0], [0], constant=1.5)

combined_fom = fom1 + fom2

test_array = np.arange(9).reshape((3, 3))

# The combination should evaluate to the same value
fom1_val = fom1.compute_fom(test_array)
fom2_val = fom2.compute_fom(test_array)
assert fom1_val + fom2_val == combined_fom.compute_fom(test_array)

# We can also change the weights of the FoMs
combined_fom = 1.5 * fom1 + 2 * fom2
assert combined_fom.compute_fom(test_array) == 1.5 * fom1_val + 2 * fom2_val

# Multiplication works as expected
combined_fom = fom1 * fom2
assert combined_fom.compute_fom(test_array) == fom1_val * fom2_val

# Computing the gradient also works as expected (applying the product rule)
fom1_grad = fom1.compute_grad(test_array)
fom2_grad = fom2.compute_grad(test_array)
fom1_arr = fom1.compute_fom(test_array, reduce=False)  # When used in computing the gradient, the FoM is not summed
fom2_arr = fom2.compute_fom(test_array, reduce=False)
np.testing.assert_equal(combined_fom.compute_grad(test_array), fom1_grad * fom2_arr + fom1_arr * fom2_grad)

You can also create a SuperFoM manually, by providing the list of FoMs and weights. This method is not recommended as it is more prone to errors in formatting the list.

[31]:

fom1 = UniformMAEFoM([0], [0], constant=0.5)
fom2 = UniformMAEFoM([0], [0], constant=1.5)

combined_fom = SuperFoM(foms=[(fom1,), (fom2,)], weights=[1.0, 1.0])

# All of the above tests should work the same
assert fom1_val + fom2_val == combined_fom.compute_fom(test_array)

# Changing weights
combined_fom.weights = [1.5, 2.0]
assert combined_fom.compute_fom(test_array) == 1.5 * fom1_val + 2 * fom2_val

# Multiplication
combined_fom = SuperFoM(foms=[(fom1, fom2)], weights=[1.0])
assert combined_fom.compute_fom(test_array) == fom1_val * fom2_val

# Computing the gradient
fom1_grad = fom1.compute_grad(test_array)
fom2_grad = fom2.compute_grad(test_array)
fom1_arr = fom1.compute_fom(test_array, reduce=False)  # When used in computing the gradient, the FoM is not summed
fom2_arr = fom2.compute_fom(test_array, reduce=False)
np.testing.assert_equal(combined_fom.compute_grad(test_array), fom1_grad * fom2_arr + fom1_arr * fom2_grad)

Spectral Weighting and Performance Weighting¶

FoMs come with two additional features, spectral weighting and performance weighting.

Spectral Weighting¶

Spectral weighting refers to weights applied to each frequency when computing the FoM. This is an argument that is passed into the constructor when instantiating a FoM object. By default, each frequency is given an equal weight.

Below we the first \(n\) non-negative integers arranged in a \(2 \times 2 \times n\) array \(A\) as input, where \(n\) is the number of frequecny channels. For the spectral weights, we use a vector \(w\) defiend by \(w_i = i\). The FoM is the MAE with 2n, i.e. \(f_{ijk} = w_k(1 - | a_{ijk} - 2n |)\), which makes the gradient \(g_{ijk} = w_k \cdot \text{sign}(2n - a_{ijk})\). These values are then used to compute the weighted sum over the frequency axis, meaning

\[g_{ij} = \sum_{k=1}^{n} w_k \cdot \text{sign}(2n - a_{ijk})\]

In the example where \(n=3\), this gives

\[\begin{split} G = \begin{bmatrix} 0 & -2\\ 0 & -4 \end{bmatrix}\end{split}\]

Try playing around with the value of n_freq below.

[32]:

# Use n frequency channels, with weights 1, 2, ..., n respectively
n_freq = 3
spectral_weights = np.linspace(1.0, n_freq, n_freq, endpoint=True)

fom = UniformMAEFoM(range(n_freq), [], 2 * n_freq, spectral_weights=spectral_weights)

# First 4n whole numbers in 2x2xn shape
multi_freq_array = np.arange(4 * n_freq).reshape((2, 2, n_freq), order='F')

expected_fom = np.dot(1 - np.abs(multi_freq_array - 2 * n_freq), spectral_weights).mean()
np.testing.assert_allclose(fom.compute_fom(multi_freq_array), expected_fom)

# Gradient at each point g_{ij} = (Aw)_{ij}, where A is defined by a_{ijk} = sign(2n - x_{ijk}) and w is the column vector [1, 2, ..., n]

print(f'Gradient at each point: {fom.compute_grad(multi_freq_array)}')

Gradient at each point: [[ 0. -2.]
 [ 0. -4.]]

Performance Weighting¶

Performance weighting allows the weights of a FoM to adapt to the current state of an optimization. It will recompute weights based on the values of the various FoMs, and then apply those weights when computing the gradient. The following two equations are used to compute the gradient when performance weighting is enabled:

\[w_{i} = \max \left( {\frac{2}{N} - \frac{{FoM_{i}^{2} }}{{\sum FoM_{j}^{2} }},\; 0} \right)\]

\[g(x,\;y) = \sum w_{i} \frac{{\partial FoM_{i} }}{\partial \epsilon }\]

This process and these equations come from Mechanically reconfigurable multi-functional meta-optics studied at microwave frequencies.

Performance weights will be computed automatically whenever SuperFoM.compute_fom is called and stored for the next time SuperFoM.compute_grad is called. The weights will only be used in the gradient if apply_performance_weights is enabled.

[33]:

fom1 = UniformMAEFoM([0], [0], constant=0.0)
fom2 = UniformMAEFoM([0], [0], constant=5.0)
fom3 = UniformMAEFoM([0], [0], constant=10.0)

combined_fom = fom1 + fom2 + fom3

print(f'The initial weights are: {combined_fom.weights}')
print(f'The initial performance weights are: {combined_fom.performance_weights}')

# Computes the FoM computes new performance weights.
test_array = np.arange(9).reshape((3, 3))
combined_fom.compute_fom(test_array)
print(f'\nThe weights are unchanged: {combined_fom.weights}...')
print(f'... but the performance weights have changed: {combined_fom.performance_weights}')

# Comute gradient using the performance weights
normal_grad = combined_fom.compute_grad(test_array)
performance_grad = combined_fom.compute_grad(test_array, apply_performance_weights=True)
print(f'\nGradient using normal weights:\n{normal_grad}')
print(f'Gradient using performance weights:\n{performance_grad}')

The initial weights are: [1.0, 1.0, 1.0]
The initial performance weights are: [1. 1. 1.]

The weights are unchanged: [1.0, 1.0, 1.0]...
... but the performance weights have changed: [0.40793201 0.59206799 0.        ]

Gradient using normal weights:
[[ 2.  1.  1.]
 [ 1.  1.  0.]
 [-1. -1. -1.]]
Gradient using performance weights:
[[ 0.59206799  0.18413598  0.18413598]
 [ 0.18413598  0.18413598 -0.40793201]
 [-1.         -1.         -1.        ]]

Defining a New Figure of Merit¶

The FoM class at its core is defined by two functions:

A function that computes the FoM (fom_func)
A function that computes the gradient of the FoM for optimization (grad_func)

That’s it! In fact, to implement your own FoM class, all you need to do is provide these two functions and it will be compatible with all other optimization code.

In the following examples, we use the following FoM:

\[FoM(x) = x^2 \qquad FoM'(x) = 2 x\]

[34]:

def fom_func(x):
    return x ** 2

def gradient_func(x):
    return 2 * x

# Here we create our FoM using the FoM class directly
fom = FoM('TE+TM', [], [], [], [], fom_func, gradient_func, [0], [])
assert fom.compute_fom(5) == 25
assert fom.compute_grad(5) == 10

[35]:

# We can instead define our own FoM subclass
class SquareFoM(FoM):

    # We don't need the majority of the arguments for a normal FoM since we dont need a simulation
    def __init__(
            self,
            pos_max_freqs,
            neg_min_freqs,
            spectral_weights=np.array(1),
    ) -> None:
        # For all of the unnecessary arguments we use arbitrary values
        super().__init__(
            'TE+TM',
            [],
            [],
            [],
            [],
            self._square_fom,
            self._square_gradient,
            pos_max_freqs,
            neg_min_freqs,
            spectral_weights,
        )

    def _square_fom(self, x):
        return x ** 2

    def _square_gradient(self, x):
        return 2 * x

fom = SquareFoM([1], [])
assert fom.compute_fom(5) == 25
assert fom.compute_grad(5) == 10

Figures of Merit in vipdopt¶

Using the FoM Class¶