{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Figures of Merit in `vipdopt`\n",
    "\n",
    "This notebook serves as a guide to using the figure of merit (FoM) class `FoM`. This\n",
    "class has a number of tools designed to abstract the notion of a FoM.\n",
    "\n",
    "The FoMs in `vipdopt` are based on the adjoint state method (see [this paper](https://doi.org/10.1364/OE.21.021693))\n",
    "\n",
    "For later cells to work, please run the following block first.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "# imports\n",
    "from pathlib import Path\n",
    "import sys  \n",
    "\n",
    "import numpy as np\n",
    "\n",
    "np.set_printoptions(threshold=100)\n",
    "\n",
    "# Get vipdopt directory path from Notebook\n",
    "parent_dir = str(Path().resolve().parents[2])\n",
    "\n",
    "# Add to sys.path\n",
    "sys.path.insert(0, parent_dir)\n",
    "\n",
    "# Imports from vipdopt\n",
    "from vipdopt.optimization import FoM, BayerFilterFoM, UniformMAEFoM, SuperFoM\n",
    "from vipdopt.simulation import Power, Profile, LumericalSimulation, GaussianSource, DipoleSource, LumericalFDTD\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the `FoM` Class \n",
    "\n",
    "The `FoM` class at its core is defined by two functions:\n",
    "\n",
    "* A function that computes the FoM (`fom_func`)\n",
    "* A function that computes the gradient of the FoM for optimization (`grad_func`)\n",
    "\n",
    "To instantiate a `FoM` object, these two functions will need to be provided as well as\n",
    "a number of other arguments:\n",
    "\n",
    "* Which polarization to use out of TE, TM, or TE + TM (`polarization`)\n",
    "* A list of sources to enable in the forward / adjoint simulations (`fwd_srcs` / `adj_srcs`)\n",
    "* A list of monitors in the forward / adjoint simulations from which data will be accessed in computing the FoM (`fwd_monitors` / `adj_monitors`)\n",
    "* A list of the indices of frequencies being minimized / maximized in the optimization (`pos_max_freqs` / `neg_min_freqs`)\n",
    "* A list of weights to apply to each frequency (discussed more in [\"Spectral Weighting\"](#spectral-weighting))\n",
    "* A function for reducing the FoM from per-voxel to a single value (defaults to sum)\n",
    "\n",
    "### Monitor Data and Frequencies\n",
    "Note that monitors will output data of shape N x ... x L where L is the total number of\n",
    "frequencies. So specifying that `pos_max_freqs=range(L)` is equivalent to saying that\n",
    "we're maximizing over every frequency.\n",
    "\n",
    "### Reducing `FoM`s\n",
    "When the FoM is computed the function obtains a value at each point in space, for every frequency.\n",
    "These values are then reduced to return a final single value. This can be changed by setting \n",
    "`reduce=False` when calling `fom.compute_fom()`. By default, the reduction function used is `sum`,\n",
    "(so all of the values will be summed together), but this can be replaced with another function\n",
    "should the user wish. For example, `UniformMAEFoM` uses `mean` instead.\n",
    "\n",
    "When the gradient is computed, the fom may need to be computed, due to the nature of differentiation,\n",
    "and in this case the values will not be reduced.\n",
    "\n",
    "#### Example\n",
    "In the following example, we make use of the `BayerFilterFoM` subclass, which has\n",
    "pre-defined functions for the FoM and gradient."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Note: all object names are based on those in \"simulation_example.json\"\n",
    "\n",
    "\n",
    "# List of sources needed to make forward / adjoint simulations\n",
    "forward_sources = [\n",
    "    GaussianSource('forward_src_x'),\n",
    "]\n",
    "adjoint_sources = [\n",
    "    GaussianSource('forward_src_x'),\n",
    "    DipoleSource('adjoint_src_0x'),\n",
    "]\n",
    "\n",
    "# Monitors whose data is necessary for computing the FoM\n",
    "forward_monitors = [\n",
    "    Power('focal_monitor_0'),\n",
    "    Power('transmission_monitor_0'),\n",
    "    Profile('design_efield_monitor'),\n",
    "]\n",
    "adjoint_monitors = [\n",
    "    Profile('design_efield_monitor'),\n",
    "]\n",
    "\n",
    "fom = BayerFilterFoM(\n",
    "    polarization='TE',\n",
    "    fwd_srcs=forward_sources,\n",
    "    adj_srcs=adjoint_sources,\n",
    "    fwd_monitors=forward_monitors,\n",
    "    adj_monitors=adjoint_monitors,\n",
    "    pos_max_freqs=list(range(60)),  # We're maximizing all 60 frequencies \n",
    "    neg_min_freqs=[],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Each FoM is also able to create its forward and adjoint simulations given a provided\n",
    "\"base simulation\" to start from. It achieves this by disabling all sources except for\n",
    "those specified, and by replacing its `Monitor`'s with their counterparts in the \n",
    "simulation.\n",
    "\n",
    "So long as the FoM's monitors and sources share a name with a corresponding object in \n",
    "the base simulation, the `create_fwd_sim` and `create_adj_sim` methods will create \n",
    "their respective simulation variants automatically."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_sim = LumericalSimulation('simulation_example.json')\n",
    "\n",
    "fwd_sim = fom.create_forward_sim(base_sim)[0]  # Access first element, as method returns a list\n",
    "adj_sim = fom.create_adjoint_sim(base_sim)[0]\n",
    "\n",
    "# Run the simulations and save the monitor data\n",
    "fdtd = LumericalFDTD()\n",
    "\n",
    "fdtd.save('fwd_sim.fsp', fwd_sim)\n",
    "fdtd.save('adj_sim.fsp', adj_sim)\n",
    "\n",
    "fdtd.addjob('fwd_sim.fsp')\n",
    "fdtd.addjob('adj_sim.fsp')\n",
    "\n",
    "fdtd.runjobs()\n",
    "fdtd.reformat_monitor_data([fwd_sim, adj_sim])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that the FoM has been created, and its forward and adjoint simulations have been \n",
    "run, we can finally compute its functions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "f = fom.compute_fom()\n",
    "\n",
    "g = fom.compute_grad()\n",
    "\n",
    "print(f'The FoM is {f}')\n",
    "print(f'The gradient is {g}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Arithmetic Functions on `FoM`'s\n",
    "\n",
    "Every `FoM` is equipped with arithmetic functions in order to combine separate figures.  The supported operations are as follows:\n",
    "\n",
    "* Addition / subtraction\n",
    "* Multiplication\n",
    "* Scalar multiplication / division\n",
    "\n",
    "Combining `FoM` in this way will result in a `SuperFoM` which represents a weighted sum of the different `FoM`'s. A `SuperFoM` contains two lists:\n",
    "\n",
    "* a list of tuples of `FoM`s\n",
    "* a list of weights to corresponding to each tuple\n",
    "\n",
    "IMPORTANT: When combining `FoM`s, they must use the same arugments in `compute_fom` and `compute_gradient`.\n",
    "Calling `SuperFom.compute_fom(*args)` will pass those arguments to all of the `FoM`s.\n",
    "\n",
    "Each tuple is a product of all of the FoMs it contains. So if you had the tuple\n",
    "`(f(x),)`, that represents just $f(x)$ by itself, whereas `(f(x), g(x))` would correspond to a $f(x)\\cdot g(x)$ factor. Using the supported operations can therefore produce combined FoMs of the form $\\text{FoM}(x) = w_1 f(x) + w_2 g(x) h(x)$ , where $f, g, h$ are FoMs, possibly `SuperFoM`s."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This FoM takes an array and measures the absolute error from a pre-determined\n",
    "# constant (i.e. 0.5) at each point. It doesn't require monitors, or even a\n",
    "# simulation to be run and is primarily for examples / testing.\n",
    "fom1 = UniformMAEFoM([0], [0], constant=0.5)\n",
    "fom2 = UniformMAEFoM([0], [0], constant=1.5)\n",
    "\n",
    "combined_fom = fom1 + fom2\n",
    "\n",
    "test_array = np.arange(9).reshape((3, 3))\n",
    "\n",
    "# The combination should evaluate to the same value\n",
    "fom1_val = fom1.compute_fom(test_array)\n",
    "fom2_val = fom2.compute_fom(test_array)\n",
    "assert fom1_val + fom2_val == combined_fom.compute_fom(test_array)\n",
    "\n",
    "# We can also change the weights of the FoMs\n",
    "combined_fom = 1.5 * fom1 + 2 * fom2\n",
    "assert combined_fom.compute_fom(test_array) == 1.5 * fom1_val + 2 * fom2_val\n",
    "\n",
    "# Multiplication works as expected\n",
    "combined_fom = fom1 * fom2\n",
    "assert combined_fom.compute_fom(test_array) == fom1_val * fom2_val\n",
    "\n",
    "# Computing the gradient also works as expected (applying the product rule)\n",
    "fom1_grad = fom1.compute_grad(test_array)\n",
    "fom2_grad = fom2.compute_grad(test_array)\n",
    "fom1_arr = fom1.compute_fom(test_array, reduce=False)  # When used in computing the gradient, the FoM is not summed\n",
    "fom2_arr = fom2.compute_fom(test_array, reduce=False)\n",
    "np.testing.assert_equal(combined_fom.compute_grad(test_array), fom1_grad * fom2_arr + fom1_arr * fom2_grad)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also create a `SuperFoM` manually, by providing the list of `FoM`s and weights.\n",
    "This method is not recommended as it is more prone to errors in formatting the list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "fom1 = UniformMAEFoM([0], [0], constant=0.5)\n",
    "fom2 = UniformMAEFoM([0], [0], constant=1.5)\n",
    "\n",
    "combined_fom = SuperFoM(foms=[(fom1,), (fom2,)], weights=[1.0, 1.0])\n",
    "\n",
    "# All of the above tests should work the same\n",
    "assert fom1_val + fom2_val == combined_fom.compute_fom(test_array)\n",
    "\n",
    "# Changing weights\n",
    "combined_fom.weights = [1.5, 2.0]\n",
    "assert combined_fom.compute_fom(test_array) == 1.5 * fom1_val + 2 * fom2_val\n",
    "\n",
    "# Multiplication\n",
    "combined_fom = SuperFoM(foms=[(fom1, fom2)], weights=[1.0])\n",
    "assert combined_fom.compute_fom(test_array) == fom1_val * fom2_val\n",
    "\n",
    "# Computing the gradient\n",
    "fom1_grad = fom1.compute_grad(test_array)\n",
    "fom2_grad = fom2.compute_grad(test_array)\n",
    "fom1_arr = fom1.compute_fom(test_array, reduce=False)  # When used in computing the gradient, the FoM is not summed\n",
    "fom2_arr = fom2.compute_fom(test_array, reduce=False)\n",
    "np.testing.assert_equal(combined_fom.compute_grad(test_array), fom1_grad * fom2_arr + fom1_arr * fom2_grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Spectral Weighting and Performance Weighting\n",
    "\n",
    "FoMs come with two additional features, spectral weighting and performance weighting.\n",
    "\n",
    "#### Spectral Weighting\n",
    "Spectral weighting refers to weights applied to each frequency when computing the FoM. This is an\n",
    "argument that is passed into the constructor when instantiating a `FoM` object. By default, each\n",
    "frequency is given an equal weight.\n",
    "\n",
    "Below we the first $n$ non-negative integers arranged in a $2 \\times 2 \\times n$ array $A$ as input, where $n$ is\n",
    "the number of frequecny channels. For the spectral weights, we use a vector $w$ defiend by $w_i = i$.\n",
    "The FoM is the MAE with 2n, i.e. $f_{ijk} = w_k(1 - | a_{ijk} - 2n |)$, which makes the gradient $g_{ijk} = w_k \\cdot \\text{sign}(2n - a_{ijk})$. These values\n",
    "are then used to compute the weighted sum over the frequency axis, meaning\n",
    "$$g_{ij} = \\sum_{k=1}^{n} w_k \\cdot \\text{sign}(2n - a_{ijk})$$\n",
    "\n",
    "In the example where $n=3$, this gives \n",
    "$$ G = \\begin{bmatrix}\n",
    "0 & -2\\\\\n",
    "0 & -4\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "\n",
    "Try playing around with the value of `n_freq` below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gradient at each point: [[ 0. -2.]\n",
      " [ 0. -4.]]\n"
     ]
    }
   ],
   "source": [
    "# Use n frequency channels, with weights 1, 2, ..., n respectively\n",
    "n_freq = 3\n",
    "spectral_weights = np.linspace(1.0, n_freq, n_freq, endpoint=True)\n",
    "\n",
    "fom = UniformMAEFoM(range(n_freq), [], 2 * n_freq, spectral_weights=spectral_weights)\n",
    "\n",
    "# First 4n whole numbers in 2x2xn shape\n",
    "multi_freq_array = np.arange(4 * n_freq).reshape((2, 2, n_freq), order='F')\n",
    "\n",
    "expected_fom = np.dot(1 - np.abs(multi_freq_array - 2 * n_freq), spectral_weights).mean()\n",
    "np.testing.assert_allclose(fom.compute_fom(multi_freq_array), expected_fom)\n",
    "\n",
    "# Gradient at each point g_{ij} = (Aw)_{ij}, where A is defined by a_{ijk} = sign(2n - x_{ijk}) and w is the column vector [1, 2, ..., n]\n",
    "\n",
    "print(f'Gradient at each point: {fom.compute_grad(multi_freq_array)}') "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Performance Weighting\n",
    "Performance weighting allows the weights of a FoM to adapt to the current state of an optimization.\n",
    "It will recompute weights based on the values of the various FoMs, and then apply those weights when\n",
    "computing the gradient. The following two equations are used to compute the gradient when performance\n",
    "weighting is enabled:\n",
    "\n",
    "$$w_{i} = \\max \\left( {\\frac{2}{N} - \\frac{{FoM_{i}^{2} }}{{\\sum FoM_{j}^{2} }},\\; 0} \\right)$$\n",
    "$$g(x,\\;y) = \\sum w_{i} \\frac{{\\partial FoM_{i} }}{\\partial \\epsilon }$$\n",
    "\n",
    "This process and these equations come from [Mechanically reconfigurable multi-functional meta-optics studied at microwave frequencies](https://doi.org/10.1038/s41598-021-88785-5).\n",
    "\n",
    "Performance weights will be computed automatically whenever `SuperFoM.compute_fom` is called and stored\n",
    "for the next time `SuperFoM.compute_grad` is called. The weights will only be used in the gradient\n",
    "if `apply_performance_weights` is enabled.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The initial weights are: [1.0, 1.0, 1.0]\n",
      "The initial performance weights are: [1. 1. 1.]\n",
      "\n",
      "The weights are unchanged: [1.0, 1.0, 1.0]...\n",
      "... but the performance weights have changed: [0.40793201 0.59206799 0.        ]\n",
      "\n",
      "Gradient using normal weights:\n",
      "[[ 2.  1.  1.]\n",
      " [ 1.  1.  0.]\n",
      " [-1. -1. -1.]]\n",
      "Gradient using performance weights:\n",
      "[[ 0.59206799  0.18413598  0.18413598]\n",
      " [ 0.18413598  0.18413598 -0.40793201]\n",
      " [-1.         -1.         -1.        ]]\n"
     ]
    }
   ],
   "source": [
    "fom1 = UniformMAEFoM([0], [0], constant=0.0)\n",
    "fom2 = UniformMAEFoM([0], [0], constant=5.0)\n",
    "fom3 = UniformMAEFoM([0], [0], constant=10.0)\n",
    "\n",
    "combined_fom = fom1 + fom2 + fom3\n",
    "\n",
    "print(f'The initial weights are: {combined_fom.weights}')\n",
    "print(f'The initial performance weights are: {combined_fom.performance_weights}')\n",
    "\n",
    "# Computes the FoM computes new performance weights.\n",
    "test_array = np.arange(9).reshape((3, 3))\n",
    "combined_fom.compute_fom(test_array)\n",
    "print(f'\\nThe weights are unchanged: {combined_fom.weights}...')\n",
    "print(f'... but the performance weights have changed: {combined_fom.performance_weights}')\n",
    "\n",
    "# Comute gradient using the performance weights\n",
    "normal_grad = combined_fom.compute_grad(test_array)\n",
    "performance_grad = combined_fom.compute_grad(test_array, apply_performance_weights=True)\n",
    "print(f'\\nGradient using normal weights:\\n{normal_grad}')\n",
    "print(f'Gradient using performance weights:\\n{performance_grad}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Defining a New Figure of Merit\n",
    "\n",
    "The `FoM` class at its core is defined by two functions:\n",
    "\n",
    "* A function that computes the FoM (`fom_func`)\n",
    "* A function that computes the gradient of the FoM for optimization (`grad_func`)\n",
    "\n",
    "That's it! In fact, to implement your own FoM class, all you need to do is provide these\n",
    "two functions and it will be compatible with all other optimization code.\n",
    "\n",
    "In the following examples, we use the following FoM:\n",
    "$$FoM(x) = x^2 \\qquad FoM'(x) = 2 x$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fom_func(x):\n",
    "    return x ** 2\n",
    "\n",
    "def gradient_func(x):\n",
    "    return 2 * x\n",
    "\n",
    "# Here we create our FoM using the FoM class directly\n",
    "fom = FoM('TE+TM', [], [], [], [], fom_func, gradient_func, [0], [])\n",
    "assert fom.compute_fom(5) == 25\n",
    "assert fom.compute_grad(5) == 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "# We can instead define our own FoM subclass\n",
    "class SquareFoM(FoM):\n",
    "\n",
    "    # We don't need the majority of the arguments for a normal FoM since we dont need a simulation\n",
    "    def __init__(\n",
    "            self,\n",
    "            pos_max_freqs,\n",
    "            neg_min_freqs,\n",
    "            spectral_weights=np.array(1),\n",
    "    ) -> None:\n",
    "        # For all of the unnecessary arguments we use arbitrary values\n",
    "        super().__init__(\n",
    "            'TE+TM',\n",
    "            [],\n",
    "            [],\n",
    "            [],\n",
    "            [],\n",
    "            self._square_fom,\n",
    "            self._square_gradient,\n",
    "            pos_max_freqs,\n",
    "            neg_min_freqs,\n",
    "            spectral_weights,\n",
    "        )\n",
    "    \n",
    "    def _square_fom(self, x):\n",
    "        return x ** 2\n",
    "    \n",
    "    def _square_gradient(self, x):\n",
    "        return 2 * x\n",
    "\n",
    "fom = SquareFoM([1], [])\n",
    "assert fom.compute_fom(5) == 25\n",
    "assert fom.compute_grad(5) == 10"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}