# Optimizing a Function that is Permutation Invariant

In this example, we explore BayBE's capabilities for handling optimization problems
with symmetry via automatic data augmentation and / or constraint.

## Imports


```python
import os
```


```python
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
from matplotlib.ticker import MaxNLocator
```


```python
from baybe import Campaign
from baybe.constraints import DiscretePermutationInvarianceConstraint
from baybe.parameters import NumericalDiscreteParameter
from baybe.recommenders import (
    BotorchRecommender,
    TwoPhaseMetaRecommender,
)
from baybe.searchspace import SearchSpace
from baybe.simulation import simulate_scenarios
from baybe.surrogates import NGBoostSurrogate
from baybe.targets import NumericalTarget
from baybe.utils.random import set_random_seed
```

## Settings


```python
set_random_seed(1337)
SMOKE_TEST = "SMOKE_TEST" in os.environ
N_MC_ITERATIONS = 2 if SMOKE_TEST else 100
N_DOE_ITERATIONS = 2 if SMOKE_TEST else 50
```

## The Scenario

We will explore a 2 dimensional function which is permutation-invariant. This means
$f(x,y) = f(y,x)$. The function was crafted to exhibit no additional mirror symmetry
(a common way of also resulting in permutation invariance) and have multiple minima.
In practice, permutation invariance can arise e.g. for
[mixtures when modeled with a slot-based approach](/examples/Mixtures/slot_based).
BayBE supports other kinds of symmetries as well (not part of this example).

There are several ways to handle such symmetries. The simplest one is
to augment your data. In the case of permutation invariance, augmentation means for
each measurement (x,y) you also add a measurement with switched values: (y,x).
This has the advantage that it is fully model-agnostic, but might
come at the expense of increased training time and efficiency due to the larger amount
of effective training points. Other ways of treating symmetry, such as using special
kernels for a GP, will not be discussed in this example.


```python
LBOUND = -2.0
UBOUND = 2.0
```


```python
def lookup(df: pd.DataFrame, a=1.0, b=1.0, c=1.0, d=1.0, phi=0.5) -> pd.DataFrame:
    """A lookup modeling a permutation-invariant 2D function with multiple minima."""
    x = df["x"].values
    y = df["y"].values
    result = (
        (x - y) ** 2
        + a * (x**3 + y**3)
        + b * ((x**2 - 1) ** 2 + (y**2 - 1) ** 2)
        + c * np.sin(3 * (x + y)) ** 2
        + d * np.sin(3 * (x - y) + phi) ** 2
    )

    df_z = pd.DataFrame({"f": result}, index=df.index)
    return df_z
```


```python
# Grid and dataframe for plotting
x = np.linspace(LBOUND, UBOUND, 25)
y = np.linspace(LBOUND, UBOUND, 25)
xx, yy = np.meshgrid(x, y)
df_plot = lookup(pd.DataFrame({"x": xx.ravel(), "y": yy.ravel()}))
zz = df_plot["f"].values.reshape(xx.shape)
line_vals = np.linspace(LBOUND, UBOUND, 2)
```


```python
# Plot the contour and diagonal
fig, axs = plt.subplots(1, 2, figsize=(15, 6))
contour = axs[0].contourf(xx, yy, zz, levels=50, cmap="viridis")
fig.colorbar(contour, ax=axs[0])
axs[0].plot(line_vals, line_vals, "r--", alpha=0.5, linewidth=2)
axs[0].set_title("Ground Truth: $f(x, y)$ = $f(y, x)$ (Permutation Invariant)")
axs[0].set_xlabel("x")
axs[0].set_ylabel("y");
```


The first subplot shows the function we want to minimize. The dashed red line
illustrates the permutation invariance, which is similar to a mirror-symmetry, just
not along any of the parameter axis but along the diagonal. We can also see several
local minima.

Such a situation can be challenging for optimization algorithms if no information
about the invariance is considered. For instance, if no
{class}`~baybe.constraints.discrete.DiscretePermutationInvarianceConstraint` was used
at all, BayBE would search for the optima across the entire 2D space. But it is clear
that the search can be restricted to the lower (or equivalently the upper) triangle
of the searchspace. This is exactly what
{class}`~baybe.constraints.discrete.DiscretePermutationInvarianceConstraint` does:
Remove entries that are "duplicated" in the sense of already being represented by
another invariant point.

If the surrogate is additionally configured with `symmetries` that use
`use_data_augmentation=True`, the model will be fit with an extended set of points,
including augmented ones. So as a user, you don't have to generate permutations and
add them manually. Depending on the surrogate model, this might have different
impacts. We can expect a strong effect for tree-based models because their splits are
always parallel to the parameter axes. Thus, without augmented measurements, it is
easy to fall into suboptimal splits and overfit. We illustrate this by using the
{class}`~baybe.surrogates.ngboost.NGBoostSurrogate`.

## The Optimization Problem


```python
p1 = NumericalDiscreteParameter("x", np.linspace(LBOUND, UBOUND, 51))
p2 = NumericalDiscreteParameter("y", np.linspace(LBOUND, UBOUND, 51))
objective = NumericalTarget("f", minimize=True).to_objective()
```

We set up a constrained and unconstrained searchspace to demonstrate the impact of
the constraint on optimization performance.


```python
constraint = DiscretePermutationInvarianceConstraint(["x", "y"])
searchspace_plain = SearchSpace.from_product([p1, p2])
searchspace_constrained = SearchSpace.from_product([p1, p2], [constraint])
```


```python
print("Number of Points in the Searchspace")
print(f"{'Without Constraint:':<35} {len(searchspace_plain.discrete.exp_rep)}")
print(f"{'With Constraint:':<35} {len(searchspace_constrained.discrete.exp_rep)}")
```

    Number of Points in the Searchspace
    Without Constraint:                 2601
    With Constraint:                    1275


We can see that the searchspace without the constraint has more points than the other
two. This is the effect of the utilized
{class}`~baybe.constraints.discrete.DiscretePermutationInvarianceConstraint`,
filtering entries that are degenerate due to the permutation symmetry. As a result,
the optimization will only be performed within the lower triangle of the first
subplot.

BayBE can automatically perform this augmentation if configured to do so.
Specifically, surrogate models have the `Surrogate.symmetries` attribute. If any of
these symmetries has `use_data_augmentation=True` (enabled by default),
BayBE will automatically augment measurements internally before performing the model
fit. To construct symmetries quickly, we use the `to_symmetry` method of the
constraint.


```python
symmetry = constraint.to_symmetry(use_data_augmentation=True)
recommender_plain = TwoPhaseMetaRecommender(
    recommender=BotorchRecommender(surrogate_model=NGBoostSurrogate())
)
recommender_symmetric = TwoPhaseMetaRecommender(
    recommender=BotorchRecommender(
        surrogate_model=NGBoostSurrogate(symmetries=[symmetry])
    )
)
```

The combination of constraint and augmentation settings results in four different
campaigns:


```python
campaign_plain = Campaign(searchspace_plain, objective, recommender_plain)
campaign_c = Campaign(searchspace_constrained, objective, recommender_plain)
campaign_s = Campaign(searchspace_plain, objective, recommender_symmetric)
campaign_cs = Campaign(searchspace_constrained, objective, recommender_symmetric)
```

## Simulating the Optimization Loop


```python
scenarios = {
    "Unconstrained, Unsymmetric": campaign_plain,
    "Constrained, Unsymmetric": campaign_c,
    "Unconstrained, Symmetric": campaign_s,
    "Constrained, Symmetric": campaign_cs,
}
```


```python
results = simulate_scenarios(
    scenarios,
    lookup,
    n_doe_iterations=N_DOE_ITERATIONS,
    n_mc_iterations=N_MC_ITERATIONS,
).rename(
    columns={
        "f_CumBest": "$f(x,y)$ (cumulative best)",
        "Num_Experiments": "# Experiments",
    }
)
```

    
## Results

Let us visualize the optimization process in the second subplot:


```python
sns.lineplot(
    data=results,
    x="# Experiments",
    y="$f(x,y)$ (cumulative best)",
    hue="Scenario",
    marker="o",
    ax=axs[1],
)
axs[1].xaxis.set_major_locator(MaxNLocator(integer=True))
axs[1].set_ylim(axs[1].get_ylim()[0], 3)
axs[1].set_title("Minimization Performance")
plt.tight_layout()
plt.show()
```


We find that the campaigns utilizing the permutation invariance constraint
perform better than the ones without. This can be attributed simply to the reduced
number of searchspace points they operate on. However, this effect is rather minor
compared to the effect of symmetry.

Furthermore, there is a strong impact on whether data augmentation is used or not,
the effect we expected for a tree-based surrogate model. Indeed, the campaign with
constraint but without augmentation is barely better than the campaign not utilizing
the constraint at all. Conversely, the data-augmented campaign has a clearly superior
performance. The best result is achieved by using both constraints and data
augmentation.
```{image} permutation.svg
:align: center
```