Train a surrogate
This guide will cover how to link models together and train a surrogate in amisc
.
Define a multidisciplinary system
The primary object for surrogate construction is the System
. A System
is constructed by passing all the component models:
from amisc import System
def first_model(x1, x2):
y1 = x1 * x2
return y1
def second_model(y1, x3):
y2 = y1 ** 2 + x3
return y2
system = System(first_model, second_model)
More generally, you may pass the Component
wrapper objects themselves with extra configurations to the system:
from amisc import Component
system = System(Component(first_model, data_fidelity=(2, 2), ...),
Component(second_model, data_fidelity=(3, 2), ...))
The System
may also accept only a single component model as a limiting case. An MD system is compactly summarized by a directed graph data structure, with the nodes being the component models and the edges being the coupling variables passing between the components. You may view the system graph using networkx
via:
If you want to save a variety of surrogate build products and logs, set the root_dir
attribute:
This will create a new amisc_{timestamp}
save directory with the current timestamp under the specified directory, where all build products and save files will be written. The structure of the amisc
root directory is summarized below:
📁 amisc_2024-12-10T11.00.00/
├── 📁 components/ # folder for saving model outputs
│ ├── 📁 comp1/ # outputs for 'comp1'
│ └── 📁 comp2/ # etc.
├── 📁 surrogates/ # surrogate save files
│ ├── 📁 system_iter0/
│ └── 📁 system_iter1/
├── 📄 amisc_2024-12-10T11.00.00.log # log file
└── 📄 plots.pdf # plots generated during training
Partial surrogates for an MD system
By default, the System
will try to build a surrogate for each component model in the system. If you don't want a surrogate to be built for a particular component model (e.g. if it's cheap to evaluate), then leave all fidelity
options of the Component
empty. This indicates that there is no way to "refine" your model, and so the System
will skip building a surrogate for the component. You can check the Component.has_surrogate
property to verify. During surrogate prediction, the underlying model function will be called instead for any components that do not have a surrogate.
Train a surrogate
Surrogate training is handled by System.fit
. From a high level, surrogate training proceeds by taking a series of adaptive refinement steps until an end criterion is reached. There are three criteria for terminating the adaptive train loop:
- Maximum iteration - train for a set number of iterations,
- Runtime - train for at least a set length of time, then terminate at the end of the current iteration, or
- Tolerance - when relative improvement between iterations is below a tolerance level.
For expensive models, it is highly recommended to parallelize model evaluations by passing an instance of a concurrent.futures.Executor
. At each sequential iteration, the parallel executor will manage evaluating the models on all new training data in a parallel loop.
It is also highly recommended to pass an independent test set to evaluate the surrogate's generalization on unseen data. A test set is a tuple of two Datasets: one dataset for model inputs and one dataset for the corresponding model outputs. Test sets do not guide any aspect of the training -- they are just used as a metric for monitoring performance during training.
Coupling variable bounds
Coupling variables are the inputs of any component model which are computed by and passed as outputs of an upstream component model. Since coupling variables are computed by a model, it may be difficult to know their domains a priori. When passing a test set to System.fit()
, you may also set estimate_bounds=True
to estimate all coupling variable bounds from the test set. Otherwise, you must manually set the coupling variable domains with a best guess for their expected ranges.
We leave the details of the adaptive refinement in the AMISC journal paper, but you can view the logs and error plots during training to get an idea. Generally, at each iteration, the System
loops over all candidate search directions for each component model, evaluates an "error indicator" metric that indicates potential improvement, and selects the most promising direction for more refinement. Once a direction is selected for refinement, new training points are sampled, the model is computed and stored, and the surrogate approximation is updated.
Example
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor(max_workers=8) as executor:
system.fit(max_iter=100, # max number of refinement steps
runtime_hr=3, # run for at least 3 hrs and stop at end of current iteration
max_tol=1e-3, # or terminate once relative improvement falls below this threshold
save_interval=5, # save the surrogate to file every 5 iterations
plot_interval=1, # plot error indicators every iteration
test_set=(xtest, ytest), # test set of unseen inputs/outputs
estimate_bounds=True, # estimate coupling variable bounds from ytest
executor=executor) # parallelize model evaluations
Predict with a surrogate
Surrogate predictions are obtained using System.predict
. The surrogate expects to be called with a Dataset of model inputs, which is a dictionary with variable names as keys and corresponding numeric values. The values for each input may be arrays, for which the surrogate will be computed over all inputs in the array. You may use System.sample_inputs
to obtain a dataset of random input samples.
Important
The input Dataset must contain both normalized and compressed inputs (for field quantities) before passing to System.predict
. This is because the surrogate was trained in the normalized space for Variables
with the norm
attribute, and in the latent space for field quantity Variables
with the compression
attribute. Likewise, return values from predict
will be normalized and compressed outputs. See the dataset conversion section for more information.
You may also call System.predict(..., use_model='best')
as an alias for calling the true component models instead of the surrogate (the inputs should still be normalized when passed in though -- they will get denormalized as needed under the hood).
Finally, there are two "modes" for evaluating the surrogate:
In training mode, only the active index sets are used in the MISC combination technique approximation (see theory for details). Training mode uses only a subset of all available training data, and so its accuracy is generally worse than evaluation (or "testing") mode.
Evaluate surrogate performance
The System
object provides three methods for evaluating the surrogate performance:
test_set_performance
- computes the relative error between the surrogate and the true model on a test set,plot_slice
- plots 1d slices of surrogate outputs over the inputs, and optionally compares to the true model,plot_allocation
- plots a bar chart that shows how computational resources were allocated during training.
Example
Saving to file
The System
object provides two methods for saving and loading the surrogate from file:
By default, these methods use the YamlLoader class to read and write the System
surrogate object from YAML files. If the System.root_dir
property has been set, then save files will default to the root_dir/surrogates
directory. If a save file is located within an amisc_{timestamp}
directory, then the root_dir
property will be set when loading from file.
YAML files
YAML files are a plain-text format, which allows easy inspection of the surrogate data saved in the file. You can also edit the surrogate properties directly in the file before loading back into memory. The save files closely mirror the format of configuration files and can be used as a template for future configurations.
Convert datasets for model or surrogate usage
Datasets for passing input/output arrays have two formats:
All values in the dataset are not normalized, and field quantities are in their full high-dimensional form (i.e. not compressed). This is how the model wrapper functions should expect their inputs to be formatted.
All values in the dataset are normalized, and field quantities are split into r
arrays with the special LATENT
ID string, enumerated from 0
to r-1
, where r
is the rank of the compressed latent space for the field quantity.
By default, System.predict
expects inputs in a normalized form for surrogate evaluation (but this may be toggled via the normalized_inputs
flag). The System.sample_inputs
method will also return normalized/compressed inputs by default.
To convert between dataset formats (i.e. for comparing surrogate outputs to model outputs or vice versa), you may use the to_model_dataset
and to_surrogate_dataset
utilities. These methods will use the Variable
objects to perform the appropriate normalizations and compression/reconstruction during conversion.
Example
from amisc import to_model_dataset, to_surrogate_dataset
x = Variable(norm='log10', domain=(10, 100)) # a scalar
f = Variable(compression=...) # a field quantity
model_dataset = { 'x': 100, 'f': np.array([1, 2, 3, ...]) }
surr_dataset, surr_vars = to_surrogate_dataset(model_dataset, [x, f]) # also returns the names of the latent variables
model_dataset, coords = to_model_dataset(surr_dataset, [x, f]) # also returns grid coordinates for field quantities