Skip to content

amisc.component

A Component is an amisc wrapper around a single discipline model. It manages surrogate construction and a hierarchy of modeling fidelities.

Multi-indices in the MISC approximation

A multi-index is a tuple of natural numbers, each specifying a level of fidelity. You will frequently see two multi-indices: alpha and beta. The alpha (or \(\alpha\)) indices specify physical model fidelity and get passed to the model as an additional argument (e.g. things like discretization level, time step size, etc.). The beta (or \(\beta\)) indices specify surrogate refinement level, so typically an indication of the amount of training data used or the complexity of the surrogate model. We divide \(\beta\) into data_fidelity and surrogate_fidelity for specifying training data and surrogate model complexity, respectively.

Includes:

  • ModelKwargs — a dataclass for storing model keyword arguments
  • StringKwargs — a dataclass for storing model keyword arguments as a string
  • IndexSet — a dataclass that maintains a list of multi-indices
  • MiscTree — a dataclass that maintains MISC data in a dict tree, indexed by alpha and beta
  • Component — a class that manages a single discipline model and its surrogate hierarchy

Component(model, *args, inputs=None, outputs=None, name=None, **kwargs)

Bases: BaseModel, Serializable

A Component wrapper around a single discipline model. It manages MISC surrogate construction and a hierarchy of modeling fidelities.

A Component can be constructed by specifying a model, input and output variables, and additional configurations such as the maximum fidelity levels, the interpolator type, and the training data type. If model_fidelity, data_fidelity, and surrogate_fidelity are all left empty, then the Component will not use a surrogate model, instead calling the underlying model directly. The Component can be serialized to a YAML file and deserialized back into a Python object.

A simple Component

from amisc import Component, Variable

x = Variable(domain=(0, 1))
y = Variable()
model = lambda x: {'y': x['x']**2}
comp = Component(model=model, inputs=[x], outputs=[y])

Each fidelity index in \(\alpha\) increases in refinement from \(0\) up to model_fidelity. Each fidelity index in \(\beta\) increases from \(0\) up to (data_fidelity, surrogate_fidelity). From the Component's perspective, the concatenation of \((\alpha, \beta)\) fully specifies a single fidelity "level". The Component forms an approximation of the model by summing up over many of these concatenated sets of \((\alpha, \beta)\).

ATTRIBUTE DESCRIPTION
name

the name of the Component

TYPE: Optional[str]

model

the model or function that is to be approximated, callable as y = f(x)

TYPE: str | Callable[[dict | Dataset, ...], dict | Dataset]

inputs

the input variables to the model

TYPE: _VariableLike

outputs

the output variables from the model

TYPE: _VariableLike

model_kwargs

extra keyword arguments to pass to the model

TYPE: str | dict | ModelKwargs

model_fidelity

the maximum level of refinement for each fidelity index in \(\alpha\) for model fidelity

TYPE: str | tuple

data_fidelity

the maximum level of refinement for each fidelity index in \(\beta\) for training data

TYPE: str | tuple

surrogate_fidelity

the max level of refinement for each fidelity index in \(\beta\) for the surrogate

TYPE: str | tuple

interpolator

the interpolator to use as the underlying surrogate model

TYPE: Any | Interpolator

vectorized

whether the model supports vectorized input/output (i.e. datasets with arbitrary shape (...,))

TYPE: bool

call_unpacked

whether the model expects unpacked input arguments (i.e. func(x1, x2, ...))

TYPE: Optional[bool]

ret_unpacked

whether the model returns unpacked output arguments (i.e. func() -> (y1, y2, ...))

TYPE: Optional[bool]

active_set

the current active set of multi-indices in the MISC approximation

TYPE: list | set | IndexSet

candidate_set

all neighboring multi-indices that are candidates for inclusion in active_set

TYPE: list | set | IndexSet

misc_states

the interpolator states for each multi-index in the MISC approximation

TYPE: dict | MiscTree

misc_costs

the computational cost associated with each multi-index in the MISC approximation

TYPE: dict | MiscTree

misc_coeff_train

the combination technique coefficients for the active set multi-indices

TYPE: dict | MiscTree

misc_coeff_test

the combination technique coefficients for the active and candidate set multi-indices

TYPE: dict | MiscTree

model_costs

the average single fidelity model costs for each \(\alpha\)

TYPE: dict

training_data

the training data storage structure for the surrogate model

TYPE: Any | TrainingData

serializers

the custom serializers for the [model_kwargs, interpolator, training_data] Component attributes -- these should be the types of the serializer objects, which will be inferred from the data passed in if not explicitly set

TYPE: Optional[ComponentSerializers]

_logger

the logger for the Component

TYPE: Optional[Logger]

Source code in src/amisc/component.py
def __init__(self, /, model, *args, inputs=None, outputs=None, name=None, **kwargs):
    if name is None:
        name = _inspect_assignment('Component')  # try to assign the name from inspection
    name = name or model.__name__ or "Component_" + "".join(random.choices(string.digits, k=3))

    # Determine how the model expects to be called and gather inputs/outputs
    _ = self._validate_model_signature(model, args, inputs, outputs, kwargs.get('call_unpacked', None),
                                       kwargs.get('ret_unpacked', None))
    model, inputs, outputs, call_unpacked, ret_unpacked = _
    kwargs['call_unpacked'] = call_unpacked
    kwargs['ret_unpacked'] = ret_unpacked

    # Gather all model kwargs (anything else passed in for kwargs is assumed to be a model kwarg)
    model_kwargs = kwargs.get('model_kwargs', {})
    for key in kwargs.keys() - self.model_fields.keys():
        model_kwargs[key] = kwargs.pop(key)
    kwargs['model_kwargs'] = model_kwargs

    # Gather data serializers from type checks (if not passed in as a kwarg)
    serializers = kwargs.get('serializers', {})  # directly passing serializers will override type checks
    for key in ComponentSerializers.__annotations__.keys():
        field = kwargs.get(key, None)
        if isinstance(field, dict):
            field_super = next(filter(lambda x: issubclass(x, Serializable),
                                      typing.get_args(self.model_fields[key].annotation)), None)
            field = field_super.from_dict(field) if field_super is not None else field
        if not serializers.get(key, None):
            serializers[key] = type(field) if isinstance(field, Serializable) else (
                type(self.model_fields[key].default))
    kwargs['serializers'] = serializers

    super().__init__(model=model, inputs=inputs, outputs=outputs, name=name, **kwargs)  # Runs pydantic validation

    # Set internal properties
    assert self.is_downward_closed(self.active_set.union(self.candidate_set))
    self.set_logger()

has_surrogate: bool property

The component has no surrogate model if there are no fidelity indices.

max_alpha: MultiIndex property

The maximum model fidelity multi-index (alias for model_fidelity).

max_beta: MultiIndex property

The maximum surrogate fidelity multi-index is a combination of training and interpolator indices.

activate_index(alpha, beta, model_dir=None, executor=None)

Add a multi-index to the active set and all neighbors to the candidate set.

Warning

The user of this function is responsible for ensuring that the index set maintains downward-closedness. That is, only activate indices that are neighbors of the current active set.

PARAMETER DESCRIPTION
alpha

A multi-index specifying model fidelity

TYPE: MultiIndex

beta

A multi-index specifying surrogate fidelity

TYPE: MultiIndex

model_dir

Directory to save model output files

TYPE: str | Path DEFAULT: None

executor

Executor for parallel execution of model on training data if the model is not vectorized

TYPE: Executor DEFAULT: None

Source code in src/amisc/component.py
def activate_index(self, alpha: MultiIndex, beta: MultiIndex, model_dir: str | Path = None,
                   executor: Executor = None):
    """Add a multi-index to the active set and all neighbors to the candidate set.

    !!! Warning
        The user of this function is responsible for ensuring that the index set maintains downward-closedness.
        That is, only activate indices that are neighbors of the current active set.

    :param alpha: A multi-index specifying model fidelity
    :param beta: A multi-index specifying surrogate fidelity
    :param model_dir: Directory to save model output files
    :param executor: Executor for parallel execution of model on training data if the model is not vectorized
    """
    if (alpha, beta) in self.active_set:
        self.logger.warning(f'Multi-index {(alpha, beta)} is already in the active index set. Ignoring...')
        return
    if (alpha, beta) not in self.candidate_set and (sum(alpha) + sum(beta)) > 0:
        # Can only activate the initial index (0, 0, ... 0) without it being in the candidate set
        self.logger.warning(f'Multi-index {(alpha, beta)} is not a neighbor of the active index set, so it '
                            f'cannot be activated. Please only add multi-indices from the candidate set. '
                            f'Ignoring...')
        return

    # Collect all neighbor candidate indices
    neighbors = self._neighbors(alpha, beta, forward=True)
    indices = list(itertools.chain([(alpha, beta)] if (alpha, beta) not in self.candidate_set else [], neighbors))

    # Refine and collect all new model inputs (i.e. training points) requested by the new candidates
    alpha_list = []    # keep track of model fidelities
    design_list = []   # keep track of training data coordinates/locations/indices
    model_inputs = {}  # concatenate all model inputs
    field_coords = {f'{var}_coords': self.model_kwargs.get(f'{var}_coords', None) for var in self.inputs}
    domains = self.inputs.get_domains()
    weight_fcns = self.inputs.get_pdfs()
    for a, b in indices:
        design_coords, design_pts = self.training_data.refine(a, b[:len(self.data_fidelity)],
                                                              domains, weight_fcns)
        design_pts, fc = to_model_dataset(design_pts, self.inputs, del_latent=True, **field_coords)

        # Remove duplicate (alpha, coords) pairs -- so you don't evaluate the model twice for the same input
        i = 0
        del_idx = []
        for other_design in design_list:
            for other_coord in other_design:
                for j, curr_coord in enumerate(design_coords):
                    if curr_coord == other_coord and a == alpha_list[i] and j not in del_idx:
                        del_idx.append(j)
                i += 1
        design_coords = [design_coords[j] for j in range(len(design_coords)) if j not in del_idx]
        design_pts = {var: np.delete(arr, del_idx, axis=0) for var, arr in design_pts.items()}

        alpha_list.extend([tuple(a)] * len(design_coords))
        design_list.append(design_coords)
        field_coords.update(fc)
        for var in design_pts:
            model_inputs[var] = design_pts[var] if model_inputs.get(var) is None else (
                np.concatenate((model_inputs[var], design_pts[var]), axis=0))

    # Evaluate model at designed training points
    if len(alpha_list) > 0:
        self.logger.info(f"Running {len(alpha_list)} total model evaluations for component "
                         f"'{self.name}' new candidate indices: {indices}...")
        model_outputs = self.call_model(model_inputs, model_fidelity=alpha_list, output_path=model_dir,
                                        executor=executor, **field_coords)
        errors = model_outputs.pop('errors', {})

    # Unpack model outputs and update states
    start_idx = 0
    for i, (a, b) in enumerate(indices):
        num_train_pts = len(design_list[i])
        end_idx = start_idx + num_train_pts  # Ensure loop dim of 1 gets its own axis (might have been squeezed)
        yi_dict = {var: arr[np.newaxis, ...] if len(alpha_list) == 1 and arr.shape[0] != 1 else
                   arr[start_idx:end_idx, ...] for var, arr in model_outputs.items()}

        # Check for errors and store
        err_coords = []
        err_list = []
        for idx in list(errors.keys()):
            if idx < end_idx:
                err_info = errors.pop(idx)
                err_info['index'] = idx - start_idx
                err_coords.append(design_list[i][idx - start_idx])
                err_list.append(err_info)
        if len(err_list) > 0:
            self.logger.warning(f"Model errors occurred while adding candidate ({a}, {b}) for component "
                                f"{self.name}. Leaving NaN values in training data...")
            self.training_data.set_errors(a, b[:len(self.data_fidelity)], err_coords, err_list)

        # Compress field quantities and normalize
        yi_dict, y_vars = to_surrogate_dataset(yi_dict, self.outputs, del_fields=False, **field_coords)

        # Store training data, computational cost, and new interpolator state
        self.training_data.set(a, b[:len(self.data_fidelity)], design_list[i], yi_dict)
        self.training_data.impute_missing_data(a, b[:len(self.data_fidelity)])
        self.misc_costs[a, b] = self.model_costs.get(a, 1.) * num_train_pts
        self.misc_states[a, b] = self.interpolator.refine(b[len(self.data_fidelity):],
                                                          self.training_data.get(a, b[:len(self.data_fidelity)],
                                                                                 y_vars=y_vars, skip_nan=True),
                                                          self.misc_states.get((alpha, beta)),
                                                          domains)
        start_idx = end_idx

    # Move to the active index set
    s = set()
    s.add((alpha, beta))
    self.update_misc_coeff(IndexSet(s), index_set='train')
    if (alpha, beta) in self.candidate_set:
        self.candidate_set.remove((alpha, beta))
    else:
        # Only for initial index which didn't come from the candidate set
        self.update_misc_coeff(IndexSet(s), index_set='test')
    self.active_set.update(s)

    self.update_misc_coeff(neighbors, index_set='test')  # neighbors will only ever pass through here once
    self.candidate_set.update(neighbors)

call_model(inputs, model_fidelity=None, output_path=None, executor=None, **kwds)

Wrapper function for calling the underlying component model.

This function formats the input data, calls the model, and processes the output data. It supports vectorized calls, parallel execution using an executor, or serial execution. These options are checked in that order, with the first available method used. Must set Component.vectorized=True if the model supports input arrays of the form (N,) or even arbitrary shape (...,).

Parallel Execution

The underlying model must be defined in a global module scope if pickle is the serialization method for the provided Executor.

Additional return values

The model can return additional items that are not part of Component.outputs. These items are returned as object arrays in the output dict. Two special return values are model_cost and output_path. Returning model_cost will store the computational cost of a single model evaluation (which is used by amisc adaptive surrogate training). Returning output_path will store the output file name if the model wrote any files to disk.

Handling errors

If the underlying component model raises an exception, the error is stored in output_dict['errors'] with the index of the input data that caused the error. The output data for that index is set to np.nan for each output variable.

PARAMETER DESCRIPTION
inputs

The input data for the model, formatted as a dict with a key for each input variable and a corresponding value that is an array of the input data. If specified as a plain list, then the order is assumed the same as Component.inputs.

TYPE: dict | Dataset

model_fidelity

Fidelity indices to tune the model fidelity (model must request this in its keyword arguments).

TYPE: Literal['best', 'worst'] | tuple | list DEFAULT: None

output_path

Directory to save model output files (model must request this in its keyword arguments).

TYPE: str | Path DEFAULT: None

executor

Executor for parallel execution if the model is not vectorized (optional).

TYPE: Executor DEFAULT: None

kwds

Additional keyword arguments to pass to the model (model must request these in its keyword args).

DEFAULT: {}

RETURNS DESCRIPTION
Dataset

The output data from the model, formatted as a dict with a key for each output variable and a corresponding value that is an array of the output data.

Source code in src/amisc/component.py
def call_model(self, inputs: dict | Dataset,
               model_fidelity: Literal['best', 'worst'] | tuple | list = None,
               output_path: str | Path = None,
               executor: Executor = None,
               **kwds) -> Dataset:
    """Wrapper function for calling the underlying component model.

    This function formats the input data, calls the model, and processes the output data.
    It supports vectorized calls, parallel execution using an executor, or serial execution. These options are
    checked in that order, with the first available method used. Must set `Component.vectorized=True` if the
    model supports input arrays of the form `(N,)` or even arbitrary shape `(...,)`.

    !!! Warning "Parallel Execution"
        The underlying model must be defined in a global module scope if `pickle` is the serialization method for
        the provided `Executor`.

    !!! Note "Additional return values"
        The model can return additional items that are not part of `Component.outputs`. These items are returned
        as object arrays in the output `dict`. Two special return values are `model_cost` and `output_path`.
        Returning `model_cost` will store the computational cost of a single model evaluation (which is used by
        `amisc` adaptive surrogate training). Returning `output_path` will store the output file name if the model
        wrote any files to disk.

    !!! Note "Handling errors"
        If the underlying component model raises an exception, the error is stored in `output_dict['errors']` with
        the index of the input data that caused the error. The output data for that index is set to `np.nan`
        for each output variable.

    :param inputs: The input data for the model, formatted as a `dict` with a key for each input variable and
                   a corresponding value that is an array of the input data. If specified as a plain list, then the
                   order is assumed the same as `Component.inputs`.
    :param model_fidelity: Fidelity indices to tune the model fidelity (model must request this
                           in its keyword arguments).
    :param output_path: Directory to save model output files (model must request this in its keyword arguments).
    :param executor: Executor for parallel execution if the model is not vectorized (optional).
    :param kwds: Additional keyword arguments to pass to the model (model must request these in its keyword args).
    :returns: The output data from the model, formatted as a `dict` with a key for each output variable and a
              corresponding value that is an array of the output data.
    """
    # Format inputs to a common loop shape (fail if missing any)
    if len(inputs) == 0:
        return {}  # your fault
    if isinstance(inputs, list | np.ndarray):
        inputs = np.atleast_1d(inputs)
        inputs = {var.name: inputs[..., i] for i, var in enumerate(self.inputs)}

    var_shape = {}
    for var in self.inputs:
        s = None
        if (arr := kwds.get(f'{var.name}_coords')) is not None:
            s = arr.shape if len(arr.shape) == 1 else arr.shape[:-1]  # skip the coordinate dim (last axis)
        if var.compression is not None:
            for field in var.compression.fields:
                var_shape[field] = s
        else:
            var_shape[var.name] = s
    inputs, loop_shape = format_inputs(inputs, var_shape=var_shape)

    N = int(np.prod(loop_shape))
    list_alpha = isinstance(model_fidelity, list | np.ndarray)
    alpha_requested = self.model_kwarg_requested('model_fidelity')
    for var in self.inputs:
        if var.compression is not None:
            for field in var.compression.fields:
                if field not in inputs:
                    raise ValueError(f"Missing field '{field}' for input variable '{var}'.")
        elif var.name not in inputs:
            raise ValueError(f"Missing input variable '{var.name}'.")

    # Pass extra requested items to the model kwargs
    kwargs = copy.deepcopy(self.model_kwargs.data)
    if self.model_kwarg_requested('output_path'):
        kwargs['output_path'] = output_path
    if self.model_kwarg_requested('input_vars'):
        kwargs['input_vars'] = self.inputs
    if self.model_kwarg_requested('output_vars'):
        kwargs['output_vars'] = self.outputs
    if alpha_requested:
        if not list_alpha:
            model_fidelity = [model_fidelity] * N
        for i in range(N):
            if model_fidelity[i] == 'best':
                model_fidelity[i] = self.max_alpha
            elif model_fidelity[i] == 'worst':
                model_fidelity[i] = (0,) * len(self.model_fidelity)

    for k, v in kwds.items():
        if self.model_kwarg_requested(k):
            kwargs[k] = v

    # Compute model (vectorized, executor parallel, or serial)
    errors = {}
    if self.vectorized:
        if alpha_requested:
            kwargs['model_fidelity'] = np.atleast_1d(model_fidelity).reshape((N, -1))
        output_dict = self.model(*[inputs[var.name] for var in self.inputs], **kwargs) if self.call_unpacked \
            else self.model(inputs, **kwargs)
        if self.ret_unpacked:
            output_dict = (output_dict,) if not isinstance(output_dict, tuple) else output_dict
            output_dict = {out_var.name: output_dict[i] for i, out_var in enumerate(self.outputs)}
    else:
        if executor is None:  # Serial
            results = deque(maxlen=N)
            for i in range(N):
                try:
                    if alpha_requested:
                        kwargs['model_fidelity'] = model_fidelity[i]
                    ret = self.model(*[{k: v[i] for k, v in inputs.items()}[var.name] for var in self.inputs],
                                     **kwargs) if self.call_unpacked else (
                        self.model({k: v[i] for k, v in inputs.items()}, **kwargs))
                    if self.ret_unpacked:
                        ret = (ret,) if not isinstance(ret, tuple) else ret
                        ret = {out_var.name: ret[i] for i, out_var in enumerate(self.outputs)}
                    results.append(ret)
                except Exception:
                    results.append({'inputs': {k: v[i] for k, v in inputs.items()}, 'index': i,
                                    'model_kwargs': kwargs.copy(), 'error': traceback.format_exc()})
        else:  # Parallel
            results = deque(maxlen=N)
            futures = []
            for i in range(N):
                if alpha_requested:
                    kwargs['model_fidelity'] = model_fidelity[i]
                fs = executor.submit(self.model,
                                     *[{k: v[i] for k, v in inputs.items()}[var.name] for var in self.inputs],
                                     **kwargs) if self.call_unpacked else (
                    executor.submit(self.model, {k: v[i] for k, v in inputs.items()}, **kwargs))
                futures.append(fs)
            wait(futures, timeout=None, return_when=ALL_COMPLETED)

            for i, fs in enumerate(futures):
                try:
                    if alpha_requested:
                        kwargs['model_fidelity'] = model_fidelity[i]
                    ret = fs.result()
                    if self.ret_unpacked:
                        ret = (ret,) if not isinstance(ret, tuple) else ret
                        ret = {out_var.name: ret[i] for i, out_var in enumerate(self.outputs)}
                    results.append(ret)
                except Exception:
                    results.append({'inputs': {k: v[i] for k, v in inputs.items()}, 'index': i,
                                    'model_kwargs': kwargs.copy(), 'error': traceback.format_exc()})

        # Collect parallel/serial results
        output_dict = {}
        for i in range(N):
            res = results.popleft()
            if 'error' in res:
                errors[i] = res
            else:
                for key, val in res.items():
                    # Save numeric outputs
                    numeric_flag = False
                    for var in self.outputs:
                        if var.compression is not None:  # field quantity return values
                            if key in var.compression.fields:
                                if output_dict.get(key) is None:
                                    output_dict.setdefault(key, np.full((N, *np.atleast_1d(val).shape), np.nan))
                                output_dict[key][i, ...] = np.atleast_1d(val)
                                numeric_flag = True
                                break
                        elif key == var:
                            if output_dict.get(key) is None:
                                output_dict.setdefault(key, np.full((N, *np.atleast_1d(val).shape), np.nan))
                            output_dict[key][i, ...] = np.atleast_1d(val)
                            numeric_flag = True
                            break
                    # Otherwise, save other objects
                    if not numeric_flag:
                        if key == 'model_cost':
                            if output_dict.get(key) is None:
                                output_dict.setdefault(key, np.full((N,), np.nan))
                            output_dict[key][i] = val
                        else:
                            if output_dict.get(key) is None:
                                output_dict.setdefault(key, np.full((N,), None, dtype=object))
                            output_dict[key][i] = val

    # Save average model costs for each alpha fidelity
    if model_fidelity is not None and output_dict.get('model_cost') is not None:
        alpha_costs = {}
        for i, cost in enumerate(output_dict['model_cost']):
            alpha_costs.setdefault(MultiIndex(model_fidelity[i]), [])
            alpha_costs[MultiIndex(model_fidelity[i])].append(cost)
        for a, costs in alpha_costs.items():
            self.model_costs.setdefault(a, np.empty(0))
            self.model_costs[a] = np.nanmean(np.hstack((costs, self.model_costs[a])))

    # Reshape loop dimensions to match the original input shape
    output_dict = format_outputs(output_dict, loop_shape)

    for var in self.outputs:
        if var.compression is not None:
            for field in var.compression.fields:
                if field not in output_dict:
                    self.logger.warning(f"Model return missing field '{field}' for output variable '{var}'. "
                                        f"Returning NaNs...")
                    output_dict[field].setdefault(field, np.full((N,), np.nan))
        elif var.name not in output_dict:
            self.logger.warning(f"Model return missing output variable '{var.name}'. Returning NaNs...")
            output_dict[var.name] = np.full((N,), np.nan)

    # Return the output dictionary and any errors
    if errors:
        output_dict['errors'] = errors
    return output_dict

clear()

Clear the component of all training data, index sets, and MISC states.

Source code in src/amisc/component.py
def clear(self):
    """Clear the component of all training data, index sets, and MISC states."""
    self.active_set.clear()
    self.candidate_set.clear()
    self.misc_states.clear()
    self.misc_costs.clear()
    self.misc_coeff_train.clear()
    self.misc_coeff_test.clear()
    self.model_costs.clear()
    self.training_data.clear()

deserialize(serialized_data, search_paths=None, search_keys=None) classmethod

Return a Component from data. Let pydantic handle field validation and conversion. If any component data has been saved to file and the save file doesn't exist, then the loader will search for the file in the current working directory and any additional search paths provided.

PARAMETER DESCRIPTION
serialized_data

the serialized data to construct the object from

TYPE: dict

search_paths

paths to try and find any save files (i.e. if they moved since they were serialized), will always search in the current working directory by default

TYPE: list[str | Path] DEFAULT: None

search_keys

keys to search for save files in each component (default is all keys in ComponentSerializers, in addition to variable inputs and outputs)

TYPE: list[str] DEFAULT: None

Source code in src/amisc/component.py
@classmethod
def deserialize(cls, serialized_data: dict, search_paths: list[str | Path] = None,
                search_keys: list[str] = None) -> Component:
    """Return a `Component` from `data`. Let pydantic handle field validation and conversion. If any component
    data has been saved to file and the save file doesn't exist, then the loader will search for the file
    in the current working directory and any additional search paths provided.

    :param serialized_data: the serialized data to construct the object from
    :param search_paths: paths to try and find any save files (i.e. if they moved since they were serialized),
                         will always search in the current working directory by default
    :param search_keys: keys to search for save files in each component (default is all keys in
                        [`ComponentSerializers`][amisc.component.ComponentSerializers], in addition to variable
                        inputs and outputs)
    """
    if isinstance(serialized_data, Component):
        return serialized_data
    elif callable(serialized_data):
        # try to construct a component from a raw function (assume data fidelity is (2,) for each inspected input)
        return cls(serialized_data, data_fidelity=(2,) * len(_inspect_function(serialized_data)[0]))

    search_paths = search_paths or []
    search_keys = search_keys or []
    search_keys.extend(ComponentSerializers.__annotations__.keys())
    comp = serialized_data

    for key in search_keys:
        if (filename := comp.get(key, None)) is not None:
            comp[key] = search_for_file(filename, search_paths=search_paths)  # will ret original str if not found

    for key in ['inputs', 'outputs']:
        for var in comp.get(key, []):
            if isinstance(var, dict):
                if (compression := var.get('compression', None)) is not None:
                    var['compression'] = search_for_file(compression, search_paths=search_paths)

    return cls(**comp)

get_cost(alpha, beta)

Return the total cost (wall time s) required to add \((\alpha, \beta)\) to the MISC approximation.

PARAMETER DESCRIPTION
alpha

A multi-index specifying model fidelity

TYPE: MultiIndex

beta

A multi-index specifying surrogate fidelity

TYPE: MultiIndex

RETURNS DESCRIPTION
float

the total cost of adding this multi-index pair to the MISC approximation

Source code in src/amisc/component.py
def get_cost(self, alpha: MultiIndex, beta: MultiIndex) -> float:
    """Return the total cost (wall time s) required to add $(\\alpha, \\beta)$ to the MISC approximation.

    :param alpha: A multi-index specifying model fidelity
    :param beta: A multi-index specifying surrogate fidelity
    :returns: the total cost of adding this multi-index pair to the MISC approximation
    """
    try:
        return self.misc_costs[alpha, beta]
    except Exception:
        return 0.0

get_training_data(alpha='best', beta='best', y_vars=None)

Get all training data for a given multi-index pair (alpha, beta).

PARAMETER DESCRIPTION
alpha

the model fidelity index (defaults to the maximum available model fidelity)

TYPE: Literal['best', 'worst'] | MultiIndex DEFAULT: 'best'

beta

the surrogate fidelity index (defaults to the maximum available surrogate fidelity)

TYPE: Literal['best', 'worst'] | MultiIndex DEFAULT: 'best'

y_vars

the training data to return (defaults to all stored data)

TYPE: list DEFAULT: None

RETURNS DESCRIPTION
tuple[Dataset, Dataset]

(xtrain, ytrain) - the training data for the given multi-indices

Source code in src/amisc/component.py
def get_training_data(self, alpha: Literal['best', 'worst'] | MultiIndex = 'best',
                      beta: Literal['best', 'worst'] | MultiIndex = 'best',
                      y_vars: list = None) -> tuple[Dataset, Dataset]:
    """Get all training data for a given multi-index pair `(alpha, beta)`.

    :param alpha: the model fidelity index (defaults to the maximum available model fidelity)
    :param beta: the surrogate fidelity index (defaults to the maximum available surrogate fidelity)
    :param y_vars: the training data to return (defaults to all stored data)
    :returns: `(xtrain, ytrain)` - the training data for the given multi-indices
    """
    # Find the best alpha
    if alpha == 'best':
        alpha_best = ()
        for a, _ in self.active_set.union(self.candidate_set):
            if sum(a) > sum(alpha_best):
                alpha_best = a
        alpha = alpha_best
    elif alpha == 'worst':
        alpha = (0,) * len(self.max_alpha)

    # Find the best beta for the given alpha
    if beta == 'best':
        beta_best = ()
        for a, b in self.active_set.union(self.candidate_set):
            if a == alpha and sum(b) > sum(beta_best):
                beta_best = b
        beta = beta_best
    elif beta == 'worst':
        beta = (0,) * len(self.max_beta)

    try:
        return self.training_data.get(alpha, beta, y_vars=y_vars, skip_nan=True)
    except Exception as e:
        self.logger.error(f"Error getting training data for alpha={alpha}, beta={beta}.")
        raise e

gradient(inputs, index_set='test', misc_coeff=None, derivative='first', executor=None)

Evaluate the Jacobian or Hessian of the MISC surrogate approximation at new inputs, i.e. the first or second derivatives, respectively.

PARAMETER DESCRIPTION
inputs

dict of input arrays for each variable input

TYPE: dict | Dataset

index_set

the active index set, defaults to self.active_set if 'train' or both self.active_set + self.candidate_set if 'test'

TYPE: Literal['train', 'test'] | IndexSet DEFAULT: 'test'

misc_coeff

the data structure holding the MISC coefficients to use, which defaults to the training or testing coefficients depending on the index_set parameter.

TYPE: MiscTree DEFAULT: None

derivative

whether to compute the first or second derivative (i.e. Jacobian or Hessian)

TYPE: Literal['first', 'second'] DEFAULT: 'first'

executor

executor for looping over MISC coefficients (optional)

TYPE: Executor DEFAULT: None

RETURNS DESCRIPTION
Dataset

a dict of the Jacobian or Hessian of the surrogate approximation for each output variable

Source code in src/amisc/component.py
def gradient(self, inputs: dict | Dataset,
             index_set: Literal['train', 'test'] | IndexSet = 'test',
             misc_coeff: MiscTree = None,
             derivative: Literal['first', 'second'] = 'first',
             executor: Executor = None) -> Dataset:
    """Evaluate the Jacobian or Hessian of the MISC surrogate approximation at new `inputs`, i.e.
    the first or second derivatives, respectively.

    :param inputs: `dict` of input arrays for each variable input
    :param index_set: the active index set, defaults to `self.active_set` if `'train'` or both
                      `self.active_set + self.candidate_set` if `'test'`
    :param misc_coeff: the data structure holding the MISC coefficients to use, which defaults to the
                       training or testing coefficients depending on the `index_set` parameter.
    :param derivative: whether to compute the first or second derivative (i.e. Jacobian or Hessian)
    :param executor: executor for looping over MISC coefficients (optional)
    :returns: a `dict` of the Jacobian or Hessian of the surrogate approximation for each output variable
    """
    if not self.has_surrogate:
        self.logger.warning("No surrogate model available for gradient computation.")
        return None

    index_set, misc_coeff = self._match_index_set(index_set, misc_coeff)
    inputs, loop_shape = format_inputs(inputs)  # {'x': (N,)}
    outputs = {}

    if len(index_set) == 0:
        for var in self.outputs:
            outputs[var] = np.full(loop_shape, np.nan)
        return outputs
    y_vars = self._surrogate_outputs()

    # Combination technique MISC gradient prediction
    results = []
    coeffs = []
    for alpha, beta in index_set:
        comb_coeff = misc_coeff[alpha, beta]
        if np.abs(comb_coeff) > 0:
            coeffs.append(comb_coeff)
            func = self.interpolator.gradient if derivative == 'first' else self.interpolator.hessian
            args = (self.misc_states.get((alpha, beta)),
                    self.training_data.get(alpha, beta[:len(self.data_fidelity)], skip_nan=True, y_vars=y_vars))

            results.append(func(inputs, *args) if executor is None else executor.submit(func, inputs, *args))

    if executor is not None:
        wait(results, timeout=None, return_when=ALL_COMPLETED)
        results = [future.result() for future in results]

    for coeff, interp_pred in zip(coeffs, results):
        for var, arr in interp_pred.items():
            if outputs.get(var) is None:
                outputs[str(var)] = coeff * arr
            else:
                outputs[str(var)] += coeff * arr

    return format_outputs(outputs, loop_shape)

hessian(*args, **kwargs)

Alias for Component.gradient(*args, derivative='second', **kwargs).

Source code in src/amisc/component.py
def hessian(self, *args, **kwargs):
    """Alias for `Component.gradient(*args, derivative='second', **kwargs)`."""
    return self.gradient(*args, derivative='second', **kwargs)

is_downward_closed(indices) staticmethod

Return if a list of \((\alpha, \beta)\) multi-indices is downward-closed.

MISC approximations require a downward-closed set in order to use the combination-technique formula for the coefficients (as implemented by Component.update_misc_coeff()).

Example

The list [( (0,), (0,) ), ( (1,), (0,) ), ( (1,), (1,) )] is downward-closed. You can visualize this as building a stack of cubes: in order to place a cube, all adjacent cubes must be present (does the logo make sense now?).

PARAMETER DESCRIPTION
indices

IndexSet of (alpha, beta) multi-indices

TYPE: IndexSet

RETURNS DESCRIPTION
bool

whether the set of indices is downward-closed

Source code in src/amisc/component.py
@staticmethod
def is_downward_closed(indices: IndexSet) -> bool:
    """Return if a list of $(\\alpha, \\beta)$ multi-indices is downward-closed.

    MISC approximations require a downward-closed set in order to use the combination-technique formula for the
    coefficients (as implemented by `Component.update_misc_coeff()`).

    !!! Example
        The list `[( (0,), (0,) ), ( (1,), (0,) ), ( (1,), (1,) )]` is downward-closed. You can visualize this as
        building a stack of cubes: in order to place a cube, all adjacent cubes must be present (does the logo
        make sense now?).

    :param indices: `IndexSet` of (`alpha`, `beta`) multi-indices
    :returns: whether the set of indices is downward-closed
    """
    # Iterate over every multi-index
    for alpha, beta in indices:
        # Every smaller multi-index must also be included in the indices list
        sub_sets = [np.arange(tuple(alpha + beta)[i] + 1) for i in range(len(alpha) + len(beta))]
        for ele in itertools.product(*sub_sets):
            tup = (MultiIndex(ele[:len(alpha)]), MultiIndex(ele[len(alpha):]))
            if tup not in indices:
                return False
    return True

model_kwarg_requested(kwarg_name)

Return whether the underlying component model requested this kwarg_name. Special kwargs include:

  • output_path — a save directory created by amisc will be passed to the model for saving model output files.
  • alpha — a tuple or list of model fidelity indices will be passed to the model to adjust fidelity.
  • input_vars — a list of Variable objects will be passed to the model for input variable information.
  • output_vars — a list of Variable objects will be passed to the model for output variable information.
PARAMETER DESCRIPTION
kwarg_name

the argument to check for in the underlying component model's function signature kwargs

TYPE: str

RETURNS DESCRIPTION
bool

whether the component model requests this kwarg argument

Source code in src/amisc/component.py
def model_kwarg_requested(self, kwarg_name: str) -> bool:
    """Return whether the underlying component model requested this `kwarg_name`. Special kwargs include:

    - `output_path` — a save directory created by `amisc` will be passed to the model for saving model output files.
    - `alpha` — a tuple or list of model fidelity indices will be passed to the model to adjust fidelity.
    - `input_vars` — a list of `Variable` objects will be passed to the model for input variable information.
    - `output_vars` — a list of `Variable` objects will be passed to the model for output variable information.

    :param kwarg_name: the argument to check for in the underlying component model's function signature kwargs
    :returns: whether the component model requests this `kwarg` argument
    """
    signature = inspect.signature(self.model)
    for param in signature.parameters.values():
        if param.name == kwarg_name and param.default != param.empty:
            return True
    return False

predict(inputs, use_model=None, model_dir=None, index_set='test', misc_coeff=None, incremental=False, executor=None, **kwds)

Evaluate the MISC surrogate approximation at new inputs x.

Using the underlying model

By default this will predict the MISC surrogate approximation; all inputs are assumed to be in a compressed and normalized form. If the component does not have a surrogate (i.e. it is analytical), then the inputs will be converted to model form and the underlying model will be called in place. If you instead want to override the surrogate, passing use_model will call the underlying model directly. In that case, the inputs should be passed in already in model form (i.e. full fields, denormalized).

PARAMETER DESCRIPTION
inputs

dict of input arrays for each variable input

TYPE: dict | Dataset

use_model

'best'=high-fidelity, 'worst'=low-fidelity, tuple=a specific alpha, None=surrogate (default)

TYPE: Literal['best', 'worst'] | tuple DEFAULT: None

model_dir

directory to save output files if use_model is specified, ignored otherwise

TYPE: str | Path DEFAULT: None

index_set

the active index set, defaults to self.active_set if 'train' or both self.active_set + self.candidate_set if 'test'

TYPE: Literal['train', 'test'] | IndexSet DEFAULT: 'test'

misc_coeff

the data structure holding the MISC coefficients to use, which defaults to the training or testing coefficients depending on the index_set parameter.

TYPE: MiscTree DEFAULT: None

incremental

a special flag to use if the provided index_set is an incremental update to the active index set. A temporary copy of the internal misc_coeff data structure will be updated and used to incorporate the new indices.

TYPE: bool DEFAULT: False

executor

executor for parallel execution if the model is not vectorized (optional), will use the executor for looping over MISC coefficients if evaluating the surrogate rather than the model

TYPE: Executor DEFAULT: None

kwds

additional keyword arguments to pass to the model (if using the underlying model)

DEFAULT: {}

RETURNS DESCRIPTION
Dataset

the surrogate approximation of the model (or the model return itself if use_model)

Source code in src/amisc/component.py
def predict(self, inputs: dict | Dataset,
            use_model: Literal['best', 'worst'] | tuple = None,
            model_dir: str | Path = None,
            index_set: Literal['train', 'test'] | IndexSet = 'test',
            misc_coeff: MiscTree = None,
            incremental: bool = False,
            executor: Executor = None,
            **kwds) -> Dataset:
    """Evaluate the MISC surrogate approximation at new inputs `x`.

    !!! Note "Using the underlying model"
        By default this will predict the MISC surrogate approximation; all inputs are assumed to be in a compressed
        and normalized form. If the component does not have a surrogate (i.e. it is analytical), then the inputs
        will be converted to model form and the underlying model will be called in place. If you instead want to
        override the surrogate, passing `use_model` will call the underlying model directly. In that case, the
        inputs should be passed in already in model form (i.e. full fields, denormalized).

    :param inputs: `dict` of input arrays for each variable input
    :param use_model: 'best'=high-fidelity, 'worst'=low-fidelity, tuple=a specific `alpha`, None=surrogate (default)
    :param model_dir: directory to save output files if `use_model` is specified, ignored otherwise
    :param index_set: the active index set, defaults to `self.active_set` if `'train'` or both
                      `self.active_set + self.candidate_set` if `'test'`
    :param misc_coeff: the data structure holding the MISC coefficients to use, which defaults to the
                       training or testing coefficients depending on the `index_set` parameter.
    :param incremental: a special flag to use if the provided `index_set` is an incremental update to the active
                        index set. A temporary copy of the internal `misc_coeff` data structure will be updated
                        and used to incorporate the new indices.
    :param executor: executor for parallel execution if the model is not vectorized (optional), will use the
                     executor for looping over MISC coefficients if evaluating the surrogate rather than the model
    :param kwds: additional keyword arguments to pass to the model (if using the underlying model)
    :returns: the surrogate approximation of the model (or the model return itself if `use_model`)
    """
    # Use raw model inputs/outputs
    if use_model is not None:
        outputs = self.call_model(inputs, model_fidelity=use_model, output_path=model_dir, executor=executor,**kwds)
        ret = {}
        for var in self.outputs:
            if var in outputs:
                ret[var.name] = outputs[var.name]
            elif var.compression is not None:
                for field in var.compression.fields:
                    ret[field] = outputs[field]
        return ret

    # Convert inputs/outputs to/from model if no surrogate (i.e. analytical models)
    if not self.has_surrogate:
        field_coords = {f'{var}_coords': self.model_kwargs.get(f'{var}_coords', kwds.get(f'{var}_coords', None))
                        for var in self.inputs}
        inputs, field_coords = to_model_dataset(inputs, self.inputs, del_latent=True, **field_coords)
        field_coords.update(kwds)
        outputs = self.call_model(inputs, model_fidelity=use_model or 'best', output_path=model_dir,
                                  executor=executor, **field_coords)
        outputs, surr_vars = to_surrogate_dataset(outputs, self.outputs, del_fields=True, **field_coords)
        return {str(var): outputs[var] for var in surr_vars}

    # Choose the correct index set and misc_coeff data structures
    if incremental:
        misc_coeff = copy.deepcopy(self.misc_coeff_train)
        self.update_misc_coeff(index_set, self.active_set, misc_coeff)
        index_set = self.active_set.union(index_set)
    else:
        index_set, misc_coeff = self._match_index_set(index_set, misc_coeff)

    # Format inputs for surrogate prediction (all scalars at this point, including latent coeffs)
    inputs, loop_shape = format_inputs(inputs)  # {'x': (N,)}
    outputs = {}

    # Handle prediction with empty active set (return nan)
    if len(index_set) == 0:
        for var in self.outputs:
            outputs[var.name] = np.full(loop_shape, np.nan)
        return outputs

    y_vars = self._surrogate_outputs()  # Only request this component's specified outputs (ignore all extras)

    # Combination technique MISC surrogate prediction
    results = []
    coeffs = []
    for alpha, beta in index_set:
        comb_coeff = misc_coeff[alpha, beta]
        if np.abs(comb_coeff) > 0:
            coeffs.append(comb_coeff)
            args = (self.misc_states.get((alpha, beta)),
                    self.training_data.get(alpha, beta[:len(self.data_fidelity)], skip_nan=True, y_vars=y_vars))

            results.append(self.interpolator.predict(inputs, *args) if executor is None else
                           executor.submit(self.interpolator.predict, inputs, *args))

    if executor is not None:
        wait(results, timeout=None, return_when=ALL_COMPLETED)
        results = [future.result() for future in results]

    for coeff, interp_pred in zip(coeffs, results):
        for var, arr in interp_pred.items():
            if outputs.get(var) is None:
                outputs[str(var)] = coeff * arr
            else:
                outputs[str(var)] += coeff * arr

    return format_outputs(outputs, loop_shape)

serialize(keep_yaml_objects=False, serialize_args=None, serialize_kwargs=None)

Convert to a dict with only standard Python types as fields and values.

PARAMETER DESCRIPTION
keep_yaml_objects

whether to keep Variable or other yaml serializable objects instead of also serializing them (default is False)

TYPE: bool DEFAULT: False

serialize_args

additional arguments to pass to the serialize method of each Component attribute; specify as a dict of attribute names to tuple of arguments to pass

TYPE: dict[str, tuple] DEFAULT: None

serialize_kwargs

additional keyword arguments to pass to the serialize method of each Component attribute

TYPE: dict[str:dict] DEFAULT: None

RETURNS DESCRIPTION
dict

a dict representation of the Component object

Source code in src/amisc/component.py
def serialize(self, keep_yaml_objects: bool = False, serialize_args: dict[str, tuple] = None,
              serialize_kwargs: dict[str: dict] = None) -> dict:
    """Convert to a `dict` with only standard Python types as fields and values.

    :param keep_yaml_objects: whether to keep `Variable` or other yaml serializable objects instead of
                              also serializing them (default is False)
    :param serialize_args: additional arguments to pass to the `serialize` method of each `Component` attribute;
                           specify as a `dict` of attribute names to tuple of arguments to pass
    :param serialize_kwargs: additional keyword arguments to pass to the `serialize` method of each
                             `Component` attribute
    :returns: a `dict` representation of the `Component` object
    """
    serialize_args = serialize_args or dict()
    serialize_kwargs = serialize_kwargs or dict()
    d = {}
    for key, value in self.__dict__.items():
        if value is not None and not key.startswith('_'):
            if key == 'serializers':
                # Update the serializers
                serializers = self._validate_serializers({k: type(getattr(self, k)) for k in value.keys()})
                d[key] = {k: (v.obj if keep_yaml_objects else v.serialize()) for k, v in serializers.items()}
            elif key in ['inputs', 'outputs'] and not keep_yaml_objects:
                d[key] = value.serialize(**serialize_kwargs.get(key, {}))
            elif key == 'model' and not keep_yaml_objects:
                d[key] = YamlSerializable(obj=value).serialize()
            elif key in ['data_fidelity', 'surrogate_fidelity', 'model_fidelity']:
                if len(value) > 0:
                    d[key] = str(value)
            elif key in ['active_set', 'candidate_set']:
                if len(value) > 0:
                    d[key] = value.serialize()
            elif key in ['misc_costs', 'misc_coeff_train', 'misc_coeff_test', 'misc_states']:
                if len(value) > 0:
                    d[key] = value.serialize(keep_yaml_objects=keep_yaml_objects)
            elif key in ['model_costs']:
                if len(value) > 0:
                    d[key] = {str(k): float(v) for k, v in value.items()}
            elif key in ComponentSerializers.__annotations__.keys():
                if key in ['training_data', 'interpolator'] and not self.has_surrogate:
                    continue
                else:
                    d[key] = value.serialize(*serialize_args.get(key, ()), **serialize_kwargs.get(key, {}))
            else:
                d[key] = value
    return d

set_logger(log_file=None, stdout=None, logger=None, level=logging.INFO)

Set a new logging.Logger object.

PARAMETER DESCRIPTION
log_file

log to file (if provided)

TYPE: str | Path DEFAULT: None

stdout

whether to connect the logger to console (defaults to whatever is currently set or False)

TYPE: bool DEFAULT: None

logger

the logging object to use (if None, then a new logger is created; this will override the log_file and stdout arguments if set)

TYPE: Logger DEFAULT: None

level

the logging level to set (default is logging.INFO)

TYPE: int DEFAULT: INFO

Source code in src/amisc/component.py
def set_logger(self, log_file: str | Path = None, stdout: bool = None, logger: logging.Logger = None,
               level: int = logging.INFO):
    """Set a new `logging.Logger` object.

    :param log_file: log to file (if provided)
    :param stdout: whether to connect the logger to console (defaults to whatever is currently set or False)
    :param logger: the logging object to use (if None, then a new logger is created; this will override
                   the `log_file` and `stdout` arguments if set)
    :param level: the logging level to set (default is `logging.INFO`)
    """
    if stdout is None:
        stdout = False
        if self._logger is not None:
            for handler in self._logger.handlers:
                if isinstance(handler, logging.StreamHandler):
                    stdout = True
                    break
    self._logger = logger or get_logger(self.name, log_file=log_file, stdout=stdout, level=level)

update_misc_coeff(new_indices, index_set='train', misc_coeff=None)

Update MISC coefficients incrementally resulting from the addition of new indices to an index set.

Incremental updates

This function is used to update the MISC coefficients stored in misc_coeff after adding new indices to the given index_set. If a custom index_set or misc_coeff are provided, the user is responsible for ensuring the data structures are consistent. Since this is an incremental update, this means all existing coefficients for every index in index_set should be precomputed and stored in misc_coeff.

PARAMETER DESCRIPTION
new_indices

a set of \((\alpha, \beta)\) tuples that are being added to the index_set

TYPE: IndexSet

index_set

the active index set, defaults to self.active_set if 'train' or both self.active_set + self.candidate_set if 'test'

TYPE: Literal['test', 'train'] | IndexSet DEFAULT: 'train'

misc_coeff

the data structure holding the MISC coefficients to update, which defaults to the training or testing coefficients depending on the index_set parameter. This data structure is modified in place.

TYPE: MiscTree DEFAULT: None

Source code in src/amisc/component.py
def update_misc_coeff(self, new_indices: IndexSet, index_set: Literal['test', 'train'] | IndexSet = 'train',
                      misc_coeff: MiscTree = None):
    """Update MISC coefficients incrementally resulting from the addition of new indices to an index set.

    !!! Warning "Incremental updates"
        This function is used to update the MISC coefficients stored in `misc_coeff` after adding new indices
        to the given `index_set`. If a custom `index_set` or `misc_coeff` are provided, the user is responsible
        for ensuring the data structures are consistent. Since this is an incremental update, this means all
        existing coefficients for every index in `index_set` should be precomputed and stored in `misc_coeff`.

    :param new_indices: a set of $(\\alpha, \\beta)$ tuples that are being added to the `index_set`
    :param index_set: the active index set, defaults to `self.active_set` if `'train'` or both
                      `self.active_set + self.candidate_set` if `'test'`
    :param misc_coeff: the data structure holding the MISC coefficients to update, which defaults to the
                       training or testing coefficients depending on the `index_set` parameter. This data structure
                       is modified in place.
    """
    index_set, misc_coeff = self._match_index_set(index_set, misc_coeff)

    for new_alpha, new_beta in new_indices:
        new_ind = np.array(new_alpha + new_beta)

        # Update all existing/new coefficients if they are a distance of [0, 1] "below" the new index
        # Note that new indices can only be [0, 1] away from themselves -- not any other new indices
        for old_alpha, old_beta in itertools.chain(index_set, [(new_alpha, new_beta)]):
            old_ind = np.array(old_alpha + old_beta)
            diff = new_ind - old_ind
            if np.all(np.isin(diff, [0, 1])):
                if misc_coeff.get((old_alpha, old_beta)) is None:
                    misc_coeff[old_alpha, old_beta] = 0
                misc_coeff[old_alpha, old_beta] += (-1) ** int(np.sum(np.abs(diff)))

update_model(new_model=None, model_kwargs=None, **kwargs)

Update the underlying component model or its kwargs.

Source code in src/amisc/component.py
def update_model(self, new_model: callable = None, model_kwargs: dict = None, **kwargs):
    """Update the underlying component model or its kwargs."""
    if new_model is not None:
        self.model = new_model
    new_kwargs = self.model_kwargs.data
    new_kwargs.update(model_kwargs or {})
    new_kwargs.update(kwargs)
    self.model_kwargs = new_kwargs

ComponentSerializers

Bases: TypedDict

Type hint for the Component class data serializers.

ATTRIBUTE DESCRIPTION
model_kwargs

the model kwarg object class

TYPE: str | type[Serializable] | YamlSerializable

interpolator

the interpolator object class

TYPE: str | type[Serializable] | YamlSerializable

training_data

the training data object class

TYPE: str | type[Serializable] | YamlSerializable

IndexSet(s=())

Bases: set, Serializable

Dataclass that maintains a list of multi-indices. Overrides basic set functionality to ensure elements are formatted correctly as (alpha, beta); that is, as a tuple of alpha and beta, which are themselves instances of a MultiIndex tuple.

An example index set

\(\mathcal{I} = [(\alpha, \beta)_1 , (\alpha, \beta)_2, (\alpha, \beta)_3 , ...]\) would be specified as I = [((0, 0), (0, 0, 0)) , ((0, 1), (0, 1, 0)), ...].

Source code in src/amisc/component.py
def __init__(self, s=()):
    s = [self._validate_element(ele) for ele in s]
    super().__init__(s)

deserialize(serialized_data) classmethod

Deserialize a list of tuples to an IndexSet.

Source code in src/amisc/component.py
@classmethod
def deserialize(cls, serialized_data: list[str]) -> IndexSet:
    """Deserialize a list of tuples to an `IndexSet`."""
    return cls(serialized_data)

serialize()

Return a list of each multi-index in the set serialized to a string.

Source code in src/amisc/component.py
def serialize(self) -> list[str]:
    """Return a list of each multi-index in the set serialized to a string."""
    return [str(ele) for ele in self]

MiscTree(data=None, **kwargs)

Bases: UserDict, Serializable

Dataclass that maintains MISC data in a dict tree, indexed by alpha and beta. Overrides basic dict functionality to ensure elements are formatted correctly as (alpha, beta) -> data. Used to store MISC coefficients, model costs, and interpolator states.

The underlying data structure is: dict[MultiIndex, dict[MultiIndex, float | InterpolatorState]].

Source code in src/amisc/component.py
def __init__(self, data: dict = None, **kwargs):
    data_dict = data or {}
    if isinstance(data_dict, MiscTree):
        data_dict = data_dict.data
    data_dict.update(kwargs)
    super().__init__(self._validate_data(data_dict))

clear()

Clear the MiscTree data.

Source code in src/amisc/component.py
def clear(self):
    """Clear the `MiscTree` data."""
    for key in list(self.data.keys()):
        del self.data[key]

deserialize(serialized_data) classmethod

Deserialize a dict to a MiscTree.

PARAMETER DESCRIPTION
serialized_data

the data to deserialize to a MiscTree object

TYPE: dict

Source code in src/amisc/component.py
@classmethod
def deserialize(cls, serialized_data: dict) -> MiscTree:
    """Deserialize a `dict` to a `MiscTree`.

    :param serialized_data: the data to deserialize to a `MiscTree` object
    """
    return cls(serialized_data)

serialize(*args, keep_yaml_objects=False, **kwargs)

Serialize alpha, beta indices to string and return a dict of internal data.

PARAMETER DESCRIPTION
args

extra serialization arguments for internal InterpolatorState

DEFAULT: ()

keep_yaml_objects

whether to keep YamlSerializable instances in the serialization

DEFAULT: False

kwargs

extra serialization keyword arguments for internal InterpolatorState

DEFAULT: {}

Source code in src/amisc/component.py
def serialize(self, *args, keep_yaml_objects=False, **kwargs) -> dict:
    """Serialize `alpha, beta` indices to string and return a `dict` of internal data.

    :param args: extra serialization arguments for internal `InterpolatorState`
    :param keep_yaml_objects: whether to keep `YamlSerializable` instances in the serialization
    :param kwargs: extra serialization keyword arguments for internal `InterpolatorState`
    """
    ret_dict = {}
    if state_serializer := self.state_serializer(self.data):
        ret_dict[self.SERIALIZER_KEY] = state_serializer.obj if keep_yaml_objects else state_serializer.serialize()
    for alpha, beta, data in self:
        ret_dict.setdefault(str(alpha), dict())
        serialized_data = data.serialize(*args, **kwargs) if isinstance(data, InterpolatorState) else float(data)
        ret_dict[str(alpha)][str(beta)] = serialized_data
    return ret_dict

state_serializer(data) classmethod

Infer and return the interpolator state serializer from the MiscTree data (if possible). If no InterpolatorState instance could be found, return None.

Source code in src/amisc/component.py
@classmethod
def state_serializer(cls, data: dict) -> YamlSerializable | None:
    """Infer and return the interpolator state serializer from the `MiscTree` data (if possible). If no
    `InterpolatorState` instance could be found, return `None`.
    """
    serializer = data.get(cls.SERIALIZER_KEY, None)  # if `data` is serialized
    if serializer is None:  # Otherwise search for an InterpolatorState
        for alpha, beta_dict in data.items():
            if alpha == cls.SERIALIZER_KEY:
                continue
            for beta, value in beta_dict.items():
                if isinstance(value, InterpolatorState):
                    serializer = type(value)
                    break
            if serializer is not None:
                break
    return cls._validate_state_serializer(serializer)

update(data_dict=None, **kwargs)

Force dict.update() through the validator.

Source code in src/amisc/component.py
def update(self, data_dict: dict = None, **kwargs):
    """Force `dict.update()` through the validator."""
    data_dict = data_dict or dict()
    data_dict.update(kwargs)
    super().update(self._validate_data(data_dict))

ModelKwargs

Bases: UserDict, Serializable

Default dataclass for storing model keyword arguments in a dict. If you have kwargs that require more complicated serialization/specification than a plain dict, then you can subclass from here.

from_dict(config) classmethod

Create a ModelKwargs object from a dict configuration.

Source code in src/amisc/component.py
@classmethod
def from_dict(cls, config: dict) -> ModelKwargs:
    """Create a `ModelKwargs` object from a `dict` configuration."""
    method = config.pop('method', 'default_kwargs').lower()
    match method:
        case 'default_kwargs':
            return ModelKwargs(**config)
        case 'string_kwargs':
            return StringKwargs(**config)
        case other:
            config['method'] = other
            return ModelKwargs(**config)  # Pass the method through

StringKwargs

Bases: StringSerializable, ModelKwargs

Dataclass for storing model keyword arguments as a string.