4.4. Design Spaces
A Design Space defines a set of materials that should be searched over when performing a material design. Design Spaces must be registered to be used in a design workflow. Currently, there are four Design Spaces:
4.4.1. Product Design Space
Materials from a ProductDesignSpace are composed of the Cartesian product of independent factors.
Each factor can be a separate design space or a univariate dimension.
Any other type of design space can be a valid subspace.
Subspaces are defined anonymously and embedded in the product design space.
A Dimension defines valid values of a single variable.
Valid values can be discrete sets (i.e., enumerated using a list) or continuous ranges (i.e., defined by upper and lower bounds on real numbers).
The product design space samples materials by taking all combinations of one element from each dimension.
For example, given dimensions temperature = [300, 400] and time = [1, 5, 10] the Cartesian product is:
candidates = [
{"temperature": 300, "time": 1},
{"temperature": 300, "time": 5},
{"temperature": 300, "time": 10},
{"temperature": 400, "time": 1},
{"temperature": 400, "time": 5},
{"temperature": 400, "time": 10}
]
The defining characteristic of a design space is the ability to draw samples from it. If a continuous range is included, a random sample is drawn for that variable, and finite variables are exhaustively enumerated. Once all combinations of finite variables have been sampled, the cycle repeats while continuing to sample new values from the infinite dimension.
Finite sets of value are defined using an EnumeratedDimension.
Valid variable values are specified using a list of strings.
An enumerated dimension of two temperatures, for example, can be specified using the Citrine Python client via:
from citrine.informatics.descriptors import RealDescriptor
from citrine.informatics.dimensions import EnumeratedDimension
descriptor = RealDescriptor(key='Temperature', lower_bound=273, upper_bound=1000, units='K')
dimension = EnumeratedDimension(descriptor=descriptor, values=['300', '400'])
Continuous ranges of values are defined using a ContinuousDimension.
Upper and lower bounds define the range of values we wish to uniformly sample from.
If, using the previous example, temperature can be any value between 300 and 400K the dimension would be created using:
from citrine.informatics.dimensions import ContinuousDimension
dimension = ContinuousDimension(descriptor, lower_bound=300, upper_bound=400)
Note, the upper and lower bounds of the dimension do not need to match those of the descriptor. The bounds of the descriptor define the minimum and maximum temperatures that could be considered valid, e.g. our furnace can only reach 1000K. The bounds of the dimension are the bounds we wish to search between, e.g., restrict the search to between 300 and 400K (even though the furnace can go to much higher temperatures).
A product design space combines subspaces in a similar manner, although subspaces are often multivariate. However the same principle holds for sampling: all combinations of finite factors are enumerated, while infinite factors are sampled continuously. Note, each factor must be independent. This means that the same descriptor may not appear more than once in a product design space.
As an example, let’s create a produt design space that defines the ways in which we might mix two pigments together and stir at some temperature. We are only interested in specific amounts of each pigment, so we create a data source design space that references a data source defining the amounts we wish to test. The mixing speed is discrete, so we describe it with an enumerated dimension. And temperature is described by a continuous dimension.
from citrine.informatics.data_sources import GemTableDataSource
from citrine.informatics.descriptors import CategoricalDescriptor, RealDescriptor
from citrine.informatics.design_spaces import DataSourceDesignSpace, ProductDesignSpace
from citrine.informatics.dimensions import ContinuousDimension, EnumeratedDimension
pigment_data_source = data_source=GemTableDataSource(table_id=table_id, table_version=table_version)
enumerated_space = DataSourceDesignSpace(
name="amounts of pigments A and B",
description="total amount of pigment is 100 grams",
data_source=pigment_data_source
)
temp_descriptor = RealDescriptor(key='Temperature', lower_bound=273, upper_bound=1000, units='K')
temp_dimension = ContinuousDimension(descriptor=temp_descriptor, lower_bound=300, upper_bound=400)
speed_descriptor = CategoricalDescriptor(key='Mixing Speed', categories=["Slow", "Medium", "Fast"])
speed_dimension = EnumeratedDimension(descriptor=speed_descriptor, values=["Slow", "Fast"])
product_space = ProductDesignSpace(
name="Mix 2 pigments at some speed and temperature",
description="Pigments A and B, temperatures between 300 and 400 K, and either Slow or Fast",
subspaces=[enumerated_space],
dimensions=[temp_dimension, speed_dimension]
)
product_space = project.design_spaces.register(product_space)
The enumerated design space defined in this way might product the following candidates:
candidates = [
{"Amount of Pigment A": 10.0, "Amount of Pigment B": 90.0, "Mixing Speed": "Slow", "Temperature": 329.1356},
{"Amount of Pigment A": 10.0, "Amount of Pigment B": 90.0, "Mixing Speed": "Fast", "Temperature": 391.5329},
{"Amount of Pigment A": 15.0, "Amount of Pigment B": 85.0, "Mixing Speed": "Slow", "Temperature": 388.2350},
{"Amount of Pigment A": 15.0, "Amount of Pigment B": 85.0, "Mixing Speed": "Fast", "Temperature": 347.9817},
{"Amount of Pigment A": 20.0, "Amount of Pigment B": 80.0, "Mixing Speed": "Slow", "Temperature": 381.8395},
{"Amount of Pigment A": 20.0, "Amount of Pigment B": 80.0, "Mixing Speed": "Fast", "Temperature": 305.8001},
{"Amount of Pigment A": 10.0, "Amount of Pigment B": 90.0, "Mixing Speed": "Slow", "Temperature": 338.1545},
... # enumerated factors repeat while continuously sampling Temperature
]
4.4.2. Hierarchical Design Space
A HierarchicalDesignSpace produces candidates that represent full material histories.
Unlike a ProductDesignSpace, which produces flat candidates with composition represented only by raw ingredients, a hierarchical design space generates candidates with a tree structure: a terminal (root) material connected to sub-materials through formulation ingredients.
The design space is defined by a root node and zero or more subspace nodes, each represented by a MaterialNodeDefinition.
The root node defines the attributes and formulation contents of the terminal material in each candidate.
Subspaces define any new materials that appear in the history of the terminal material.
Commonly, each node in a hierarchical design space contains a FormulationDesignSpace as its formulation_subspace, which defines the ingredients, labels, and constraints for that level of the material history.
See Formulation Design Space below for details on configuring formulation subspaces.
Nodes are connected through formulation ingredient names.
If the root node contains a formulation subspace with an ingredient named "New Mixture-001", and a sub-node has its name set to "New Mixture-001", the resulting candidate will include that sub-node’s material as an ingredient in the root material’s formulation.
This linking can be extended to sub-nodes referencing other sub-nodes, allowing arbitrarily deep material history shapes.
4.4.2.1. Material Node Definitions
Each MaterialNodeDefinition describes a single node in the material history and has the following fields:
name: A unique identifier used to reference materials produced by this node. When a formulation subspace on another node includes this name as an ingredient, the nodes become linked in the resulting candidate.attributes: A list ofDimensionobjects defining the processing parameters on the materials produced by this node. These dimensions are explored independently during Candidate Generation.formulation_subspace: An optionalFormulationDesignSpacedefining the ingredients, labels, and constraints for formulations on materials produced by this node.template_link: An optionalTemplateLinklinking the node to material and process templates on the Citrine Platform.scope: An optional custom scope used to identify the materials produced by this node.display_name: An optional display name for identifying the node on the Citrine Platform (does not appear in generated candidates).
4.4.2.2. Template Links
A TemplateLink associates a node with on-platform material and process templates via their UUIDs.
Template names can optionally be provided for readability.
4.4.2.3. Data Sources
DataSource objects can be included on the design space to allow design over “known” materials.
When constructing candidates, the Citrine Platform looks up ingredient names from formulation subspaces in the provided data sources and injects their composition and properties into the material history.
When constructing a default hierarchical design space, the platform includes any data sources found on the associated predictor configuration.
See our documentation on Data Sources for more information.
4.4.2.4. Creating a Hierarchical Design Space
The easiest way to build a hierarchical design space is to use create_default() with mode=DefaultDesignSpaceMode.HIERARCHICAL.
This inspects the predictor’s training data to infer the material history shape and automatically constructs the root node, sub-nodes, dimensions, and formulation subspaces:
from citrine.informatics.design_spaces import DefaultDesignSpaceMode
design_space = project.design_spaces.create_default(
predictor_id=predictor_id,
predictor_version=predictor_version,
mode=DefaultDesignSpaceMode.HIERARCHICAL,
)
registered_design_space = project.design_spaces.register(design_space)
If you already have a registered ProductDesignSpace, you can convert it into an equivalent hierarchical design space using convert_to_hierarchical():
hierarchical_ds = project.design_spaces.convert_to_hierarchical(
product_design_space.uid,
predictor_id=predictor_id,
predictor_version=predictor_version,
)
registered_design_space = project.design_spaces.register(hierarchical_ds)
4.4.2.5. Manual Construction Example
The following example creates a hierarchical design space for a coating material manually. The root node represents the final coating, which is a formulation of a binder and a pigment mixture. The pigment mixture is defined as a sub-node with its own formulation and a processing temperature attribute.
from citrine.informatics.constraints import IngredientCountConstraint
from citrine.informatics.data_sources import GemTableDataSource
from citrine.informatics.descriptors import FormulationDescriptor, RealDescriptor
from citrine.informatics.design_spaces import (
FormulationDesignSpace,
HierarchicalDesignSpace,
)
from citrine.informatics.design_spaces.hierarchical_design_space import (
MaterialNodeDefinition,
)
from citrine.informatics.dimensions import ContinuousDimension
# Define the formulation descriptor used across the design space
formulation_descriptor = FormulationDescriptor.hierarchical()
# --- Sub-node: Pigment Mixture ---
# This node defines a mixture of two pigments with a processing temperature.
pigment_formulation = FormulationDesignSpace(
name="Pigment blend",
description="Blend of pigment A and pigment B",
formulation_descriptor=formulation_descriptor,
ingredients={"Pigment A", "Pigment B"},
labels={"pigment": {"Pigment A", "Pigment B"}},
constraints={
IngredientCountConstraint(
formulation_descriptor=formulation_descriptor, min=2, max=2
)
},
)
temp_descriptor = RealDescriptor(
key="Mixing Temperature", lower_bound=273, upper_bound=500, units="K"
)
temp_dimension = ContinuousDimension(
descriptor=temp_descriptor, lower_bound=300, upper_bound=400
)
pigment_node = MaterialNodeDefinition(
name="Pigment Mixture",
attributes=[temp_dimension],
formulation_subspace=pigment_formulation,
display_name="Pigment Mixture",
)
# --- Root node: Final Coating ---
# The coating is a formulation of a binder and the pigment mixture defined above.
# The ingredient name "Pigment Mixture" matches the sub-node's name,
# which links them in the resulting candidates.
coating_formulation = FormulationDesignSpace(
name="Coating formulation",
description="Binder plus pigment mixture",
formulation_descriptor=formulation_descriptor,
ingredients={"Binder", "Pigment Mixture"},
labels={"resin": {"Binder"}, "filler": {"Pigment Mixture"}},
constraints={
IngredientCountConstraint(
formulation_descriptor=formulation_descriptor, min=2, max=2
)
},
)
root_node = MaterialNodeDefinition(
name="Final Coating",
formulation_subspace=coating_formulation,
display_name="Final Coating",
)
# --- Assemble the hierarchical design space ---
design_space = HierarchicalDesignSpace(
name="Coating design space",
description="Designs coatings from binder and a pigment mixture",
root=root_node,
subspaces=[pigment_node],
data_sources=[
GemTableDataSource(table_id=table_id, table_version=table_version)
],
)
registered_design_space = project.design_spaces.register(design_space)
In this example, each candidate produced by the design space will contain a terminal coating material whose formulation includes a “Binder” (resolved from the data source) and a “Pigment Mixture” (a newly designed material with its own formulation and mixing temperature).
4.4.3. Data Source Design Space
A DataSourceDesignSpace draws its candidates from an existing data source.
Any data source can be used and no additional information is needed.
When registered, this type of design space must be a subspace of a ProductDesignSpace.
For example, assume you have a GemTable that contains one
Row for each candidate that you wish to test.
Assume the table’s table_id and table_version are known.
The example code below creates a registers a design space based on this Gem Table.
from citrine.informatics.data_sources import GemTableDataSource
from citrine.informatics.design_spaces import DataSourceDesignSpace, ProductDesignSpace
data_source = GemTableDataSource(
table_id=table_id,
table_version=table_version
)
data_source_design_space = DataSourceDesignSpace(
name="my candidates",
description="450 potential formulations",
data_source=data_source
)
design_space = ProductDesignSpace(
name="top-level design space",
description="contains a single data source design space.",
subspaces=[data_source_design_space]
)
registered_design_space = project.design_spaces.register(design_space)
4.4.4. Formulation Design Space
A FormulationDesignSpace defines the set of formulations that can be produced from a given set of ingredient names, labels, and constraints.
Ingredient names are specified as a set of strings, each mapping to a unique ingredient in a design space.
For example, {"water","salt"} may be the set of names for a design space with two ingredients.
Labels provide a way to map a string to a set of ingredient names.
For example, salt can be labeled as a solute by specifying the mapping {"solute": {"salt"}}.
An ingredient may be given multiple labels, and an ingredient will always be given all applicable labels when present in a formulation.
Constraints restrict the total number or fractional amount of ingredients in formulations sampled from the design space. There are three types of constraint that can be specified as part of a formulation design space:
IngredientCountConstraintconstrains the total number of ingredients in a formulation. At least one constraint on the total number of ingredients is required. Formulation Design Spaces without this constraint will fail validation. Additional ingredient count constraints may specify a label. If specified, only ingredients with the given label count towards the constraint total. This could be used, for example, to constrain the total number of solutes in a formulation without constraining the number of solvents.IngredientFractionConstraintrestricts the fractional amount of a single formulation ingredient between minimum and maximum bounds.LabelFractionConstraintplaces minimum and maximum bounds on the sum of fractional amounts of ingredients that have a specified label. This could be used, for example, to ensure the total fraction of ingredients labeled as solute is within a given range.
All minimum and maximum bounds for these three formulation constraints are inclusive.
IngredientFractionConstraint and LabelFractionConstraint also have an is_required flag.
By default is_required == True, indicating that ingredient and label fractions unconditionally must be within the minimum and maximum bound defined by the constraint.
If set to False, the fractional amount may be either zero or within the specified bounds.
In other words, the fractional amount is restricted to the specified bounds only when the formulation contains the constrained ingredient (for ingredient fraction constraints) or any ingredient with the given label (for label fraction constraints).
Setting is_required to False effectively adds 0 as a valid value.
Formulation Design Spaces define an inherent resolution for formulations sampled from the domain.
This resolution defines the minimum step size between consecutive formulations sampled from the space.
Resolution does not impose a grid over fractional ingredient amounts.
Instead, it provides a way to specify the characteristic length scale for the problem.
The resolution should be set to the minimum change in fractional ingredient amount that can be expected to make a difference in your problem.
The default resolution is 0.0001, which means that at least one ingredient fraction will differ by at least 0.0001 between consecutive candidates sampled from the formulation design space.
Formulations sampled from the design space are stored using the FormulationDescriptor passed to the design space when it is configured.
Each formulation contains two pieces of information: a recipe and a collection of ingredient labels.
Each recipe can be thought of as a map from ingredient name to its fractional amount, e.g., {"water": 0.99, "salt": 0.01}.
Ingredient fractions in recipes sampled from a formulation design space will always sum to 1.
Label information defines which labels are applied to each ingredient in the recipe.
These labels will always be a subset of all labels from the design space.
When registered, this type of design space must be a subspace of a ProductDesignSpace or part of a HierarchicalDesignSpace.
The following demonstrates how to create a formulation design space of saline solutions containing three ingredients: water, salt, and boric acid (a common antiseptic). We will require that formulations contain 2 ingredients, that no more than 1 solute is present, and that the total fraction of water is between 0.95 and 0.99.
from citrine.informatics.descriptors import FormulationDescriptor
from citrine.informatics.design_spaces import FormulationDesignSpace
from citrine.informatics.constraints import IngredientCountConstraint, IngredientFractionConstraint
# define the default descriptor to store formulations
descriptor = FormulationDescriptor.hierarchical()
# set of unique ingredient names
ingredients = {"water", "salt", "boric acid"}
# labels for each ingredient
labels = {
"solute": {"water"},
"solvent": {"salt", "boric acid"}
}
# constraints on formulations emitted from the design space
constraints = {
IngredientCountConstraint(formulation_descriptor=descriptor, min=2, max=2),
IngredientCountConstraint(formulation_descriptor=descriptor, label="solute", min=1, max=1),
IngredientFractionConstraint(formulation_descriptor=descriptor, ingredient="water", min=0.95, max=0.99)
}
formulation_design_space = FormulationDesignSpace(
name = "Saline solution design space",
description = "Composes formulations from water, salt, and boric acid",
formulation_descriptor = descriptor,
ingredients = ingredients,
labels = labels,
constraints = constraints
)
design_space = ProductDesignSpace(
name="top-level design space",
description="contains a single formulation design space.",
subspaces=[formulation_design_space]
)
registered_design_space = project.design_spaces.register(design_space)
4.4.5. Sampling from Design Spaces
To sample candidates from a registered design space, follow the example below:
from citrine import Citrine
from citrine.jobs.waiting import wait_while_executing
from citrine.informatics.design_spaces.sample_design_space import SampleDesignSpaceInput
sample_input = SampleDesignSpaceInput(n_candidates=50)
sample_design_space_collection = design_space.sample_design_space_executions
sample_design_space_execution = sample_design_space_collection.trigger(sample_input)
execution = wait_while_executing(
collection=sample_design_space_collection, execution=sample_design_space_execution
)
sampled_candidates = [candidate.material for candidate in execution.results()]
root_materials = [candidate.root for candidate in sampled_candidates]