citrine.gemtables.variables module

Variable definitions for GEM Tables.

class citrine.gemtables.variables.AttributeByTemplate(name: str, *, headers: List[str], template: UUID | str | LinkByUID | AttributeTemplate, attribute_constraints: List[Tuple[UUID | str | LinkByUID | AttributeTemplate, BaseBounds]] | None = None, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[AttributeByTemplate], Variable

Attribute marked by an attribute template.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • template (Union[UUID, str, LinkByUID, AttributeTemplate]) – attribute template that identifies the attribute to assign to the variable

  • attribute_constraints (List[Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], Bounds]]) – Optional constraints on object attributes in the target object that must be satisfied. Constraints are expressed as Bounds. Attributes are expressed with links. The attribute that the variable is being set to may be the target of a constraint as well.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

attribute_constraints = None
attribute_type

alias of Union[UUID, str, LinkByUID, AttributeTemplate]

constraint_type

alias of Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], BaseBounds]

headers = None
name = None
template = None
typ = 'attribute_by_template'
type_selector = None
class citrine.gemtables.variables.AttributeByTemplateAfterProcessTemplate(name: str, *, headers: List[str], attribute_template: UUID | str | LinkByUID | AttributeTemplate, process_template: UUID | str | LinkByUID | ProcessTemplate, attribute_constraints: List[Tuple[UUID | str | LinkByUID | AttributeTemplate, BaseBounds]] | None = None, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[AttributeByTemplateAfterProcessTemplate], Variable

Attribute of an object marked by an attribute template and a parent process template.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • attribute_template (Union[UUID, str, LinkByUID, AttributeTemplate]) – attribute template that identifies the attribute to assign to the variable

  • process_template (Union[UUID, str, LinkByUID, ProcessTemplate]) – process template that identifies the originating process

  • attribute_constraints (List[Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], Bounds]]) – Optional constraints on object attributes in the target object that must be satisfied. Constraints are expressed as Bounds. Attributes are expressed with links. The attribute that the variable is being set to may be the target of a constraint as well.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

attribute_constraints = None
attribute_template = None
attribute_type

alias of Union[UUID, str, LinkByUID, AttributeTemplate]

constraint_type

alias of Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], BaseBounds]

headers = None
name = None
process_template = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

typ = 'attribute_after_process'
type_selector = None
class citrine.gemtables.variables.AttributeByTemplateAndObjectTemplate(name: str, *, headers: List[str], attribute_template: UUID | str | LinkByUID | AttributeTemplate, object_template: UUID | str | LinkByUID | BaseTemplate, attribute_constraints: List[Tuple[UUID | str | LinkByUID | AttributeTemplate, BaseBounds]] | None = None, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[AttributeByTemplateAndObjectTemplate], Variable

Attribute marked by an attribute template and an object template.

For example, one property may be measured by two different measurement techniques. In this case, that property would have the same attribute template. Filtering by measurement templates, which identify the measurement techniques, disambiguates the technique used to measure that otherwise ambiguous property.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • attribute_template (Union[UUID, str, LinkByUID, AttributeTemplate]) – attribute template that identifies the attribute to assign to the variable

  • object_template (Union[UUID, str, LinkByUID, BaseTemplate]) – template that identifies the associated object

  • attribute_constraints (List[Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], Bounds]]) – Optional constraints on object attributes in the target object that must be satisfied. Constraints are expressed as Bounds. Attributes are expressed with links. The attribute that the variable is being set to may be the target of a constraint as well.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

attribute_constraints = None
attribute_template = None
attribute_type

alias of Union[UUID, str, LinkByUID, AttributeTemplate]

constraint_type

alias of Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], BaseBounds]

headers = None
name = None
object_template = None
object_type

alias of Union[UUID, str, LinkByUID, BaseTemplate]

typ = 'attribute_by_object'
type_selector = None
class citrine.gemtables.variables.AttributeInOutput(name: str, *, headers: List[str], attribute_template: UUID | str | LinkByUID | AttributeTemplate, process_templates: List[UUID | str | LinkByUID | ProcessTemplate], attribute_constraints: List[Tuple[UUID | str | LinkByUID | AttributeTemplate, BaseBounds]] | None = None, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[AttributeInOutput], Variable

Attribute marked by an attribute template in the trunk of the history tree.

The search for an attribute that marks the given attribute template starts at the terminal of the material history tree and proceeds until any of the given process templates are reached. Those templates block the search from continuing into their ingredients but do not halt the search entirely. This variable definition allows attributes that are present both in output and the inputs of a process to be distinguished.

For example, a material “paint” might be produced by mixing and then resting “pigments” and a “base”. The color of the pigments and base could be measured and recorded as attributes in addition to the color of the resulting paint. To define a variable as the color of the resulting paint, AttributeInOutput can be used with the mixing process included in the list of process templates. Then, when the platform looks for the color of a paint, it will find it but won’t traverse through the mixing process and also find the colors of the pigments and base, which would result in an ambiguous variable match.

Unlike “AttributeByTemplateAfterProcess”, AttributeInOutput will also match on the color attribute of the pigments in the rows that correspond to those pigments. This way, all the colors can be assigned to the same variable and rendered into the same columns in the GEM table.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • attribute_template (LinkByUID) – attribute template that identifies the attribute to assign to the variable

  • process_templates (list[LinkByUID]) – process templates that should not be traversed through when searching for a matching attribute. The attribute may be present in these processes but not their ingredients.

  • attribute_constraints (List[Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], Bounds]]) – Optional constraints on object attributes in the target object that must be satisfied. Constraints are expressed as Bounds. Attributes are expressed with links. The attribute that the variable is being set to may be the target of a constraint as well.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

attribute_constraints = None
attribute_template = None
attribute_type

alias of Union[UUID, str, LinkByUID, AttributeTemplate]

constraint_type

alias of Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], BaseBounds]

headers = None
name = None
process_templates = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

typ = 'attribute_in_trunk'
type_selector = None
class citrine.gemtables.variables.DataObjectTypeSelector(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: BaseEnumeration

The strategy for selecting types to consider for variable matching.

Variables can potentially match many objects in a material history, creating ambiguity around which value should be assigned. In particular, associated runs and specs often share attributes and thus will often match the same variable. To enable disambiguation in such circumstances, many variables allow specification of a type_selector, with the following choices:

  • RUN_ONLY only match run objects

  • SPEC_ONLY only match spec objects

  • PREFER_RUN match either run or spec objects, and if both types match

    only return the result for runs

  • ANY match either run or spec objects, and if both types match

    return an ambiguous error result

capitalize()

Return a capitalized version of the string.

More specifically, make the first character have upper case and the rest lower case.

casefold()

Return a version of the string suitable for caseless comparisons.

center(width, fillchar=' ', /)

Return a centered string of length width.

Padding is done using the specified fill character (default is a space).

count(sub[, start[, end]]) int

Return the number of non-overlapping occurrences of substring sub in string S[start:end]. Optional arguments start and end are interpreted as in slice notation.

encode(encoding='utf-8', errors='strict')

Encode the string using the codec registered for encoding.

encoding

The encoding in which to encode the string.

errors

The error handling scheme to use for encoding errors. The default is ‘strict’ meaning that encoding errors raise a UnicodeEncodeError. Other possible values are ‘ignore’, ‘replace’ and ‘xmlcharrefreplace’ as well as any other name registered with codecs.register_error that can handle UnicodeEncodeErrors.

endswith(suffix[, start[, end]]) bool

Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try.

expandtabs(tabsize=8)

Return a copy where all tab characters are expanded using spaces.

If tabsize is not given, a tab size of 8 characters is assumed.

find(sub[, start[, end]]) int

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Return -1 on failure.

format(*args, **kwargs) str

Return a formatted version of S, using substitutions from args and kwargs. The substitutions are identified by braces (‘{’ and ‘}’).

format_map(mapping) str

Return a formatted version of S, using substitutions from mapping. The substitutions are identified by braces (‘{’ and ‘}’).

index(sub[, start[, end]]) int

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Raises ValueError when the substring is not found.

isalnum()

Return True if the string is an alpha-numeric string, False otherwise.

A string is alpha-numeric if all characters in the string are alpha-numeric and there is at least one character in the string.

isalpha()

Return True if the string is an alphabetic string, False otherwise.

A string is alphabetic if all characters in the string are alphabetic and there is at least one character in the string.

isascii()

Return True if all characters in the string are ASCII, False otherwise.

ASCII characters have code points in the range U+0000-U+007F. Empty string is ASCII too.

isdecimal()

Return True if the string is a decimal string, False otherwise.

A string is a decimal string if all characters in the string are decimal and there is at least one character in the string.

isdigit()

Return True if the string is a digit string, False otherwise.

A string is a digit string if all characters in the string are digits and there is at least one character in the string.

isidentifier()

Return True if the string is a valid Python identifier, False otherwise.

Call keyword.iskeyword(s) to test whether string s is a reserved identifier, such as “def” or “class”.

islower()

Return True if the string is a lowercase string, False otherwise.

A string is lowercase if all cased characters in the string are lowercase and there is at least one cased character in the string.

isnumeric()

Return True if the string is a numeric string, False otherwise.

A string is numeric if all characters in the string are numeric and there is at least one character in the string.

isprintable()

Return True if the string is printable, False otherwise.

A string is printable if all of its characters are considered printable in repr() or if it is empty.

isspace()

Return True if the string is a whitespace string, False otherwise.

A string is whitespace if all characters in the string are whitespace and there is at least one character in the string.

istitle()

Return True if the string is a title-cased string, False otherwise.

In a title-cased string, upper- and title-case characters may only follow uncased characters and lowercase characters only cased ones.

isupper()

Return True if the string is an uppercase string, False otherwise.

A string is uppercase if all cased characters in the string are uppercase and there is at least one cased character in the string.

join(iterable, /)

Concatenate any number of strings.

The string whose method is called is inserted in between each given string. The result is returned as a new string.

Example: ‘.’.join([‘ab’, ‘pq’, ‘rs’]) -> ‘ab.pq.rs’

ljust(width, fillchar=' ', /)

Return a left-justified string of length width.

Padding is done using the specified fill character (default is a space).

lower()

Return a copy of the string converted to lowercase.

lstrip(chars=None, /)

Return a copy of the string with leading whitespace removed.

If chars is given and not None, remove characters in chars instead.

static maketrans()

Return a translation table usable for str.translate().

If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters to Unicode ordinals, strings or None. Character keys will be then converted to ordinals. If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.

partition(sep, /)

Partition the string into three parts using the given separator.

This will search for the separator in the string. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.

If the separator is not found, returns a 3-tuple containing the original string and two empty strings.

removeprefix(prefix, /)

Return a str with the given prefix string removed if present.

If the string starts with the prefix string, return string[len(prefix):]. Otherwise, return a copy of the original string.

removesuffix(suffix, /)

Return a str with the given suffix string removed if present.

If the string ends with the suffix string and that suffix is not empty, return string[:-len(suffix)]. Otherwise, return a copy of the original string.

replace(old, new, count=-1, /)

Return a copy with all occurrences of substring old replaced by new.

count

Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences.

If the optional argument count is given, only the first count occurrences are replaced.

rfind(sub[, start[, end]]) int

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Return -1 on failure.

rindex(sub[, start[, end]]) int

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Raises ValueError when the substring is not found.

rjust(width, fillchar=' ', /)

Return a right-justified string of length width.

Padding is done using the specified fill character (default is a space).

rpartition(sep, /)

Partition the string into three parts using the given separator.

This will search for the separator in the string, starting at the end. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.

If the separator is not found, returns a 3-tuple containing two empty strings and the original string.

rsplit(sep=None, maxsplit=-1)

Return a list of the substrings in the string, using sep as the separator string.

sep

The separator used to split the string.

When set to None (the default value), will split on any whitespace character (including n r t f and spaces) and will discard empty strings from the result.

maxsplit

Maximum number of splits. -1 (the default value) means no limit.

Splitting starts at the end of the string and works to the front.

rstrip(chars=None, /)

Return a copy of the string with trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.

split(sep=None, maxsplit=-1)

Return a list of the substrings in the string, using sep as the separator string.

sep

The separator used to split the string.

When set to None (the default value), will split on any whitespace character (including n r t f and spaces) and will discard empty strings from the result.

maxsplit

Maximum number of splits. -1 (the default value) means no limit.

Splitting starts at the front of the string and works to the end.

Note, str.split() is mainly useful for data that has been intentionally delimited. With natural text that includes punctuation, consider using the regular expression module.

splitlines(keepends=False)

Return a list of the lines in the string, breaking at line boundaries.

Line breaks are not included in the resulting list unless keepends is given and true.

startswith(prefix[, start[, end]]) bool

Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try.

strip(chars=None, /)

Return a copy of the string with leading and trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.

swapcase()

Convert uppercase characters to lowercase and lowercase characters to uppercase.

title()

Return a version of the string where each word is titlecased.

More specifically, words start with uppercased characters and all remaining cased characters have lower case.

translate(table, /)

Replace each character in the string using the given translation table.

table

Translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, strings, or None.

The table must implement lookup/indexing via __getitem__, for instance a dictionary or list. If this operation raises LookupError, the character is left untouched. Characters mapped to None are deleted.

upper()

Return a copy of the string converted to uppercase.

zfill(width, /)

Pad a numeric string with zeros on the left, to fill a field of the given width.

The string is never truncated.

ANY = 'any'
PREFER_RUN = 'prefer_run'
RUN_ONLY = 'run_only'
SPEC_ONLY = 'spec_only'
class citrine.gemtables.variables.IngredientIdentifierByProcessTemplateAndName(name: str, *, headers: List[str], process_template: UUID | str | LinkByUID | ProcessTemplate, ingredient_name: str, scope: str, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[IngredientIdentifierByProcessAndName], Variable

Ingredient identifier associated with a process template and a name.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • process_template (LinkByUID) – process template associated with this ingredient identifier

  • ingredient_name (str) – name of ingredient

  • scope (str) – scope of the identifier (default: the Citrine scope)

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_template = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

scope = None
typ = 'ing_id_by_process_and_name'
type_selector = None
class citrine.gemtables.variables.IngredientIdentifierInOutput(name: str, *, headers: List[str], ingredient_name: str, process_templates: List[UUID | str | LinkByUID | ProcessTemplate], scope: str = 'id', type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[IngredientIdentifierInOutput], Variable

Ingredient identifier in the trunk of a material history tree.

The search for an ingredient starts at the terminal of the material history tree and proceeds until any of the given process templates are reached. Those templates block the search from continuing but are inclusive: a match is extracted if an ingredient with the specified ingredient name is found at or before a cutoff.

This variable definition allows an identifier to be extracted when an ingredient is used in multiple processes. As an example, consider a paint formed by mixing red and yellow pigments, where the red pigment is formed by mixing yellow and magenta. This variable could be used to represent the identifier of yellow in both mixing processes (red and the final paint) in a single column provided the process templates that mixed red and the final paint are included as cutoffs.

In general, this variable should be preferred over an IngredientIdentifierByProcessTemplateAndName when mixtures are hierarchical (i.e., blends of blends). It allows an ingredient with a single name to be used in multiple processes without defining additional variables that manifest as additional columns in your GEM table, and must be used in place of the former if the same process template is used to represent mixing at multiple levels in the material history hierarchy. Going back to the previous example, this variable must be used in place of an IngredientIdentifierByProcessTemplateAndName if the same process template was used to represent the process that mixed red and the final paint. Using IngredientIdentifierByProcessTemplateAndName would result in an ambiguous match because yellow would be found twice in the material history, once when mixing red and again when mixing the final paint.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • ingredient_name (str) – Name of the ingredient to search for

  • process_templates (list[Union[UUID, str, LinkByUID, ProcessTemplate]]) – Process templates halt the search for a matching ingredient name. These process templates are inclusive. The ingredient may be present in these processes but not before.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_templates = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

scope = None
typ = 'ing_id_in_output'
type_selector = None
class citrine.gemtables.variables.IngredientLabelByProcessAndName(name: str, *, headers: List[str], process_template: UUID | str | LinkByUID | ProcessTemplate, ingredient_name: str, label: str, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[IngredientLabelByProcessAndName], Variable

A boolean variable indicating whether a given label is applied.

Matches by process template, ingredient name, and the label string to check.

For example, a column might indicate whether or not the ingredient “ethanol” is labeled as a “solvent” in the “second mixing” process. Many such columns would then support the downstream analysis “get the volumetric average density of the solvents”.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • process_template (LinkByUID) – process template associated with this ingredient identifier

  • ingredient_name (str) – name of ingredient

  • label (str) – label to test

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
label = None
name = None
process_template = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

typ = 'ing_label_by_process_and_name'
type_selector = None
class citrine.gemtables.variables.IngredientLabelsSetByProcessAndName(name: str, *, headers: List[str], process_template: UUID | str | LinkByUID | ProcessTemplate, ingredient_name: str)

Bases: Serializable[IngredientLabelsSetByProcessAndName], Variable

The set of labels on an ingredient when used in a process.

For example, the ingredient “ethanol” might be labeled “solvent”, “alcohol” and “VOC”. The column would then contain that set of strings.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • process_template (LinkByUID) – process template associated with this ingredient identifier

  • ingredient_name (str) – name of ingredient

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_template = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

typ = 'ing_labels_set_by_process_and_name'
class citrine.gemtables.variables.IngredientLabelsSetInOutput(name: str, *, headers: List[str], process_templates: List[UUID | str | LinkByUID | ProcessTemplate], ingredient_name: str)

Bases: Serializable[IngredientLabelsSetInOutput], Variable

The set of labels on an ingredient in the trunk of a material history tree.

The search for an ingredient starts at the terminal of the material history tree and proceeds until any of the given process templates are reached. Those templates block the search from continuing but are inclusive: a match is extracted if an ingredient with the specified ingredient name is found at or before a cutoff.

This variable definition allows a set of labels to be extracted when an ingredient is used in multiple processes. As an example, consider a paint formed by mixing red and yellow pigments, where the red pigment is formed by mixing yellow and magenta. This variable could be used to represent the labels applied to yellow in both mixing processes (red and the final paint) in a single column provided the process templates that mixed red and the final paint are included as cutoffs.

In general, this variable should be preferred over an IngredientLabelsSetByProcessAndName when mixtures are hierarchical (i.e., blends of blends). It allows an ingredient with a single name to be used in multiple processes without defining additional variables that manifest as additional columns in your GEM table, and must be used in place of the former if the same process template is used to represent mixing at multiple levels in the material history hierarchy. Going back to the previous example, this variable must be used in place of an IngredientLabelsSetByProcessAndName if the same process template was used to represent the process that mixed red and the final paint. Using IngredientLabelsSetByProcessAndName would result in an ambiguous match because yellow would be found twice in the material history, once when mixing red and again when mixing the final paint.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • process_templates (list[Union[UUID, str, LinkByUID, ProcessTemplate]]) – process templates that should not be traversed through when searching for a matching attribute. The attribute may be present in these processes but not their ingredients.

  • ingredient_name (str) – name of ingredient

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_templates = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

typ = 'ing_label_set_in_output'
class citrine.gemtables.variables.IngredientQuantityByProcessAndName(name: str, *, headers: List[str], process_template: UUID | str | LinkByUID | ProcessTemplate, ingredient_name: str, quantity_dimension: IngredientQuantityDimension, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN, unit: str | None = None)

Bases: Serializable[IngredientQuantityByProcessAndName], Variable

The quantity of an ingredient associated with a process template and a name.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • process_template (LinkByUID) – process template associated with this ingredient identifier

  • ingredient_name (str) – name of ingredient

  • quantity_dimension (IngredientQuantityDimension) – Dimension of the ingredient quantity: absolute quantity, number, mass, or volume fraction. Valid options are defined by IngredientQuantityDimension

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

  • unit (str) – An optional unit: only ingredient quantities that are convertible to this unit will be matched. Note that this parameter is mandatory when quantity_dimension is IngredientQuantityDimension.ABSOLUTE.

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_template = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

quantity_dimension = None
typ = 'ing_quantity_by_process_and_name'
type_selector = None
unit = None
class citrine.gemtables.variables.IngredientQuantityDimension(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: BaseEnumeration

The dimension of an ingredient quantity.

  • ABSOLUTE corresponds to the absolute quantity

  • MASS corresponds to the mass fraction

  • VOLUME corresponds to the volume fraction

  • NUMBER corresponds to the number fraction

capitalize()

Return a capitalized version of the string.

More specifically, make the first character have upper case and the rest lower case.

casefold()

Return a version of the string suitable for caseless comparisons.

center(width, fillchar=' ', /)

Return a centered string of length width.

Padding is done using the specified fill character (default is a space).

count(sub[, start[, end]]) int

Return the number of non-overlapping occurrences of substring sub in string S[start:end]. Optional arguments start and end are interpreted as in slice notation.

encode(encoding='utf-8', errors='strict')

Encode the string using the codec registered for encoding.

encoding

The encoding in which to encode the string.

errors

The error handling scheme to use for encoding errors. The default is ‘strict’ meaning that encoding errors raise a UnicodeEncodeError. Other possible values are ‘ignore’, ‘replace’ and ‘xmlcharrefreplace’ as well as any other name registered with codecs.register_error that can handle UnicodeEncodeErrors.

endswith(suffix[, start[, end]]) bool

Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try.

expandtabs(tabsize=8)

Return a copy where all tab characters are expanded using spaces.

If tabsize is not given, a tab size of 8 characters is assumed.

find(sub[, start[, end]]) int

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Return -1 on failure.

format(*args, **kwargs) str

Return a formatted version of S, using substitutions from args and kwargs. The substitutions are identified by braces (‘{’ and ‘}’).

format_map(mapping) str

Return a formatted version of S, using substitutions from mapping. The substitutions are identified by braces (‘{’ and ‘}’).

index(sub[, start[, end]]) int

Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Raises ValueError when the substring is not found.

isalnum()

Return True if the string is an alpha-numeric string, False otherwise.

A string is alpha-numeric if all characters in the string are alpha-numeric and there is at least one character in the string.

isalpha()

Return True if the string is an alphabetic string, False otherwise.

A string is alphabetic if all characters in the string are alphabetic and there is at least one character in the string.

isascii()

Return True if all characters in the string are ASCII, False otherwise.

ASCII characters have code points in the range U+0000-U+007F. Empty string is ASCII too.

isdecimal()

Return True if the string is a decimal string, False otherwise.

A string is a decimal string if all characters in the string are decimal and there is at least one character in the string.

isdigit()

Return True if the string is a digit string, False otherwise.

A string is a digit string if all characters in the string are digits and there is at least one character in the string.

isidentifier()

Return True if the string is a valid Python identifier, False otherwise.

Call keyword.iskeyword(s) to test whether string s is a reserved identifier, such as “def” or “class”.

islower()

Return True if the string is a lowercase string, False otherwise.

A string is lowercase if all cased characters in the string are lowercase and there is at least one cased character in the string.

isnumeric()

Return True if the string is a numeric string, False otherwise.

A string is numeric if all characters in the string are numeric and there is at least one character in the string.

isprintable()

Return True if the string is printable, False otherwise.

A string is printable if all of its characters are considered printable in repr() or if it is empty.

isspace()

Return True if the string is a whitespace string, False otherwise.

A string is whitespace if all characters in the string are whitespace and there is at least one character in the string.

istitle()

Return True if the string is a title-cased string, False otherwise.

In a title-cased string, upper- and title-case characters may only follow uncased characters and lowercase characters only cased ones.

isupper()

Return True if the string is an uppercase string, False otherwise.

A string is uppercase if all cased characters in the string are uppercase and there is at least one cased character in the string.

join(iterable, /)

Concatenate any number of strings.

The string whose method is called is inserted in between each given string. The result is returned as a new string.

Example: ‘.’.join([‘ab’, ‘pq’, ‘rs’]) -> ‘ab.pq.rs’

ljust(width, fillchar=' ', /)

Return a left-justified string of length width.

Padding is done using the specified fill character (default is a space).

lower()

Return a copy of the string converted to lowercase.

lstrip(chars=None, /)

Return a copy of the string with leading whitespace removed.

If chars is given and not None, remove characters in chars instead.

static maketrans()

Return a translation table usable for str.translate().

If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters to Unicode ordinals, strings or None. Character keys will be then converted to ordinals. If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.

partition(sep, /)

Partition the string into three parts using the given separator.

This will search for the separator in the string. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.

If the separator is not found, returns a 3-tuple containing the original string and two empty strings.

removeprefix(prefix, /)

Return a str with the given prefix string removed if present.

If the string starts with the prefix string, return string[len(prefix):]. Otherwise, return a copy of the original string.

removesuffix(suffix, /)

Return a str with the given suffix string removed if present.

If the string ends with the suffix string and that suffix is not empty, return string[:-len(suffix)]. Otherwise, return a copy of the original string.

replace(old, new, count=-1, /)

Return a copy with all occurrences of substring old replaced by new.

count

Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences.

If the optional argument count is given, only the first count occurrences are replaced.

rfind(sub[, start[, end]]) int

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Return -1 on failure.

rindex(sub[, start[, end]]) int

Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.

Raises ValueError when the substring is not found.

rjust(width, fillchar=' ', /)

Return a right-justified string of length width.

Padding is done using the specified fill character (default is a space).

rpartition(sep, /)

Partition the string into three parts using the given separator.

This will search for the separator in the string, starting at the end. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.

If the separator is not found, returns a 3-tuple containing two empty strings and the original string.

rsplit(sep=None, maxsplit=-1)

Return a list of the substrings in the string, using sep as the separator string.

sep

The separator used to split the string.

When set to None (the default value), will split on any whitespace character (including n r t f and spaces) and will discard empty strings from the result.

maxsplit

Maximum number of splits. -1 (the default value) means no limit.

Splitting starts at the end of the string and works to the front.

rstrip(chars=None, /)

Return a copy of the string with trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.

split(sep=None, maxsplit=-1)

Return a list of the substrings in the string, using sep as the separator string.

sep

The separator used to split the string.

When set to None (the default value), will split on any whitespace character (including n r t f and spaces) and will discard empty strings from the result.

maxsplit

Maximum number of splits. -1 (the default value) means no limit.

Splitting starts at the front of the string and works to the end.

Note, str.split() is mainly useful for data that has been intentionally delimited. With natural text that includes punctuation, consider using the regular expression module.

splitlines(keepends=False)

Return a list of the lines in the string, breaking at line boundaries.

Line breaks are not included in the resulting list unless keepends is given and true.

startswith(prefix[, start[, end]]) bool

Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try.

strip(chars=None, /)

Return a copy of the string with leading and trailing whitespace removed.

If chars is given and not None, remove characters in chars instead.

swapcase()

Convert uppercase characters to lowercase and lowercase characters to uppercase.

title()

Return a version of the string where each word is titlecased.

More specifically, words start with uppercased characters and all remaining cased characters have lower case.

translate(table, /)

Replace each character in the string using the given translation table.

table

Translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, strings, or None.

The table must implement lookup/indexing via __getitem__, for instance a dictionary or list. If this operation raises LookupError, the character is left untouched. Characters mapped to None are deleted.

upper()

Return a copy of the string converted to uppercase.

zfill(width, /)

Pad a numeric string with zeros on the left, to fill a field of the given width.

The string is never truncated.

ABSOLUTE = 'absolute'
MASS = 'mass'
NUMBER = 'number'
VOLUME = 'volume'
class citrine.gemtables.variables.IngredientQuantityInOutput(name: str, *, headers: List[str], ingredient_name: str, quantity_dimension: IngredientQuantityDimension, process_templates: List[UUID | str | LinkByUID | ProcessTemplate], type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN, unit: str | None = None)

Bases: Serializable[IngredientQuantityInOutput], Variable

Ingredient quantity in the trunk of a material history tree.

The search for an ingredient starts at the terminal of the material history tree and proceeds until any of the given process templates are reached. Those templates block the search from continuing but are inclusive: a match is extracted if an ingredient with the specified ingredient name is found at or before a cutoff.

This variable definition allows a quantity to be extracted when an ingredient is used in multiple processes. As an example, consider a paint formed by mixing red and yellow pigments, where the red pigment is formed by mixing yellow and magenta. This variable could be used to represent the quantity of yellow in both mixing processes (red and the final paint) in a single column provided the process templates that mixed red and the final paint are included as cutoffs.

In general, this variable should be preferred over an IngredientQuantityByProcessTemplateAndName when mixtures are hierarchical (i.e., blends of blends). It allows an ingredient with a single name to be used in multiple processes without defining additional variables that manifest as additional columns in your table, and must be used in place of the former if the same process template is used to represent mixing at multiple levels in the material history hierarchy. Going back to the previous example, this variable must be used in place of an IngredientQuantityByProcessTemplateAndName if the same process template was used to represent the process that mixed red and the final paint. Using IngredientQuantityByProcessTemplateAndName would result in an ambiguous match because yellow would be found twice in the material history, once when mixing red and again when mixing the final paint.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • ingredient_name (str) – Name of the ingredient to search for

  • quantity_dimension (IngredientQuantityDimension) – Dimension of the ingredient quantity: absolute quantity, number, mass, or volume fraction. Valid options are defined by IngredientQuantityDimension

  • process_templates (list[Union[UUID, str, LinkByUID, ProcessTemplate]]) – Process templates halt the search for a matching ingredient name. These process templates are inclusive. The ingredient may be present in these processes but not before.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

  • unit (str) – an optional unit: only ingredient quantities that are convertible to this unit will be matched. note that this parameter is mandatory when quantity_dimension is IngredientQuantityDimension.ABSOLUTE.

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_templates = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

quantity_dimension = None
typ = 'ing_quantity_in_output'
type_selector = None
unit = None
class citrine.gemtables.variables.LocalAttribute(name: str, *, headers: List[str], template: UUID | str | LinkByUID | AttributeTemplate, attribute_constraints: List[Tuple[UUID | str | LinkByUID | AttributeTemplate, BaseBounds]] | None = None, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[LocalAttribute], Variable

[ALPHA] Attribute marked by an attribute template for the root of a material history tree.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • template (Union[UUID, str, LinkByUID, AttributeTemplate]) – attribute template that identifies the attribute to assign to the variable

  • attribute_constraints (List[Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], Bounds]]) – Optional constraints on object attributes in the target object that must be satisfied. Constraints are expressed as Bounds. Attributes are expressed with links. The attribute that the variable is being set to may be the target of a constraint as well.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

attribute_constraints = None
attribute_type

alias of Union[UUID, str, LinkByUID, AttributeTemplate]

constraint_type

alias of Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], BaseBounds]

headers = None
name = None
template = None
typ = 'local_attribute'
type_selector = None
class citrine.gemtables.variables.LocalAttributeAndObject(name: str, *, headers: List[str], template: UUID | str | LinkByUID | AttributeTemplate, object_template: UUID | str | LinkByUID | BaseTemplate, attribute_constraints: List[Tuple[UUID | str | LinkByUID | AttributeTemplate, BaseBounds]] | None = None, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[LocalAttributeAndObject], Variable

[ALPHA] Attribute marked by an attribute template for the root of a material history tree.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • template (Union[UUID, str, LinkByUID, AttributeTemplate]) – attribute template that identifies the attribute to assign to the variable

  • object_template (Union[UUID, str, LinkByUID, AttributeTemplate]) – attribute template that identifies the attribute to assign to the variable

  • attribute_constraints (List[Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], Bounds]]) – Optional constraints on object attributes in the target object that must be satisfied. Constraints are expressed as Bounds. Attributes are expressed with links. The attribute that the variable is being set to may be the target of a constraint as well.

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

attribute_constraints = None
attribute_type

alias of Union[UUID, str, LinkByUID, AttributeTemplate]

constraint_type

alias of Tuple[Union[UUID, str, LinkByUID, AttributeTemplate], BaseBounds]

headers = None
name = None
object_template = None
object_type

alias of Union[UUID, str, LinkByUID, BaseTemplate]

template = None
typ = 'local_attribute_and_object'
type_selector = None
class citrine.gemtables.variables.LocalIngredientIdentifier(name: str, *, headers: List[str], ingredient_name: str, scope: str = 'id', type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN)

Bases: Serializable[LocalIngredientIdentifier], Variable

Ingredient identifier for the root process of a material history tree.

Get ingredient identifier by name. Stop traversal when encountering any ingredient. This class exists because we began seeing a common pattern of using IngredientIdentifierInOutput variableDefinitions with the process_templates list populated with every single process template. This class has the same terminating behavior without the need to populate the tableconfig with a huge list of redundant process template ids.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • ingredient_name (str) – Name of the ingredient to search for

  • scope (str) – scope of the identifier (default: the Citrine scope)

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

scope = None
typ = 'local_ing_id'
type_selector = None
class citrine.gemtables.variables.LocalIngredientLabelsSet(name: str, *, headers: List[str], ingredient_name: str)

Bases: Serializable[LocalIngredientLabelsSet], Variable

The set of labels on an ingredient for the root process of a material history tree.

Define a variable contains the set of labels that is present on the ingredient

For example, the labels might be “solvent” and “alcohol for the variable “what roles is the ethanol playing?”. Many such columns would then support the downstream analysis “get the volumetric average density of the solvents”.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • ingredient_name (str) – name of ingredient

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

typ = 'local_ing_label_set'
class citrine.gemtables.variables.LocalIngredientQuantity(name: str, *, headers: List[str], ingredient_name: str, quantity_dimension: IngredientQuantityDimension, type_selector: DataObjectTypeSelector = DataObjectTypeSelector.PREFER_RUN, unit: str | None = None)

Bases: Serializable[LocalIngredientQuantity], Variable

The quantity of an ingredient for the root process of a material history tree.

Get ingredient quantity by name. Stop traversal when encountering any ingredient. This class exists because we began seeing a common pattern of using IngredientQuantityInOutput with the process_templates list populated with every single process template in a dataset. This class has the same terminating behavior without the need to populate the tableconfig with a huge list of redundant process template ids.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • ingredient_name (str) – Name of the ingredient to search for

  • quantity_dimension (IngredientQuantityDimension) – Dimension of the ingredient quantity: absolute quantity, number, mass, or volume fraction. Valid options are defined by IngredientQuantityDimension

  • type_selector (DataObjectTypeSelector) – strategy for selecting data object types to consider when matching, defaults to PREFER_RUN

  • unit (str) – an optional unit: only ingredient quantities that are convertible to this unit will be matched. note that this parameter is mandatory when quantity_dimension is IngredientQuantityDimension.ABSOLUTE.

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
ingredient_name = None
name = None
process_type

alias of Union[UUID, str, LinkByUID, ProcessTemplate]

quantity_dimension = None
typ = 'local_ing_quantity'
type_selector = None
unit = None
class citrine.gemtables.variables.TerminalMaterialIdentifier(name: str, *, headers: List[str], scope: str = 'id')

Bases: Serializable[TerminalMaterialIdentifier], Variable

A unique identifier of the terminal material of the material history, by scope.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • scope (string) – scope of the identifier (default: the Citrine scope)

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
name = None
scope = None
typ = 'root_id'
class citrine.gemtables.variables.TerminalMaterialInfo(name: str, *, headers: List[str], field: str)

Bases: Serializable[TerminalMaterialInfo], Variable

Metadata from the terminal material of the material history.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • field (str) – name of the field to assign the variable to, for example, “sample_type” would assign the sample type of the terminal material run

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

field = None
headers = None
name = None
typ = 'root_info'
class citrine.gemtables.variables.Variable

Bases: PolymorphicSerializable[Variable]

A variable that can be assigned values present in material histories.

Abstract type that returns the proper type given a serialized dict.

classmethod build(data: dict) SelfType

Build the underlying type.

classmethod get_type(data) Type[Serializable]

Return the subtype.

class citrine.gemtables.variables.XOR(name, *, headers, variables)

Bases: Serializable[XOR], Variable

Logical exclusive OR for GEM table variables.

This variable combines the results of 2 or more variables into a single variable according to exclusive OR logic. XOR is defined when exactly one of its inputs is defined. Otherwise it is undefined.

XOR can only operate on inputs with the same output type. For example, you may XOR TerminalMaterialIdentifier with IngredientIdentifierByProcessTemplateAndName because they both produce simple strings, but not with IngredientQuantityInOutput which produces a real numeric quantity.

The input variables to XOR need not exist elsewhere in the table config, and the name and headers assigned to them have no bearing on how the table is constructed. That they are required at all is simply a limitation of the current API.

Parameters:
  • name (str) – a short human-readable name to use when referencing the variable

  • headers (list[str]) – sequence of column headers

  • variables (list[Variable]) – set of 2 or more Variables to XOR

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

headers = None
name = None
typ = 'xor'
variables = None