citrine.gemtables.columns module
Column definitions for GEM Tables.
- class citrine.gemtables.columns.ChemicalDisplayFormat(value)
Bases:
BaseEnumeration
Format to use when rendering a molecular structure.
SMILES
Simplified molecular-input line-entry systemINCHI
International Chemical Identifier
- INCHI = 'inchi'
- SMILES = 'smiles'
- class citrine.gemtables.columns.Column
Bases:
PolymorphicSerializable
[Column
]A column in the GEM Table, defined as some operation on a variable.
Abstract type that returns the proper type given a serialized dict.
- classmethod build(data: dict) SelfType
Build the underlying type.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- class citrine.gemtables.columns.ComponentQuantityColumn(*, data_source: str | Variable, component_name: str, normalize: bool = False)
Bases:
Serializable
[ComponentQuantityColumn
],Column
Column that extracts the quantity of a given component.
If the component is not present in the composition, then the value in the column will be 0.0.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
component_name (str) – name of the component from which to extract the quantity
normalize (bool) – whether to normalize the quantity by the sum of all component amounts. Default is false
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- component_name = None
- data_source = None
- normalize = None
- typ = 'component_quantity_column'
- class citrine.gemtables.columns.CompositionSortOrder(value)
Bases:
BaseEnumeration
Order to use when sorting the components in a composition.
ALPHABETICAL
is alpha-numeric order by the component nameQUANTITY
is ordered from the largest to smallest quantity, with ties broken alphabetically
- ALPHABETICAL = 'alphabetical'
- QUANTITY = 'quantity'
- class citrine.gemtables.columns.ConcatColumn(*, data_source: str | Variable, subcolumn: Column)
Bases:
Serializable
[ConcatColumn
],Column
Column that concatenates multiple values produced by a list- or set-valued variable.
The input subcolumn need not exist elsewhere in the table config, and its parameters have no bearing on how the table is constructed. Only the type of column is relevant. That a complete Column object is required is simply a limitation of the current API.
- Parameters:
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- subcolumn = None
- typ = 'concat_column'
- class citrine.gemtables.columns.FlatCompositionColumn(*, data_source: str | Variable, sort_order: CompositionSortOrder)
Bases:
Serializable
[FlatCompositionColumn
],Column
Column that flattens the composition into a string of names and quantities.
The numeric formatting tries to be human readable. For example, if all of the quantities are round numbers like
{"spam": 4.0, "eggs": 1.0}
then the result omit the decimal points like"(spam)4(eggs)1"
(if sort_order is by quantity).- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
sort_order (CompositionSortOrder) – order with which to sort the components when generating the flat string
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- sort_order = None
- typ = 'flat_composition_column'
- class citrine.gemtables.columns.IdentityColumn(*, data_source: str | Variable)
Bases:
Serializable
[IdentityColumn
],Column
Column containing the value of a string-valued variable.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- typ = 'identity_column'
- class citrine.gemtables.columns.MeanColumn(*, data_source: str | Variable, target_units: str | None = None)
Bases:
Serializable
[MeanColumn
],Column
Column containing the mean of a real-valued variable.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
target_units (Optional[str]) –
- units to convert the real variable into. If not specified:
- If there is an OriginalUnitsColumnDefinition for that source,
no conversion will be made.
- If not, the real variable will be converted by using the
default_units from the associated template.
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- target_units = None
- typ = 'mean_column'
- class citrine.gemtables.columns.MolecularStructureColumn(*, data_source: str | Variable, format: ChemicalDisplayFormat)
Bases:
Serializable
[MolecularStructureColumn
],Column
Column containing a representation of a molecular structure.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
format (ChemicalDisplayFormat) – the format in which to display the molecular structure
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- format = None
- typ = 'molecular_structure_column'
- class citrine.gemtables.columns.MostLikelyCategoryColumn(*, data_source: str | Variable)
Bases:
Serializable
[MostLikelyCategoryColumn
],Column
Column containing the most likely category.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- typ = 'most_likely_category_column'
- class citrine.gemtables.columns.MostLikelyProbabilityColumn(*, data_source: str | Variable)
Bases:
Serializable
[MostLikelyProbabilityColumn
],Column
Column containing the probability of the most likely category.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- typ = 'most_likely_probability_column'
- class citrine.gemtables.columns.NthBiggestComponentNameColumn(*, data_source: str | Variable, n: int)
Bases:
Serializable
[NthBiggestComponentNameColumn
],Column
Name of the Nth biggest component.
If there are fewer than N components in the composition, then this column will be empty.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
n (int) – index of the component name to extract, starting with 1 for the biggest
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- n = None
- typ = 'biggest_component_name_column'
- class citrine.gemtables.columns.NthBiggestComponentQuantityColumn(*, data_source: str | Variable, n: int, normalize: bool = False)
Bases:
Serializable
[NthBiggestComponentQuantityColumn
],Column
Quantity of the Nth biggest component.
If there are fewer than N components in the composition, then this column will be empty.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
n (int) – index of the component quantity to extract, starting with 1 for the biggest
normalize (bool) – whether to normalize the quantity by the sum of all component amounts. Default is false
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- n = None
- normalize = None
- typ = 'biggest_component_quantity_column'
- class citrine.gemtables.columns.OriginalUnitsColumn(*, data_source: str | Variable)
Bases:
Serializable
[OriginalUnitsColumn
],Column
Column containing the units as entered in the source data.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- typ = 'original_units_column'
- class citrine.gemtables.columns.QuantileColumn(*, data_source: str | Variable, quantile: float, target_units: str | None = None)
Bases:
Serializable
[QuantileColumn
],Column
Column containing a quantile of the variable.
The column is populated with the quantile function of the distribution evaluated at “quantile”. For example, for a uniform distribution parameterized by a lower and upper bound, the value in the column would be:
\[lower + (upper - lower) * quantile\]while for a normal distribution parameterized by a mean and stddev, the value would be:
\[mean + stddev * \sqrt{2} * erf^{-1}(2 * quantile - 1)\]- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
quantile (float) – the quantile to use for the column, defined between 0.0 and 1.0
target_units (Optional[str]) –
- units to convert the real variable into. If not specified:
- If there is an OriginalUnitsColumnDefinition for that source,
no conversion will be made.
- If not, the real variable will be converted by using the
default_units from the associated template.
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- quantile = None
- target_units = None
- typ = 'quantile_column'
- class citrine.gemtables.columns.StdDevColumn(*, data_source: str | Variable, target_units: str | None = None)
Bases:
Serializable
[StdDevColumn
],Column
Column containing the standard deviation of a real-valued variable.
- Parameters:
data_source (Union[str, Variable]) – name of the variable to use when populating the column
target_units (Optional[str]) –
- units to convert the real variable into. If not specified:
- If there is an OriginalUnitsColumnDefinition for that source,
no conversion will be made.
- If not, the real variable will be converted by using the
default_units from the associated template.
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- classmethod get_type(data) Type[Serializable]
Return the subtype.
- data_source = None
- target_units = None
- typ = 'std_dev_column'