citrine.resources.predictor module

Resources that represent collections of predictors.

class citrine.resources.predictor.AsyncDefaultPredictor

Bases: Resource[AsyncDefaultPredictor]

Return type for async default predictor generation and retrieval.

access_control_dict() dict

Return an access control entity representation of this resource. Internal use only.

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

predictor = None

Optional[GraphPredictor]:

status = None

short description of the resource’s status

Type:

str

status_detail = []

a list of structured status info, containing the message and level

Type:

List[StatusDetail]

uid = None

Citrine Platform unique identifier for this task.

Type:

UUID

class citrine.resources.predictor.AutoConfigureMode(value)

Bases: BaseEnumeration

The format to use in building auto-configured assets.

  • PLAIN corresponds to a single-row GEM table and plain predictor

  • FORMULATION corresponds to a multi-row GEM table and formulations predictor

  • INFER auto-detects the GEM table and predictor type

FORMULATION = 'FORMULATION'
INFER = 'INFER'
PLAIN = 'PLAIN'
class citrine.resources.predictor.PredictorCollection(project_id: UUID, session: Session)

Bases: Collection[GraphPredictor]

Represents the collection of all predictors for a project.

Parameters:

project_id (UUID) – the UUID of the project

archive(uid: UUID | str)

[UNSUPPORTED] Use archive_root or archive_version instead.

archive_root(uid: UUID | str)

Archive a root predictor.

uid: Union[UUID, str]

Unique identifier of the predictor to archive.

archive_version(uid: UUID | str, *, version: int | str) GraphPredictor

Archive a predictor version.

build(data: dict) GraphPredictor

Build an individual Predictor.

check_for_update(uid: UUID | str) GraphPredictor | None

Check if there are updates available for a predictor.

Typically these are updates to the training data. For example, a GEM Table may have been re-built to include additional rows.

This check does not update the predictor; it just returns the update that is available. To perform the update, the response should then be used to call PredictorCollection.update

Parameters:

uid (Union[UUID, str]) – Unique identifier of the predictor to check

Returns:

The update, if an update is available; None otherwise.

Return type:

Optional[Predictor]

create_default(*, training_data: DataSource, pattern: str | AutoConfigureMode = AutoConfigureMode.INFER, prefer_valid: bool = True) GraphPredictor

Create a default predictor for some training data.

This method will return an unregistered predictor generated by inspecting the training data and attempting to automatically configure the predictor.

The configuration generated while using the AutoConfigureMode.SIMPLE pattern includes featurizers for chemical formulas/molecular structures, and AutoMLPredictor`s for any variables identified as responses in the training data. The configuration generated while using the `AutoConfigureMode.FORMULATION pattern includes these same components, as well as a SimpleMixturePredictor, LabelFractionsPredictor, IngredientFractionsPredictor, and a series of MeanPropertyPredictor`s to handle featurization of formulation quantities and ingredient properties. The `AutoConfigureMode.INFER pattern chooses an appropriate mode based on whether the data source contains formulations data or not.

Parameters:
  • training_data (DataSource) – The data to configure the predictor to model.

  • pattern (AutoConfigureMode or str) – The predictor pattern to use, either “PLAIN”, “FORMULATION”, or “INFER”. The “INFER” pattern auto-detects whether the DataSource contains formulations data or not. If it does, then a formulation predictor is created. If not, then a plain predictor is created.

  • prefer_valid (Boolean) – If True, enables filtering of sparse descriptors and trimming of excess graph components in attempt to return a default configuration that will pass validation. Default: True.

Returns:

Automatically configured predictor for the training data

Return type:

GraphPredictor

create_default_async(*, training_data: DataSource, pattern: str | AutoConfigureMode = AutoConfigureMode.INFER, prefer_valid: bool = True) AsyncDefaultPredictor

Similar to PredictorCollection.create_default, except asynchronous.

This begins a long-running task to generate the predictor. The returned object contains an ID which can be used to track its status and get the resulting predictor once complete. PredictorCollection.get_default_async is intended for that purpose.

See PredictorCollection.create_default for more details on the generation process and parameter specifics.

Parameters:
  • training_data (DataSource) – The data to configure the predictor to model.

  • pattern (AutoConfigureMode or str) – The predictor pattern to use, either “PLAIN”, “FORMULATION”, or “INFER”. The “INFER” pattern auto-detects whether the DataSource contains formulations data or not. If it does, then a formulation predictor is created. If not, then a plain predictor is created.

  • prefer_valid (Boolean) – If True, enables filtering of sparse descriptors and trimming of excess graph components in attempt to return a default configuration that will pass validation. Default: True.

Returns:

Information on the long-running default predictor generation task.

Return type:

AsyncDefaultPredictor

delete(uid: UUID | str)

Predictors cannot be deleted at this time.

get(uid: UUID | str, *, version: int | str = 'most_recent') GraphPredictor

Get a predictor by ID and (optionally) version.

If version is omitted, the most recent version will be retrieved.

get_default_async(*, task_id: UUID | str) AsyncDefaultPredictor

Get the current async default predictor generation result.

The status field will indicate if it’s INPROGRESS, SUCCEEDED, or FAILED. While INPROGRESS, the predictor will also be None. Once it’s SUCCEEDED, it will be populated with a GraphPredictor, which can then be registered to the platform. If it’s FAILED, look to the status_detail field for more information on what went wrong.

get_featurized_training_data(uid: UUID | str, *, version: int | str = 'most_recent') List[HierarchicalDesignMaterial]

Retrieve a list of featurized materials for a trained predictor.

Featurized materials contain the input variables found in the training data source along with any internal features generated by the predictor while training. If not available, retraining the predictor will generate new featurized data.

Parameters:
  • uid (UUID) – the UUID of the predictor

  • version (str) – the version of the predictor (if omitted, the most recent will be used)

Return type:

A list of featurized materials, formatted as design materials

is_stale(uid: UUID | str, *, version: int | str) bool

Returns True if a predictor is stale, False otherwise.

A predictor is stale if it’s in the READY state, but the platform cannot load the previously trained object.

list(*, per_page: int = 20) Iterable[GraphPredictor]

List the most recent version of all non-archived predictors.

list_all(*, per_page: int = 20) Iterable[GraphPredictor]

List the most recent version of all predictors.

list_archived(*, per_page: int = 20) Iterable[GraphPredictor]

List the most recent version of all archived predictors.

list_archived_versions(uid: UUID | str | None = None, *, per_page: int = 20) Iterable[GraphPredictor]

List all archived versions of the given Predictor.

list_versions(uid: UUID | str | None = None, *, per_page: int = 100) Iterable[GraphPredictor]

List all non-archived versions of the given Predictor.

register(predictor: GraphPredictor, *, train: bool = True) GraphPredictor

Register and optionally train a Predictor.

This predctor will be version 1, and its draft flag will be True. If train is True and training completes successfully, the draft flag will be set to False. Otherwise, it will remain True.

rename(uid: UUID | str, *, version: int | str, name: str | None = None, description: str | None = None) GraphPredictor

Rename an existing predictor.

Both the name and description can be changed. This does not trigger retraining. Any existing version of the predictor can be renamed, or “most_recent”.

restore(uid: UUID | str)

[UNSUPPORTED] Use restore_root or restore_version instead.

restore_root(uid: UUID | str)

Restore an archived root predictor.

uid: Union[UUID, str]

Unique identifier of the predictor to restore.

restore_version(uid: UUID | str, *, version: int | str) GraphPredictor

Restore a predictor version.

retrain_stale(uid: UUID | str, *, version: int | str) GraphPredictor

Begins retraining a stale predictor.

This can only be used on a stale predictor, which is when it’s in the READY state, but the platform cannot load the previously trained object. Using it on a non-stale predictor will result in an error.

root_is_archived(uid: UUID | str) bool

Determine if the predictor root is archived.

uid: Union[UUID, str]

Unique identifier of the predictor to check.

train(uid: UUID | str) GraphPredictor

Train a predictor.

If the predictor is not a draft, a new version will be created which is a copy of the current predictor version as a draft, which will be trained. Either way, if training completes successfully, the Predictor will no longer be a draft.

update(predictor: GraphPredictor, *, train: bool = True) GraphPredictor

Update and optionally train a Predictor.

If the predictor is a draft, this will overwrite its contents. If it’s not a draft, a new version will be created with the update.

In either case, training will begin after the update if train is True. And if training completes successfully, the Predictor will no longer be a draft.