citrine.informatics.predictor_evaluator module

class citrine.informatics.predictor_evaluator.CrossValidationEvaluator(name: str, *, description: str = '', responses: Set[str], n_folds: int = 5, n_trials: int = 3, metrics: Set[PredictorEvaluationMetric] | None = None, ignore_when_grouping: Set[str] | None = None)

Bases: Serializable[CrossValidationEvaluator], PredictorEvaluator

Evaluate a predictor via cross validation.

Performs cross-validation on requested predictor responses and computes the requested metrics on each response. For a discussion of how many folds and trials to use, please see the documentation.

In addition to a name, set of responses to validate, trials, folds and metrics to compute, this evaluator defines a set of descriptor keys to ignore when grouping. Candidates with different values for ignored keys and identical values for all other predictor inputs will be placed in the same fold. For example, if you are baking cakes with different ingredients and different oven temperatures and want to group together the data by the ingredients, then you can set ignore_when_grouping={“oven temperature”}. That way, two recipes that differ only in their oven temperature will always end up in the same fold.

Parameters:
  • name (str) – Name of the evaluator

  • description (str) – Description of the evaluator

  • responses (Set[str]) – Set of descriptor keys to evaluate

  • n_folds (int) – Number of cross-validation folds

  • n_trials (int) – Number of cross-validation trials, each contains n_folds folds

  • metrics (Optional[Set[PredictorEvaluationMetric]]) – Optional set of metrics to compute for each response. Default is all metrics.

  • ignore_when_grouping (Optional[Set[str]]) – Set of descriptor keys to group together. Candidates with different values for the given keys and identical values for all other descriptors will be in the same group.

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

description: str = None
ignore_when_grouping: Set[str] | None = None
property metrics: Set[PredictorEvaluationMetric]

Set of metrics computed during cross-validation.

n_folds: int = None
n_trials: int = None
name: str = None
property responses: Set[str]

Set of predictor responses cross-validated by the evaluator.

typ = 'CrossValidationEvaluator'
class citrine.informatics.predictor_evaluator.HoldoutSetEvaluator(name: str, *, description: str = '', responses: Set[str], data_source: DataSource, metrics: Set[PredictorEvaluationMetric] | None = None)

Bases: Serializable[HoldoutSetEvaluator], PredictorEvaluator

Evaluate a predictor using a holdout set.

For each response, the actual values are masked off and the predictor makes predictions. These predictions are compared with the ground-truth values in the holdout set using specified metrics.

Parameters:
  • name (str) – Name of the evaluator

  • responses (Set[str]) – Set of descriptor keys to evaluate

  • data_source (DataSource) – Source of holdout data

  • metrics (Optional[Set[PredictorEvaluationMetric]]) – Optional set of metrics to compute for each response. Default is all metrics.

classmethod build(data: dict) Self

Build an instance of this object from given data.

dump() dict

Dump this instance.

classmethod get_type(data) Type[Serializable]

Return the subtype.

data_source = None
description: str = None
property metrics: Set[PredictorEvaluationMetric]

Set of metrics computed on the predictions.

name: str = None
property responses: Set[str]

Set of responses to predict and compare against the ground-truth values.

typ = 'HoldoutSetEvaluator'
class citrine.informatics.predictor_evaluator.PredictorEvaluator

Bases: PolymorphicSerializable[PredictorEvaluator]

A Citrine Predictor Evaluator computes metrics on a predictor.

classmethod build(data: dict) SelfType

Build the underlying type.

classmethod get_type(data) Type[Serializable]

Return the subtype.

property metrics: Set[PredictorEvaluationMetric]

Metrics to compute for each response.

property name: str

Name of the evaluator.

A name is required by all evaluators because it is used as the top-level key in the results returned by a citrine.informatics.workflows.PredictorEvaluationWorkflow. As such, the names of all evaluators within a single workflow must be unique.

property responses: Set[str]

Responses to compute metrics for.