citrine.informatics.predictor_evaluator module

class citrine.informatics.predictor_evaluator.CrossValidationEvaluator(name: str, *, description: str = '', responses: Set[str], n_folds: int = 5, n_trials: int = 3, metrics: Set[PredictorEvaluationMetric] | None = None, ignore_when_grouping: Set[str] | None = None)

Bases: Serializable[CrossValidationEvaluator], PredictorEvaluator

Evaluate a predictor via cross validation.

Performs cross-validation on requested predictor responses and computes the requested metrics on each response. For a discussion of how many folds and trials to use, please see the documentation.

In addition to a name, set of responses to validate, trials, folds and metrics to compute, this evaluator defines a set of descriptor keys to ignore when grouping. Candidates with different values for ignored keys and identical values for all other predictor inputs will be placed in the same fold. For example, if you are baking cakes with different ingredients and different oven temperatures and want to group together the data by the ingredients, then you can set ignore_when_grouping={“oven temperature”}. That way, two recipes that differ only in their oven temperature will always end up in the same fold.

Parameters:

name (str) – Name of the evaluator
description (str) – Description of the evaluator
responses (Set[str]) – Set of descriptor keys to evaluate
n_folds (int) – Number of cross-validation folds
n_trials (int) – Number of cross-validation trials, each contains n_folds folds
metrics (Optional[Set[PredictorEvaluationMetric]]) – Optional set of metrics to compute for each response. Default is all metrics.
ignore_when_grouping (Optional[Set[str]]) – Set of descriptor keys to group together. Candidates with different values for the given keys and identical values for all other descriptors will be in the same group.

classmethod build(data: dict) → Self: Build an instance of this object from given data.

dump() → dict: Dump this instance.

classmethod get_type(data) → Type[Serializable]: Return the subtype.

description: str = None

ignore_when_grouping: Set[str] | None = None

property metrics: Set[PredictorEvaluationMetric]: Set of metrics computed during cross-validation.

n_folds: int = None

n_trials: int = None

name: str = None

property responses: Set[str]: Set of predictor responses cross-validated by the evaluator.

typ = 'CrossValidationEvaluator'

class citrine.informatics.predictor_evaluator.HoldoutSetEvaluator(name: str, *, description: str = '', responses: Set[str], data_source: DataSource, metrics: Set[PredictorEvaluationMetric] | None = None)

Bases: Serializable[HoldoutSetEvaluator], PredictorEvaluator

Evaluate a predictor using a holdout set.

For each response, the actual values are masked off and the predictor makes predictions. These predictions are compared with the ground-truth values in the holdout set using specified metrics.

Parameters:

name (str) – Name of the evaluator
responses (Set[str]) – Set of descriptor keys to evaluate
data_source (DataSource) – Source of holdout data
metrics (Optional[Set[PredictorEvaluationMetric]]) – Optional set of metrics to compute for each response. Default is all metrics.

classmethod build(data: dict) → Self: Build an instance of this object from given data.

dump() → dict: Dump this instance.

classmethod get_type(data) → Type[Serializable]: Return the subtype.

data_source = None

description: str = None

property metrics: Set[PredictorEvaluationMetric]: Set of metrics computed on the predictions.

name: str = None

property responses: Set[str]: Set of responses to predict and compare against the ground-truth values.

typ = 'HoldoutSetEvaluator'

class citrine.informatics.predictor_evaluator.PredictorEvaluator

Bases: PolymorphicSerializable[PredictorEvaluator]

A Citrine Predictor Evaluator computes metrics on a predictor.

classmethod build(data: dict) → SelfType: Build the underlying type.

classmethod get_type(data) → Type[Serializable]: Return the subtype.

property metrics: Set[PredictorEvaluationMetric]: Metrics to compute for each response.

property name: str

Name of the evaluator.

A name is required by all evaluators because it is used as the top-level key in the results returned by a citrine.informatics.workflows.PredictorEvaluationWorkflow. As such, the names of all evaluators within a single workflow must be unique.

property responses: Set[str]: Responses to compute metrics for.