4.8. Predictor Reports
Training a predictor generally produces a set of inter-connected models.
A predictor report describes those models, for example their settings and what features are important to the model.
It does not include predictor evaluation metrics.
To learn more about predictor evaluation metrics, please see PredictorEvaluationMetrics.
The report can be accessed via predictor.report
.
A task to generate a predictor report is scheduled when a predictor is registered.
The report has a status
and json
member variables.
Status can be one of:
PENDING
The report has been scheduled.ERROR
An error was thrown while generating the report.OK
The report was generated successfully and the results are ready.
Results of the report are in the descriptors
and model_summaries
attributes.
descriptors
is a list of Descriptor
objects that may be inputs or outputs to models in the predictor.
model_summaries
is a list of ModelSummary
objects, each one corresponding to a single model in the predictor.
Each ModelSummary
includes the name of the model, a list of input descriptors, a list of output descriptors, the model’s settings, and its feature importances.
model_settings
is a dictionary of settings and values, the details of which depend on the type of model.
One possible model settings dictionary is shown below:
{
'Algorithm': 'Ensemble of non-linear estimators',
'Number of estimators': 64,
'Use jackknife': True
}
feature_importances
is a list of FeatureImportanceReport
objects, each of which corresponds to a single output of the model.
It has fields output_key
, which is the key of the output descriptor, and importances
, which is a dictionary from input keys to their importance.
The input and output keys correspond to descriptors that can be found in the predictor report’s descriptors
field.
An example is shown below:
{
'output_key': 'shear modulus',
'importances': {
'Young's modulus': 0.85,
'Poisson's ratio': 0.15
}
}
For simple models, such as those that featurize inputs, the model_settings
and feature_importances
fields might be empty.
As an example, consider a AutoMLPredictor
with numeric inputs x
and y
and numeric output z
.
This predictor will produce a single model to predict z
from x
and y
.
In cases involving multiple ml predictors and/or input featurization, more models will be produced.
The code below shows how to create the predictor, register it, and view the report.
Assume that there is a training data table with known id and version.
from time import sleep
from citrine.informatics.predictors import GraphPredictor, AutoMLPredictor
from citrine.informatics.descriptors import RealDescriptor
from citrine.informatics.data_sources import GemTableDataSource
# create input descriptors
x = RealDescriptor(key='x', lower_bound=0, upper_bound=10, units='')
y = RealDescriptor(key='y', lower_bound=0, upper_bound=10, units='')
z = RealDescriptor(key='z', lower_bound=0, upper_bound=10, units='')
# create ML predictor
auto_ml_predictor = AutoMLPredictor(
name='ML predictor for z',
description='Predicts z from x and y',
inputs=[x, y],
outputs=[z],
training_data=[GemTableDataSource(
table_id = training_table_id,
table_version = training_table_version
)]
)
# register a predictor with a project
predictor = project.predictors.register(
GraphPredictor(
name='ML predictor for z',
description='Predicts z from x and y',
predictors=[auto_ml_predictor]
)
)
# wait for the predictor report to be ready
while project.predictors.get(predictor.uid).report.status == 'PENDING':
sleep(10)
# print the json report
report = project.predictors.get(predictor.uid).report
print(report.json)