abacusai.model_version

Classes

ModelVersion

A version of a model

Module Contents

class abacusai.model_version.ModelVersion(client, modelVersion=None, modelConfigType=None, status=None, modelId=None, modelPredictionConfig=None, trainingStartedAt=None, trainingCompletedAt=None, featureGroupVersions=None, customAlgorithms=None, builtinAlgorithms=None, error=None, pendingDeploymentIds=None, failedDeploymentIds=None, cpuSize=None, memory=None, automlComplete=None, trainingFeatureGroupIds=None, trainingDocumentRetrieverVersions=None, documentRetrieverMappings=None, bestAlgorithm=None, defaultAlgorithm=None, featureAnalysisStatus=None, dataClusterInfo=None, customAlgorithmConfigs=None, trainedModelTypes=None, useGpu=None, partialComplete=None, modelFeatureGroupSchemaMappings=None, trainingConfigUpdated=None, codeSource={}, modelConfig={}, deployableAlgorithms={})

Bases: abacusai.return_class.AbstractApiClass

A version of a model

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • modelVersion (str) – The unique identifier of a model version.

  • modelConfigType (str) – Name of the TrainingConfig class of the model_config.

  • status (str) – The current status of the model.

  • modelId (str) – A reference to the model this version belongs to.

  • modelPredictionConfig (dict) – The prediction config options for the model.

  • trainingStartedAt (str) – The start time and date of the training process in ISO-8601 format.

  • trainingCompletedAt (str) – The end time and date of the training process in ISO-8601 format.

  • featureGroupVersions (list) – A list of Feature Group version IDs used for model training.

  • customAlgorithms (list) – List of user-defined algorithms used for model training.

  • builtinAlgorithms (list) – List of algorithm names builtin algorithms provided by Abacus.AI used for model training.

  • error (str) – Relevant error if the status is FAILED.

  • pendingDeploymentIds (list) – List of deployment IDs where deployment is pending.

  • failedDeploymentIds (list) – List of failed deployment IDs.

  • cpuSize (str) – CPU size specified for the python model training.

  • memory (int) – Memory in GB specified for the python model training.

  • automlComplete (bool) – If true, all algorithms have completed training.

  • trainingFeatureGroupIds (list) – The unique identifiers of the feature groups used as inputs during training to create this ModelVersion.

  • trainingDocumentRetrieverVersions (list) – The document retriever version IDs used as inputs during training to create this ModelVersion.

  • documentRetrieverMappings (dict) – mapping of document retriever version to their respective information.

  • bestAlgorithm (dict) – Best performing algorithm.

  • defaultAlgorithm (dict) – Default algorithm that the user has selected.

  • featureAnalysisStatus (str) – Lifecycle of the feature analysis stage.

  • dataClusterInfo (dict) – Information about the models for different data clusters.

  • customAlgorithmConfigs (dict) – User-defined configs for each of the user-defined custom algorithms.

  • trainedModelTypes (list) – List of trained model types.

  • useGpu (bool) – Whether this model version is using gpu

  • partialComplete (bool) – If true, all required algorithms have completed training.

  • modelFeatureGroupSchemaMappings (dict) – mapping of feature group to schema version

  • trainingConfigUpdated (bool) – If the training config has been updated since the instance was created.

  • codeSource (CodeSource) – If a python model, information on where the source code is located.

  • modelConfig (TrainingConfig) – The training config options used to train this model.

  • deployableAlgorithms (DeployableAlgorithm) – List of deployable algorithms.

model_version
model_config_type
status
model_id
model_prediction_config
training_started_at
training_completed_at
feature_group_versions
custom_algorithms
builtin_algorithms
error
pending_deployment_ids
failed_deployment_ids
cpu_size
memory
automl_complete
training_feature_group_ids
training_document_retriever_versions
document_retriever_mappings
best_algorithm
default_algorithm
feature_analysis_status
data_cluster_info
custom_algorithm_configs
trained_model_types
use_gpu
partial_complete
model_feature_group_schema_mappings
training_config_updated
code_source
model_config
deployable_algorithms
deprecated_keys
__repr__()
to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

describe_train_test_data_split_feature_group_version()

Get the train and test data split for a trained model by model version. This is only supported for models with custom algorithms.

Parameters:

model_version (str) – The unique version ID of the model version.

Returns:

The feature group version containing the training data and folds information.

Return type:

FeatureGroupVersion

set_model_objective(metric=None)

Sets the best model for all model instances of the model based on the specified metric, and updates the training configuration to use the specified metric for any future model versions.

If metric is set to None, then just use the default selection

Parameters:

metric (str) – The metric to use to determine the best model.

get_feature_group_schemas_for()

Gets the schema (including feature mappings) for all feature groups used in the model version.

Parameters:

model_version (str) – Unique string identifier for the version of the model.

Returns:

List of schema for all feature groups used in the model version.

Return type:

list[ModelVersionFeatureGroupSchema]

delete()

Deletes the specified model version. Model versions which are currently used in deployments cannot be deleted.

Parameters:

model_version (str) – The unique identifier of the model version to delete.

export_model_artifact_as_feature_group(table_name, artifact_type=None)

Exports metric artifact data for a model as a feature group.

Parameters:
  • table_name (str) – Name of the feature group table to create.

  • artifact_type (EvalArtifactType) – eval artifact type to export.

Returns:

The created feature group.

Return type:

FeatureGroup

refresh()

Calls describe and refreshes the current object’s fields

Returns:

The current object

Return type:

ModelVersion

describe()

Retrieves a full description of the specified model version.

Parameters:

model_version (str) – Unique string identifier of the model version.

Returns:

A model version.

Return type:

ModelVersion

get_feature_importance_by()

Gets the feature importance calculated by various methods for the model.

Parameters:

model_version (str) – Unique string identifier for the model version.

Returns:

Feature importances for the model.

Return type:

FeatureImportance

get_training_data_logs()

Retrieves the data preparation logs during model training.

Parameters:

model_version (str) – The unique version ID of the model version.

Returns:

A list of logs.

Return type:

list[DataPrepLogs]

get_training_logs(stdout=False, stderr=False)

Returns training logs for the model.

Parameters:
  • stdout (bool) – Set True to get info logs.

  • stderr (bool) – Set True to get error logs.

Returns:

A function logs object.

Return type:

FunctionLogs

export_custom(output_location, algorithm=None)

Bundle custom model artifacts to a zip file, and export to the specified location.

Parameters:
  • output_location (str) – Location to export the model artifacts results. For example, s3://a-bucket/

  • algorithm (str) – The algorithm to be exported. Optional if there’s only one custom algorithm in the model version.

Returns:

Object describing the export and its status.

Return type:

ModelArtifactsExport

wait_for_training(timeout=None)

A waiting call until model gets trained.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_full_automl(timeout=None)

A waiting call until full AutoML cycle is completed.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the model version under training.

Returns:

A string describing the status of a model training (pending, complete, etc.).

Return type:

str

get_train_test_feature_group_as_pandas()

Get the model train test data split feature group of the model version as pandas data frame.

Returns:

A pandas dataframe for the training data with fold column.

Return type:

pandas.Dataframe