abacusai.deployment

Classes

Deployment

A model deployment

Module Contents

class abacusai.deployment.Deployment(client, deploymentId=None, name=None, status=None, description=None, deployedAt=None, createdAt=None, projectId=None, modelId=None, modelVersion=None, featureGroupId=None, featureGroupVersion=None, callsPerSecond=None, autoDeploy=None, skipMetricsCheck=None, algoName=None, regions=None, error=None, batchStreamingUpdates=None, algorithm=None, pendingModelVersion=None, modelDeploymentConfig=None, predictionOperatorId=None, predictionOperatorVersion=None, pendingPredictionOperatorVersion=None, onlineFeatureGroupId=None, outputOnlineFeatureGroupId=None, realtimeMonitorId=None, refreshSchedules={}, featureGroupExportConfig={}, defaultPredictionArguments={})

Bases: abacusai.return_class.AbstractApiClass

A model deployment

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • deploymentId (str) – A unique identifier for the deployment.

  • name (str) – A user-friendly name for the deployment.

  • status (str) – The status of the deployment.

  • description (str) – A description of the deployment.

  • deployedAt (str) – The date and time when the deployment became active, in ISO-8601 format.

  • createdAt (str) – The date and time when the deployment was created, in ISO-8601 format.

  • projectId (str) – A unique identifier for the project this deployment belongs to.

  • modelId (str) – The model that is currently deployed.

  • modelVersion (str) – The model version ID that is currently deployed.

  • featureGroupId (str) – The feature group that is currently deployed.

  • featureGroupVersion (str) – The feature group version ID that is currently deployed.

  • callsPerSecond (int) – The number of calls per second the deployment can handle.

  • autoDeploy (bool) – A flag marking the deployment as eligible for auto deployments whenever any model in the project finishes training.

  • skipMetricsCheck (bool) – A flag to skip metric regression with this current deployment. This field is only relevant when auto_deploy is on

  • algoName (str) – The name of the algorithm that is currently deployed.

  • regions (list) – A list of regions that the deployment has been deployed to.

  • error (str) – The relevant error, if the status is FAILED.

  • batchStreamingUpdates (bool) – A flag marking the feature group deployment as having enabled a background process which caches streamed-in rows for quicker lookup.

  • algorithm (str) – The algorithm that is currently deployed.

  • pendingModelVersion (dict) – The model that the deployment is switching to, or being stopped.

  • modelDeploymentConfig (dict) – The config for which model to be deployed.

  • predictionOperatorId (str) – The prediction operator ID that is currently deployed.

  • predictionOperatorVersion (str) – The prediction operator version ID that is currently deployed.

  • pendingPredictionOperatorVersion (str) – The prediction operator version ID that the deployment is switching to, or being stopped.

  • onlineFeatureGroupId (id) – The online feature group ID that the deployment is running on

  • outputOnlineFeatureGroupId (id) – The online feature group ID that the deployment is outputting results to

  • realtimeMonitorId (id) – The realtime monitor ID of the realtime-monitor that is associated with the deployment

  • refreshSchedules (RefreshSchedule) – A list of refresh schedules that indicate when the deployment will be updated to the latest model version.

  • featureGroupExportConfig (FeatureGroupExportConfig) – The export config (file connector or database connector information) for feature group deployment exports.

  • defaultPredictionArguments (PredictionArguments) – The default prediction arguments for prediction APIs

deployment_id
name
status
description
deployed_at
created_at
project_id
model_id
model_version
feature_group_id
feature_group_version
calls_per_second
auto_deploy
skip_metrics_check
algo_name
regions
error
batch_streaming_updates
algorithm
pending_model_version
model_deployment_config
prediction_operator_id
prediction_operator_version
pending_prediction_operator_version
online_feature_group_id
output_online_feature_group_id
realtime_monitor_id
refresh_schedules
feature_group_export_config
default_prediction_arguments
deprecated_keys
__repr__()
to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

create_webhook(endpoint, webhook_event_type, payload_template=None)

Create a webhook attached to a given deployment ID.

Parameters:
  • endpoint (str) – URI that the webhook will send HTTP POST requests to.

  • webhook_event_type (str) – One of ‘DEPLOYMENT_START’, ‘DEPLOYMENT_SUCCESS’, or ‘DEPLOYMENT_FAILED’.

  • payload_template (dict) – Template for the body of the HTTP POST requests. Defaults to {}.

Returns:

The webhook attached to the deployment.

Return type:

Webhook

list_webhooks()

List all the webhooks attached to a given deployment.

Parameters:

deployment_id (str) – Unique identifier of the target deployment.

Returns:

List of the webhooks attached to the given deployment ID.

Return type:

list[Webhook]

refresh()

Calls describe and refreshes the current object’s fields

Returns:

The current object

Return type:

Deployment

describe()

Retrieves a full description of the specified deployment.

Parameters:

deployment_id (str) – Unique string identifier associated with the deployment.

Returns:

Description of the deployment.

Return type:

Deployment

update(description=None, auto_deploy=None, skip_metrics_check=None)

Updates a deployment’s properties.

Parameters:
  • description (str) – The new description for the deployment.

  • auto_deploy (bool) – Flag to enable the automatic deployment when a new Model Version finishes training.

  • skip_metrics_check (bool) – Flag to skip metric regression with this current deployment. This field is only relevant when auto_deploy is on

rename(name)

Updates a deployment’s name

Parameters:

name (str) – The new deployment name.

set_auto(enable=None)

Enable or disable auto deployment for the specified deployment.

When a model is scheduled to retrain, deployments with auto deployment enabled will be marked to automatically promote the new model version. After the newly trained model completes, a check on its metrics in comparison to the currently deployed model version will be performed. If the metrics are comparable or better, the newly trained model version is automatically promoted. If not, it will be marked as a failed model version promotion with an error indicating poor metrics performance.

Parameters:

enable (bool) – Enable or disable the autoDeploy property of the deployment.

set_model_version(model_version, algorithm=None, model_deployment_config=None)

Promotes a model version and/or algorithm to be the active served deployment version

Parameters:
  • model_version (str) – A unique identifier for the model version.

  • algorithm (str) – The algorithm to use for the model version. If not specified, the algorithm will be inferred from the model version.

  • model_deployment_config (dict) – The deployment configuration for the model to deploy.

set_feature_group_version(feature_group_version)

Promotes a feature group version to be served in the deployment.

Parameters:

feature_group_version (str) – Unique string identifier for the feature group version.

set_prediction_operator_version(prediction_operator_version)

Promotes a prediction operator version to be served in the deployment.

Parameters:

prediction_operator_version (str) – Unique string identifier for the prediction operator version.

start()

Restarts the specified deployment that was previously suspended.

Parameters:

deployment_id (str) – A unique string identifier associated with the deployment.

stop()

Stops the specified deployment.

Parameters:

deployment_id (str) – Unique string identifier of the deployment to be stopped.

delete()

Deletes the specified deployment. The deployment’s models will not be affected. Note that the deployments are not recoverable after they are deleted.

Parameters:

deployment_id (str) – Unique string identifier of the deployment to delete.

set_feature_group_export_file_connector_output(file_format=None, output_location=None)

Sets the export output for the Feature Group Deployment to be a file connector.

Parameters:
  • file_format (str) – The type of export output, either CSV or JSON.

  • output_location (str) – The file connector (cloud) location where the output should be exported.

set_feature_group_export_database_connector_output(database_connector_id, object_name, write_mode, database_feature_mapping, id_column=None, additional_id_columns=None)

Sets the export output for the Feature Group Deployment to a Database connector.

Parameters:
  • database_connector_id (str) – The unique string identifier of the database connector used.

  • object_name (str) – The object of the database connector to write to.

  • write_mode (str) – The write mode to use when writing to the database connector, either UPSERT or INSERT.

  • database_feature_mapping (dict) – The column/feature pairs mapping the features to the database columns.

  • id_column (str) – The id column to use as the upsert key.

  • additional_id_columns (list) – For database connectors which support it, a list of additional ID columns to use as a complex key for upserting.

remove_feature_group_export_output()

Removes the export type that is set for the Feature Group Deployment

Parameters:

deployment_id (str) – The ID of the deployment for which the export type is set.

set_default_prediction_arguments(prediction_arguments, set_as_override=False)

Sets the deployment config.

Parameters:
  • prediction_arguments (PredictionArguments) – The prediction arguments to set.

  • set_as_override (bool) – If True, use these arguments as overrides instead of defaults for predict calls

Returns:

description of the updated deployment.

Return type:

Deployment

get_prediction_logs_records(limit=10, last_log_request_id='', last_log_timestamp=None)

Retrieves the prediction request IDs for the most recent predictions made to the deployment.

Parameters:
  • limit (int) – The number of prediction log entries to retrieve up to the specified limit.

  • last_log_request_id (str) – The request ID of the last log entry to retrieve.

  • last_log_timestamp (int) – A Unix timestamp in milliseconds specifying the timestamp for the last log entry.

Returns:

A list of prediction log records.

Return type:

list[PredictionLogRecord]

create_alert(alert_name, condition_config, action_config)

Create a deployment alert for the given conditions.

Only support batch prediction usage now.

Parameters:
  • alert_name (str) – Name of the alert.

  • condition_config (AlertConditionConfig) – Condition to run the actions for the alert.

  • action_config (AlertActionConfig) – Configuration for the action of the alert.

Returns:

Object describing the deployment alert.

Return type:

MonitorAlert

list_alerts()

List the monitor alerts associated with the deployment id.

Parameters:

deployment_id (str) – Unique string identifier for the deployment.

Returns:

An array of deployment alerts.

Return type:

list[MonitorAlert]

create_realtime_monitor(realtime_monitor_schedule=None, lookback_time=None)

Real time monitors compute and monitor metrics of real time prediction data.

Parameters:
  • realtime_monitor_schedule (str) – The cron expression for triggering monitor.

  • lookback_time (int) – Lookback time (in seconds) for each monitor trigger

Returns:

Object describing the real-time monitor.

Return type:

RealtimeMonitor

get_conversation_response(message, deployment_token, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, doc_infos=None)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:
  • message (str) – A message from the user

  • deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.

  • deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.

  • external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.

  • llm_name (str) – Name of the specific LLM backend to use to power the chat experience

  • num_completion_tokens (int) – Default for maximum number of tokens for chat answers

  • system_message (str) – The generative LLM system message

  • temperature (float) – The generative LLM temperature

  • filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.

  • search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.

  • chat_config (dict) – A dictionary specifiying the query chat config override.

  • doc_infos (list) – An optional list of documents use for the conversation. A keyword ‘doc_id’ is expected to be present in each document for retrieving contents from docstore.

get_conversation_response_with_binary_data(deployment_token, message, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, attachments=None)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:
  • deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.

  • message (str) – A message from the user

  • deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.

  • external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.

  • llm_name (str) – Name of the specific LLM backend to use to power the chat experience

  • num_completion_tokens (int) – Default for maximum number of tokens for chat answers

  • system_message (str) – The generative LLM system message

  • temperature (float) – The generative LLM temperature

  • filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.

  • search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.

  • chat_config (dict) – A dictionary specifiying the query chat config override.

  • attachments (None) – A dictionary of binary data to use to answer the queries.

create_batch_prediction(table_name=None, name=None, global_prediction_args=None, batch_prediction_args=None, explanations=False, output_format=None, output_location=None, database_connector_id=None, database_output_config=None, refresh_schedule=None, csv_input_prefix=None, csv_prediction_prefix=None, csv_explanations_prefix=None, output_includes_metadata=None, result_input_columns=None, input_feature_groups=None)

Creates a batch prediction job description for the given deployment.

Parameters:
  • table_name (str) – Name of the feature group table to write the results of the batch prediction. Can only be specified if outputLocation and databaseConnectorId are not specified. If tableName is specified, the outputType will be enforced as CSV.

  • name (str) – Name of the batch prediction job.

  • batch_prediction_args (BatchPredictionArgs) – Batch Prediction args specific to problem type.

  • output_format (str) – Format of the batch prediction output (CSV or JSON).

  • output_location (str) – Location to write the prediction results. Otherwise, results will be stored in Abacus.AI.

  • database_connector_id (str) – Unique identifier of a Database Connection to write predictions to. Cannot be specified in conjunction with outputLocation.

  • database_output_config (dict) – Key-value pair of columns/values to write to the database connector. Only available if databaseConnectorId is specified.

  • refresh_schedule (str) – Cron-style string that describes a schedule in UTC to automatically run the batch prediction.

  • csv_input_prefix (str) – Prefix to prepend to the input columns, only applies when output format is CSV.

  • csv_prediction_prefix (str) – Prefix to prepend to the prediction columns, only applies when output format is CSV.

  • csv_explanations_prefix (str) – Prefix to prepend to the explanation columns, only applies when output format is CSV.

  • output_includes_metadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version.

  • result_input_columns (list) – If present, will limit result files or feature groups to only include columns present in this list.

  • input_feature_groups (dict) – A dict of {‘<feature_group_type>’: ‘<feature_group_id>’} which overrides the default input data of that type for the Batch Prediction. Default input data is the training data that was used for training the deployed model.

  • global_prediction_args (Union[dict, abacusai.api_class.BatchPredictionArgs])

  • explanations (bool)

Returns:

The batch prediction description.

Return type:

BatchPrediction

get_statistics_over_time(start_date, end_date)

Return basic access statistics for the given window

Parameters:
  • start_date (str) – Timeline start date in ISO format.

  • end_date (str) – Timeline end date in ISO format. The date range must be 7 days or less.

Returns:

Object describing Time series data of the number of requests and latency over the specified time period.

Return type:

DeploymentStatistics

describe_feature_group_row_process_by_key(primary_key_value)

Gets the feature group row process.

Parameters:

primary_key_value (str) – The primary key value

Returns:

An object representing the feature group row process

Return type:

FeatureGroupRowProcess

list_feature_group_row_processes(limit=None, status=None)

Gets a list of feature group row processes.

Parameters:
  • limit (int) – The maximum number of processes to return. Defaults to None.

  • status (str) – The status of the processes to return. Defaults to None.

Returns:

A list of object representing the feature group row process

Return type:

list[FeatureGroupRowProcess]

get_feature_group_row_process_summary()

Gets a summary of the statuses of the individual feature group processes.

Parameters:

deployment_id (str) – The deployment id for the process

Returns:

An object representing the summary of the statuses of the individual feature group processes

Return type:

FeatureGroupRowProcessSummary

reset_feature_group_row_process_by_key(primary_key_value)

Resets a feature group row process so that it can be reprocessed

Parameters:

primary_key_value (str) – The primary key value

Returns:

An object representing the feature group row process.

Return type:

FeatureGroupRowProcess

get_feature_group_row_process_logs_by_key(primary_key_value)

Gets the logs for a feature group row process

Parameters:

primary_key_value (str) – The primary key value

Returns:

An object representing the logs for the feature group row process

Return type:

FeatureGroupRowProcessLogs

create_conversation(name=None, external_application_id=None)

Creates a deployment conversation.

Parameters:
  • name (str) – The name of the conversation.

  • external_application_id (str) – The external application id associated with the deployment conversation.

Returns:

The deployment conversation.

Return type:

DeploymentConversation

list_conversations(external_application_id=None, conversation_type=None, fetch_last_llm_info=False)

Lists all conversations for the given deployment and current user.

Parameters:
  • external_application_id (str) – The external application id associated with the deployment conversation. If specified, only conversations created on that application will be listed.

  • conversation_type (DeploymentConversationType) – The type of the conversation indicating its origin.

  • fetch_last_llm_info (bool) – If true, the LLM info for the most recent conversation will be fetched. Only applicable for system-created bots.

Returns:

The deployment conversations.

Return type:

list[DeploymentConversation]

create_external_application(name=None, description=None, logo=None, theme=None)

Creates a new External Application from an existing ChatLLM Deployment.

Parameters:
  • name (str) – The name of the External Application. If not provided, the name of the deployment will be used.

  • description (str) – The description of the External Application. This will be shown to users when they access the External Application. If not provided, the description of the deployment will be used.

  • logo (str) – The logo to be displayed.

  • theme (dict) – The visual theme of the External Application.

Returns:

The newly created External Application.

Return type:

ExternalApplication

download_agent_attachment(attachment_id)

Return an agent attachment.

Parameters:

attachment_id (str) – The attachment ID.

wait_for_deployment(wait_states={'PENDING', 'DEPLOYING'}, timeout=900)

A waiting call until deployment is completed.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_pending_deployment_update(timeout=900)

A waiting call until deployment is in a stable state, that pending model switch is completed and previous model is stopped.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

Returns:

the latest deployment object.

Return type:

Deployment

get_status()

Gets the status of the deployment.

Returns:

A string describing the status of a deploymet (pending, deploying, active, etc.).

Return type:

str

create_refresh_policy(cron)

To create a refresh policy for a deployment.

Parameters:

cron (str) – A cron style string to set the refresh time.

Returns:

The refresh policy object.

Return type:

RefreshPolicy

list_refresh_policies()

Gets the refresh policies in a list.

Returns:

A list of refresh policy objects.

Return type:

List[RefreshPolicy]