abacusai.batch_prediction

Classes

BatchPrediction

Make batch predictions.

Module Contents

class abacusai.batch_prediction.BatchPrediction(client, batchPredictionId=None, createdAt=None, name=None, deploymentId=None, fileConnectorOutputLocation=None, databaseConnectorId=None, databaseOutputConfiguration=None, fileOutputFormat=None, connectorType=None, legacyInputLocation=None, outputFeatureGroupId=None, featureGroupTableName=None, outputFeatureGroupTableName=None, summaryFeatureGroupTableName=None, csvInputPrefix=None, csvPredictionPrefix=None, csvExplanationsPrefix=None, outputIncludesMetadata=None, resultInputColumns=None, modelMonitorId=None, modelVersion=None, bpAcrossVersionsMonitorId=None, algorithm=None, batchPredictionArgsType=None, batchInputs={}, latestBatchPredictionVersion={}, refreshSchedules={}, inputFeatureGroups={}, globalPredictionArgs={}, batchPredictionArgs={})

Bases: abacusai.return_class.AbstractApiClass

Make batch predictions.

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • batchPredictionId (str) – The unique identifier of the batch prediction request.

  • createdAt (str) – When the batch prediction was created, in ISO-8601 format.

  • name (str) – Name given to the batch prediction object.

  • deploymentId (str) – The deployment used to make the predictions.

  • fileConnectorOutputLocation (str) – Contains information about where the batch predictions are written to.

  • databaseConnectorId (str) – The database connector to write the results to.

  • databaseOutputConfiguration (dict) – Contains information about where the batch predictions are written to.

  • fileOutputFormat (str) – The format of the batch prediction output (CSV or JSON).

  • connectorType (str) – Null if writing to internal console, else FEATURE_GROUP | FILE_CONNECTOR | DATABASE_CONNECTOR.

  • legacyInputLocation (str) – The location of the input data.

  • outputFeatureGroupId (str) – The Batch Prediction output feature group ID if applicable

  • featureGroupTableName (str) – The table name of the Batch Prediction output feature group.

  • outputFeatureGroupTableName (str) – The table name of the Batch Prediction output feature group.

  • summaryFeatureGroupTableName (str) – The table name of the metrics summary feature group output by Batch Prediction.

  • csvInputPrefix (str) – A prefix to prepend to the input columns, only applies when output format is CSV.

  • csvPredictionPrefix (str) – A prefix to prepend to the prediction columns, only applies when output format is CSV.

  • csvExplanationsPrefix (str) – A prefix to prepend to the explanation columns, only applies when output format is CSV.

  • outputIncludesMetadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version.

  • resultInputColumns (list) – If present, will limit result files or feature groups to only include columns present in this list.

  • modelMonitorId (str) – The model monitor for this batch prediction.

  • modelVersion (str) – The model instance used in the deployment for the batch prediction.

  • bpAcrossVersionsMonitorId (str) – The model monitor for this batch prediction across versions.

  • algorithm (str) – The algorithm that is currently deployed.

  • batchPredictionArgsType (str) – The type of batch prediction arguments used for this batch prediction.

  • batchInputs (PredictionInput) – Inputs to the batch prediction.

  • latestBatchPredictionVersion (BatchPredictionVersion) – The latest batch prediction version.

  • refreshSchedules (RefreshSchedule) – List of refresh schedules that dictate the next time the batch prediction will be run.

  • inputFeatureGroups (PredictionFeatureGroup) – List of prediction feature groups.

  • globalPredictionArgs (BatchPredictionArgs)

  • batchPredictionArgs (BatchPredictionArgs) – Argument(s) passed to every prediction call.

batch_prediction_id = None
created_at = None
name = None
deployment_id = None
file_connector_output_location = None
database_connector_id = None
database_output_configuration = None
file_output_format = None
connector_type = None
legacy_input_location = None
output_feature_group_id = None
feature_group_table_name = None
output_feature_group_table_name = None
summary_feature_group_table_name = None
csv_input_prefix = None
csv_prediction_prefix = None
csv_explanations_prefix = None
output_includes_metadata = None
result_input_columns = None
model_monitor_id = None
model_version = None
bp_across_versions_monitor_id = None
algorithm = None
batch_prediction_args_type = None
batch_inputs
latest_batch_prediction_version
refresh_schedules
input_feature_groups
global_prediction_args
batch_prediction_args
deprecated_keys
__repr__()
to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

start()

Creates a new batch prediction version job for a given batch prediction job description.

Parameters:

batch_prediction_id (str) – The unique identifier of the batch prediction to create a new version of.

Returns:

The batch prediction version started by this method call.

Return type:

BatchPredictionVersion

refresh()

Calls describe and refreshes the current object’s fields

Returns:

The current object

Return type:

BatchPrediction

describe()

Describe the batch prediction.

Parameters:

batch_prediction_id (str) – The unique identifier associated with the batch prediction.

Returns:

The batch prediction description.

Return type:

BatchPrediction

list_versions(limit=100, start_after_version=None)

Retrieves a list of versions of a given batch prediction

Parameters:
  • limit (int) – Number of versions to list.

  • start_after_version (str) – Version to start after.

Returns:

List of batch prediction versions.

Return type:

list[BatchPredictionVersion]

update(deployment_id=None, global_prediction_args=None, batch_prediction_args=None, explanations=None, output_format=None, csv_input_prefix=None, csv_prediction_prefix=None, csv_explanations_prefix=None, output_includes_metadata=None, result_input_columns=None, name=None)

Update a batch prediction job description.

Parameters:
  • deployment_id (str) – Unique identifier of the deployment.

  • batch_prediction_args (BatchPredictionArgs) – Batch Prediction args specific to problem type.

  • output_format (str) – If specified, sets the format of the batch prediction output (CSV or JSON).

  • csv_input_prefix (str) – Prefix to prepend to the input columns, only applies when output format is CSV.

  • csv_prediction_prefix (str) – Prefix to prepend to the prediction columns, only applies when output format is CSV.

  • csv_explanations_prefix (str) – Prefix to prepend to the explanation columns, only applies when output format is CSV.

  • output_includes_metadata (bool) – If True, output will contain columns including prediction start time, batch prediction version, and model version.

  • result_input_columns (list) – If present, will limit result files or feature groups to only include columns present in this list.

  • name (str) – If present, will rename the batch prediction.

  • global_prediction_args (Union[dict, abacusai.api_class.BatchPredictionArgs])

  • explanations (bool)

Returns:

The batch prediction.

Return type:

BatchPrediction

set_file_connector_output(output_format=None, output_location=None)

Updates the file connector output configuration of the batch prediction

Parameters:
  • output_format (str) – The format of the batch prediction output (CSV or JSON). If not specified, the default format will be used.

  • output_location (str) – The location to write the prediction results. If not specified, results will be stored in Abacus.AI.

Returns:

The batch prediction description.

Return type:

BatchPrediction

set_database_connector_output(database_connector_id=None, database_output_config=None)

Updates the database connector output configuration of the batch prediction

Parameters:
  • database_connector_id (str) – Unique string identifier of an Database Connection to write predictions to.

  • database_output_config (dict) – Key-value pair of columns/values to write to the database connector.

Returns:

Description of the batch prediction.

Return type:

BatchPrediction

set_feature_group_output(table_name)

Creates a feature group and sets it as the batch prediction output.

Parameters:

table_name (str) – Name of the feature group table to create.

Returns:

Batch prediction after the output has been applied.

Return type:

BatchPrediction

set_output_to_console()

Sets the batch prediction output to the console, clearing both the file connector and database connector configurations.

Parameters:

batch_prediction_id (str) – The unique identifier of the batch prediction.

Returns:

The batch prediction description.

Return type:

BatchPrediction

set_feature_group(feature_group_type, feature_group_id=None)

Sets the batch prediction input feature group.

Parameters:
  • feature_group_type (str) – Enum string representing the feature group type to set. The type is based on the use case under which the feature group is being created (e.g. Catalog Attributes for personalized recommendation use case).

  • feature_group_id (str) – Unique identifier of the feature group to set as input to the batch prediction.

Returns:

Description of the batch prediction.

Return type:

BatchPrediction

set_dataset_remap(dataset_id_remap)

For the purpose of this batch prediction, will swap out datasets in the training feature groups

Parameters:

dataset_id_remap (dict) – Key/value pairs of dataset ids to be replaced during the batch prediction.

Returns:

Batch prediction object.

Return type:

BatchPrediction

delete()

Deletes a batch prediction and associated data, such as associated monitors.

Parameters:

batch_prediction_id (str) – Unique string identifier of the batch prediction.

wait_for_predictions(timeout=86400)

A waiting call until batch predictions are ready.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_drift_monitor(timeout=86400)

A waiting call until batch prediction drift monitor calculations are ready.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the latest batch prediction version.

Returns:

A string describing the status of the latest batch prediction version e.g., pending, complete, etc.

Return type:

str

create_refresh_policy(cron)

To create a refresh policy for a batch prediction.

Parameters:

cron (str) – A cron style string to set the refresh time.

Returns:

The refresh policy object.

Return type:

RefreshPolicy

list_refresh_policies()

Gets the refresh policies in a list.

Returns:

A list of refresh policy objects.

Return type:

List[RefreshPolicy]

describe_output_feature_group()

Gets the results feature group for this batch prediction

Returns:

A feature group object.

Return type:

FeatureGroup

load_results_as_pandas()

Loads the output feature groups into a python pandas dataframe.

Returns:

A pandas dataframe with annotations and text_snippet columns.

Return type:

DataFrame