abacusai

Submodules

Attributes

`MIN_AGENT_SLEEP_TIME`
`DocumentRetrieverConfig`
`Segment`
`_request_context`
`__version__`

Exceptions

ApiException

Default ApiException raised by APIs

Classes

`AbacusApi`	An Abacus API.
`Address`	Address object
`Agent`	An AI agent.
`AgentChatMessage`	A single chat message with Agent Chat.
`AgentConversation`	List of messages with Agent chat
`AgentDataDocumentInfo`	Information for documents uploaded to agents.
`AgentDataExecutionResult`	Results of agent execution with uploaded data.
`AgentVersion`	A version of an AI agent.
`AiBuildingTask`	A task for Data Science Co-pilot to help build AI.
`Algorithm`	Customer created algorithm
`Annotation`	An Annotation Store Annotation
`AnnotationConfig`	Annotation config for a feature group
`AnnotationDocument`	Document to be annotated.
`AnnotationEntry`	An Annotation Store entry for an Annotation
`AnnotationsStatus`	The status of annotations for a feature group
`ApiClass`	Helper class that provides a standard way to create an ABC using
`FieldDescriptor`	Configs for vector store indexing.
`JSONSchema`
`WorkflowNodeInputMapping`	Represents a mapping of inputs to a workflow node.
`WorkflowNodeInputSchema`	A schema conformant to react-jsonschema-form for workflow node input.
`WorkflowNodeOutputMapping`	Represents a mapping of output from a workflow node.
`WorkflowNodeOutputSchema`	A schema conformant to react-jsonschema-form for a workflow node output.
`TriggerConfig`	Represents the configuration for a trigger workflow node.
`WorkflowGraphNode`	Represents a node in an Agent workflow graph.
`DecisionNode`	Represents a decision node in an Agent workflow graph. It is connected between two workflow nodes and is used to determine if subsequent nodes should be executed.
`WorkflowGraphEdge`	Represents an edge in an Agent workflow graph.
`WorkflowGraph`	Represents an Agent workflow graph.
`AgentConversationMessage`	Message format for agent conversation
`WorkflowNodeTemplateConfig`	Represents a WorkflowNode template config.
`WorkflowNodeTemplateInput`	Represents an input to the workflow node generated using template.
`WorkflowNodeTemplateOutput`	Represents an output returned by the workflow node generated using template.
`HotkeyPrompt`	A config class for a Data Science Co-Pilot Hotkey
`_ApiClassFactory`	Helper class that provides a standard way to create an ABC using
`BatchPredictionArgs`	An abstract class for Batch Prediction args specific to problem type.
`ForecastingBatchPredictionArgs`	Batch Prediction Config for the FORECASTING problem type
`NamedEntityExtractionBatchPredictionArgs`	Batch Prediction Config for the NAMED_ENTITY_EXTRACTION problem type
`PersonalizationBatchPredictionArgs`	Batch Prediction Config for the PERSONALIZATION problem type
`PredictiveModelingBatchPredictionArgs`	Batch Prediction Config for the PREDICTIVE_MODELING problem type
`PretrainedModelsBatchPredictionArgs`	Batch Prediction Config for the PRETRAINED_MODELS problem type
`SentenceBoundaryDetectionBatchPredictionArgs`	Batch Prediction Config for the SENTENCE_BOUNDARY_DETECTION problem type
`ThemeAnalysisBatchPredictionArgs`	Batch Prediction Config for the THEME_ANALYSIS problem type
`ChatLLMBatchPredictionArgs`	Batch Prediction Config for the ChatLLM problem type
`TrainablePlugAndPlayBatchPredictionArgs`	Batch Prediction Config for the TrainablePlugAndPlay problem type
`AIAgentBatchPredictionArgs`	Batch Prediction Config for the AIAgents problem type
`_BatchPredictionArgsFactory`	Helper class that provides a standard way to create an ABC using
`Blob`	An object for storing and passing file data.
`BlobInput`	An object for storing and passing file data.
`DatasetConfig`	An abstract class for dataset configs
`StreamingConnectorDatasetConfig`	An abstract class for dataset configs specific to streaming connectors.
`KafkaDatasetConfig`	Dataset config for Kafka Streaming Connector
`_StreamingConnectorDatasetConfigFactory`	Helper class that provides a standard way to create an ABC using
`DocumentType`	Generic enumeration.
`OcrMode`	Generic enumeration.
`ParsingConfig`	Custom config for dataset parsing.
`DocumentProcessingConfig`	Document processing configuration.
`DatasetDocumentProcessingConfig`	Document processing configuration for dataset imports.
`IncrementalDatabaseConnectorConfig`	Config information for incremental datasets from database connectors
`AttachmentParsingConfig`	Config information for parsing attachments
`ApplicationConnectorDatasetConfig`	An abstract class for dataset configs specific to application connectors.
`ConfluenceDatasetConfig`	Dataset config for Confluence Application Connector
`BoxDatasetConfig`	Dataset config for Box Application Connector
`GoogleAnalyticsDatasetConfig`	Dataset config for Google Analytics Application Connector
`GoogleDriveDatasetConfig`	Dataset config for Google Drive Application Connector
`JiraDatasetConfig`	Dataset config for Jira Application Connector
`OneDriveDatasetConfig`	Dataset config for OneDrive Application Connector
`SharepointDatasetConfig`	Dataset config for Sharepoint Application Connector
`ZendeskDatasetConfig`	Dataset config for Zendesk Application Connector
`AbacusUsageMetricsDatasetConfig`	Dataset config for Abacus Usage Metrics Application Connector
`TeamsScraperDatasetConfig`	Dataset config for Teams Scraper Application Connector
`FreshserviceDatasetConfig`	Dataset config for Freshservice Application Connector
`SftpDatasetConfig`	Dataset config for SFTP Application Connector
`_ApplicationConnectorDatasetConfigFactory`	Helper class that provides a standard way to create an ABC using
`PredictionArguments`	An abstract class for prediction arguments specific to problem type.
`OptimizationPredictionArguments`	Prediction arguments for the OPTIMIZATION problem type
`TimeseriesAnomalyPredictionArguments`	Prediction arguments for the TS_ANOMALY problem type
`ChatLLMPredictionArguments`	Prediction arguments for the CHAT_LLM problem type
`RegressionPredictionArguments`	Prediction arguments for the PREDICTIVE_MODELING problem type
`ForecastingPredictionArguments`	Prediction arguments for the FORECASTING problem type
`CumulativeForecastingPredictionArguments`	Prediction arguments for the CUMULATIVE_FORECASTING problem type
`NaturalLanguageSearchPredictionArguments`	Prediction arguments for the NATURAL_LANGUAGE_SEARCH problem type
`FeatureStorePredictionArguments`	Prediction arguments for the FEATURE_STORE problem type
`_PredictionArgumentsFactory`	Helper class that provides a standard way to create an ABC using
`VectorStoreTextEncoder`	Generic enumeration.
`VectorStoreConfig`	Config for indexing options of a document retriever. Default values of optional arguments are heuristically selected by the Abacus.AI platform based on the underlying data.
`ApiEnum`	Generic enumeration.
`ProblemType`	Description of a problem type which is the common underlying problem for different use cases.
`RegressionObjective`	Generic enumeration.
`RegressionTreeHPOMode`	Generic enumeration.
`PartialDependenceAnalysis`	Generic enumeration.
`RegressionAugmentationStrategy`	Generic enumeration.
`RegressionTargetTransform`	Generic enumeration.
`RegressionTypeOfSplit`	Generic enumeration.
`RegressionTimeSplitMethod`	Generic enumeration.
`RegressionLossFunction`	Generic enumeration.
`ExplainerType`	Generic enumeration.
`SamplingMethodType`	Generic enumeration.
`MergeMode`	Generic enumeration.
`OperatorType`	Generic enumeration.
`MarkdownOperatorInputType`	Generic enumeration.
`FillLogic`	Generic enumeration.
`BatchSize`	Generic enumeration.
`HolidayCalendars`	Generic enumeration.
`FileFormat`	Generic enumeration.
`ExperimentationMode`	Generic enumeration.
`PersonalizationTrainingMode`	Generic enumeration.
`PersonalizationObjective`	Generic enumeration.
`ForecastingObjective`	Generic enumeration.
`ForecastingFrequency`	Generic enumeration.
`ForecastingDataSplitType`	Generic enumeration.
`ForecastingLossFunction`	Generic enumeration.
`ForecastingLocalScaling`	Generic enumeration.
`ForecastingFillMethod`	Generic enumeration.
`ForecastingQuanitlesExtensionMethod`	Generic enumeration.
`TimeseriesAnomalyDataSplitType`	Generic enumeration.
`TimeseriesAnomalyTypeOfAnomaly`	Generic enumeration.
`TimeseriesAnomalyUseHeuristic`	Generic enumeration.
`NERObjective`	Generic enumeration.
`NERModelType`	Generic enumeration.
`NLPDocumentFormat`	Generic enumeration.
`SentimentType`	Generic enumeration.
`ClusteringImputationMethod`	Generic enumeration.
`ConnectorType`	Generic enumeration.
`ApplicationConnectorType`	Generic enumeration.
`StreamingConnectorType`	Generic enumeration.
`PythonFunctionArgumentType`	Generic enumeration.
`PythonFunctionOutputArgumentType`	Generic enumeration.
`LLMName`	Generic enumeration.
`MonitorAlertType`	Generic enumeration.
`FeatureDriftType`	Generic enumeration.
`DataIntegrityViolationType`	Generic enumeration.
`BiasType`	Generic enumeration.
`AlertActionType`	Generic enumeration.
`PythonFunctionType`	Generic enumeration.
`EvalArtifactType`	Generic enumeration.
`FieldDescriptorType`	Generic enumeration.
`WorkflowNodeInputType`	Generic enumeration.
`WorkflowNodeOutputType`	Generic enumeration.
`StdDevThresholdType`	Generic enumeration.
`DataType`	Generic enumeration.
`AgentInterface`	Generic enumeration.
`WorkflowNodeTemplateType`	Generic enumeration.
`ProjectConfigType`	Generic enumeration.
`CPUSize`	Generic enumeration.
`MemorySize`	Generic enumeration.
`ResponseSectionType`	Generic enumeration.
`CodeLanguage`	Generic enumeration.
`DeploymentConversationType`	Generic enumeration.
`AgentClientType`	Generic enumeration.
`OrganizationSecretType`	Enum for organization secret types
`SamplingConfig`	An abstract class for the sampling config of a feature group
`NSamplingConfig`	The number of distinct values of the key columns to include in the sample, or number of rows if key columns not specified.
`PercentSamplingConfig`	The fraction of distinct values of the feature group to include in the sample.
`_SamplingConfigFactory`	Helper class that provides a standard way to create an ABC using
`MergeConfig`	An abstract class for the merge config of a feature group
`LastNMergeConfig`	Merge LAST N chunks/versions of an incremental dataset.
`TimeWindowMergeConfig`	Merge rows within a given timewindow of the most recent timestamp
`_MergeConfigFactory`	Helper class that provides a standard way to create an ABC using
`OperatorConfig`	Configuration for a template Feature Group Operation
`UnpivotConfig`	Unpivot Columns in a FeatureGroup.
`MarkdownConfig`	Transform a input column to a markdown column.
`CrawlerTransformConfig`	Transform a input column of urls to html text
`ExtractDocumentDataConfig`	Extracts data from documents.
`DataGenerationConfig`	Generate synthetic data using a model for finetuning an LLM.
`UnionTransformConfig`	Takes Union of current feature group with 1 or more selected feature groups of same type.
`_OperatorConfigFactory`	A class to select and return the the correct type of Operator Config based on a serialized OperatorConfig instance.
`TrainingConfig`	An abstract class for the training config options used to train the model.
`PersonalizationTrainingConfig`	Training config for the PERSONALIZATION problem type
`RegressionTrainingConfig`	Training config for the PREDICTIVE_MODELING problem type
`ForecastingTrainingConfig`	Training config for the FORECASTING problem type
`NamedEntityExtractionTrainingConfig`	Training config for the NAMED_ENTITY_EXTRACTION problem type
`NaturalLanguageSearchTrainingConfig`	Training config for the NATURAL_LANGUAGE_SEARCH problem type
`SystemConnectorTool`	System connector tool used to integrate chatbots with external services.
`ChatLLMTrainingConfig`	Training config for the CHAT_LLM problem type
`SentenceBoundaryDetectionTrainingConfig`	Training config for the SENTENCE_BOUNDARY_DETECTION problem type
`SentimentDetectionTrainingConfig`	Training config for the SENTIMENT_DETECTION problem type
`DocumentClassificationTrainingConfig`	Training config for the DOCUMENT_CLASSIFICATION problem type
`DocumentSummarizationTrainingConfig`	Training config for the DOCUMENT_SUMMARIZATION problem type
`DocumentVisualizationTrainingConfig`	Training config for the DOCUMENT_VISUALIZATION problem type
`ClusteringTrainingConfig`	Training config for the CLUSTERING problem type
`ClusteringTimeseriesTrainingConfig`	Training config for the CLUSTERING_TIMESERIES problem type
`EventAnomalyTrainingConfig`	Training config for the EVENT_ANOMALY problem type
`TimeseriesAnomalyTrainingConfig`	Training config for the TS_ANOMALY problem type
`CumulativeForecastingTrainingConfig`	Training config for the CUMULATIVE_FORECASTING problem type
`ThemeAnalysisTrainingConfig`	Training config for the THEME ANALYSIS problem type
`AIAgentTrainingConfig`	Training config for the AI_AGENT problem type
`CustomTrainedModelTrainingConfig`	Training config for the CUSTOM_TRAINED_MODEL problem type
`CustomAlgorithmTrainingConfig`	Training config for the CUSTOM_ALGORITHM problem type
`OptimizationTrainingConfig`	Training config for the OPTIMIZATION problem type
`_TrainingConfigFactory`	Helper class that provides a standard way to create an ABC using
`DeployableAlgorithm`	Algorithm that can be deployed to a model.
`TimeWindowConfig`	Time Window Configuration
`ForecastingMonitorConfig`	Forecasting Monitor Configuration
`StdDevThreshold`	Std Dev Threshold types
`ItemAttributesStdDevThreshold`	Item Attributes Std Dev Threshold for Monitor Alerts
`RestrictFeatureMappings`	Restrict Feature Mappings for Monitor Filtering
`MonitorFilteringConfig`	Monitor Filtering Configuration
`AlertConditionConfig`	An abstract class for alert condition configs
`AccuracyBelowThresholdConditionConfig`	Accuracy Below Threshold Condition Config for Monitor Alerts
`FeatureDriftConditionConfig`	Feature Drift Condition Config for Monitor Alerts
`TargetDriftConditionConfig`	Target Drift Condition Config for Monitor Alerts
`HistoryLengthDriftConditionConfig`	History Length Drift Condition Config for Monitor Alerts
`DataIntegrityViolationConditionConfig`	Data Integrity Violation Condition Config for Monitor Alerts
`BiasViolationConditionConfig`	Bias Violation Condition Config for Monitor Alerts
`PredictionCountConditionConfig`	Deployment Prediction Condition Config for Deployment Alerts. By default we monitor if predictions made over a time window has reduced significantly.
`_AlertConditionConfigFactory`	Helper class that provides a standard way to create an ABC using
`AlertActionConfig`	An abstract class for alert action configs
`EmailActionConfig`	Email Action Config for Monitor Alerts
`_AlertActionConfigFactory`	Helper class that provides a standard way to create an ABC using
`MonitorThresholdConfig`	Monitor Threshold Config for Monitor Alerts
`FeatureMappingConfig`	Feature mapping configuration for a feature group type.
`ProjectFeatureGroupTypeMappingsConfig`	Project feature group type mappings.
`ConstraintConfig`	Constraint configuration.
`ProjectFeatureGroupConfig`	An abstract class for project feature group configuration.
`ConstraintProjectFeatureGroupConfig`	Constraint project feature group configuration.
`ReviewModeProjectFeatureGroupConfig`	Review mode project feature group configuration.
`_ProjectFeatureGroupConfigFactory`	Helper class that provides a standard way to create an ABC using
`PythonFunctionArgument`	A config class for python function arguments
`OutputVariableMapping`	A config class for python function arguments
`FeatureGroupExportConfig`	Export configuration (file connector or database connector information) for feature group exports.
`FileConnectorExportConfig`	File connector export config for feature groups
`DatabaseConnectorExportConfig`	Database connector export config for feature groups
`_FeatureGroupExportConfigFactory`	Helper class that provides a standard way to create an ABC using
`ResponseSection`	A response section that an agent can return to render specific UI elements.
`AgentFlowButtonResponseSection`	A response section that an AI Agent can return to render a button.
`ImageUrlResponseSection`	A response section that an agent can return to render an image.
`TextResponseSection`	A response section that an agent can return to render text.
`RuntimeSchemaResponseSection`	A segment that an agent can return to render json and ui schema in react-jsonschema-form format for workflow nodes.
`CodeResponseSection`	A response section that an agent can return to render code.
`Base64ImageResponseSection`	A response section that an agent can return to render a base64 image.
`CollapseResponseSection`	A response section that an agent can return to render a collapsible component.
`ListResponseSection`	A response section that an agent can return to render a list.
`ChartResponseSection`	A response section that an agent can return to render a chart.
`DataframeResponseSection`	A response section that an agent can return to render a pandas dataframe.
`ApiEndpoint`	An collection of endpoints which can be used to make requests to, such as api calls or predict calls
`ApiKey`	An API Key to authenticate requests to the Abacus.AI API
`AppUserGroup`	An app user group. This is used to determine which users have permissions for external chatbots.
`AppUserGroupSignInToken`	User Group Sign In Token
`ApplicationConnector`	A connector to an external service
`AudioGenSettings`	Audio generation settings
`AudioUrlResult`	TTS result
`BatchPrediction`	Make batch predictions.
`BatchPredictionVersion`	Batch Prediction Version
`BatchPredictionVersionLogs`	Logs from batch prediction version.
`BotInfo`	Information about an external application and LLM.
`CategoricalRangeViolation`	Summary of important range mismatches for a numerical feature discovered by a model monitoring instance
`ChatMessage`	A single chat message with Abacus Chat.
`ChatSession`	A chat session with Abacus Data Science Co-pilot.
`ChatllmComputer`	ChatLLMComputer
`ChatllmComputerStatus`	ChatLLM Computer Status
`ChatllmMemory`	An LLM created memory in ChatLLM
`ChatllmProject`	ChatLLM Project
`ChatllmReferralInvite`	The response of the Chatllm Referral Invite for different emails
`ChatllmTask`	A chatllm task
`AgentResponse`	Response object for agent to support attachments, section data and normal data
`ApiClient`	Abacus.AI API Client
`ClientOptions`	Options for configuring the ApiClient
`ReadOnlyClient`	Abacus.AI Read Only API Client. Only contains GET methods
`ToolResponse`	Response object for tool to support non-text response sections
`CodeAgentResponse`	A response from a Code Agent
`CodeAutocompleteEditPredictionResponse`	A autocomplete response from an LLM
`CodeAutocompleteResponse`	A autocomplete response from an LLM
`CodeBot`	A bot option for CodeLLM
`CodeEdit`	A code edit response from an LLM
`CodeEditResponse`	A code edit response from an LLM
`CodeEdits`	A code edit response from an LLM
`CodeEmbeddings`	Code embeddings
`CodeLlmChangedFiles`	Code changed files
`CodeSource`	Code source for python-based custom feature groups and models
`CodeSuggestionValidationResponse`	A response from an LLM to validate a code suggestion.
`CodeSummaryResponse`	A summary response from an LLM
`CodellmEmbeddingConstants`	A dictionary of constants to be used in the autocomplete.
`ComputePointInfo`	The compute point info of the organization
`ConcatenationConfig`	Feature Group Concatenation Config
`ConstantsAutocompleteResponse`	A dictionary of constants to be used in the autocomplete.
`CpuGpuMemorySpecs`	Includes the memory specs of the CPU/GPU
`CustomChatInstructions`	Custom Chat Instructions
`CustomDomain`	Result of adding a custom domain to a hosted app
`CustomLossFunction`	Custom Loss Function
`CustomMetric`	Custom metric.
`CustomMetricVersion`	Custom metric version
`CustomTrainFunctionInfo`	Information about how to call the customer provided train function.
`DataConsistencyDuplication`	Data Consistency for duplication within data
`DataMetrics`	Processed Metrics and Schema for a dataset version or feature group version
`DataPrepLogs`	Logs from data preparation.
`DataQualityResults`	Data Quality results from normalization stage
`DataUploadResult`	Results of uploading data to agent.
`DatabaseColumnFeatureMapping`	Mapping for export of feature group version to database column
`DatabaseConnector`	A connector to an external service
`DatabaseConnectorColumn`	A schema description for a column from a database connector
`DatabaseConnectorSchema`	A schema description for a table from a database connector
`Dataset`	A dataset reference
`DatasetColumn`	A schema description for a column
`DatasetVersion`	A specific version of a dataset
`DatasetVersionLogs`	Logs from dataset version.
`DefaultLlm`	A default LLM.
`Deployment`	A model deployment
`DeploymentAuthToken`	A deployment authentication token that is used to authenticate prediction requests
`DeploymentConversation`	A deployment conversation.
`DeploymentConversationEvent`	A single deployment conversation message.
`DeploymentConversationExport`	A deployment conversation html export, to be used for downloading the conversation.
`DeploymentStatistics`	A set of statistics for a realtime deployment.
`DocumentData`	Data extracted from a docstore document.
`DocumentRetriever`	A vector store that stores embeddings for a list of document trunks.
`DocumentRetrieverLookupResult`	Result of a document retriever lookup.
`DocumentRetrieverVersion`	A version of document retriever.
`DriftDistribution`	How actuals or predicted values have changed in the training data versus predicted data
`DriftDistributions`	For either actuals or predicted values, how it has changed in the training data versus some specified window
`Eda`	A exploratory data analysis object
`EdaChartDescription`	Eda Chart Description.
`EdaCollinearity`	Eda Collinearity of the latest version of the data between all the features.
`EdaDataConsistency`	Eda Data Consistency, contained the duplicates in the base version, Comparison version, Deletions between the base and comparison and feature transformations between the base and comparison data.
`EdaFeatureAssociation`	Eda Feature Association between two features in the data.
`EdaFeatureCollinearity`	Eda Collinearity of the latest version of the data for a given feature.
`EdaForecastingAnalysis`	Eda Forecasting Analysis of the latest version of the data.
`EdaVersion`	A version of an eda object
`EditImageModels`	Edit image models
`EmbeddingFeatureDriftDistribution`	Feature distribution for embeddings
`ExecuteFeatureGroupOperation`	The result of executing a SQL query
`ExternalApplication`	An external application.
`ExternalInvite`	The response of the invites for different emails
`ExtractedFields`	The fields extracted from a document.
`Feature`	A feature in a feature group
`FeatureDistribution`	For a single feature, how it has changed in the training data versus some specified window
`FeatureDriftRecord`	Value of each type of drift
`FeatureDriftSummary`	Summary of important model monitoring statistics for features available in a model monitoring instance
`FeatureGroup`	A feature group.
`FeatureGroupDocument`	A document of a feature group.
`FeatureGroupExport`	A feature Group Export Job
`FeatureGroupExportConfig`	Export configuration (file connector or database connector information) for feature group exports.
`FeatureGroupExportDownloadUrl`	A Feature Group Export Download Url, which is used to download the feature group version
`FeatureGroupLineage`	Directed acyclic graph of feature group lineage for all feature groups in a project
`FeatureGroupRefreshExportConfig`	A Feature Group Connector Export Config outlines the export configuration for a feature group.
`FeatureGroupRow`	A row of a feature group.
`FeatureGroupRowProcess`	A feature group row process
`FeatureGroupRowProcessLogs`	Logs for the feature group row process.
`FeatureGroupRowProcessSummary`	A summary of the feature group processes for a deployment.
`FeatureGroupTemplate`	A template for creating feature groups.
`FeatureGroupTemplateVariableOptions`	Feature Group Template Variable Options
`FeatureGroupVersion`	A materialized version of a feature group
`FeatureGroupVersionLogs`	Logs from feature group version.
`FeatureImportance`	Feature importance for a specified model monitor
`FeatureMapping`	A description of the data use for a feature
`FeaturePerformanceAnalysis`	A feature performance analysis for Monitor
`FileConnector`	Verification result for an external storage service
`FileConnectorInstructions`	An object with a full description of the cloud storage bucket authentication options and bucket policy. Returns an error message if the parameters are invalid.
`FileConnectorVerification`	To verify the file connector
`FinetunedPretrainedModel`	A finetuned pretrained model
`ForecastingAnalysisGraphData`	Forecasting Analysis Graph Data representation.
`ForecastingMonitorItemAnalysis`	Forecasting Monitor Item Analysis of the latest version of the data.
`ForecastingMonitorSummary`	Forecasting Monitor Summary of the latest version of the data.
`FsEntry`	File system entry.
`FunctionLogs`	Logs from an invocation of a function.
`GeneratedPitFeatureConfigOption`	The options to display for possible generated PIT aggregation functions
`GraphDashboard`	A Graph Dashboard
`HoldoutAnalysis`	A holdout analysis object.
`HoldoutAnalysisVersion`	A holdout analysis version object.
`HostedApp`	Hosted App
`HostedAppContainer`	Hosted app + Deep agent container information.
`HostedAppFileRead`	Result of reading file content from a hosted app container.
`HostedArtifact`	A hosted artifact being served by the platform.
`HostedDatabase`	Hosted Database
`HostedDatabaseSnapshot`	Hosted Database Snapshot
`HostedModelToken`	A hosted model authentication token that is used to authenticate requests to an abacus hosted model
`HostnameInfo`	Hostname Info
`HumeVoice`	Hume Voice
`ImageGenModel`	Image generation model
`ImageGenModelOptions`	Image generation model options
`ImageGenSettings`	Image generation settings
`IndexingConfig`	The indexing config for a Feature Group
`InferredDatabaseColumnToFeatureMappings`	Autocomplete mappings for database to connector columns
`InferredFeatureMappings`	A description of the data use for a feature
`ItemStatistics`	ItemStatistics representation.
`LipSyncGenSettings`	Lip sync generation settings
`LlmApp`	An LLM App that can be used for generation. LLM Apps are specifically crafted to help with certain tasks like code generation or question answering.
`LlmArtifact`	LLM Artifact
`LlmCodeBlock`	Parsed code block from an LLM response
`LlmExecutionPreview`	Preview of executing queries using LLM.
`LlmExecutionResult`	Results of executing queries using LLM.
`LlmGeneratedCode`	Code generated by LLM.
`LlmInput`	The result of encoding an object as input for a language model.
`LlmParameters`	The parameters of LLM for given inputs.
`LlmResponse`	The response returned by LLM
`McpConfig`	Model Context Protocol Config
`McpServer`	Model Context Protocol Server
`McpServerConnection`	Model Context Protocol Server Connection
`McpServerQueryResult`	Result of a MCP server query
`MemoryOptions`	The overall memory options for executing a job
`MessagingConnectorResponse`	The response to view label data for Teams
`Model`	A model
`ModelArtifactsExport`	A Model Artifacts Export Job
`ModelBlueprintExport`	Model Blueprint
`ModelBlueprintStage`	A stage in the model blueprint export process.
`ModelLocation`	Provide location information for the plug-and-play model.
`ModelMetrics`	Metrics of the trained model.
`ModelMonitor`	A model monitor
`ModelMonitorOrgSummary`	A summary of an organization's model monitors
`ModelMonitorSummary`	A summary of model monitor
`ModelMonitorSummaryFromOrg`	A summary of model monitor given an organization
`ModelMonitorVersion`	A version of a model monitor
`ModelMonitorVersionMetricData`	Data for displaying model monitor version metric data
`ModelTrainingTypeForDeployment`	Model training types for deployment.
`ModelUpload`	A model version that includes the upload identifiers for the various required files.
`ModelVersion`	A version of a model
`ModelVersionFeatureGroupSchema`	Schema for a feature group used in model version
`ModificationLockInfo`	Information about a modification lock for a certain object
`Module`	Customer created python module
`MonitorAlert`	A Monitor Alert
`MonitorAlertVersion`	A monitor alert version
`MonitorDriftAndDistributions`	Summary of important model monitoring statistics for features available in a model monitoring instance
`NaturalLanguageExplanation`	Natural language explanation of an artifact/object
`NestedFeature`	A nested feature in a feature group
`NestedFeatureSchema`	A schema description for a nested feature
`NewsSearchResult`	A single news search result.
`NlpChatResponse`	A chat response from an LLM
`NullViolation`	Summary of anomalous null frequencies for a feature discovered by a model monitoring instance
`OrganizationExternalApplicationSettings`	The External Application Settings for an Organization.
`OrganizationGroup`	An Organization Group. Defines the permissions available to the users who are members of the group.
`OrganizationSearchResult`	A search result object which contains the retrieved artifact and its relevance score
`OrganizationSecret`	Organization secret
`PageData`	Data extracted from a docstore page.
`Pipeline`	A Pipeline For Steps.
`PipelineReference`	A reference to a pipeline to the objects it is run on.
`PipelineStep`	A step in a pipeline.
`PipelineStepVersion`	A version of a pipeline step.
`PipelineStepVersionLogs`	Logs for a given pipeline step version.
`PipelineStepVersionReference`	A reference from a pipeline step version to the versions that were output from the pipeline step.
`PipelineVersion`	A version of a pipeline.
`PipelineVersionLogs`	Logs for a given pipeline version.
`PlaygroundText`	The text content inside of a playground segment.
`PointInTimeFeature`	A point-in-time feature description
`PointInTimeFeatureInfo`	A point-in-time infos for a feature
`PointInTimeGroup`	A point in time group containing point in time features
`PointInTimeGroupFeature`	A point in time group feature
`PredictionClient`	Abacus.AI Prediction API Client. Does not utilize authentication and only contains public prediction methods
`PredictionDataset`	Batch Input Datasets
`PredictionFeatureGroup`	Batch Input Feature Group
`PredictionInput`	Batch inputs
`PredictionLogRecord`	A Record for a prediction request log.
`PredictionOperator`	A prediction operator.
`PredictionOperatorVersion`	A prediction operator version.
`PresentationExportResult`	Export Presentation
`ProblemType`	Description of a problem type which is the common underlying problem for different use cases.
`Project`	A project is a container which holds datasets, models and deployments
`ProjectConfig`	Project-specific config for a feature group
`ProjectFeatureGroup`	A feature group along with project specific mappings
`ProjectFeatureGroupSchema`	A schema description for a project feature group
`ProjectFeatureGroupSchemaVersion`	A version of a schema
`ProjectValidation`	A validation result for a project
`PythonFunction`	Customer created python function
`PythonPlotFunction`	Create a Plot for a Dashboard
`RangeViolation`	Summary of important range mismatches for a numerical feature discovered by a model monitoring instance
`RealtimeMonitor`	A real-time monitor
`RefreshPipelineRun`	This keeps track of the overall status of a refresh. A refresh can span multiple resources such as the creation of new dataset versions and the training of a new model version based on them.
`RefreshPolicy`	A Refresh Policy describes the frequency at which one or more datasets/models/deployments/batch_predictions can be updated.
`RefreshSchedule`	A refresh schedule for an object. Defines when the next version of the object will be created
`RegenerateLlmExternalApplication`	An external application that specifies an LLM user can regenerate with in RouteLLM.
`ResolvedFeatureGroupTemplate`	Final SQL from resolving a feature group template.
`RoutingAction`	Routing action
`Schema`	A schema description for a feature
`SftpKey`	An SFTP key
`StreamingAuthToken`	A streaming authentication token that is used to authenticate requests to append data to streaming datasets
`StreamingClient`	Abacus.AI Streaming API Client. Does not utilize authentication and only contains public streaming methods
`StreamingConnector`	A connector to an external service
`StreamingRowCount`	Returns the number of rows in a streaming feature group from the specified time
`StreamingSampleCode`	Sample code for adding to a streaming feature group with examples from different locations.
`StsGenSettings`	STS generation settings
`SttGenModel`	STT generation model
`SttGenModelOptions`	STT generation model options
`SttGenSettings`	STT generation settings
`TemplateNodeDetails`	Details about WorkflowGraphNode object and notebook code for adding template nodes in workflow.
`TestPointPredictions`	Test Point Predictions
`ToneDetails`	Tone details for audio
`TrainingConfigOptions`	Training options for a model
`TtsGenSettings`	TTS generation settings
`TwitterSearchResult`	A single twitter search result.
`UnifiedConnector`	A unified connector that can handle both application and database connectors.
`Upload`	A Upload Reference for uploading file parts
`UploadPart`	Unique identifiers for a part
`UseCase`	A Project Use Case
`UseCaseRequirements`	Use Case Requirements
`User`	An Abacus.AI User
`UserException`	Exception information for errors in usercode.
`VideoGenCosts`	The most expensive price for each video gen model in credits
`VideoGenModel`	Video generation model
`VideoGenModelOptions`	Video generation model options
`VideoGenSettings`	Video generation settings
`VideoSearchResult`	A single video search result.
`VoiceGenDetails`	Voice generation details
`WebAppConversation`	Web App Conversation
`WebAppDomain`	Web App Domain
`WebPageResponse`	A scraped web page response
`WebSearchResponse`	Result of running a web search with optional content fetching.
`WebSearchResult`	A single search result.
`Webhook`	A Abacus.AI Webhook attached to an endpoint and event trigger for a given object.
`WorkflowGraphNodeDetails`	A workflow graph node in the workflow graph.
`WorkflowNodeTemplate`	A workflow node template.

Functions

`get_clean_function_source_code_for_agent`(func)
`validate_constructor_arg_types`([friendly_class_name])
`validate_input_dict_param`(dict_object, friendly_class_name)
`deprecated_enums`(*enum_values)

Package Contents

class abacusai.AbacusApi(client, method=None, docstring=None, score=None)

Bases: abacusai.return_class.AbstractApiClass

An Abacus API.

Parameters:

client (ApiClient) – An authenticated API Client instance
method (str) – The name of of the API method.
docstring (str) – The docstring of the API method.
score (str) – The relevance score of the API method.

method = None

docstring = None

score = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Address(client, addressLine1=None, addressLine2=None, city=None, stateOrProvince=None, postalCode=None, country=None, additionalInfo=None, includeReverseCharge=None)

Bases: abacusai.return_class.AbstractApiClass

Address object

Parameters:

client (ApiClient) – An authenticated API Client instance
addressLine1 (str) – The first line of the address
addressLine2 (str) – The second line of the address
city (str) – The city
stateOrProvince (str) – The state or province
postalCode (str) – The postal code
country (str) – The country
additionalInfo (str) – Additional information for invoice
includeReverseCharge (bool) – Whether the organization needs the reverse charge mechanism applied to invoices.

address_line_1 = None

address_line_2 = None

city = None

state_or_province = None

postal_code = None

country = None

additional_info = None

include_reverse_charge = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Agent(client, name=None, agentId=None, createdAt=None, projectId=None, notebookId=None, predictFunctionName=None, sourceCode=None, agentConfig=None, memory=None, trainingRequired=None, agentExecutionConfig=None, codeSource={}, latestAgentVersion={}, draftWorkflowGraph={}, workflowGraph={})

Bases: abacusai.return_class.AbstractApiClass

An AI agent.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The user-friendly name for the agent.
agentId (str) – The unique identifier of the agent.
createdAt (str) – Date and time at which the agent was created.
projectId (str) – The project this agent belongs to.
notebookId (str) – The notebook associated with the agent.
predictFunctionName (str) – Name of the function found in the source code that will be executed run predictions through agent. It is not executed when this function is run.
sourceCode (str) – Python code used to make the agent.
agentConfig (dict) – The config options used to create this agent.
memory (int) – Memory in GB specified for the deployment resources for the agent.
trainingRequired (bool) – Whether training is required to deploy the latest agent code.
agentExecutionConfig (dict) – The config for arguments used to execute the agent.
latestAgentVersion (AgentVersion) – The latest agent version.
codeSource (CodeSource) – If a python model, information on the source code
draftWorkflowGraph (WorkflowGraph) – The saved draft state of the workflow graph for the agent.
workflowGraph (WorkflowGraph) – The workflow graph for the agent.

name = None

agent_id = None

created_at = None

project_id = None

notebook_id = None

predict_function_name = None

source_code = None

agent_config = None

memory = None

training_required = None

agent_execution_config = None

code_source

latest_agent_version

draft_workflow_graph

workflow_graph

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Agent

describe()

Retrieves a full description of the specified model.

Parameters:: agent_id (str) – Unique string identifier associated with the model.
Returns:: Description of the agent.
Return type:: Agent

list_versions(limit=100, start_after_version=None)

List all versions of an agent.

Parameters:

limit (int) – If provided, limits the number of agent versions returned.
start_after_version (str) – Unique string identifier of the version after which the list starts.

Returns:

An array of Agent versions.

Return type:

list[AgentVersion]

copy(project_id=None)

Creates a copy of the input agent

Parameters:: project_id (str) – Project id to create the new agent to. By default it picks up the source agent’s project id.
Returns:: The newly generated agent.
Return type:: Agent

property description: str

The description of the agent.

Return type:: str

property agent_interface: str

The interface that the agent will be deployed with.

Return type:: str

property agent_connectors: dict

A dictionary mapping ApplicationConnectorType keys to lists of OAuth scopes. Each key represents a specific application connector, while the value is a list of scopes that define the permissions granted to the application.

Return type:: dict

wait_for_publish(timeout=None)

A waiting call until agent is published.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the agent publishing.

Returns:: A string describing the status of a agent publishing (pending, complete, etc.).
Return type:: str

republish()

Re-publishes the Agent and creates a new Agent Version.

Returns:: The new Agent Version.
Return type:: AgentVersion

class abacusai.AgentChatMessage(client, role=None, text=None, docIds=None, keywordArguments=None, segments=None, streamedData=None, streamedSectionData=None, agentWorkflowNodeId=None)

Bases: abacusai.return_class.AbstractApiClass

A single chat message with Agent Chat.

Parameters:

client (ApiClient) – An authenticated API Client instance
role (str) – The role of the message sender
text (list[dict]) – A list of text segments for the message
docIds (list[str]) – A list of IDs of the uploaded document if the message has
keywordArguments (dict) – User message only. A dictionary of keyword arguments used to generate response.
segments (list[dict]) – A list of segments for the message
streamedData (str) – The streamed data for the message
streamedSectionData (list) – A list of streamed section data for the message
agentWorkflowNodeId (str) – The workflow node name associated with the agent response.

role = None

text = None

doc_ids = None

keyword_arguments = None

segments = None

streamed_data = None

streamed_section_data = None

agent_workflow_node_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AgentConversation(client, messages={})

Bases: abacusai.return_class.AbstractApiClass

List of messages with Agent chat

Parameters:

client (ApiClient) – An authenticated API Client instance
messages (AgentConversationMessage) – list of messages in the conversation with agent.

messages

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AgentDataDocumentInfo(client, docId=None, filename=None, mimeType=None, size=None, pageCount=None)

Bases: abacusai.return_class.AbstractApiClass

Information for documents uploaded to agents.

Parameters:

client (ApiClient) – An authenticated API Client instance
docId (str) – The docstore Document ID of the document.
filename (str) – The file name of the uploaded document.
mimeType (str) – The mime type of the uploaded document.
size (int) – The total size of the uploaded document.
pageCount (int) – The total number of pages in the uploaded document.

doc_id = None

filename = None

mime_type = None

size = None

page_count = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AgentDataExecutionResult(client, response=None, deploymentConversationId=None, docInfos={})

Bases: abacusai.return_class.AbstractApiClass

Results of agent execution with uploaded data.

Parameters:

client (ApiClient) – An authenticated API Client instance
response (str) – The result of agent conversation execution.
deploymentConversationId (id) – The unique identifier of the deployment conversation.
docInfos (AgentDataDocumentInfo) – A list of dict containing information on documents uploaded to agent.

response = None

deployment_conversation_id = None

doc_infos

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AgentVersion(client, agentVersion=None, status=None, agentId=None, agentConfig=None, publishingStartedAt=None, publishingCompletedAt=None, pendingDeploymentIds=None, failedDeploymentIds=None, error=None, agentExecutionConfig=None, codeSource={}, workflowGraph={})

Bases: abacusai.return_class.AbstractApiClass

A version of an AI agent.

Parameters:

client (ApiClient) – An authenticated API Client instance
agentVersion (str) – The unique identifier of an agent version.
status (str) – The current status of the model.
agentId (str) – A reference to the agent this version belongs to.
agentConfig (dict) – The config options used to create this agent.
publishingStartedAt (str) – The start time and date of the training process in ISO-8601 format.
publishingCompletedAt (str) – The end time and date of the training process in ISO-8601 format.
pendingDeploymentIds (list) – List of deployment IDs where deployment is pending.
failedDeploymentIds (list) – List of failed deployment IDs.
error (str) – Relevant error if the status is FAILED.
agentExecutionConfig (dict) – The config for arguments used to execute the agent.
codeSource (CodeSource) – If a python model, information on where the source code is located.
workflowGraph (WorkflowGraph) – The workflow graph for the agent.

agent_version = None

status = None

agent_id = None

agent_config = None

publishing_started_at = None

publishing_completed_at = None

pending_deployment_ids = None

failed_deployment_ids = None

error = None

agent_execution_config = None

code_source

workflow_graph

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: AgentVersion

describe()

Retrieves a full description of the specified agent version.

Parameters:: agent_version (str) – Unique string identifier of the agent version.
Returns:: A agent version.
Return type:: AgentVersion

wait_for_publish(timeout=None)

A waiting call until agent gets published.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the model version under training.

Returns:: A string describing the status of a model training (pending, complete, etc.).
Return type:: str

class abacusai.AiBuildingTask(client, task=None, taskType=None)

Bases: abacusai.return_class.AbstractApiClass

A task for Data Science Co-pilot to help build AI.

Parameters:

client (ApiClient) – An authenticated API Client instance
task (str) – The task to be performed
taskType (str) – The type of task

task = None

task_type = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Algorithm(client, name=None, problemType=None, createdAt=None, updatedAt=None, isDefaultEnabled=None, trainingInputMappings=None, trainFunctionName=None, predictFunctionName=None, predictManyFunctionName=None, initializeFunctionName=None, configOptions=None, algorithmId=None, useGpu=None, algorithmTrainingConfig=None, onlyOfflineDeployable=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

Customer created algorithm

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the algorithm
problemType (str) – The type of the problem this algorithm will work on
createdAt (str) – When the algorithm was created
updatedAt (str) – When the algorithm was last updated
isDefaultEnabled (bool) – Whether train with the algorithm by default
trainingInputMappings (dict) – The mappings for train function parameters’ names, e.g. names for training data, name for training config
trainFunctionName (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
predictFunctionName (str) – Name of the function found in the source code that will be executed run predictions through model. It is not executed when this function is run.
predictManyFunctionName (str) – Name of the function found in the source code that will be executed for batch prediction of the model. It is not executed when this function is run.
initializeFunctionName (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model
configOptions (dict) – Map dataset types and configs to train function parameter names
algorithmId (str) – The unique identifier of the algorithm
useGpu (bool) – Whether to use gpu for model training
algorithmTrainingConfig (dict) – The algorithm specific training config
onlyOfflineDeployable (bool) – Whether or not the algorithm is only allowed to be deployed offline
codeSource (CodeSource) – Info about the source code of the algorithm

name = None

problem_type = None

created_at = None

updated_at = None

is_default_enabled = None

training_input_mappings = None

train_function_name = None

predict_function_name = None

predict_many_function_name = None

initialize_function_name = None

config_options = None

algorithm_id = None

use_gpu = None

algorithm_training_config = None

only_offline_deployable = None

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Annotation(client, annotationType=None, annotationValue=None, comments=None, metadata=None)

Bases: abacusai.return_class.AbstractApiClass

An Annotation Store Annotation

Parameters:

client (ApiClient) – An authenticated API Client instance
annotationType (str) – A name determining the type of annotation and how to interpret the annotation value data, e.g. as a label, bounding box, etc.
annotationValue (dict) – JSON-compatible value of the annotation. The format of the value is determined by the annotation type.
comments (dict) – Comments about the annotation. This is a dictionary of feature name to the corresponding comment.
metadata (dict) – Metadata about the annotation.

annotation_type = None

annotation_value = None

comments = None

metadata = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AnnotationConfig(client, featureAnnotationConfigs=None, labels=None, statusFeature=None, commentsFeatures=None, metadataFeature=None)

Bases: abacusai.return_class.AbstractApiClass

Annotation config for a feature group

Parameters:

client (ApiClient) – An authenticated API Client instance
featureAnnotationConfigs (list) – List of feature annotation configs
labels (list) – List of labels
statusFeature (str) – Name of the feature that contains the status of the annotation (Optional)
commentsFeatures (list) – Features that contain comments for the annotation (Optional)
metadataFeature (str) – Name of the feature that contains the metadata for the annotation (Optional)

feature_annotation_configs = None

labels = None

status_feature = None

comments_features = None

metadata_feature = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AnnotationDocument(client, docId=None, featureGroupRowIdentifier=None, featureGroupRowIndex=None, totalRows=None, isAnnotationPresent=None)

Bases: abacusai.return_class.AbstractApiClass

Document to be annotated.

Parameters:

client (ApiClient) – An authenticated API Client instance
docId (str) – The docstore Document ID of the document.
featureGroupRowIdentifier (str) – The key value of the feature group row the annotation is on. Usually the primary key value.
featureGroupRowIndex (int) – The index of the document row in the feature group.
totalRows (int) – The total number of rows in the feature group.
isAnnotationPresent (bool) – Whether the document already has an annotation. Returns None if feature group is not under annotations review mode.

doc_id = None

feature_group_row_identifier = None

feature_group_row_index = None

total_rows = None

is_annotation_present = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AnnotationEntry(client, featureGroupId=None, featureName=None, docId=None, featureGroupRowIdentifier=None, updatedAt=None, annotationEntryMarker=None, status=None, lockedUntil=None, verificationInfo=None, annotation={})

Bases: abacusai.return_class.AbstractApiClass

An Annotation Store entry for an Annotation

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – The ID of the feature group this annotation belongs to.
featureName (str) – name of the feature this annotation is on.
docId (str) – The ID of the primary document the annotation is on.
featureGroupRowIdentifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the primary key value.
updatedAt (str) – Most recent time the annotation entry was modified, e.g. creation or update time.
annotationEntryMarker (str) – The entry marker for the annotation.
status (str) – The status of labeling the document.
lockedUntil (str) – The time until which the document is locked for editing, in ISO-8601 format.
verificationInfo (dict) – The verification info for the annotation.
annotation (Annotation) – json-compatible structure holding the type and value of the annotation.

feature_group_id = None

feature_name = None

doc_id = None

feature_group_row_identifier = None

updated_at = None

annotation_entry_marker = None

status = None

locked_until = None

verification_info = None

annotation

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AnnotationsStatus(client, total=None, done=None, inProgress=None, todo=None, latestUpdatedAt=None, isMaterializationNeeded=None, latestMaterializedAnnotationConfig={})

Bases: abacusai.return_class.AbstractApiClass

The status of annotations for a feature group

Parameters:

client (ApiClient) – An authenticated API Client instance
total (int) – The total number of documents annotated
done (int) – The number of documents annotated
inProgress (int) – The number of documents currently being annotated
todo (int) – The number of documents that need to be annotated
latestUpdatedAt (str) – The latest time an annotation was updated (ISO-8601 format)
isMaterializationNeeded (bool) – Whether feature group needs to be materialized before using for annotations
latestMaterializedAnnotationConfig (AnnotationConfig) – The annotation config corresponding to the latest materialized feature group

total = None

done = None

in_progress = None

todo = None

latest_updated_at = None

is_materialization_needed = None

latest_materialized_annotation_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ApiClass

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

_upper_snake_case_keys: bool = False

_support_kwargs: bool = False

__post_init__()

classmethod _get_builder()

__str__()

_repr_html_()

__getitem__(item)

Parameters:: item (str)

__setitem__(item, value)

Parameters:

item (str)
value (Any)

_unset_item(item)

Parameters:: item (str)

get(item, default=None)

Parameters:

item (str)
default (Any)

pop(item, default=NotImplemented)

Parameters:

item (str)
default (Any)

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(input_dict)

Parameters:: input_dict (dict)

abacusai.get_clean_function_source_code_for_agent(func)

Parameters:: func (Callable)

abacusai.validate_constructor_arg_types(friendly_class_name=None)

abacusai.MIN_AGENT_SLEEP_TIME = 3600

abacusai.validate_input_dict_param(dict_object, friendly_class_name, must_contain=[])

class abacusai.FieldDescriptor

Bases: abacusai.api_class.abstract.ApiClass

Configs for vector store indexing.

Parameters:

field (str) – The field to be extracted. This will be used as the key in the response.
description (str) – The description of this field. If not included, the response_field will be used.
example_extraction (Union[str, int, bool, float]) – An example of this extracted field.
type (FieldDescriptorType) – The type of this field. If not provided, the default type is STRING.

field: str

description: str = None

example_extraction: str | int | bool | float | list | dict = None

type: abacusai.api_class.enums.FieldDescriptorType

class abacusai.JSONSchema

classmethod from_fields_list(fields_list)

Parameters:: fields_list (List[str])

classmethod to_fields_list(json_schema)

Return type:: List[str]

class abacusai.WorkflowNodeInputMapping

Bases: abacusai.api_class.abstract.ApiClass

Represents a mapping of inputs to a workflow node.

Parameters:

name (str) – The name of the input variable of the node function.
variable_type (Union[WorkflowNodeInputType, str]) – The type of the input. If the type is IGNORE, the input will be ignored.
variable_source (str) – The name of the node this variable is sourced from. If the type is WORKFLOW_VARIABLE, the value given by the source node will be directly used. If the type is USER_INPUT, the value given by the source node will be used as the default initial value before the user edits it. Set to None if the type is USER_INPUT and the variable doesn’t need a pre-filled initial value.
is_required (bool) – Indicates whether the input is required. Defaults to True.
description (str) – The description of this input.
constant_value (str) – The constant value of this input if variable type is CONSTANT. Only applicable for template nodes.

name: str

variable_type: abacusai.api_class.enums.WorkflowNodeInputType

variable_source: str = None

source_prop: str = None

is_required: bool = True

description: str = None

constant_value: str = None

__post_init__()

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(mapping)

Parameters:: mapping (dict)

class abacusai.WorkflowNodeInputSchema

Bases: abacusai.api_class.abstract.ApiClass, JSONSchema

A schema conformant to react-jsonschema-form for workflow node input.

To initialize a WorkflowNodeInputSchema dependent on another node’s output, use the from_workflow_node method.

Parameters:

json_schema (dict) – The JSON schema for the input, conformant to react-jsonschema-form specification. Must define keys like “title”, “type”, and “properties”. Supported elements include Checkbox, Radio Button, Dropdown, Textarea, Number, Date, and file upload. Nested elements, arrays, and other complex types are not supported.
ui_schema (dict) – The UI schema for the input, conformant to react-jsonschema-form specification.

json_schema: dict

ui_schema: dict

schema_source: str = None

schema_prop: str = None

runtime_schema: bool = False

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(schema)

Parameters:: schema (dict)

classmethod from_workflow_node(schema_source, schema_prop)

Creates a WorkflowNodeInputSchema instance which references the schema generated by a WorkflowGraphNode.

Parameters:

schema_source (str) – The name of the source WorkflowGraphNode.
schema_prop (str) – The name of the input schema parameter which source node outputs.

classmethod from_input_mappings(input_mappings)

Creates a json_schema for the input schema of the node from it’s input mappings.

Parameters:: input_mappings (List[WorkflowNodeInputMapping]) – The input mappings for the node.

classmethod from_tool_variable_mappings(tool_variable_mappings)

Creates a WorkflowNodeInputSchema for the given tool variable mappings.

Parameters:: tool_variable_mappings (List[dict]) – The tool variable mappings for the node.

class abacusai.WorkflowNodeOutputMapping

Bases: abacusai.api_class.abstract.ApiClass

Represents a mapping of output from a workflow node.

Parameters:

name (str) – The name of the output.
variable_type (Union[WorkflowNodeOutputType, str]) – The type of the output in the form of an enum or a string.
description (str) – The description of this output.

name: str

variable_type: abacusai.api_class.enums.WorkflowNodeOutputType | str

description: str = None

__post_init__()

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(mapping)

Parameters:: mapping (dict)

class abacusai.WorkflowNodeOutputSchema

Bases: abacusai.api_class.abstract.ApiClass, JSONSchema

A schema conformant to react-jsonschema-form for a workflow node output.

Parameters:: json_schema (dict) – The JSON schema for the output, conformant to react-jsonschema-form specification.

json_schema: dict

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(schema)

Parameters:: schema (dict)

class abacusai.TriggerConfig

Bases: abacusai.api_class.abstract.ApiClass

Represents the configuration for a trigger workflow node.

Parameters:: sleep_time (int) – The time in seconds to wait before the node gets executed again.

sleep_time: int = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(configs)

Parameters:: configs (dict)

class abacusai.WorkflowGraphNode(name, function=None, input_mappings=None, output_mappings=None, function_name=None, source_code=None, input_schema=None, output_schema=None, template_metadata=None, trigger_config=None)

Bases: abacusai.api_class.abstract.ApiClass

Represents a node in an Agent workflow graph.

Parameters:

name (str) – A unique name for the workflow node.
input_mappings (List[WorkflowNodeInputMapping]) – List of input mappings for the node. Each arg/kwarg of the node function should have a corresponding input mapping.
output_mappings (List[str]) – List of outputs for the node. Each field in the returned dict/AgentResponse must have a corresponding output in the list.
function (callable) – The callable node function reference.
input_schema (WorkflowNodeInputSchema) – The react json schema for the user input variables. This should be empty for CHAT interface.
output_schema (List[str]) – The list of outputs to be shown on UI. Each output corresponds to a field in the output mappings of the node.
function_name (str)
source_code (str)
template_metadata (dict)
trigger_config (TriggerConfig)

Additional Attributes:: function_name (str): The name of the function. source_code (str): The source code of the function. trigger_config (TriggerConfig): The configuration for a trigger workflow node.

template_metadata = None

trigger_config = None

node_type = 'workflow_node'

classmethod _raw_init(name, input_mappings=None, output_mappings=None, function=None, function_name=None, source_code=None, input_schema=None, output_schema=None, template_metadata=None, trigger_config=None)

Parameters:

name (str)
input_mappings (List[WorkflowNodeInputMapping])
output_mappings (List[WorkflowNodeOutputMapping])
function (callable)
function_name (str)
source_code (str)
input_schema (WorkflowNodeInputSchema)
output_schema (WorkflowNodeOutputSchema)
template_metadata (dict)
trigger_config (TriggerConfig)

classmethod from_template(template_name, name, configs=None, input_mappings=None, input_schema=None, output_schema=None, sleep_time=None)

Parameters:

template_name (str)
name (str)
configs (dict)
input_mappings (Union[Dict[str, WorkflowNodeInputMapping], List[WorkflowNodeInputMapping]])
input_schema (Union[List[str], WorkflowNodeInputSchema])
output_schema (Union[List[str], WorkflowNodeOutputSchema])
sleep_time (int)

classmethod from_tool(tool_name, name, configs=None, input_mappings=None, input_schema=None, output_schema=None)

Creates and returns a WorkflowGraphNode based on an available user created tool. Note: DO NOT specify the output mapping for the tool; it will be inferred automatically. Doing so will raise an error.

Parameters:

tool_name (str) – The name of the tool. There should already be a tool created in the platform with tool_name.
name (str) – The name to assign to the WorkflowGraphNode instance.
configs (optional) – The configuration state of the tool to use (if necessary). If not specified, will use the tool’s default configuration.
input_mappings (optional) – The WorkflowNodeInputMappings for this node.
input_schema (optional) – The WorkflowNodeInputSchema for this node.
output_schema (optional) – The WorkflowNodeOutputSchema for this node.

classmethod from_system_tool(tool_name, name, configs=None, input_mappings=None, input_schema=None, output_schema=None)

Creates and returns a WorkflowGraphNode based on the name of an available system tool. Note: DO NOT specify the output mapping for the tool; it will be inferred automatically. Doing so will raise an error.

Parameters:

tool_name (str) – The name of the tool. There should already be a tool created in the platform with tool_name.
name (str) – The name to assign to the WorkflowGraphNode instance.
configs (optional) – The configuration state of the tool to use (if necessary). If not specified, will use the tool’s default configuration.
input_mappings (optional) – The WorkflowNodeInputMappings for this node.
input_schema (optional) – The WorkflowNodeInputSchema for this node.
output_schema (optional) – The WorkflowNodeOutputSchema for this node.

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

is_template_node()

is_trigger_node()

classmethod from_dict(node)

Parameters:: node (dict)

__setattr__(name, value)

__getattribute__(name)

class Outputs(node)

Parameters:: node (WorkflowGraphNode)

node

__getattr__(name)

property outputs

class abacusai.DecisionNode(name, condition, input_mappings)

Bases: WorkflowGraphNode

Represents a decision node in an Agent workflow graph. It is connected between two workflow nodes and is used to determine if subsequent nodes should be executed.

Parameters:

name (str)
condition (str)
input_mappings (Union[Dict[str, WorkflowNodeInputMapping], List[WorkflowNodeInputMapping]])

node_type = 'decision_node'

name

source_code

output_mappings

template_metadata = None

trigger_config = None

input_schema = None

output_schema = None

function_name = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(node)

Parameters:: node (dict)

class abacusai.WorkflowGraphEdge(source, target, details=None)

Bases: abacusai.api_class.abstract.ApiClass

Represents an edge in an Agent workflow graph.

To make an edge conditional, provide {‘EXECUTION_CONDITION’: ‘<condition>’} key-value in the details dictionary. The condition should be a Pythonic expression string that evaluates to a boolean value and only depends on the outputs of the source node of the edge.

Parameters:

source (str) – The name of the source node of the edge.
target (str) – The name of the target node of the edge.
details (dict) – Additional details about the edge. Like the condition for edge execution.

source: str | WorkflowGraphNode

target: str | WorkflowGraphNode

details: dict

to_nx_edge()

classmethod from_dict(input_dict)

Parameters:: input_dict (dict)

class abacusai.WorkflowGraph

Bases: abacusai.api_class.abstract.ApiClass

Represents an Agent workflow graph.

Parameters:

nodes (List[Union[WorkflowGraphNode, DecisionNode]]) – A list of nodes in the workflow graph.
primary_start_node (Union[str, WorkflowGraphNode]) – The primary node to start the workflow from.
common_source_code (str) – Common source code that can be used across all nodes.

nodes: List[WorkflowGraphNode | DecisionNode] = []

edges: List[WorkflowGraphEdge | Tuple[WorkflowGraphNode, WorkflowGraphNode, dict] | Tuple[str, str, dict]] = []

primary_start_node: str | WorkflowGraphNode = None

common_source_code: str = None

specification_type: str = 'data_flow'

__post_init__()

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(graph)

Parameters:: graph (dict)

class abacusai.AgentConversationMessage

Bases: abacusai.api_class.abstract.ApiClass

Message format for agent conversation

Parameters:

is_user (bool) – Whether the message is from the user.
text (str) – The message’s text.
document_contents (dict) – Dict of document name to document text in case of any document present.

is_user: bool = None

text: str = None

document_contents: dict = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.WorkflowNodeTemplateConfig

Bases: abacusai.api_class.abstract.ApiClass

Represents a WorkflowNode template config.

Parameters:

name (str) – A unique name of the config.
description (str) – The description of this config.
default_value (str) – Default value of the config to be used if value is not provided during node initialization.
is_required (bool) – Indicates whether the config is required. Defaults to False.

name: str

description: str = None

default_value: str = None

is_required: bool = False

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(mapping)

Parameters:: mapping (dict)

class abacusai.WorkflowNodeTemplateInput

Bases: abacusai.api_class.abstract.ApiClass

Represents an input to the workflow node generated using template.

Parameters:

name (str) – A unique name of the input.
is_required (bool) – Indicates whether the input is required. Defaults to False.
description (str) – The description of this input.

name: str

is_required: bool = False

description: str = ''

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(mapping)

Parameters:: mapping (dict)

class abacusai.WorkflowNodeTemplateOutput

Bases: abacusai.api_class.abstract.ApiClass

Represents an output returned by the workflow node generated using template.

Parameters:

name (str) – The name of the output.
variable_type (WorkflowNodeOutputType) – The type of the output.
description (str) – The description of this output.

name: str

variable_type: abacusai.api_class.enums.WorkflowNodeOutputType

description: str = ''

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

classmethod from_dict(mapping)

Parameters:: mapping (dict)

class abacusai.HotkeyPrompt

Bases: abacusai.api_class.abstract.ApiClass

A config class for a Data Science Co-Pilot Hotkey

Parameters:

prompt (str) – The prompt to send to Data Science Co-Pilot
title (str) – A short, descriptive title for the prompt. If not provided, one will be automatically generated.

prompt: str

title: str = None

disable_problem_type_context: bool = True

ignore_history: bool = None

class abacusai._ApiClassFactory

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class = None

config_class_key = None

config_class_map

classmethod from_dict(config)

Parameters:: config (dict)
Return type:: ApiClass

class abacusai.BatchPredictionArgs

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for Batch Prediction args specific to problem type.

_support_kwargs: bool = True

kwargs: dict

problem_type: abacusai.api_class.enums.ProblemType = None

classmethod _get_builder()

class abacusai.ForecastingBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the FORECASTING problem type

Parameters:

for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation
predictions_start_date (str) – The start date for predictions. Accepts timestamp integers and strings in many standard formats such as YYYY-MM-DD, YYYY-MM-DD HH:MM:SS, or YYYY-MM-DDTHH:MM:SS. If not specified, the prediction start date will be automatically defined.
use_prediction_offset (bool) – If True, use prediction offset.
start_date_offset (int) – Sets prediction start date as this offset relative to the prediction start date.
forecasting_horizon (int) – The number of timestamps to predict in the future. Range: [1, 1000].
item_attributes_to_include_in_the_result (list) – List of columns to include in the prediction output.
explain_predictions (bool) – If True, calculates explanations for the forecasted values along with predictions.
create_monitor (bool) – Controls whether to automatically create a monitor to calculate the drift each time the batch prediction is run. Defaults to true if not specified.

for_eval: bool = None

predictions_start_date: str = None

use_prediction_offset: bool = None

start_date_offset: int = None

forecasting_horizon: int = None

item_attributes_to_include_in_the_result: list = None

explain_predictions: bool = None

create_monitor: bool = None

__post_init__()

class abacusai.NamedEntityExtractionBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the NAMED_ENTITY_EXTRACTION problem type

Parameters:: for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation.

for_eval: bool = None

__post_init__()

class abacusai.PersonalizationBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the PERSONALIZATION problem type

Parameters:

for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation.
number_of_items (int) – Number of items to recommend.
item_attributes_to_include_in_the_result (list) – List of columns to include in the prediction output.
score_field (str) – If specified, relative item scores will be returned using a field with this name

for_eval: bool = None

number_of_items: int = None

item_attributes_to_include_in_the_result: list = None

score_field: str = None

__post_init__()

class abacusai.PredictiveModelingBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the PREDICTIVE_MODELING problem type

Parameters:

for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation.
explainer_type (enums.ExplainerType) – The type of explainer to use to generate explanations on the batch prediction.
number_of_samples_to_use_for_explainer (int) – Number Of Samples To Use For Kernel Explainer.
include_multi_class_explanations (bool) – If True, Includes explanations for all classes in multi-class classification.
features_considered_constant_for_explanations (str) – Comma separate list of fields to treat as constant in SHAP explanations.
importance_of_records_in_nested_columns (str) – Returns importance of each index in the specified nested column instead of SHAP column explanations.
explanation_filter_lower_bound (float) – If set explanations will be limited to predictions above this value, Range: [0, 1].
explanation_filter_upper_bound (float) – If set explanations will be limited to predictions below this value, Range: [0, 1].
explanation_filter_label (str) – For classification problems specifies the label to which the explanation bounds are applied.
output_columns (list) – A list of column names to include in the prediction result.
explain_predictions (bool) – If True, calculates explanations for the predicted values along with predictions.
create_monitor (bool) – Controls whether to automatically create a monitor to calculate the drift each time the batch prediction is run. Defaults to true if not specified.

for_eval: bool = None

explainer_type: abacusai.api_class.enums.ExplainerType = None

number_of_samples_to_use_for_explainer: int = None

include_multi_class_explanations: bool = None

features_considered_constant_for_explanations: str = None

importance_of_records_in_nested_columns: str = None

explanation_filter_lower_bound: float = None

explanation_filter_upper_bound: float = None

explanation_filter_label: str = None

output_columns: list = None

explain_predictions: bool = None

create_monitor: bool = None

__post_init__()

class abacusai.PretrainedModelsBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the PRETRAINED_MODELS problem type

Parameters:

for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation.
files_input_location (str) – The input location for the files.
files_output_location_prefix (str) – The output location prefix for the files.
channel_id_to_label_map (str) – JSON string for the map from channel ids to their labels.

for_eval: bool = None

files_input_location: str = None

files_output_location_prefix: str = None

channel_id_to_label_map: str = None

__post_init__()

class abacusai.SentenceBoundaryDetectionBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the SENTENCE_BOUNDARY_DETECTION problem type

Parameters:

for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation
explode_output (bool) – Explode data so there is one sentence per row.

for_eval: bool = None

explode_output: bool = None

__post_init__()

class abacusai.ThemeAnalysisBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the THEME_ANALYSIS problem type

Parameters:

for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation.
analysis_frequency (str) – The length of each analysis interval.
start_date (str) – The end point for predictions.
analysis_days (int) – How many days to analyze.

for_eval: bool = None

analysis_frequency: str = None

start_date: str = None

analysis_days: int = None

__post_init__()

class abacusai.ChatLLMBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the ChatLLM problem type

Parameters:: for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation.

for_eval: bool = None

__post_init__()

class abacusai.TrainablePlugAndPlayBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the TrainablePlugAndPlay problem type

Parameters:

for_eval (bool) – If True, the test fold which was created during training and used for metrics calculation will be used as input data. These predictions are hence, used for model evaluation.
create_monitor (bool) – Controls whether to automatically create a monitor to calculate the drift each time the batch prediction is run. Defaults to true if not specified.

for_eval: bool = None

create_monitor: bool = None

__post_init__()

class abacusai.AIAgentBatchPredictionArgs

Bases: BatchPredictionArgs

Batch Prediction Config for the AIAgents problem type

__post_init__()

class abacusai._BatchPredictionArgsFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'problem_type'

config_class_map

class abacusai.Blob(contents, mime_type=None, filename=None, size=None)

Bases: abacusai.api_class.abstract.ApiClass

An object for storing and passing file data. In AI Agents, if a function accepts file upload as an argument, the uploaded file is passed as a Blob object. If a function returns a Blob object, it will be rendered as a file download.

Parameters:

contents (bytes) – The binary contents of the blob.
mime_type (str) – The mime type of the blob.
filename (str) – The original filename of the blob.
size (int) – The size of the blob in bytes.

filename: str

contents: bytes

mime_type: str

size: int

classmethod from_local_file(file_path)

Parameters:: file_path (str)
Return type:: Blob

classmethod from_contents(contents, filename=None, mime_type=None)

Parameters:

contents (bytes)
filename (str)
mime_type (str)

Return type:

Blob

class abacusai.BlobInput(filename=None, contents=None, mime_type=None, size=None)

Bases: Blob

An object for storing and passing file data. In AI Agents, if a function accepts file upload as an argument, the uploaded file is passed as a BlobInput object.

Parameters:

filename (str) – The original filename of the blob.
contents (bytes) – The binary contents of the blob.
mime_type (str) – The mime type of the blob.
size (int) – The size of the blob in bytes.

class abacusai.DatasetConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for dataset configs

Parameters:: is_documentset (bool) – Whether the dataset is a document set

is_documentset: bool = None

class abacusai.StreamingConnectorDatasetConfig

Bases: abacusai.api_class.dataset.DatasetConfig

An abstract class for dataset configs specific to streaming connectors.

Parameters:: streaming_connector_type (StreamingConnectorType) – The type of streaming connector

streaming_connector_type: abacusai.api_class.enums.StreamingConnectorType = None

classmethod _get_builder()

class abacusai.KafkaDatasetConfig

Bases: StreamingConnectorDatasetConfig

Dataset config for Kafka Streaming Connector

Parameters:: topic (str) – The kafka topic to consume

topic: str = None

__post_init__()

class abacusai._StreamingConnectorDatasetConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'streaming_connector_type'

config_class_map

class abacusai.DocumentType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

SIMPLE_TEXT = 'SIMPLE_TEXT'

TEXT = 'TEXT'

TABLES_AND_FORMS = 'TABLES_AND_FORMS'

EMBEDDED_IMAGES = 'EMBEDDED_IMAGES'

SCANNED_TEXT = 'SCANNED_TEXT'

COMPREHENSIVE_MARKDOWN = 'COMPREHENSIVE_MARKDOWN'

classmethod is_ocr_forced(document_type)

Parameters:: document_type (DocumentType)

class abacusai.OcrMode

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AUTO = 'AUTO'

DEFAULT = 'DEFAULT'

LAYOUT = 'LAYOUT'

SCANNED = 'SCANNED'

COMPREHENSIVE = 'COMPREHENSIVE'

COMPREHENSIVE_V2 = 'COMPREHENSIVE_V2'

COMPREHENSIVE_TABLE_MD = 'COMPREHENSIVE_TABLE_MD'

COMPREHENSIVE_FORM_MD = 'COMPREHENSIVE_FORM_MD'

COMPREHENSIVE_FORM_AND_TABLE_MD = 'COMPREHENSIVE_FORM_AND_TABLE_MD'

TESSERACT_FAST = 'TESSERACT_FAST'

LLM = 'LLM'

AUGMENTED_LLM = 'AUGMENTED_LLM'

classmethod aws_ocr_modes()

class abacusai.ParsingConfig

Bases: abacusai.api_class.abstract.ApiClass

Custom config for dataset parsing.

Parameters:

escape (str) – Escape character for CSV files. Defaults to ‘”’.
csv_delimiter (str) – Delimiter for CSV files. Defaults to None.
file_path_with_schema (str) – Path to the file with schema. Defaults to None.

escape: str = '"'

csv_delimiter: str = None

file_path_with_schema: str = None

class abacusai.DocumentProcessingConfig

Bases: abacusai.api_class.abstract.ApiClass

Document processing configuration.

Parameters:

document_type (DocumentType) – Type of document. Can be one of Text, Tables and Forms, Embedded Images, etc. If not specified, type will be decided automatically.
highlight_relevant_text (bool) – Whether to extract bounding boxes and highlight relevant text in search results. Defaults to False.
extract_bounding_boxes (bool) – Whether to perform OCR and extract bounding boxes. If False, no OCR will be done but only the embedded text from digital documents will be extracted. Defaults to False.
ocr_mode (OcrMode) – OCR mode. There are different OCR modes available for different kinds of documents and use cases. This option only takes effect when extract_bounding_boxes is True.
use_full_ocr (bool) – Whether to perform full OCR. If True, OCR will be performed on the full page. If False, OCR will be performed on the non-text regions only. By default, it will be decided automatically based on the OCR mode and the document type. This option only takes effect when extract_bounding_boxes is True.
remove_header_footer (bool) – Whether to remove headers and footers. Defaults to False. This option only takes effect when extract_bounding_boxes is True.
remove_watermarks (bool) – Whether to remove watermarks. By default, it will be decided automatically based on the OCR mode and the document type. This option only takes effect when extract_bounding_boxes is True.
convert_to_markdown (bool) – Whether to convert extracted text to markdown. Defaults to False. This option only takes effect when extract_bounding_boxes is True.
mask_pii (bool) – Whether to mask personally identifiable information (PII) in the document text/tokens. Defaults to False.
extract_images (bool) – Whether to extract images from the document e.g. diagrams in a PDF page. Defaults to False.

document_type: abacusai.api_class.enums.DocumentType = None

highlight_relevant_text: bool = None

extract_bounding_boxes: bool = False

ocr_mode: abacusai.api_class.enums.OcrMode

use_full_ocr: bool = None

remove_header_footer: bool = False

remove_watermarks: bool = True

convert_to_markdown: bool = False

mask_pii: bool = False

extract_images: bool = False

__post_init__()

_detect_ocr_mode()

classmethod _get_filtered_dict(config)

Filters out default values from the config

Parameters:: config (dict)

class abacusai.DatasetDocumentProcessingConfig

Bases: DocumentProcessingConfig

Document processing configuration for dataset imports.

Parameters:

extract_bounding_boxes (bool) – Whether to perform OCR and extract bounding boxes. If False, no OCR will be done but only the embedded text from digital documents will be extracted. Defaults to False.
ocr_mode (OcrMode) – OCR mode. There are different OCR modes available for different kinds of documents and use cases. This option only takes effect when extract_bounding_boxes is True.
use_full_ocr (bool) – Whether to perform full OCR. If True, OCR will be performed on the full page. If False, OCR will be performed on the non-text regions only. By default, it will be decided automatically based on the OCR mode and the document type. This option only takes effect when extract_bounding_boxes is True.
remove_header_footer (bool) – Whether to remove headers and footers. Defaults to False. This option only takes effect when extract_bounding_boxes is True.
remove_watermarks (bool) – Whether to remove watermarks. By default, it will be decided automatically based on the OCR mode and the document type. This option only takes effect when extract_bounding_boxes is True.
convert_to_markdown (bool) – Whether to convert extracted text to markdown. Defaults to False. This option only takes effect when extract_bounding_boxes is True.
page_text_column (str) – Name of the output column which contains the extracted text for each page. If not provided, no column will be created.

page_text_column: str = None

class abacusai.IncrementalDatabaseConnectorConfig

Bases: abacusai.api_class.abstract.ApiClass

Config information for incremental datasets from database connectors

Parameters:: timestamp_column (str) – If dataset is incremental, this is the column name of the required column in the dataset. This column must contain timestamps in descending order which are used to determine the increments of the incremental dataset.

timestamp_column: str = None

class abacusai.AttachmentParsingConfig

Bases: abacusai.api_class.abstract.ApiClass

Config information for parsing attachments

Parameters:

feature_group_name (str) – feature group name
column_name (str) – column name
urls (str) – list of urls

feature_group_name: str = None

column_name: str = None

urls: str = None

class abacusai.ApplicationConnectorDatasetConfig

Bases: abacusai.api_class.dataset.DatasetConfig

An abstract class for dataset configs specific to application connectors.

Parameters:

application_connector_type (enums.ApplicationConnectorType) – The type of application connector
application_connector_id (str) – The ID of the application connector
document_processing_config (DatasetDocumentProcessingConfig) – The document processing configuration. Only valid if is_documentset is True for the dataset.

application_connector_type: abacusai.api_class.enums.ApplicationConnectorType = None

application_connector_id: str = None

document_processing_config: abacusai.api_class.dataset.DatasetDocumentProcessingConfig = None

classmethod _get_builder()

class abacusai.ConfluenceDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Confluence Application Connector :param location: The location of the pages to fetch :type location: str :param space_key: The space key of the space from which we fetch pages :type space_key: str :param pull_attachments: Whether to pull attachments for each page :type pull_attachments: bool :param extract_bounding_boxes: Whether to extract bounding boxes from the documents :type extract_bounding_boxes: bool :param location_type: The type of location to be fetched. Maps values in location to content type, example: ‘spaceKey/folderTitle/*’ -> ‘folder’ :type location_type: str

location: str = None

space_key: str = None

pull_attachments: bool = False

extract_bounding_boxes: bool = False

location_type: str = None

__post_init__()

class abacusai.BoxDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Box Application Connector :param location: The regex location of the files to fetch :type location: str :param csv_delimiter: If the file format is CSV, use a specific csv delimiter :type csv_delimiter: str :param merge_file_schemas: Signifies if the merge file schema policy is enabled. Not applicable if is_documentset is True :type merge_file_schemas: bool

location: str = None

csv_delimiter: str = None

merge_file_schemas: bool = False

__post_init__()

class abacusai.GoogleAnalyticsDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Google Analytics Application Connector

Parameters:

location (str) – The view id of the report in the connector to fetch
start_timestamp (int) – Unix timestamp of the start of the period that will be queried
end_timestamp (int) – Unix timestamp of the end of the period that will be queried

location: str = None

start_timestamp: int = None

end_timestamp: int = None

__post_init__()

class abacusai.GoogleDriveDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Google Drive Application Connector

Parameters:

location (str) – The regex location of the files to fetch
csv_delimiter (str) – If the file format is CSV, use a specific csv delimiter
extract_bounding_boxes (bool) – Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True
merge_file_schemas (bool) – Signifies if the merge file schema policy is enabled. Not applicable if is_documentset is True

location: str = None

csv_delimiter: str = None

extract_bounding_boxes: bool = False

merge_file_schemas: bool = False

__post_init__()

class abacusai.JiraDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Jira Application Connector

Parameters:

jql (str) – The JQL query for fetching issues
custom_fields (list) – A list of custom fields to include in the dataset
include_comments (bool) – Fetch comments for each issue
include_watchers (bool) – Fetch watchers for each issue

jql: str = None

custom_fields: list = None

include_comments: bool = False

include_watchers: bool = False

__post_init__()

class abacusai.OneDriveDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for OneDrive Application Connector

Parameters:

location (str) – The regex location of the files to fetch
csv_delimiter (str) – If the file format is CSV, use a specific csv delimiter
extract_bounding_boxes (bool) – Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True
merge_file_schemas (bool) – Signifies if the merge file schema policy is enabled. Not applicable if is_documentset is True

location: str = None

csv_delimiter: str = None

extract_bounding_boxes: bool = False

merge_file_schemas: bool = False

__post_init__()

class abacusai.SharepointDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Sharepoint Application Connector

Parameters:

location (str) – The regex location of the files to fetch
csv_delimiter (str) – If the file format is CSV, use a specific csv delimiter
extract_bounding_boxes (bool) – Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True
merge_file_schemas (bool) – Signifies if the merge file schema policy is enabled. Not applicable if is_documentset is True

location: str = None

csv_delimiter: str = None

extract_bounding_boxes: bool = False

merge_file_schemas: bool = False

__post_init__()

class abacusai.ZendeskDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Zendesk Application Connector

Parameters:: location (str) – The regex location of the files to fetch

location: str = None

__post_init__()

class abacusai.AbacusUsageMetricsDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Abacus Usage Metrics Application Connector

Parameters:

include_entire_conversation_history (bool) – Whether to show the entire history for this deployment conversation
include_all_feedback (bool) – Whether to include all feedback for this deployment conversation
resolve_matching_documents (bool) – Whether to get matching document references for response instead of prompt. Needs to recalculate them if highlights are unavailable in summary_info

include_entire_conversation_history: bool = False

include_all_feedback: bool = False

resolve_matching_documents: bool = False

__post_init__()

class abacusai.TeamsScraperDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Teams Scraper Application Connector

Parameters:

pull_chat_messages (bool) – Whether to pull teams chat messages
pull_channel_posts (bool) – Whether to pull posts for each channel
pull_transcripts (bool) – Whether to pull transcripts for calendar meetings

pull_chat_messages: bool = False

pull_channel_posts: bool = False

pull_transcripts: bool = False

__post_init__()

class abacusai.FreshserviceDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for Freshservice Application Connector

__post_init__()

class abacusai.SftpDatasetConfig

Bases: ApplicationConnectorDatasetConfig

Dataset config for SFTP Application Connector

Parameters:

location (str) – The regex location of the files to fetch
csv_delimiter (str) – If the file format is CSV, use a specific csv delimiter
extract_bounding_boxes (bool) – Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True
merge_file_schemas (bool) – Signifies if the merge file schema policy is enabled. Not applicable if is_documentset is True

location: str = None

csv_delimiter: str = None

extract_bounding_boxes: bool = False

merge_file_schemas: bool = False

__post_init__()

class abacusai._ApplicationConnectorDatasetConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'application_connector_type'

config_class_map

class abacusai.PredictionArguments

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for prediction arguments specific to problem type.

_support_kwargs: bool = True

kwargs: dict

problem_type: abacusai.api_class.enums.ProblemType = None

classmethod _get_builder()

class abacusai.OptimizationPredictionArguments

Bases: PredictionArguments

Prediction arguments for the OPTIMIZATION problem type

Parameters:

forced_assignments (dict) – Set of assignments to force and resolve before returning query results.
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
include_all_assignments (bool) – If True, will return all assignments, including assignments with value 0. Default is False.

forced_assignments: dict = None

solve_time_limit_seconds: float = None

include_all_assignments: bool = None

__post_init__()

class abacusai.TimeseriesAnomalyPredictionArguments

Bases: PredictionArguments

Prediction arguments for the TS_ANOMALY problem type

Parameters:

start_timestamp (str) – Timestamp from which anomalies have to be detected in the training data
end_timestamp (str) – Timestamp to which anomalies have to be detected in the training data
get_all_item_data (bool) – If True, anomaly detection has to be performed on all the data related to input ids

start_timestamp: str = None

end_timestamp: str = None

get_all_item_data: bool = None

__post_init__()

class abacusai.ChatLLMPredictionArguments

Bases: PredictionArguments

Prediction arguments for the CHAT_LLM problem type

Parameters:

llm_name (str) – Name of the specific LLM backend to use to power the chat experience.
num_completion_tokens (int) – Default for maximum number of tokens for chat answers.
system_message (str) – The generative LLM system message.
temperature (float) – The generative LLM temperature.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
ignore_documents (bool) – If True, will ignore any documents and search results, and only use the messages to generate a response.

llm_name: str = None

num_completion_tokens: int = None

system_message: str = None

temperature: float = None

search_score_cutoff: float = None

ignore_documents: bool = None

__post_init__()

class abacusai.RegressionPredictionArguments

Bases: PredictionArguments

Prediction arguments for the PREDICTIVE_MODELING problem type

Parameters:

explain_predictions (bool) – If true, will explain predictions.
explainer_type (str) – Type of explainer to use for explanations.

explain_predictions: bool = None

explainer_type: str = None

__post_init__()

class abacusai.ForecastingPredictionArguments

Bases: PredictionArguments

Prediction arguments for the FORECASTING problem type

Parameters:

num_predictions (int) – The number of timestamps to predict in the future.
prediction_start (str) – The start date for predictions (e.g., “2015-08-01T00:00:00” as input for mid-night of 2015-08-01).
explain_predictions (bool) – If True, explain predictions for forecasting.
explainer_type (str) – Type of explainer to use for explanations.
get_item_data (bool) – If True, will return the data corresponding to items as well.

num_predictions: int = None

prediction_start: str = None

explain_predictions: bool = None

explainer_type: str = None

get_item_data: bool = None

__post_init__()

class abacusai.CumulativeForecastingPredictionArguments

Bases: PredictionArguments

Prediction arguments for the CUMULATIVE_FORECASTING problem type

Parameters:

num_predictions (int) – The number of timestamps to predict in the future.
prediction_start (str) – The start date for predictions (e.g., “2015-08-01T00:00:00” as input for mid-night of 2015-08-01).
explain_predictions (bool) – If True, explain predictions for forecasting.
explainer_type (str) – Type of explainer to use for explanations.
get_item_data (bool) – If True, will return the data corresponding to items as well.

num_predictions: int = None

prediction_start: str = None

explain_predictions: bool = None

explainer_type: str = None

get_item_data: bool = None

__post_init__()

class abacusai.NaturalLanguageSearchPredictionArguments

Bases: PredictionArguments

Prediction arguments for the NATURAL_LANGUAGE_SEARCH problem type

Parameters:

llm_name (str) – Name of the specific LLM backend to use to power the chat experience.
num_completion_tokens (int) – Default for maximum number of tokens for chat answers.
system_message (str) – The generative LLM system message.
temperature (float) – The generative LLM temperature.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
ignore_documents (bool) – If True, will ignore any documents and search results, and only use the messages to generate a response.

llm_name: str = None

num_completion_tokens: int = None

system_message: str = None

temperature: float = None

search_score_cutoff: float = None

ignore_documents: bool = None

__post_init__()

class abacusai.FeatureStorePredictionArguments

Bases: PredictionArguments

Prediction arguments for the FEATURE_STORE problem type

Parameters:: limit_results (int) – If provided, will limit the number of results to the value specified.

limit_results: int = None

__post_init__()

class abacusai._PredictionArgumentsFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'problem_type'

config_class_map

class abacusai.VectorStoreTextEncoder

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

E5 = 'E5'

OPENAI = 'OPENAI'

OPENAI_COMPACT = 'OPENAI_COMPACT'

OPENAI_LARGE = 'OPENAI_LARGE'

SENTENCE_BERT = 'SENTENCE_BERT'

E5_SMALL = 'E5_SMALL'

CODE_BERT = 'CODE_BERT'

class abacusai.VectorStoreConfig

Bases: abacusai.api_class.abstract.ApiClass

Config for indexing options of a document retriever. Default values of optional arguments are heuristically selected by the Abacus.AI platform based on the underlying data.

Parameters:

chunk_size (int) – The size of text chunks in the vector store.
chunk_overlap_fraction (float) – The fraction of overlap between chunks.
text_encoder (VectorStoreTextEncoder) – Encoder used to index texts from the documents.
chunk_size_factors (list) – Chunking data with multiple sizes. The specified list of factors are used to calculate more sizes, in addition to chunk_size.
score_multiplier_column (str) – If provided, will use the values in this metadata column to modify the relevance score of returned chunks for all queries.
prune_vectors (bool) – Transform vectors using SVD so that the average component of vectors in the corpus are removed.
index_metadata_columns (bool) – If True, metadata columns of the FG will also be used for indexing and querying.
use_document_summary (bool) – If True, uses the summary of the document in addition to chunks of the document for indexing and querying.
summary_instructions (str) – Instructions for the LLM to generate the document summary.
standalone_deployment (bool) – If True, the document retriever will be deployed as a standalone deployment.

chunk_size: int = None

chunk_overlap_fraction: float = None

text_encoder: abacusai.api_class.enums.VectorStoreTextEncoder = None

chunk_size_factors: list = None

score_multiplier_column: str = None

prune_vectors: bool = None

index_metadata_columns: bool = None

use_document_summary: bool = None

summary_instructions: str = None

standalone_deployment: bool = False

abacusai.DocumentRetrieverConfig

abacusai.deprecated_enums(*enum_values)

class abacusai.ApiEnum

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

__deprecated_values__ = []

is_deprecated()

__eq__(other)

__hash__()

class abacusai.ProblemType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AI_AGENT = 'ai_agent'

EVENT_ANOMALY = 'event_anomaly'

CLUSTERING = 'clustering'

CLUSTERING_TIMESERIES = 'clustering_timeseries'

CUMULATIVE_FORECASTING = 'cumulative_forecasting'

NAMED_ENTITY_EXTRACTION = 'nlp_ner'

NATURAL_LANGUAGE_SEARCH = 'nlp_search'

CHAT_LLM = 'chat_llm'

SENTENCE_BOUNDARY_DETECTION = 'nlp_sentence_boundary_detection'

SENTIMENT_DETECTION = 'nlp_sentiment'

DOCUMENT_CLASSIFICATION = 'nlp_classification'

DOCUMENT_SUMMARIZATION = 'nlp_summarization'

DOCUMENT_VISUALIZATION = 'nlp_document_visualization'

PERSONALIZATION = 'personalization'

PREDICTIVE_MODELING = 'regression'

FINETUNED_LLM = 'finetuned_llm'

FORECASTING = 'forecasting'

CUSTOM_TRAINED_MODEL = 'plug_and_play'

CUSTOM_ALGORITHM = 'trainable_plug_and_play'

FEATURE_STORE = 'feature_store'

IMAGE_CLASSIFICATION = 'vision_classification'

OBJECT_DETECTION = 'vision_object_detection'

IMAGE_VALUE_PREDICTION = 'vision_regression'

MODEL_MONITORING = 'model_monitoring'

LANGUAGE_DETECTION = 'language_detection'

OPTIMIZATION = 'optimization'

PRETRAINED_MODELS = 'pretrained'

THEME_ANALYSIS = 'theme_analysis'

TS_ANOMALY = 'ts_anomaly'

class abacusai.RegressionObjective

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AUC = 'auc'

ACCURACY = 'acc'

LOG_LOSS = 'log_loss'

PRECISION = 'precision'

RECALL = 'recall'

F1_SCORE = 'fscore'

MAE = 'mae'

MAPE = 'mape'

WAPE = 'wape'

RMSE = 'rmse'

R_SQUARED_COEFFICIENT_OF_DETERMINATION = 'r^2'

class abacusai.RegressionTreeHPOMode

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

RAPID = 'rapid'

THOROUGH = 'thorough'

class abacusai.PartialDependenceAnalysis

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

RAPID = 'rapid'

THOROUGH = 'thorough'

class abacusai.RegressionAugmentationStrategy

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

SMOTE = 'smote'

RESAMPLE = 'resample'

class abacusai.RegressionTargetTransform

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

LOG = 'log'

QUANTILE = 'quantile'

YEO_JOHNSON = 'yeo-johnson'

BOX_COX = 'box-cox'

class abacusai.RegressionTypeOfSplit

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

RANDOM = 'Random Sampling'

TIMESTAMP_BASED = 'Timestamp Based'

ROW_INDICATOR_BASED = 'Row Indicator Based'

STRATIFIED_RANDOM_SAMPLING = 'Stratified Random Sampling'

class abacusai.RegressionTimeSplitMethod

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

TEST_SPLIT_PERCENTAGE_BASED = 'Test Split Percentage Based'

TEST_START_TIMESTAMP_BASED = 'Test Start Timestamp Based'

class abacusai.RegressionLossFunction

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

HUBER = 'Huber'

MSE = 'Mean Squared Error'

MAE = 'Mean Absolute Error'

MAPE = 'Mean Absolute Percentage Error'

MSLE = 'Mean Squared Logarithmic Error'

TWEEDIE = 'Tweedie'

CROSS_ENTROPY = 'Cross Entropy'

FOCAL_CROSS_ENTROPY = 'Focal Cross Entropy'

AUTOMATIC = 'Automatic'

CUSTOM = 'Custom'

class abacusai.ExplainerType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

KERNEL_EXPLAINER = 'KERNEL_EXPLAINER'

LIME_EXPLAINER = 'LIME_EXPLAINER'

TREE_EXPLAINER = 'TREE_EXPLAINER'

EBM_EXPLAINER = 'EBM_EXPLAINER'

class abacusai.SamplingMethodType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

N_SAMPLING = 'N_SAMPLING'

PERCENT_SAMPLING = 'PERCENT_SAMPLING'

class abacusai.MergeMode

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

LAST_N = 'LAST_N'

TIME_WINDOW = 'TIME_WINDOW'

class abacusai.OperatorType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

UNPIVOT = 'UNPIVOT'

MARKDOWN = 'MARKDOWN'

CRAWLER = 'CRAWLER'

EXTRACT_DOCUMENT_DATA = 'EXTRACT_DOCUMENT_DATA'

DATA_GENERATION = 'DATA_GENERATION'

UNION = 'UNION'

class abacusai.MarkdownOperatorInputType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

HTML = 'HTML'

class abacusai.FillLogic

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AVERAGE = 'average'

MAX = 'max'

MEDIAN = 'median'

MIN = 'min'

CUSTOM = 'custom'

BACKFILL = 'bfill'

FORWARDFILL = 'ffill'

LINEAR = 'linear'

NEAREST = 'nearest'

class abacusai.BatchSize

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

BATCH_8 = 8

BATCH_16 = 16

BATCH_32 = 32

BATCH_64 = 64

BATCH_128 = 128

BATCH_256 = 256

BATCH_384 = 384

BATCH_512 = 512

BATCH_740 = 740

BATCH_1024 = 1024

class abacusai.HolidayCalendars

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AU = 'AU'

UK = 'UK'

US = 'US'

class abacusai.FileFormat

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AVRO = 'AVRO'

PARQUET = 'PARQUET'

TFRECORD = 'TFRECORD'

TSV = 'TSV'

CSV = 'CSV'

ORC = 'ORC'

JSON = 'JSON'

ODS = 'ODS'

XLS = 'XLS'

GZ = 'GZ'

ZIP = 'ZIP'

TAR = 'TAR'

DOCX = 'DOCX'

PDF = 'PDF'

MD = 'md'

RAR = 'RAR'

GIF = 'GIF'

JPEG = 'JPG'

PNG = 'PNG'

TIF = 'TIFF'

NUMBERS = 'NUMBERS'

PPTX = 'PPTX'

PPT = 'PPT'

HTML = 'HTML'

TXT = 'txt'

EML = 'eml'

MP3 = 'MP3'

MP4 = 'MP4'

FLV = 'flv'

MOV = 'mov'

MPG = 'mpg'

MPEG = 'mpeg'

WEBP = 'webp'

WEBM = 'webm'

WMV = 'wmv'

MSG = 'msg'

class abacusai.ExperimentationMode

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

RAPID = 'rapid'

THOROUGH = 'thorough'

class abacusai.PersonalizationTrainingMode

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

EXPERIMENTAL = 'EXP'

PRODUCTION = 'PROD'

class abacusai.PersonalizationObjective

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

NDCG = 'ndcg'

NDCG_5 = 'ndcg@5'

NDCG_10 = 'ndcg@10'

MAP = 'map'

MAP_5 = 'map@5'

MAP_10 = 'map@10'

MRR = 'mrr'

PERSONALIZATION = 'personalization@10'

COVERAGE = 'coverage'

class abacusai.ForecastingObjective

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

ACCURACY = 'w_c_accuracy'

WAPE = 'wape'

MAPE = 'mape'

CMAPE = 'cmape'

RMSE = 'rmse'

CV = 'coefficient_of_variation'

BIAS = 'bias'

SRMSE = 'srmse'

class abacusai.ForecastingFrequency

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

HOURLY = '1H'

DAILY = '1D'

WEEKLY_SUNDAY_START = '1W'

WEEKLY_MONDAY_START = 'W-MON'

WEEKLY_SATURDAY_START = 'W-SAT'

MONTH_START = 'MS'

MONTH_END = '1M'

QUARTER_START = 'QS'

QUARTER_END = '1Q'

YEARLY = '1Y'

class abacusai.ForecastingDataSplitType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AUTO = 'Automatic Time Based'

TIMESTAMP = 'Timestamp Based'

ITEM = 'Item Based'

PREDICTION_LENGTH = 'Force Prediction Length'

L_SHAPED_AUTO = 'L-shaped Split - Automatic Time Based'

L_SHAPED_TIMESTAMP = 'L-shaped Split - Timestamp Based'

class abacusai.ForecastingLossFunction

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

CUSTOM = 'Custom'

MEAN_ABSOLUTE_ERROR = 'mae'

NORMALIZED_MEAN_ABSOLUTE_ERROR = 'nmae'

PEAKS_MEAN_ABSOLUTE_ERROR = 'peaks_mae'

MEAN_ABSOLUTE_PERCENTAGE_ERROR = 'stable_mape'

POINTWISE_ACCURACY = 'accuracy'

ROOT_MEAN_SQUARE_ERROR = 'rmse'

NORMALIZED_ROOT_MEAN_SQUARE_ERROR = 'nrmse'

ASYMMETRIC_MEAN_ABSOLUTE_PERCENTAGE_ERROR = 'asymmetric_mape'

STABLE_STANDARDIZED_MEAN_ABSOLUTE_PERCENTAGE_ERROR = 'stable_standardized_mape_with_cmape'

GAUSSIAN = 'mle_gaussian_local'

GAUSSIAN_FULL_COVARIANCE = 'mle_gaussfullcov'

GUASSIAN_EXPONENTIAL = 'mle_gaussexp'

MIX_GAUSSIANS = 'mle_gaussmix'

WEIBULL = 'mle_weibull'

NEGATIVE_BINOMIAL = 'mle_negbinom'

LOG_ROOT_MEAN_SQUARE_ERROR = 'log_rmse'

class abacusai.ForecastingLocalScaling

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

ZSCORE = 'zscore'

SLIDING_ZSCORE = 'sliding_zscore'

LAST_POINT = 'lastpoint'

MIN_MAX = 'minmax'

MIN_STD = 'minstd'

ROBUST = 'robust'

ITEM = 'item'

class abacusai.ForecastingFillMethod

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

BACK = 'BACK'

MIDDLE = 'MIDDLE'

FUTURE = 'FUTURE'

class abacusai.ForecastingQuanitlesExtensionMethod

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

DIRECT = 'direct'

QUADRATIC = 'quadratic'

ANCESTRAL_SIMULATION = 'simulation'

class abacusai.TimeseriesAnomalyDataSplitType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AUTO = 'Automatic Time Based'

TIMESTAMP = 'Fixed Timestamp Based'

class abacusai.TimeseriesAnomalyTypeOfAnomaly

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

HIGH_PEAK = 'high_peak'

LOW_PEAK = 'low_peak'

class abacusai.TimeseriesAnomalyUseHeuristic

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

ENABLE = 'enable'

DISABLE = 'disable'

AUTOMATIC = 'automatic'

class abacusai.NERObjective

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

LOG_LOSS = 'log_loss'

AUC = 'auc'

PRECISION = 'precision'

RECALL = 'recall'

ANNOTATIONS_PRECISION = 'annotations_precision'

ANNOTATIONS_RECALL = 'annotations_recall'

class abacusai.NERModelType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

PRETRAINED_BERT = 'pretrained_bert'

PRETRAINED_ROBERTA_27 = 'pretrained_roberta_27'

PRETRAINED_ROBERTA_43 = 'pretrained_roberta_43'

PRETRAINED_MULTILINGUAL = 'pretrained_multilingual'

LEARNED = 'learned'

class abacusai.NLPDocumentFormat

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AUTO = 'auto'

TEXT = 'text'

DOC = 'doc'

TOKENS = 'tokens'

class abacusai.SentimentType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

VALENCE = 'valence'

EMOTION = 'emotion'

class abacusai.ClusteringImputationMethod

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AUTOMATIC = 'Automatic'

ZEROS = 'Zeros'

INTERPOLATE = 'Interpolate'

class abacusai.ConnectorType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

FILE = 'FILE'

DATABASE = 'DATABASE'

STREAMING = 'STREAMING'

APPLICATION = 'APPLICATION'

class abacusai.ApplicationConnectorType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

GOOGLEANALYTICS = 'GOOGLEANALYTICS'

GOOGLEDRIVE = 'GOOGLEDRIVE'

GOOGLECALENDAR = 'GOOGLECALENDAR'

GIT = 'GIT'

CONFLUENCE = 'CONFLUENCE'

JIRA = 'JIRA'

ONEDRIVE = 'ONEDRIVE'

ZENDESK = 'ZENDESK'

SLACK = 'SLACK'

SHAREPOINT = 'SHAREPOINT'

TEAMS = 'TEAMS'

ABACUSUSAGEMETRICS = 'ABACUSUSAGEMETRICS'

MICROSOFTAUTH = 'MICROSOFTAUTH'

FRESHSERVICE = 'FRESHSERVICE'

ZENDESKSUNSHINEMESSAGING = 'ZENDESKSUNSHINEMESSAGING'

GOOGLEDRIVEUSER = 'GOOGLEDRIVEUSER'

GOOGLEWORKSPACEUSER = 'GOOGLEWORKSPACEUSER'

GMAILUSER = 'GMAILUSER'

GOOGLESHEETS = 'GOOGLESHEETS'

GOOGLEDOCS = 'GOOGLEDOCS'

TEAMSSCRAPER = 'TEAMSSCRAPER'

GITHUBUSER = 'GITHUBUSER'

OKTASAML = 'OKTASAML'

BOX = 'BOX'

SFTPAPPLICATION = 'SFTPAPPLICATION'

OAUTH = 'OAUTH'

SALESFORCE = 'SALESFORCE'

TWITTER = 'TWITTER'

MCP = 'MCP'

classmethod user_connectors()

classmethod database_connectors()

class abacusai.StreamingConnectorType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

KAFKA = 'KAFKA'

class abacusai.PythonFunctionArgumentType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

FEATURE_GROUP = 'FEATURE_GROUP'

INTEGER = 'INTEGER'

STRING = 'STRING'

BOOLEAN = 'BOOLEAN'

FLOAT = 'FLOAT'

JSON = 'JSON'

LIST = 'LIST'

DATASET_ID = 'DATASET_ID'

MODEL_ID = 'MODEL_ID'

FEATURE_GROUP_ID = 'FEATURE_GROUP_ID'

MONITOR_ID = 'MONITOR_ID'

BATCH_PREDICTION_ID = 'BATCH_PREDICTION_ID'

DEPLOYMENT_ID = 'DEPLOYMENT_ID'

ATTACHMENT = 'ATTACHMENT'

static to_json_type(type)

class abacusai.PythonFunctionOutputArgumentType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

NTEGER = 'INTEGER'

STRING = 'STRING'

BOOLEAN = 'BOOLEAN'

FLOAT = 'FLOAT'

JSON = 'JSON'

LIST = 'LIST'

DATASET_ID = 'DATASET_ID'

MODEL_ID = 'MODEL_ID'

FEATURE_GROUP_ID = 'FEATURE_GROUP_ID'

MONITOR_ID = 'MONITOR_ID'

BATCH_PREDICTION_ID = 'BATCH_PREDICTION_ID'

DEPLOYMENT_ID = 'DEPLOYMENT_ID'

ANY = 'ANY'

ATTACHMENT = 'ATTACHMENT'

class abacusai.LLMName

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

OPENAI_GPT4 = 'OPENAI_GPT4'

OPENAI_GPT4_32K = 'OPENAI_GPT4_32K'

OPENAI_GPT4_128K = 'OPENAI_GPT4_128K'

OPENAI_GPT4_128K_LATEST = 'OPENAI_GPT4_128K_LATEST'

OPENAI_GPT4O = 'OPENAI_GPT4O'

OPENAI_GPT4O_MINI = 'OPENAI_GPT4O_MINI'

OPENAI_O1_MINI = 'OPENAI_O1_MINI'

OPENAI_GPT4_1 = 'OPENAI_GPT4_1'

OPENAI_GPT4_1_MINI = 'OPENAI_GPT4_1_MINI'

OPENAI_GPT3_5 = 'OPENAI_GPT3_5'

OPENAI_GPT3_5_TEXT = 'OPENAI_GPT3_5_TEXT'

OPENAI_O3 = 'OPENAI_O3'

OPENAI_O4_MINI = 'OPENAI_O4_MINI'

LLAMA3_1_405B = 'LLAMA3_1_405B'

LLAMA3_1_70B = 'LLAMA3_1_70B'

LLAMA3_1_8B = 'LLAMA3_1_8B'

LLAMA3_3_70B = 'LLAMA3_3_70B'

LLAMA3_LARGE_CHAT = 'LLAMA3_LARGE_CHAT'

CLAUDE_V3_OPUS = 'CLAUDE_V3_OPUS'

CLAUDE_V3_HAIKU = 'CLAUDE_V3_HAIKU'

CLAUDE_V3_5_SONNET = 'CLAUDE_V3_5_SONNET'

CLAUDE_V3_7_SONNET = 'CLAUDE_V3_7_SONNET'

CLAUDE_V4_SONNET = 'CLAUDE_V4_SONNET'

CLAUDE_V4_OPUS = 'CLAUDE_V4_OPUS'

CLAUDE_V3_5_HAIKU = 'CLAUDE_V3_5_HAIKU'

GEMINI_1_5_PRO = 'GEMINI_1_5_PRO'

GEMINI_2_FLASH = 'GEMINI_2_FLASH'

GEMINI_2_5_PRO = 'GEMINI_2_5_PRO'

GEMINI_2_5_FLASH = 'GEMINI_2_5_FLASH'

GEMINI_2_FLASH_THINKING = 'GEMINI_2_FLASH_THINKING'

GEMINI_2_PRO = 'GEMINI_2_PRO'

ABACUS_SMAUG3 = 'ABACUS_SMAUG3'

ABACUS_DRACARYS = 'ABACUS_DRACARYS'

QWEN_2_5_32B = 'QWEN_2_5_32B'

QWEN_2_5_32B_BASE = 'QWEN_2_5_32B_BASE'

QWEN_2_5_72B = 'QWEN_2_5_72B'

QWQ_32B = 'QWQ_32B'

QWEN3_235B_A22B = 'QWEN3_235B_A22B_2507'

QWEN3_235B_A22B_THINKING = 'QWEN3_235B_A22B_THINKING_2507'

QWEN3_CODER = 'QWEN3_CODER'

GEMINI_1_5_FLASH = 'GEMINI_1_5_FLASH'

XAI_GROK = 'XAI_GROK'

XAI_GROK_3 = 'XAI_GROK_3'

XAI_GROK_3_MINI = 'XAI_GROK_3_MINI'

XAI_GROK_4 = 'XAI_GROK_4'

DEEPSEEK_V3 = 'DEEPSEEK_V3'

DEEPSEEK_R1 = 'DEEPSEEK_R1'

class abacusai.MonitorAlertType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

ACCURACY_BELOW_THRESHOLD = 'AccuracyBelowThreshold'

FEATURE_DRIFT = 'FeatureDrift'

DATA_INTEGRITY_VIOLATIONS = 'DataIntegrityViolations'

BIAS_VIOLATIONS = 'BiasViolations'

HISTORY_LENGTH_DRIFT = 'HistoryLengthDrift'

TARGET_DRIFT = 'TargetDrift'

PREDICTION_COUNT = 'PredictionCount'

class abacusai.FeatureDriftType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

KL = 'kl'

KS = 'ks'

WS = 'ws'

JS = 'js'

PSI = 'psi'

CHI_SQUARE = 'chi_square'

CSI = 'csi'

class abacusai.DataIntegrityViolationType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

NULL_VIOLATIONS = 'null_violations'

RANGE_VIOLATIONS = 'range_violations'

CATEGORICAL_RANGE_VIOLATION = 'categorical_range_violations'

TOTAL_VIOLATIONS = 'total_violations'

class abacusai.BiasType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

DEMOGRAPHIC_PARITY = 'demographic_parity'

EQUAL_OPPORTUNITY = 'equal_opportunity'

GROUP_BENEFIT_EQUALITY = 'group_benefit'

TOTAL = 'total'

class abacusai.AlertActionType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

EMAIL = 'Email'

class abacusai.PythonFunctionType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

FEATURE_GROUP = 'FEATURE_GROUP'

PLOTLY_FIG = 'PLOTLY_FIG'

STEP_FUNCTION = 'STEP_FUNCTION'

USERCODE_TOOL = 'USERCODE_TOOL'

CONNECTOR_TOOL = 'CONNECTOR_TOOL'

class abacusai.EvalArtifactType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

FORECASTING_ACCURACY = 'bar_chart'

FORECASTING_VOLUME = 'bar_chart_volume'

FORECASTING_HISTORY_LENGTH_ACCURACY = 'bar_chart_accuracy_by_history'

class abacusai.FieldDescriptorType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

STRING = 'STRING'

INTEGER = 'INTEGER'

FLOAT = 'FLOAT'

BOOLEAN = 'BOOLEAN'

DATETIME = 'DATETIME'

DATE = 'DATE'

class abacusai.WorkflowNodeInputType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

USER_INPUT = 'USER_INPUT'

WORKFLOW_VARIABLE = 'WORKFLOW_VARIABLE'

IGNORE = 'IGNORE'

CONSTANT = 'CONSTANT'

class abacusai.WorkflowNodeOutputType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

ATTACHMENT = 'ATTACHMENT'

BOOLEAN = 'BOOLEAN'

FLOAT = 'FLOAT'

INTEGER = 'INTEGER'

DICT = 'DICT'

LIST = 'LIST'

STRING = 'STRING'

RUNTIME_SCHEMA = 'RUNTIME_SCHEMA'

ANY = 'ANY'

classmethod normalize_type(python_type)

Parameters:: python_type (Union[str, type, None, WorkflowNodeOutputType, PythonFunctionOutputArgumentType])
Return type:: WorkflowNodeOutputType

class abacusai.StdDevThresholdType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

ABSOLUTE = 'ABSOLUTE'

PERCENTILE = 'PERCENTILE'

STDDEV = 'STDDEV'

class abacusai.DataType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

INTEGER = 'integer'

FLOAT = 'float'

STRING = 'string'

DATE = 'date'

DATETIME = 'datetime'

BOOLEAN = 'boolean'

LIST = 'list'

STRUCT = 'struct'

NULL = 'null'

BINARY = 'binary'

class abacusai.AgentInterface

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

DEFAULT = 'DEFAULT'

CHAT = 'CHAT'

MATRIX = 'MATRIX'

AUTONOMOUS = 'AUTONOMOUS'

class abacusai.WorkflowNodeTemplateType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

TRIGGER = 'trigger'

DEFAULT = 'default'

class abacusai.ProjectConfigType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

CONSTRAINTS = 'CONSTRAINTS'

CHAT_FEEDBACK = 'CHAT_FEEDBACK'

REVIEW_MODE = 'REVIEW_MODE'

class abacusai.CPUSize

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

SMALL = 'small'

MEDIUM = 'medium'

LARGE = 'large'

class abacusai.MemorySize

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

SMALL = 16

MEDIUM = 32

LARGE = 64

XLARGE = 128

classmethod from_value(value)

class abacusai.ResponseSectionType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AGENT_FLOW_BUTTON = 'agent_flow_button'

ATTACHMENTS = 'attachments'

BASE64_IMAGE = 'base64_image'

CHART = 'chart'

CODE = 'code'

COLLAPSIBLE_COMPONENT = 'collapsible_component'

IMAGE_URL = 'image_url'

RUNTIME_SCHEMA = 'runtime_schema'

LIST = 'list'

TABLE = 'table'

TEXT = 'text'

class abacusai.CodeLanguage

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

PYTHON = 'python'

SQL = 'sql'

class abacusai.DeploymentConversationType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

CHAT_LLM = 'CHATLLM'

SIMPLE_AGENT = 'SIMPLE_AGENT'

COMPLEX_AGENT = 'COMPLEX_AGENT'

WORKFLOW_AGENT = 'WORKFLOW_AGENT'

COPILOT = 'COPILOT'

AGENT_CONTROLLER = 'AGENT_CONTROLLER'

CODE_LLM = 'CODE_LLM'

CODE_LLM_AGENT = 'CODE_LLM_AGENT'

CHAT_LLM_TASK = 'CHAT_LLM_TASK'

COMPUTER_AGENT = 'COMPUTER_AGENT'

SEARCH_LLM = 'SEARCH_LLM'

APP_LLM = 'APP_LLM'

TEST_AGENT = 'TEST_AGENT'

SUPER_AGENT = 'SUPER_AGENT'

class abacusai.AgentClientType

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

CHAT_UI = 'CHAT_UI'

MESSAGING_APP = 'MESSAGING_APP'

API = 'API'

class abacusai.OrganizationSecretType

Bases: ApiEnum

Enum for organization secret types

ORG_SECRET = 'ORG_SECRET'

ORG_API_CREDENTIALS = 'ORG_API_CREDENTIALS'

USER_API_CREDENTIALS = 'USER_API_CREDENTIALS'

class abacusai.SamplingConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for the sampling config of a feature group

sampling_method: abacusai.api_class.enums.SamplingMethodType = None

classmethod _get_builder()

__post_init__()

class abacusai.NSamplingConfig

Bases: SamplingConfig

The number of distinct values of the key columns to include in the sample, or number of rows if key columns not specified.

Parameters:

sample_count (int) – The number of rows to include in the sample
key_columns (List[str]) – The feature(s) to use as the key(s) when sampling

sample_count: int

key_columns: List[str] = []

__post_init__()

class abacusai.PercentSamplingConfig

Bases: SamplingConfig

The fraction of distinct values of the feature group to include in the sample.

Parameters:

sample_percent (float) – The percentage of the rows to sample
key_columns (List[str]) – The feature(s) to use as the key(s) when sampling

sample_percent: float

key_columns: List[str] = []

__post_init__()

class abacusai._SamplingConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_class_key = 'sampling_method'

config_abstract_class

config_class_map

class abacusai.MergeConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for the merge config of a feature group

merge_mode: abacusai.api_class.enums.MergeMode = None

classmethod _get_builder()

__post_init__()

class abacusai.LastNMergeConfig

Bases: MergeConfig

Merge LAST N chunks/versions of an incremental dataset.

Parameters:

num_versions (int) – The number of versions to merge. num_versions == 0 means merge all versions.
include_version_timestamp_column (bool) – If set, include a column with the creation timestamp of source FG versions.

num_versions: int

include_version_timestamp_column: bool = None

__post_init__()

class abacusai.TimeWindowMergeConfig

Bases: MergeConfig

Merge rows within a given timewindow of the most recent timestamp

Parameters:

feature_name (str) – Time based column to index on
time_window_size_ms (int) – Range of merged rows will be [MAX_TIME - time_window_size_ms, MAX_TIME]
include_version_timestamp_column (bool) – If set, include a column with the creation timestamp of source FG versions.

feature_name: str

time_window_size_ms: int

include_version_timestamp_column: bool = None

__post_init__()

class abacusai._MergeConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_class_key = 'merge_mode'

config_abstract_class

config_class_map

class abacusai.OperatorConfig

Bases: abacusai.api_class.abstract.ApiClass

Configuration for a template Feature Group Operation

operator_type: abacusai.api_class.enums.OperatorType = None

classmethod _get_builder()

__post_init__()

class abacusai.UnpivotConfig

Bases: OperatorConfig

Unpivot Columns in a FeatureGroup.

Parameters:

columns (List[str]) – Which columns to unpivot.
index_column (str) – Name of new column containing the unpivoted column names as its values
value_column (str) – Name of new column containing the row values that were unpivoted.
exclude (bool) – If True, the unpivoted columns are all the columns EXCEPT the ones in the columns argument. Default is False.

columns: List[str] = None

index_column: str = None

value_column: str = None

exclude: bool = None

__post_init__()

class abacusai.MarkdownConfig

Bases: OperatorConfig

Transform a input column to a markdown column.

Parameters:

input_column (str) – Name of input column to transform.
output_column (str) – Name of output column to store transformed data.
input_column_type (MarkdownOperatorInputType) – Type of input column to transform.

input_column: str = None

output_column: str = None

input_column_type: abacusai.api_class.enums.MarkdownOperatorInputType = None

__post_init__()

class abacusai.CrawlerTransformConfig

Bases: OperatorConfig

Transform a input column of urls to html text

Parameters:

input_column (str) – Name of input column to transform.
output_column (str) – Name of output column to store transformed data.
depth_column (str) – Increasing depth explores more links, capturing more content
disable_host_restriction (bool) – If True, will not restrict crawling to the same host.
honour_website_rules (bool) – If True, will respect robots.txt rules.
user_agent (str) – If provided, will use this user agent instead of randomly selecting one.

input_column: str = None

output_column: str = None

depth_column: str = None

input_column_type: str = None

crawl_depth: int = None

disable_host_restriction: bool = None

honour_website_rules: bool = None

user_agent: str = None

__post_init__()

class abacusai.ExtractDocumentDataConfig

Bases: OperatorConfig

Extracts data from documents.

Parameters:

doc_id_column (str) – Name of input document ID column.
document_column (str) – Name of the input document column which contains the page infos. This column will be transformed to include the document processing config in the output feature group.
document_processing_config (DocumentProcessingConfig) – Document processing configuration.

doc_id_column: str = None

document_column: str = None

document_processing_config: abacusai.api_class.dataset.DocumentProcessingConfig = None

__post_init__()

class abacusai.DataGenerationConfig

Bases: OperatorConfig

Generate synthetic data using a model for finetuning an LLM.

Parameters:

prompt_col (str) – Name of the input prompt column.
completion_col (str) – Name of the output completion column.
description_col (str) – Name of the description column.
id_col (str) – Name of the identifier column.
generation_instructions (str) – Instructions for the data generation model.
temperature (float) – Sampling temperature for the model.
fewshot_examples (int) – Number of fewshot examples used to prompt the model.
concurrency (int) – Number of concurrent processes.
examples_per_target (int) – Number of examples per target.
subset_size (Optional[int]) – Size of the subset to use for generation.
verify_response (bool) – Whether to verify the response.
token_budget (int) – Token budget for generation.
oversample (bool) – Whether to oversample the data.
documentation_char_limit (int) – Character limit for documentation.
frequency_penalty (float) – Penalty for frequency of token appearance.
model (str) – Model to use for data generation.
seed (Optional[int]) – Seed for random number generation.

prompt_col: str = None

completion_col: str = None

description_col: str = None

id_col: str = None

generation_instructions: str = None

temperature: float = None

fewshot_examples: int = None

concurrency: int = None

examples_per_target: int = None

subset_size: int = None

verify_response: bool = None

token_budget: int = None

oversample: bool = None

documentation_char_limit: int = None

frequency_penalty: float = None

model: str = None

seed: int = None

__post_init__()

class abacusai.UnionTransformConfig

Bases: OperatorConfig

Takes Union of current feature group with 1 or more selected feature groups of same type.

Parameters:

feature_group_ids (List[str]) – List of feature group IDs to union with source FG.
drop_non_intersecting_columns (bool) – If true, will drop columns that are not present in all feature groups. If false fills missing columns with nulls.

feature_group_ids: List[str] = None

drop_non_intersecting_columns: bool = False

__post_init__()

class abacusai._OperatorConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

A class to select and return the the correct type of Operator Config based on a serialized OperatorConfig instance.

config_abstract_class

config_class_key = 'operator_type'

config_class_map

class abacusai.TrainingConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for the training config options used to train the model.

_upper_snake_case_keys: bool = True

_support_kwargs: bool = True

kwargs: dict

problem_type: abacusai.api_class.enums.ProblemType = None

algorithm: str = None

classmethod _get_builder()

class abacusai.PersonalizationTrainingConfig

Bases: TrainingConfig

Training config for the PERSONALIZATION problem type

Parameters:

objective (PersonalizationObjective) – Ranking scheme used to select final best model.
sort_objective (PersonalizationObjective) – Ranking scheme used to sort models on the metrics page.
training_mode (PersonalizationTrainingMode) – whether to train in production or experimental mode. Defaults to EXP.
target_action_types (List[str]) – List of action types to use as targets for training.
target_action_weights (Dict[str, float]) – Dictionary of action types to weights for training.
session_event_types (List[str]) – List of event types to treat as occurrences of sessions.
test_split (int) – Percent of dataset to use for test data. We support using a range between 6% to 20% of your dataset to use as test data.
recent_days_for_training (int) – Limit training data to a certain latest number of days.
training_start_date (str) – Only consider training interaction data after this date. Specified in the timezone of the dataset.
test_on_user_split (bool) – Use user splits instead of using time splits, when validating and testing the model.
test_split_on_last_k_items (bool) – Use last k items instead of global timestamp splits, when validating and testing the model.
test_last_items_length (int) – Number of items to leave out for each user when using leave k out folds.
test_window_length_hours (int) – Duration (in hours) of most recent time window to use when validating and testing the model.
explicit_time_split (bool) – Sets an explicit time-based test boundary.
test_row_indicator (str) – Column indicating which rows to use for training (TRAIN), validation (VAL) and testing (TEST).
full_data_retraining (bool) – Train models separately with all the data.
sequential_training (bool) – Train a mode sequentially through time.
data_split_feature_group_table_name (str) – Specify the table name of the feature group to export training data with the fold column.
optimized_event_type (str) – The final event type to optimize for and compute metrics on.
dropout_rate (int) – Dropout rate for neural network.
batch_size (BatchSize) – Batch size for neural network.
disable_transformer (bool) – Disable training the transformer algorithm.
disable_gpu (boo) – Disable training on GPU.
filter_history (bool) – Do not recommend items the user has already interacted with.
action_types_exclusion_days (Dict[str, float]) – Mapping from action type to number of days for which we exclude previously interacted items from prediction
session_dedupe_mins (float) – Minimum number of minutes between two sessions for a user.
max_history_length (int) – Maximum length of user-item history to include user in training examples.
compute_rerank_metrics (bool) – Compute metrics based on rerank results.
add_time_features (bool) – Include interaction time as a feature.
disable_timestamp_scalar_features (bool) – Exclude timestamp scalar features.
compute_session_metrics (bool) – Evaluate models based on how well they are able to predict the next session of interactions.
max_user_history_len_percentile (int) – Filter out users with history length above this percentile.
downsample_item_popularity_percentile (float) – Downsample items more popular than this percentile.
use_user_id_feature (bool) – Use user id as a feature in CTR models.
min_item_history (int) – Minimum number of interactions an item must have to be included in training.
query_column (str) – Name of column in the interactions table that represents a natural language query, e.g. ‘blue t-shirt’.
item_query_column (str) – Name of column in the item catalog that will be matched to the query column in the interactions table.
include_item_id_feature (bool) – Add Item-Id to the input features of the model. Applicable for Embedding distance and CTR models.

objective: abacusai.api_class.enums.PersonalizationObjective = None

sort_objective: abacusai.api_class.enums.PersonalizationObjective = None

training_mode: abacusai.api_class.enums.PersonalizationTrainingMode = None

target_action_types: List[str] = None

target_action_weights: Dict[str, float] = None

session_event_types: List[str] = None

test_split: int = None

recent_days_for_training: int = None

training_start_date: str = None

test_on_user_split: bool = None

test_split_on_last_k_items: bool = None

test_last_items_length: int = None

test_window_length_hours: int = None

explicit_time_split: bool = None

test_row_indicator: str = None

full_data_retraining: bool = None

sequential_training: bool = None

data_split_feature_group_table_name: str = None

optimized_event_type: str = None

dropout_rate: int = None

batch_size: abacusai.api_class.enums.BatchSize = None

disable_transformer: bool = None

disable_gpu: bool = None

filter_history: bool = None

action_types_exclusion_days: Dict[str, float] = None

max_history_length: int = None

compute_rerank_metrics: bool = None

add_time_features: bool = None

disable_timestamp_scalar_features: bool = None

compute_session_metrics: bool = None

query_column: str = None

item_query_column: str = None

use_user_id_feature: bool = None

session_dedupe_mins: float = None

include_item_id_feature: bool = None

max_user_history_len_percentile: int = None

downsample_item_popularity_percentile: float = None

min_item_history: int = None

__post_init__()

class abacusai.RegressionTrainingConfig

Bases: TrainingConfig

Training config for the PREDICTIVE_MODELING problem type

Parameters:

objective (RegressionObjective) – Ranking scheme used to select final best model.
sort_objective (RegressionObjective) – Ranking scheme used to sort models on the metrics page.
tree_hpo_mode – (RegressionTreeHPOMode): Turning off Rapid Experimentation will take longer to train.
type_of_split (RegressionTypeOfSplit) – Type of data splitting into train/test (validation also).
test_split (int) – Percent of dataset to use for test data. We support using a range between 5% to 20% of your dataset to use as test data.
disable_test_val_fold (bool) – Do not create a TEST_VAL set. All records which would be part of the TEST_VAL fold otherwise, remain in the TEST fold.
k_fold_cross_validation (bool) – Use this to force k-fold cross validation bagging on or off.
num_cv_folds (int) – Specify the value of k in k-fold cross validation.
timestamp_based_splitting_column (str) – Timestamp column selected for splitting into test and train.
timestamp_based_splitting_method (RegressionTimeSplitMethod) – Method of selecting TEST set, top percentile wise or after a given timestamp.
test_splitting_timestamp (str) – Rows with timestamp greater than this will be considered to be in the test set.
sampling_unit_keys (List[str]) – Constrain train/test separation to partition a column.
test_row_indicator (str) – Column indicating which rows to use for training (TRAIN) and testing (TEST). Validation (VAL) can also be specified.
full_data_retraining (bool) – Train models separately with all the data.
rebalance_classes (bool) – Class weights are computed as the inverse of the class frequency from the training dataset when this option is selected as “Yes”. It is useful when the classes in the dataset are unbalanced. Re-balancing classes generally boosts recall at the cost of precision on rare classes.
rare_class_augmentation_threshold (float) – Augments any rare class whose relative frequency with respect to the most frequent class is less than this threshold. Default = 0.1 for classification problems with rare classes.
augmentation_strategy (RegressionAugmentationStrategy) – Strategy to deal with class imbalance and data augmentation.
training_rows_downsample_ratio (float) – Uses this ratio to train on a sample of the dataset provided.
active_labels_column (str) – Specify a column to use as the active columns in a multi label setting.
min_categorical_count (int) – Minimum threshold to consider a value different from the unknown placeholder.
sample_weight (str) – Specify a column to use as the weight of a sample for training and eval.
numeric_clipping_percentile (float) – Uses this option to clip the top and bottom x percentile of numeric feature columns where x is the value of this option.
target_transform (RegressionTargetTransform) – Specify a transform (e.g. log, quantile) to apply to the target variable.
ignore_datetime_features (bool) – Remove all datetime features from the model. Useful while generalizing to different time periods.
max_text_words (int) – Maximum number of words to use from text fields.
perform_feature_selection (bool) – If enabled, additional algorithms which support feature selection as a pretraining step will be trained separately with the selected subset of features. The details about their selected features can be found in their respective logs.
feature_selection_intensity (int) – This determines the strictness with which features will be filtered out. 1 being very lenient (more features kept), 100 being very strict.
batch_size (BatchSize) – Batch size.
dropout_rate (int) – Dropout percentage rate.
pretrained_model_name (str) – Enable algorithms which process text using pretrained multilingual NLP models.
pretrained_llm_name (str) – Enable algorithms which process text using pretrained large language models.
is_multilingual (bool) – Enable algorithms which process text using pretrained multilingual NLP models.
loss_function (RegressionLossFunction) – Loss function to be used as objective for model training.
loss_parameters (str) – Loss function params in format <key>=<value>;<key>=<value>;…..
target_encode_categoricals (bool) – Use this to turn target encoding on categorical features on or off.
drop_original_categoricals (bool) – This option helps us choose whether to also feed the original label encoded categorical columns to the mdoels along with their target encoded versions.
monotonically_increasing_features (List[str]) – Constrain the model such that it behaves as if the target feature is monotonically increasing with the selected features
monotonically_decreasing_features (List[str]) – Constrain the model such that it behaves as if the target feature is monotonically decreasing with the selected features
data_split_feature_group_table_name (str) – Specify the table name of the feature group to export training data with the fold column.
custom_loss_functions (List[str]) – Registered custom losses available for selection.
custom_metrics (List[str]) – Registered custom metrics available for selection.
partial_dependence_analysis (PartialDependenceAnalysis) – Specify whether to run partial dependence plots for all features or only some features.
do_masked_language_model_pretraining (bool) – Specify whether to run a masked language model unsupervised pretraining step before supervized training in certain supported algorithms which use BERT-like backbones.
max_tokens_in_sentence (int) – Specify the max tokens to be kept in a sentence based on the truncation strategy.
truncation_strategy (str) – What strategy to use to deal with text rows with more than a given number of tokens (if num of tokens is more than “max_tokens_in_sentence”).

objective: abacusai.api_class.enums.RegressionObjective = None

sort_objective: abacusai.api_class.enums.RegressionObjective = None

tree_hpo_mode: abacusai.api_class.enums.RegressionTreeHPOMode = None

partial_dependence_analysis: abacusai.api_class.enums.PartialDependenceAnalysis = None

type_of_split: abacusai.api_class.enums.RegressionTypeOfSplit = None

test_split: int = None

disable_test_val_fold: bool = None

k_fold_cross_validation: bool = None

num_cv_folds: int = None

timestamp_based_splitting_column: str = None

timestamp_based_splitting_method: abacusai.api_class.enums.RegressionTimeSplitMethod = None

test_splitting_timestamp: str = None

sampling_unit_keys: List[str] = None

test_row_indicator: str = None

full_data_retraining: bool = None

rebalance_classes: bool = None

rare_class_augmentation_threshold: float = None

augmentation_strategy: abacusai.api_class.enums.RegressionAugmentationStrategy = None

training_rows_downsample_ratio: float = None

active_labels_column: str = None

min_categorical_count: int = None

sample_weight: str = None

numeric_clipping_percentile: float = None

target_transform: abacusai.api_class.enums.RegressionTargetTransform = None

ignore_datetime_features: bool = None

max_text_words: int = None

perform_feature_selection: bool = None

feature_selection_intensity: int = None

batch_size: abacusai.api_class.enums.BatchSize = None

dropout_rate: int = None

pretrained_model_name: str = None

pretrained_llm_name: str = None

is_multilingual: bool = None

do_masked_language_model_pretraining: bool = None

max_tokens_in_sentence: int = None

truncation_strategy: str = None

loss_function: abacusai.api_class.enums.RegressionLossFunction = None

loss_parameters: str = None

target_encode_categoricals: bool = None

drop_original_categoricals: bool = None

monotonically_increasing_features: List[str] = None

monotonically_decreasing_features: List[str] = None

data_split_feature_group_table_name: str = None

custom_loss_functions: List[str] = None

custom_metrics: List[str] = None

__post_init__()

class abacusai.ForecastingTrainingConfig

Bases: TrainingConfig

Training config for the FORECASTING problem type

Parameters:

prediction_length (int) – How many timesteps in the future to predict.
objective (ForecastingObjective) – Ranking scheme used to select final best model.
sort_objective (ForecastingObjective) – Ranking scheme used to sort models on the metrics page.
forecast_frequency (ForecastingFrequency) – Forecast frequency.
probability_quantiles (List[float]) – Prediction quantiles.
force_prediction_length (int) – Force length of test window to be the same as prediction length.
filter_items (bool) – Filter items with small history and volume.
enable_feature_selection (bool) – Enable feature selection.
enable_padding (bool) – Pad series to the max_date of the dataset
enable_cold_start (bool) – Enable cold start forecasting by training/predicting for zero history items.
enable_multiple_backtests (bool) – Whether to enable multiple backtesting or not.
num_backtesting_windows (int) – Total backtesting windows to use for the training.
backtesting_window_step_size (int) – Use this step size to shift backtesting windows for model training.
full_data_retraining (bool) – Train models separately with all the data.
additional_forecast_keys – List[str]: List of categoricals in timeseries that can act as multi-identifier.
experimentation_mode (ExperimentationMode) – Selecting Thorough Experimentation will take longer to train.
type_of_split (ForecastingDataSplitType) – Type of data splitting into train/test.
test_by_item (bool) – Partition train/test data by item rather than time if true.
test_start (str) – Limit training data to dates before the given test start.
test_split (int) – Percent of dataset to use for test data. We support using a range between 5% to 20% of your dataset to use as test data.
loss_function (ForecastingLossFunction) – Loss function for training neural network.
underprediction_weight (float) – Weight for underpredictions
disable_networks_without_analytic_quantiles (bool) – Disable neural networks, which quantile functions do not have analytic expressions (e.g, mixture models)
initial_learning_rate (float) – Initial learning rate.
l2_regularization_factor (float) – L2 regularization factor.
dropout_rate (int) – Dropout percentage rate.
recurrent_layers (int) – Number of recurrent layers to stack in network.
recurrent_units (int) – Number of units in each recurrent layer.
convolutional_layers (int) – Number of convolutional layers to stack on top of recurrent layers in network.
convolution_filters (int) – Number of filters in each convolution.
local_scaling_mode (ForecastingLocalScaling) – Options to make NN inputs stationary in high dynamic range datasets.
zero_predictor (bool) – Include subnetwork to classify points where target equals zero.
skip_missing (bool) – Make the RNN ignore missing entries rather instead of processing them.
batch_size (ForecastingBatchSize) – Batch size.
batch_renormalization (bool) – Enable batch renormalization between layers.
history_length (int) – While training, how much history to consider.
prediction_step_size (int) – Number of future periods to include in objective for each training sample.
training_point_overlap (float) – Amount of overlap to allow between training samples.
max_scale_context (int) – Maximum context to use for local scaling.
quantiles_extension_method (ForecastingQuanitlesExtensionMethod) – Quantile extension method
number_of_samples (int) – Number of samples for ancestral simulation
symmetrize_quantiles (bool) – Force symmetric quantiles (like in Gaussian distribution)
use_log_transforms (bool) – Apply logarithmic transformations to input data.
smooth_history (float) – Smooth (low pass filter) the timeseries.
local_scale_target (bool) – Using per training/prediction window target scaling.
use_clipping (bool) – Apply clipping to input data to stabilize the training.
timeseries_weight_column (str) – If set, we use the values in this column from timeseries data to assign time dependent item weights during training and evaluation.
item_attributes_weight_column (str) – If set, we use the values in this column from item attributes data to assign weights to items during training and evaluation.
use_timeseries_weights_in_objective (bool) – If True, we include weights from column set as “TIMESERIES WEIGHT COLUMN” in objective functions.
use_item_weights_in_objective (bool) – If True, we include weights from column set as “ITEM ATTRIBUTES WEIGHT COLUMN” in objective functions.
skip_timeseries_weight_scaling (bool) – If True, we will avoid normalizing the weights.
timeseries_loss_weight_column (str) – Use value in this column to weight the loss while training.
use_item_id (bool) – Include a feature to indicate the item being forecast.
use_all_item_totals (bool) – Include as input total target across items.
handle_zeros_as_missing_values (bool) – If True, handle zero values in demand as missing data.
datetime_holiday_calendars (List[HolidayCalendars]) – Holiday calendars to augment training with.
fill_missing_values (List[List[dict]]) – Strategy for filling in missing values.
enable_clustering (bool) – Enable clustering in forecasting.
data_split_feature_group_table_name (str) – Specify the table name of the feature group to export training data with the fold column.
custom_loss_functions (List[str]) – Registered custom losses available for selection.
custom_metrics (List[str]) – Registered custom metrics available for selection.
return_fractional_forecasts – Use this to return fractional forecast values while prediction
allow_training_with_small_history – Allows training with fewer than 100 rows in the dataset

prediction_length: int = None

objective: abacusai.api_class.enums.ForecastingObjective = None

sort_objective: abacusai.api_class.enums.ForecastingObjective = None

forecast_frequency: abacusai.api_class.enums.ForecastingFrequency = None

probability_quantiles: List[float] = None

force_prediction_length: bool = None

filter_items: bool = None

enable_feature_selection: bool = None

enable_padding: bool = None

enable_cold_start: bool = None

enable_multiple_backtests: bool = None

num_backtesting_windows: int = None

backtesting_window_step_size: int = None

full_data_retraining: bool = None

additional_forecast_keys: List[str] = None

experimentation_mode: abacusai.api_class.enums.ExperimentationMode = None

type_of_split: abacusai.api_class.enums.ForecastingDataSplitType = None

test_by_item: bool = None

test_start: str = None

test_split: int = None

loss_function: abacusai.api_class.enums.ForecastingLossFunction = None

underprediction_weight: float = None

disable_networks_without_analytic_quantiles: bool = None

initial_learning_rate: float = None

l2_regularization_factor: float = None

dropout_rate: int = None

recurrent_layers: int = None

recurrent_units: int = None

convolutional_layers: int = None

convolution_filters: int = None

local_scaling_mode: abacusai.api_class.enums.ForecastingLocalScaling = None

zero_predictor: bool = None

skip_missing: bool = None

batch_size: abacusai.api_class.enums.BatchSize = None

batch_renormalization: bool = None

history_length: int = None

prediction_step_size: int = None

training_point_overlap: float = None

max_scale_context: int = None

quantiles_extension_method: abacusai.api_class.enums.ForecastingQuanitlesExtensionMethod = None

number_of_samples: int = None

symmetrize_quantiles: bool = None

use_log_transforms: bool = None

smooth_history: float = None

local_scale_target: bool = None

use_clipping: bool = None

timeseries_weight_column: str = None

item_attributes_weight_column: str = None

use_timeseries_weights_in_objective: bool = None

use_item_weights_in_objective: bool = None

skip_timeseries_weight_scaling: bool = None

timeseries_loss_weight_column: str = None

use_item_id: bool = None

use_all_item_totals: bool = None

handle_zeros_as_missing_values: bool = None

datetime_holiday_calendars: List[abacusai.api_class.enums.HolidayCalendars] = None

fill_missing_values: List[List[dict]] = None

enable_clustering: bool = None

data_split_feature_group_table_name: str = None

custom_loss_functions: List[str] = None

custom_metrics: List[str] = None

return_fractional_forecasts: bool = None

allow_training_with_small_history: bool = None

__post_init__()

class abacusai.NamedEntityExtractionTrainingConfig

Bases: TrainingConfig

Training config for the NAMED_ENTITY_EXTRACTION problem type

Parameters:

llm_for_ner (NERForLLM) – LLM to use for NER from among available LLM
test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.
test_row_indicator (str) – Column indicating which rows to use for training (TRAIN) and testing (TEST).
active_labels_column (str) – Entities that have been marked in a particular text
document_format (NLPDocumentFormat) – Format of the input documents.
minimum_bounding_box_overlap_ratio (float) – Tokens are considered to belong to annotation if the user bounding box is provided and ratio of (token_bounding_box ∩ annotation_bounding_box) / token_bounding_area is greater than the provided value.
save_predicted_pdf (bool) – Whether to save predicted PDF documents
enhanced_ocr (bool) – Enhanced text extraction from predicted digital documents
additional_extraction_instructions (str) – Additional instructions to guide the LLM in extracting the entities. Only used with LLM algorithms.

llm_for_ner: abacusai.api_class.enums.LLMName = None

test_split: int = None

test_row_indicator: str = None

active_labels_column: str = None

document_format: abacusai.api_class.enums.NLPDocumentFormat = None

minimum_bounding_box_overlap_ratio: float = 0.0

save_predicted_pdf: bool = True

enhanced_ocr: bool = False

additional_extraction_instructions: str = None

__post_init__()

class abacusai.NaturalLanguageSearchTrainingConfig

Bases: TrainingConfig

Training config for the NATURAL_LANGUAGE_SEARCH problem type

Parameters:

abacus_internal_model (bool) – Use a Abacus.AI LLM to answer questions about your data without using any external APIs
num_completion_tokens (int) – Default for maximum number of tokens for chat answers. Reducing this will get faster responses which are more succinct
larger_embeddings (bool) – Use a higher dimension embedding model.
search_chunk_size (int) – Chunk size for indexing the documents.
chunk_overlap_fraction (float) – Overlap in chunks while indexing the documents.
index_fraction (float) – Fraction of the chunk to use for indexing.

abacus_internal_model: bool = None

num_completion_tokens: int = None

larger_embeddings: bool = None

search_chunk_size: int = None

index_fraction: float = None

chunk_overlap_fraction: float = None

__post_init__()

class abacusai.SystemConnectorTool

Bases: abacusai.api_class.abstract.ApiClass

System connector tool used to integrate chatbots with external services.

Parameters:

value (str) – The name of the tool.
configs (dict) – Optional. A dictionary of key-value pairs that are used to configure the tool.

value: str

configs: dict = None

class abacusai.ChatLLMTrainingConfig

Bases: TrainingConfig

Training config for the CHAT_LLM problem type

Parameters:

document_retrievers (List[str]) – List of names or IDs of document retrievers to use as vector stores of information for RAG responses.
num_completion_tokens (int) – Default for maximum number of tokens for chat answers. Reducing this will get faster responses which are more succinct.
temperature (float) – The generative LLM temperature.
retrieval_columns (list) – Include the metadata column values in the retrieved search results.
filter_columns (list) – Allow users to filter the document retrievers on these metadata columns.
include_general_knowledge (bool) – Allow the LLM to rely not just on RAG search results, but to fall back on general knowledge. Disabled by default.
enable_web_search (bool) – Allow the LLM to use Web Search Engines to retrieve information for better results.
behavior_instructions (str) – Customize the overall behaviour of the model. This controls things like - when to execute code (if enabled), write sql query, search web (if enabled), etc.
response_instructions (str) – Customized instructions for how the model should respond inlcuding the format, persona and tone of the answers.
enable_llm_rewrite (bool) – If enabled, an LLM will rewrite the RAG queries sent to document retriever. Disabled by default.
column_filtering_instructions (str) – Instructions for a LLM call to automatically generate filter expressions on document metadata to retrieve relevant documents for the conversation.
keyword_requirement_instructions (str) – Instructions for a LLM call to automatically generate keyword requirements to retrieve relevant documents for the conversation.
query_rewrite_instructions (str) – Special instructions for the LLM which rewrites the RAG query.
max_search_results (int) – Maximum number of search results in the retrieval augmentation step. If we know that the questions are likely to have snippets which are easily matched in the documents, then a lower number will help with accuracy.
data_feature_group_ids – (List[str]): List of feature group IDs to use to possibly query for the ChatLLM. The created ChatLLM is commonly referred to as DataLLM.
data_prompt_context (str) – Prompt context for the data feature group IDs.
data_prompt_table_context (Dict[str, str]) – Dict of table name and table context pairs to provide table wise context for each structured data table.
data_prompt_column_context (Dict[str, str]) – Dict of ‘table_name.column_name’ and ‘column_context’ pairs to provide column context for some selected columns in the selected structured data table. This replaces the default auto-generated information about the column data.
hide_sql_and_code (bool) – When running data queries, this will hide the generated SQL and Code in the response.
disable_data_summarization (bool) – After executing a query summarize the reponse and reply back with only the table and query run.
data_columns_to_ignore (List[str]) – Columns to ignore while encoding information about structured data tables in context for the LLM. A list of strings of format “<table_name>.<column_name>”
search_score_cutoff (float) – Minimum search score to consider a document as a valid search result.
include_bm25_retrieval (bool) – Combine BM25 search score with vector search using reciprocal rank fusion.
database_connector_id (str) – Database connector ID to use for connecting external database that gives access to structured data to the LLM.
database_connector_tables (List[str]) – List of tables to use from the database connector for the ChatLLM.
enable_code_execution (bool) – Enable python code execution in the ChatLLM. This equips the LLM with a python kernel in which all its code is executed.
enable_response_caching (bool) – Enable caching of LLM responses to speed up response times and improve reproducibility.
unknown_answer_phrase (str) – Fallback response when the LLM can’t find an answer.
enable_tool_bar (bool) – Enable the tool bar in Enterprise ChatLLM to provide additional functionalities like tool_use, web_search, image_gen, etc. Enabling this requires enable_web_search to be enabled.
enable_inline_source_citations (bool) – Enable inline citations of the sources in the response.
response_format – (str): When set to ‘JSON’, the LLM will generate a JSON formatted string.
json_response_instructions (str) – Instructions to be followed while generating the json_response if response_format is set to “JSON”. This can include the schema information if the schema is dynamic and its keys cannot be pre-determined.
json_response_schema (str) – Specifies the JSON schema that the model should adhere to if response_format is set to “JSON”. This should be a json-formatted string where each field of the expected schema is mapped to a dictionary containing the fields ‘type’, ‘required’ and ‘description’. For example - ‘{“sample_field”: {“type”: “integer”, “required”: true, “description”: “Sample Field”}}’
mask_pii (bool) – Mask PII in the prompts and uploaded documents before sending it to the LLM. Only available for Enterprise users and will cause validation errors if set to True for ChatLLM Teams users.
builtin_tools (List[SystemConnectorTool]) – List of builtin system connector tools to use in the ChatLLM. Using builtin tools does not require enabling tool bar (enable_tool_bar flag).
mcp_servers (List[str]) – List of names of MCP servers to use in the ChatLLM. This should not be used with document_retrievers.

document_retrievers: List[str] = None

num_completion_tokens: int = None

temperature: float = None

retrieval_columns: list = None

filter_columns: list = None

include_general_knowledge: bool = None

enable_web_search: bool = None

behavior_instructions: str = None

response_instructions: str = None

enable_llm_rewrite: bool = None

column_filtering_instructions: str = None

keyword_requirement_instructions: str = None

query_rewrite_instructions: str = None

max_search_results: int = None

data_feature_group_ids: List[str] = None

data_prompt_context: str = None

data_prompt_table_context: Dict[str, str] = None

data_prompt_column_context: Dict[str, str] = None

hide_sql_and_code: bool = None

disable_data_summarization: bool = None

data_columns_to_ignore: List[str] = None

search_score_cutoff: float = None

include_bm25_retrieval: bool = None

database_connector_id: str = None

database_connector_tables: List[str] = None

enable_code_execution: bool = None

metadata_columns: list = None

lookup_rewrite_instructions: str = None

enable_response_caching: bool = None

unknown_answer_phrase: str = None

enable_tool_bar: bool = None

enable_inline_source_citations: bool = None

response_format: str = None

json_response_instructions: str = None

json_response_schema: str = None

mask_pii: bool = None

builtin_tools: List[SystemConnectorTool] = None

mcp_servers: List[str] = None

__post_init__()

class abacusai.SentenceBoundaryDetectionTrainingConfig

Bases: TrainingConfig

Training config for the SENTENCE_BOUNDARY_DETECTION problem type

Parameters:

test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.
dropout_rate (float) – Dropout rate for neural network.
batch_size (BatchSize) – Batch size for neural network.

test_split: int = None

dropout_rate: float = None

batch_size: abacusai.api_class.enums.BatchSize = None

__post_init__()

class abacusai.SentimentDetectionTrainingConfig

Bases: TrainingConfig

Training config for the SENTIMENT_DETECTION problem type

Parameters:

sentiment_type (SentimentType) – Type of sentiment to detect.
test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.

sentiment_type: abacusai.api_class.enums.SentimentType = None

test_split: int = None

__post_init__()

class abacusai.DocumentClassificationTrainingConfig

Bases: TrainingConfig

Training config for the DOCUMENT_CLASSIFICATION problem type

Parameters:

zero_shot_hypotheses (List[str]) – Zero shot hypotheses. Example text: ‘This text is about pricing’.
test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.

zero_shot_hypotheses: List[str] = None

test_split: int = None

__post_init__()

class abacusai.DocumentSummarizationTrainingConfig

Bases: TrainingConfig

Training config for the DOCUMENT_SUMMARIZATION problem type

Parameters:

test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.
dropout_rate (float) – Dropout rate for neural network.
batch_size (BatchSize) – Batch size for neural network.

test_split: int = None

dropout_rate: float = None

batch_size: abacusai.api_class.enums.BatchSize = None

__post_init__()

class abacusai.DocumentVisualizationTrainingConfig

Bases: TrainingConfig

Training config for the DOCUMENT_VISUALIZATION problem type

Parameters:

test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.
dropout_rate (float) – Dropout rate for neural network.
batch_size (BatchSize) – Batch size for neural network.

test_split: int = None

dropout_rate: float = None

batch_size: abacusai.api_class.enums.BatchSize = None

__post_init__()

class abacusai.ClusteringTrainingConfig

Bases: TrainingConfig

Training config for the CLUSTERING problem type

Parameters:: num_clusters_selection (int) – Number of clusters. If None, will be selected automatically.

num_clusters_selection: int = None

__post_init__()

class abacusai.ClusteringTimeseriesTrainingConfig

Bases: TrainingConfig

Training config for the CLUSTERING_TIMESERIES problem type

Parameters:

num_clusters_selection (int) – Number of clusters. If None, will be selected automatically.
imputation (ClusteringImputationMethod) – Imputation method for missing values.

num_clusters_selection: int = None

imputation: abacusai.api_class.enums.ClusteringImputationMethod = None

__post_init__()

class abacusai.EventAnomalyTrainingConfig

Bases: TrainingConfig

Training config for the EVENT_ANOMALY problem type

Parameters:: anomaly_fraction (float) – The fraction of the dataset to classify as anomalous, between 0 and 0.5

anomaly_fraction: float = None

__post_init__()

class abacusai.TimeseriesAnomalyTrainingConfig

Bases: TrainingConfig

Training config for the TS_ANOMALY problem type

Parameters:

type_of_split (TimeseriesAnomalyDataSplitType) – Type of data splitting into train/test.
test_start (str) – Limit training data to dates before the given test start.
test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.
fill_missing_values (List[List[dict]]) – strategies to fill missing values and missing timestamps
handle_zeros_as_missing_values (bool) – If True, handle zero values in numeric columns as missing data
timeseries_frequency (str) – set this to control frequency of filling missing values
min_samples_in_normal_region (int) – Adjust this to fine-tune the number of anomalies to be identified.
anomaly_type (TimeseriesAnomalyTypeOfAnomaly) – select what kind of peaks to detect as anomalies
hyperparameter_calculation_with_heuristics (TimeseriesAnomalyUseHeuristic) – Enable heuristic calculation to get hyperparameters for the model
threshold_score (float) – Threshold score for anomaly detection
additional_anomaly_ids (List[str]) – List of categorical columns that can act as multi-identifier

type_of_split: abacusai.api_class.enums.TimeseriesAnomalyDataSplitType = None

test_start: str = None

test_split: int = None

fill_missing_values: List[List[dict]] = None

handle_zeros_as_missing_values: bool = None

timeseries_frequency: str = None

min_samples_in_normal_region: int = None

anomaly_type: abacusai.api_class.enums.TimeseriesAnomalyTypeOfAnomaly = None

hyperparameter_calculation_with_heuristics: abacusai.api_class.enums.TimeseriesAnomalyUseHeuristic = None

threshold_score: float = None

additional_anomaly_ids: List[str] = None

__post_init__()

class abacusai.CumulativeForecastingTrainingConfig

Bases: TrainingConfig

Training config for the CUMULATIVE_FORECASTING problem type

Parameters:

test_split (int) – Percent of dataset to use for test data. We support using a range between 5 ( i.e. 5% ) to 20 ( i.e. 20% ) of your dataset.
historical_frequency (str) – Forecast frequency
cumulative_prediction_lengths (List[int]) – List of Cumulative Prediction Frequencies. Each prediction length must be between 1 and 365.
skip_input_transform (bool) – Avoid doing numeric scaling transformations on the input.
skip_target_transform (bool) – Avoid doing numeric scaling transformations on the target.
predict_residuals (bool) – Predict residuals instead of totals at each prediction step.

test_split: int = None

historical_frequency: str = None

cumulative_prediction_lengths: List[int] = None

skip_input_transform: bool = None

skip_target_transform: bool = None

predict_residuals: bool = None

__post_init__()

class abacusai.ThemeAnalysisTrainingConfig

Bases: TrainingConfig

Training config for the THEME ANALYSIS problem type

__post_init__()

class abacusai.AIAgentTrainingConfig

Bases: TrainingConfig

Training config for the AI_AGENT problem type

Parameters:

description (str) – Description of the agent function.
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
agent_connectors – (List[enums.ApplicationConnectorType]): The connectors needed for the agent to function.

description: str = None

agent_interface: abacusai.api_class.enums.AgentInterface = None

agent_connectors: List[abacusai.api_class.enums.ApplicationConnectorType] = None

enable_binary_input: bool = None

agent_input_schema: dict = None

agent_output_schema: dict = None

__post_init__()

class abacusai.CustomTrainedModelTrainingConfig

Bases: TrainingConfig

Training config for the CUSTOM_TRAINED_MODEL problem type

Parameters:

max_catalog_size (int) – Maximum expected catalog size.
max_dimension (int) – Maximum expected dimension of the catalog.
index_output_path (str) – Fully qualified cloud location (GCS, S3, etc) to export snapshots of the embedding to.
docker_image_uri (str) – Docker image URI.
service_port (int) – Service port.
streaming_embeddings (bool) – Flag to enable streaming embeddings.

max_catalog_size: int = None

max_dimension: int = None

index_output_path: str = None

docker_image_uri: str = None

service_port: int = None

streaming_embeddings: bool = None

__post_init__()

class abacusai.CustomAlgorithmTrainingConfig

Bases: TrainingConfig

Training config for the CUSTOM_ALGORITHM problem type

Parameters:: timeout_minutes (int) – Timeout for the model training in minutes.

timeout_minutes: int = None

__post_init__()

class abacusai.OptimizationTrainingConfig

Bases: TrainingConfig

Training config for the OPTIMIZATION problem type

Parameters:

solve_time_limit (float) – The maximum time in seconds to spend solving the problem. Accepts values between 0 and 86400.
optimality_gap_limit (float) – The stopping optimality gap limit. Optimality gap is fractional difference between the best known solution and the best possible solution. Accepts values between 0 and 1.
include_all_partitions (bool) – Include all partitions in the model training. Default is False.
include_specific_partitions (List[str]) – Include specific partitions in partitioned model training. Default is empty list.

solve_time_limit: float = None

optimality_gap_limit: float = None

include_all_partitions: bool = None

include_specific_partitions: List[str] = None

__post_init__()

class abacusai._TrainingConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'problem_type'

config_class_map

class abacusai.DeployableAlgorithm

Bases: abacusai.api_class.abstract.ApiClass

Algorithm that can be deployed to a model.

Parameters:

algorithm (str) – ID of the algorithm.
name (str) – Name of the algorithm.
only_offline_deployable (bool) – Whether the algorithm can only be deployed offline.
trained_model_types (List[dict]) – List of trained model types.

algorithm: str = None

name: str = None

only_offline_deployable: bool = None

trained_model_types: List[dict] = None

class abacusai.TimeWindowConfig

Bases: abacusai.api_class.abstract.ApiClass

Time Window Configuration

Parameters:

window_duration (int) – The duration of the window.
window_from_start (bool) – Whether the window should be from the start of the time series.

window_duration: int = None

window_from_start: bool = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.ForecastingMonitorConfig

Bases: abacusai.api_class.abstract.ApiClass

Forecasting Monitor Configuration

Parameters:

id_column (str) – The name of the column that contains the unique identifier for the time series.
timestamp_column (str) – The name of the column that contains the timestamp for the time series.
target_column (str) – The name of the column that contains the target value for the time series.
start_time (str) – The start time of the time series data.
end_time (str) – The end time of the time series data.
window_config (TimeWindowConfig) – The windowing configuration for the time series data.

id_column: str = None

timestamp_column: str = None

target_column: str = None

start_time: str = None

end_time: str = None

window_config: TimeWindowConfig = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.StdDevThreshold

Bases: abacusai.api_class.abstract.ApiClass

Std Dev Threshold types

Parameters:

threshold_type (StdDevThresholdType) – Type of threshold to apply to the item attributes.
value (float) – Value to use for the threshold.

threshold_type: abacusai.api_class.enums.StdDevThresholdType = None

value: float = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.ItemAttributesStdDevThreshold

Bases: abacusai.api_class.abstract.ApiClass

Item Attributes Std Dev Threshold for Monitor Alerts

Parameters:

lower_bound (StdDevThreshold) – Lower bound for the item attributes.
upper_bound (StdDevThreshold) – Upper bound for the item attributes.

lower_bound: StdDevThreshold = None

upper_bound: StdDevThreshold = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.RestrictFeatureMappings

Bases: abacusai.api_class.abstract.ApiClass

Restrict Feature Mappings for Monitor Filtering

Parameters:

feature_name (str) – The name of the feature to restrict the monitor to.
restricted_feature_values (list) – The values of the feature to restrict the monitor to if feature is a categorical.
start_time (str) – The start time of the timestamp feature to filter from
end_time (str) – The end time of the timestamp feature to filter until
min_value (float) – Value to filter the numerical feature above
max_value (float) – Filtering the numerical feature to below this value

feature_name: str = None

restricted_feature_values: list = []

start_time: str = None

end_time: str = None

min_value: float = None

max_value: float = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.MonitorFilteringConfig

Bases: abacusai.api_class.abstract.ApiClass

Monitor Filtering Configuration

Parameters:

start_time (str) – The start time of the prediction time col
end_time (str) – The end time of the prediction time col
restrict_feature_mappings (RestrictFeatureMappings) – The feature mapping to restrict the monitor to.
target_class (str) – The target class to restrict the monitor to.
train_target_feature (str) – Set the target feature for the training data.
prediction_target_feature (str) – Set the target feature for the prediction data.

start_time: str = None

end_time: str = None

restrict_feature_mappings: List[RestrictFeatureMappings] = None

target_class: str = None

train_target_feature: str = None

prediction_target_feature: str = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.AlertConditionConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for alert condition configs

alert_type: abacusai.api_class.enums.MonitorAlertType = None

classmethod _get_builder()

class abacusai.AccuracyBelowThresholdConditionConfig

Bases: AlertConditionConfig

Accuracy Below Threshold Condition Config for Monitor Alerts

Parameters:: threshold (float) – Threshold for when to consider a column to be in violation. The alert will only fire when the drift value is strictly greater than the threshold.

threshold: float = None

__post_init__()

class abacusai.FeatureDriftConditionConfig

Bases: AlertConditionConfig

Feature Drift Condition Config for Monitor Alerts

Parameters:

feature_drift_type (FeatureDriftType) – Feature drift type to apply the threshold on to determine whether a column has drifted significantly enough to be a violation.
threshold (float) – Threshold for when to consider a column to be in violation. The alert will only fire when the drift value is strictly greater than the threshold.
minimum_violations (int) – Number of columns that must exceed the specified threshold to trigger an alert.
feature_names (List[str]) – List of feature names to monitor for this alert.

feature_drift_type: abacusai.api_class.enums.FeatureDriftType = None

threshold: float = None

minimum_violations: int = None

feature_names: List[str] = None

__post_init__()

class abacusai.TargetDriftConditionConfig

Bases: AlertConditionConfig

Target Drift Condition Config for Monitor Alerts

Parameters:

feature_drift_type (FeatureDriftType) – Target drift type to apply the threshold on to determine whether a column has drifted significantly enough to be a violation.
threshold (float) – Threshold for when to consider the target column to be in violation. The alert will only fire when the drift value is strictly greater than the threshold.

feature_drift_type: abacusai.api_class.enums.FeatureDriftType = None

threshold: float = None

__post_init__()

class abacusai.HistoryLengthDriftConditionConfig

Bases: AlertConditionConfig

History Length Drift Condition Config for Monitor Alerts

Parameters:

feature_drift_type (FeatureDriftType) – History length drift type to apply the threshold on to determine whether the history length has drifted significantly enough to be a violation.
threshold (float) – Threshold for when to consider the history length to be in violation. The alert will only fire when the drift value is strictly greater than the threshold.

feature_drift_type: abacusai.api_class.enums.FeatureDriftType = None

threshold: float = None

__post_init__()

class abacusai.DataIntegrityViolationConditionConfig

Bases: AlertConditionConfig

Data Integrity Violation Condition Config for Monitor Alerts

Parameters:

data_integrity_type (DataIntegrityViolationType) – This option selects the data integrity violations to monitor for this alert.
minimum_violations (int) – Number of columns that must exceed the specified threshold to trigger an alert.

data_integrity_type: abacusai.api_class.enums.DataIntegrityViolationType = None

minimum_violations: int = None

__post_init__()

class abacusai.BiasViolationConditionConfig

Bases: AlertConditionConfig

Bias Violation Condition Config for Monitor Alerts

Parameters:

bias_type (BiasType) – This option selects the bias metric to monitor for this alert.
threshold (float) – Threshold for when to consider a column to be in violation. The alert will only fire when the drift value is strictly greater than the threshold.
minimum_violations (int) – Number of columns that must exceed the specified threshold to trigger an alert.

bias_type: abacusai.api_class.enums.BiasType = None

threshold: float = None

minimum_violations: int = None

__post_init__()

class abacusai.PredictionCountConditionConfig

Bases: AlertConditionConfig

Deployment Prediction Condition Config for Deployment Alerts. By default we monitor if predictions made over a time window has reduced significantly. :param threshold: Threshold for when to consider to be a violation. Negative means alert on reduction, positive means alert on increase. :type threshold: float :param aggregation_window: Time window to aggregate the predictions over, e.g. 1h, 10m. Only h(hour), m(minute) and s(second) are supported. :type aggregation_window: str :param aggregation_type: Aggregation type to use for the aggregation window, e.g. sum, avg. :type aggregation_type: str

threshold: float = None

aggregation_window: str = None

aggregation_type: str = None

__post_init__()

class abacusai._AlertConditionConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'alert_type'

config_class_key_value_camel_case = True

config_class_map

class abacusai.AlertActionConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for alert action configs

action_type: abacusai.api_class.enums.AlertActionType = None

classmethod _get_builder()

class abacusai.EmailActionConfig

Bases: AlertActionConfig

Email Action Config for Monitor Alerts

Parameters:

email_recipients (List[str]) – List of email addresses to send the alert to.
email_body (str) – Body of the email to send.

email_recipients: List[str] = None

email_body: str = None

__post_init__()

class abacusai._AlertActionConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'action_type'

config_class_map

class abacusai.MonitorThresholdConfig

Bases: abacusai.api_class.abstract.ApiClass

Monitor Threshold Config for Monitor Alerts

Parameters:

drift_type (FeatureDriftType) – Feature drift type to apply the threshold on to determine whether a column has drifted significantly enough to be a violation.
threshold_config (ThresholdConfigs) – Thresholds for when to consider a column to be in violation. The alert will only fire when the drift value is strictly greater than the threshold.

drift_type: abacusai.api_class.enums.FeatureDriftType = None

at_risk_threshold: float = None

severely_drifting_threshold: float = None

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.FeatureMappingConfig

Bases: abacusai.api_class.abstract.ApiClass

Feature mapping configuration for a feature group type.

Parameters:

feature_name (str) – The name of the feature in the feature group.
feature_mapping (str) – The desired feature mapping for the feature.
nested_feature_name (str) – The name of the nested feature in the feature group.

feature_name: str

feature_mapping: str = None

nested_feature_name: str = None

class abacusai.ProjectFeatureGroupTypeMappingsConfig

Bases: abacusai.api_class.abstract.ApiClass

Project feature group type mappings.

Parameters:

feature_group_id (str) – The unique identifier for the feature group.
feature_group_type (str) – The feature group type.
feature_mappings (List[FeatureMappingConfig]) – The feature mappings for the feature group.

feature_group_id: str

feature_group_type: str = None

feature_mappings: List[FeatureMappingConfig]

classmethod from_dict(input_dict)

Parameters:: input_dict (dict)

class abacusai.ConstraintConfig

Bases: abacusai.api_class.abstract.ApiClass

Constraint configuration.

Parameters:

constant (float) – The constant value for the constraint.
operator (str) – The operator for the constraint. Could be ‘EQ’, ‘LE’, ‘GE’
enforcement (str) – The enforcement for the constraint. Could be ‘HARD’ or ‘SOFT’ or ‘SKIP’. Default is ‘HARD’
code (str) – The code for the constraint.
penalty (float) – The penalty for violating the constraint.

constant: float

operator: str

enforcement: str | None = None

code: str | None = None

penalty: float | None = None

class abacusai.ProjectFeatureGroupConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for project feature group configuration.

type: abacusai.api_class.enums.ProjectConfigType = None

classmethod _get_builder()

class abacusai.ConstraintProjectFeatureGroupConfig

Bases: ProjectFeatureGroupConfig

Constraint project feature group configuration.

Parameters:: constraints (List[ConstraintConfig]) – The constraint for the feature group. Should be a list of one ConstraintConfig.

constraints: List[ConstraintConfig]

__post_init__()

class abacusai.ReviewModeProjectFeatureGroupConfig

Bases: ProjectFeatureGroupConfig

Review mode project feature group configuration.

Parameters:: is_review_mode (bool) – The review mode for the feature group.

is_review_mode: bool

__post_init__()

class abacusai._ProjectFeatureGroupConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'type'

config_class_map

class abacusai.PythonFunctionArgument

Bases: abacusai.api_class.abstract.ApiClass

A config class for python function arguments

Parameters:

variable_type (PythonFunctionArgumentType) – The type of the python function argument
name (str) – The name of the python function variable
is_required (bool) – Whether the argument is required
value (Any) – The value of the argument
pipeline_variable (str) – The name of the pipeline variable to use as the value
description (str) – The description of the argument
item_type (str) – Type of items when variable_type is LIST

variable_type: abacusai.api_class.enums.PythonFunctionArgumentType = None

name: str = None

is_required: bool = True

value: Any = None

pipeline_variable: str = None

description: str = None

item_type: str = None

class abacusai.OutputVariableMapping

Bases: abacusai.api_class.abstract.ApiClass

A config class for python function arguments

Parameters:

variable_type (PythonFunctionOutputArgumentType) – The type of the python function output argument
name (str) – The name of the python function variable

variable_type: abacusai.api_class.enums.PythonFunctionOutputArgumentType = None

name: str = None

class abacusai.FeatureGroupExportConfig

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for feature group exports.

connector_type: abacusai.api_class.enums.ConnectorType = None

classmethod _get_builder()

class abacusai.FileConnectorExportConfig

Bases: FeatureGroupExportConfig

File connector export config for feature groups

Parameters:

location (str) – The location to export the feature group to
export_file_format (str) – The file format to export the feature group to

location: str = None

export_file_format: str = None

__post_init__()

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.DatabaseConnectorExportConfig

Bases: FeatureGroupExportConfig

Database connector export config for feature groups

Parameters:

database_connector_id (str) – The ID of the database connector to export the feature group to
mode (str) – The mode to export the feature group in
object_name (str) – The name of the object to export the feature group to
id_column (str) – The name of the ID column
additional_id_columns (List[str]) – Additional ID columns
data_columns (Dict[str, str]) – The data columns to export the feature group to

database_connector_id: str = None

mode: str = None

object_name: str = None

id_column: str = None

additional_id_columns: List[str] = None

data_columns: Dict[str, str] = None

__post_init__()

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai._FeatureGroupExportConfigFactory

Bases: abacusai.api_class.abstract._ApiClassFactory

Helper class that provides a standard way to create an ABC using inheritance.

config_abstract_class

config_class_key = 'connector_type'

config_class_map

class abacusai.ResponseSection

Bases: abacusai.api_class.abstract.ApiClass

A response section that an agent can return to render specific UI elements.

Parameters:

type (ResponseSectionType) – The type of the response.
id (str) – The section key of the segment.

type: abacusai.api_class.enums.ResponseSectionType

id: str

__post_init__()

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

abacusai.Segment

class abacusai.AgentFlowButtonResponseSection(label, agent_workflow_node_name, section_key=None)

Bases: ResponseSection

A response section that an AI Agent can return to render a button.

Parameters:

label (str) – The label of the button.
agent_workflow_node_name (str) – The workflow start node to be executed when the button is clicked.
section_key (str)

label: str

agent_workflow_node_name: str

class abacusai.ImageUrlResponseSection(url, height, width, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render an image.

Parameters:

url (str) – The url of the image to be displayed.
height (int) – The height of the image.
width (int) – The width of the image.
section_key (str)

url: str

height: int

width: int

class abacusai.TextResponseSection(text, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render text.

Parameters:

segment (str) – The text to be displayed.
text (str)
section_key (str)

segment: str

class abacusai.RuntimeSchemaResponseSection(json_schema, ui_schema=None, schema_prop=None)

Bases: ResponseSection

A segment that an agent can return to render json and ui schema in react-jsonschema-form format for workflow nodes. This is primarily used to generate dynamic forms at runtime. If a node returns a runtime schema variable, the UI will render the form upon node execution.

Parameters:

json_schema (dict) – json schema in RJSF format.
ui_schema (dict) – ui schema in RJSF format.
schema_prop (str)

json_schema: dict

ui_schema: dict

class abacusai.CodeResponseSection(code, language, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render code.

Parameters:

code (str) – The code to be displayed.
language (CodeLanguage) – The language of the code.
section_key (str)

code: str

language: abacusai.api_class.enums.CodeLanguage

class abacusai.Base64ImageResponseSection(b64_image, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render a base64 image.

Parameters:

b64_image (str) – The base64 image to be displayed.
section_key (str)

b64_image: str

class abacusai.CollapseResponseSection(title, content, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render a collapsible component.

Parameters:

title (str) – The title of the collapsible component.
content (ResponseSection) – The response section constituting the content of collapsible component
section_key (str)

title: str

content: ResponseSection

to_dict(): Standardizes converting an ApiClass to dictionary. Keys of response dictionary are converted to camel case. This also validates the fields ( type, value, etc ) received in the dictionary.

class abacusai.ListResponseSection(items, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render a list.

Parameters:

items (List[str]) – The list items to be displayed.
section_key (str)

items: List[str]

class abacusai.ChartResponseSection(chart, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render a chart.

Parameters:

chart (dict) – The chart to be displayed.
section_key (str)

chart: dict

class abacusai.DataframeResponseSection(df, header=None, section_key=None)

Bases: ResponseSection

A response section that an agent can return to render a pandas dataframe. :param df: The dataframe to be displayed. :type df: pandas.DataFrame :param header: Heading of the table to be displayed. :type header: str

Parameters:

df (Any)
header (str)
section_key (str)

df: Any

header: str

class abacusai.ApiEndpoint(client, apiEndpoint=None, predictEndpoint=None, proxyEndpoint=None, llmEndpoint=None, externalChatEndpoint=None, dashboardEndpoint=None, hostingDomain=None)

Bases: abacusai.return_class.AbstractApiClass

An collection of endpoints which can be used to make requests to, such as api calls or predict calls

Parameters:

client (ApiClient) – An authenticated API Client instance
apiEndpoint (str) – The URI that can be used to make API calls
predictEndpoint (str) – The URI that can be used to make predict calls against Deployments
proxyEndpoint (str) – The URI that can be used to make proxy server calls
llmEndpoint (str) – The URI that can be used to make llm api calls
externalChatEndpoint (str) – The URI that can be used to access the external chat
dashboardEndpoint (str) – The URI that the external chat will use to go back to the dashboard
hostingDomain (str) – The domain for hosted app deployments

api_endpoint = None

predict_endpoint = None

proxy_endpoint = None

llm_endpoint = None

external_chat_endpoint = None

dashboard_endpoint = None

hosting_domain = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ApiKey(client, apiKeyId=None, apiKey=None, apiKeySuffix=None, tag=None, type=None, createdAt=None, expiresAt=None, isExpired=None)

Bases: abacusai.return_class.AbstractApiClass

An API Key to authenticate requests to the Abacus.AI API

Parameters:

client (ApiClient) – An authenticated API Client instance
apiKeyId (str) – The unique ID for the API key
apiKey (str) – The unique API key scoped to a specific organization. Value will be partially obscured.
apiKeySuffix (str) – The last 4 characters of the API key.
tag (str) – A user-friendly tag for the API key.
type (str) – The type of the API key, either ‘default’, ‘code-llm’, or ‘computer-use’.
createdAt (str) – The timestamp when the API key was created.
expiresAt (str) – The timestamp when the API key will expire.
isExpired (bool) – Whether the API key has expired.

api_key_id = None

api_key = None

api_key_suffix = None

tag = None

type = None

created_at = None

expires_at = None

is_expired = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

delete()

Delete a specified API key.

Parameters:: api_key_id (str) – The ID of the API key to delete.

class abacusai.AppUserGroup(client, name=None, userGroupId=None, externalApplicationIds=None, invitedUserEmails=None, publicUserGroup=None, hasExternalApplicationReporting=None, isExternalServiceGroup=None, externalServiceGroupId=None, users={})

Bases: abacusai.return_class.AbstractApiClass

An app user group. This is used to determine which users have permissions for external chatbots.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the user group.
userGroupId (str) – The unique identifier of the user group.
externalApplicationIds (list[str]) – The ids of the external applications the group has access to.
invitedUserEmails (list[str]) – The emails of the users invited to the user group who have not yet accepted the invite.
publicUserGroup (bool) – Boolean flag whether the app user group is the public user group of the org or not.
hasExternalApplicationReporting (bool) – Whether users in the App User Group have permission to view all reports in their organization.
isExternalServiceGroup (bool) – Whether the App User Group corresponds to a user group that’s defined in an external service (i.e Microsft Active Directory or Okta) or not
externalServiceGroupId (str) – The identifier that corresponds to the app user group’s external service group representation
users (User) – The users in the user group.

name = None

user_group_id = None

external_application_ids = None

invited_user_emails = None

public_user_group = None

has_external_application_reporting = None

is_external_service_group = None

external_service_group_id = None

users

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AppUserGroupSignInToken(client, token=None)

Bases: abacusai.return_class.AbstractApiClass

User Group Sign In Token

Parameters:

client (ApiClient) – An authenticated API Client instance
token (str) – The token to sign in the user

token = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ApplicationConnector(client, applicationConnectorId=None, service=None, name=None, createdAt=None, status=None, auth=None)

Bases: abacusai.return_class.AbstractApiClass

A connector to an external service

Parameters:

client (ApiClient) – An authenticated API Client instance
applicationConnectorId (str) – The unique ID for the connection.
service (str) – The service this connection connects to
name (str) – A user-friendly name for the service
createdAt (str) – When the API key was created
status (str) – The status of the Application Connector
auth (dict) – Non-secret connection information for this connector

application_connector_id = None

service = None

name = None

created_at = None

status = None

auth = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

rename(name)

Renames a Application Connector

Parameters:: name (str) – A new name for the application connector.

delete()

Delete an application connector.

Parameters:: application_connector_id (str) – The unique identifier for the application connector.

list_objects()

Lists querable objects in the application connector.

Parameters:: application_connector_id (str) – Unique string identifier for the application connector.

verify()

Checks if Abacus.AI can access the application using the provided application connector ID.

Parameters:: application_connector_id (str) – Unique string identifier for the application connector.

class abacusai.AudioGenSettings(client, model=None, settings=None)

Bases: abacusai.return_class.AbstractApiClass

Audio generation settings

Parameters:

client (ApiClient) – An authenticated API Client instance
model (dict) – names of models available for audio generation.
settings (dict) – settings for each model.

model = None

settings = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AudioUrlResult(client, audioUrl=None, creditsUsed=None)

Bases: abacusai.return_class.AbstractApiClass

TTS result

Parameters:

client (ApiClient) – An authenticated API Client instance
audioUrl (str) – The audio url.
creditsUsed (float) – The credits used.

audio_url = None

credits_used = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.BatchPrediction(client, batchPredictionId=None, createdAt=None, name=None, deploymentId=None, fileConnectorOutputLocation=None, databaseConnectorId=None, databaseOutputConfiguration=None, fileOutputFormat=None, connectorType=None, legacyInputLocation=None, outputFeatureGroupId=None, featureGroupTableName=None, outputFeatureGroupTableName=None, summaryFeatureGroupTableName=None, csvInputPrefix=None, csvPredictionPrefix=None, csvExplanationsPrefix=None, outputIncludesMetadata=None, resultInputColumns=None, modelMonitorId=None, modelVersion=None, bpAcrossVersionsMonitorId=None, algorithm=None, batchPredictionArgsType=None, batchInputs={}, latestBatchPredictionVersion={}, refreshSchedules={}, inputFeatureGroups={}, globalPredictionArgs={}, batchPredictionArgs={})

Bases: abacusai.return_class.AbstractApiClass

Make batch predictions.

Parameters:

client (ApiClient) – An authenticated API Client instance
batchPredictionId (str) – The unique identifier of the batch prediction request.
createdAt (str) – When the batch prediction was created, in ISO-8601 format.
name (str) – Name given to the batch prediction object.
deploymentId (str) – The deployment used to make the predictions.
fileConnectorOutputLocation (str) – Contains information about where the batch predictions are written to.
databaseConnectorId (str) – The database connector to write the results to.
databaseOutputConfiguration (dict) – Contains information about where the batch predictions are written to.
fileOutputFormat (str) – The format of the batch prediction output (CSV or JSON).
connectorType (str) – Null if writing to internal console, else FEATURE_GROUP | FILE_CONNECTOR | DATABASE_CONNECTOR.
legacyInputLocation (str) – The location of the input data.
outputFeatureGroupId (str) – The Batch Prediction output feature group ID if applicable
featureGroupTableName (str) – The table name of the Batch Prediction output feature group.
outputFeatureGroupTableName (str) – The table name of the Batch Prediction output feature group.
summaryFeatureGroupTableName (str) – The table name of the metrics summary feature group output by Batch Prediction.
csvInputPrefix (str) – A prefix to prepend to the input columns, only applies when output format is CSV.
csvPredictionPrefix (str) – A prefix to prepend to the prediction columns, only applies when output format is CSV.
csvExplanationsPrefix (str) – A prefix to prepend to the explanation columns, only applies when output format is CSV.
outputIncludesMetadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version.
resultInputColumns (list) – If present, will limit result files or feature groups to only include columns present in this list.
modelMonitorId (str) – The model monitor for this batch prediction.
modelVersion (str) – The model instance used in the deployment for the batch prediction.
bpAcrossVersionsMonitorId (str) – The model monitor for this batch prediction across versions.
algorithm (str) – The algorithm that is currently deployed.
batchPredictionArgsType (str) – The type of batch prediction arguments used for this batch prediction.
batchInputs (PredictionInput) – Inputs to the batch prediction.
latestBatchPredictionVersion (BatchPredictionVersion) – The latest batch prediction version.
refreshSchedules (RefreshSchedule) – List of refresh schedules that dictate the next time the batch prediction will be run.
inputFeatureGroups (PredictionFeatureGroup) – List of prediction feature groups.
globalPredictionArgs (BatchPredictionArgs)
batchPredictionArgs (BatchPredictionArgs) – Argument(s) passed to every prediction call.

batch_prediction_id = None

created_at = None

name = None

deployment_id = None

file_connector_output_location = None

database_connector_id = None

database_output_configuration = None

file_output_format = None

connector_type = None

legacy_input_location = None

output_feature_group_id = None

feature_group_table_name = None

output_feature_group_table_name = None

summary_feature_group_table_name = None

csv_input_prefix = None

csv_prediction_prefix = None

csv_explanations_prefix = None

output_includes_metadata = None

result_input_columns = None

model_monitor_id = None

model_version = None

bp_across_versions_monitor_id = None

algorithm = None

batch_prediction_args_type = None

batch_inputs

latest_batch_prediction_version

refresh_schedules

input_feature_groups

global_prediction_args

batch_prediction_args

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

start()

Creates a new batch prediction version job for a given batch prediction job description.

Parameters:: batch_prediction_id (str) – The unique identifier of the batch prediction to create a new version of.
Returns:: The batch prediction version started by this method call.
Return type:: BatchPredictionVersion

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: BatchPrediction

describe()

Describe the batch prediction.

Parameters:: batch_prediction_id (str) – The unique identifier associated with the batch prediction.
Returns:: The batch prediction description.
Return type:: BatchPrediction

list_versions(limit=100, start_after_version=None)

Retrieves a list of versions of a given batch prediction

Parameters:

limit (int) – Number of versions to list.
start_after_version (str) – Version to start after.

Returns:

List of batch prediction versions.

Return type:

list[BatchPredictionVersion]

update(deployment_id=None, global_prediction_args=None, batch_prediction_args=None, explanations=None, output_format=None, csv_input_prefix=None, csv_prediction_prefix=None, csv_explanations_prefix=None, output_includes_metadata=None, result_input_columns=None, name=None)

Update a batch prediction job description.

Parameters:

deployment_id (str) – Unique identifier of the deployment.
batch_prediction_args (BatchPredictionArgs) – Batch Prediction args specific to problem type.
output_format (str) – If specified, sets the format of the batch prediction output (CSV or JSON).
csv_input_prefix (str) – Prefix to prepend to the input columns, only applies when output format is CSV.
csv_prediction_prefix (str) – Prefix to prepend to the prediction columns, only applies when output format is CSV.
csv_explanations_prefix (str) – Prefix to prepend to the explanation columns, only applies when output format is CSV.
output_includes_metadata (bool) – If True, output will contain columns including prediction start time, batch prediction version, and model version.
result_input_columns (list) – If present, will limit result files or feature groups to only include columns present in this list.
name (str) – If present, will rename the batch prediction.
global_prediction_args (Union[dict, abacusai.api_class.BatchPredictionArgs])
explanations (bool)

Returns:

The batch prediction.

Return type:

BatchPrediction

set_file_connector_output(output_format=None, output_location=None)

Updates the file connector output configuration of the batch prediction

Parameters:

output_format (str) – The format of the batch prediction output (CSV or JSON). If not specified, the default format will be used.
output_location (str) – The location to write the prediction results. If not specified, results will be stored in Abacus.AI.

Returns:

The batch prediction description.

Return type:

BatchPrediction

set_database_connector_output(database_connector_id=None, database_output_config=None)

Updates the database connector output configuration of the batch prediction

Parameters:

database_connector_id (str) – Unique string identifier of an Database Connection to write predictions to.
database_output_config (dict) – Key-value pair of columns/values to write to the database connector.

Returns:

Description of the batch prediction.

Return type:

BatchPrediction

set_feature_group_output(table_name)

Creates a feature group and sets it as the batch prediction output.

Parameters:: table_name (str) – Name of the feature group table to create.
Returns:: Batch prediction after the output has been applied.
Return type:: BatchPrediction

set_output_to_console()

Sets the batch prediction output to the console, clearing both the file connector and database connector configurations.

Parameters:: batch_prediction_id (str) – The unique identifier of the batch prediction.
Returns:: The batch prediction description.
Return type:: BatchPrediction

set_feature_group(feature_group_type, feature_group_id=None)

Sets the batch prediction input feature group.

Parameters:

feature_group_type (str) – Enum string representing the feature group type to set. The type is based on the use case under which the feature group is being created (e.g. Catalog Attributes for personalized recommendation use case).
feature_group_id (str) – Unique identifier of the feature group to set as input to the batch prediction.

Returns:

Description of the batch prediction.

Return type:

BatchPrediction

set_dataset_remap(dataset_id_remap)

For the purpose of this batch prediction, will swap out datasets in the training feature groups

Parameters:: dataset_id_remap (dict) – Key/value pairs of dataset ids to be replaced during the batch prediction.
Returns:: Batch prediction object.
Return type:: BatchPrediction

delete()

Deletes a batch prediction and associated data, such as associated monitors.

Parameters:: batch_prediction_id (str) – Unique string identifier of the batch prediction.

wait_for_predictions(timeout=86400)

A waiting call until batch predictions are ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_drift_monitor(timeout=86400)

A waiting call until batch prediction drift monitor calculations are ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the latest batch prediction version.

Returns:: A string describing the status of the latest batch prediction version e.g., pending, complete, etc.
Return type:: str

create_refresh_policy(cron)

To create a refresh policy for a batch prediction.

Parameters:: cron (str) – A cron style string to set the refresh time.
Returns:: The refresh policy object.
Return type:: RefreshPolicy

list_refresh_policies()

Gets the refresh policies in a list.

Returns:: A list of refresh policy objects.
Return type:: List[RefreshPolicy]

describe_output_feature_group()

Gets the results feature group for this batch prediction

Returns:: A feature group object.
Return type:: FeatureGroup

load_results_as_pandas()

Loads the output feature groups into a python pandas dataframe.

Returns:: A pandas dataframe with annotations and text_snippet columns.
Return type:: DataFrame

class abacusai.BatchPredictionVersion(client, batchPredictionVersion=None, batchPredictionId=None, status=None, driftMonitorStatus=None, deploymentId=None, modelId=None, modelVersion=None, predictionsStartedAt=None, predictionsCompletedAt=None, databaseOutputError=None, totalPredictions=None, failedPredictions=None, databaseConnectorId=None, databaseOutputConfiguration=None, fileConnectorOutputLocation=None, fileOutputFormat=None, connectorType=None, legacyInputLocation=None, error=None, driftMonitorError=None, monitorWarnings=None, csvInputPrefix=None, csvPredictionPrefix=None, csvExplanationsPrefix=None, databaseOutputTotalWrites=None, databaseOutputFailedWrites=None, outputIncludesMetadata=None, resultInputColumns=None, modelMonitorVersion=None, algoName=None, algorithm=None, outputFeatureGroupId=None, outputFeatureGroupVersion=None, outputFeatureGroupTableName=None, batchPredictionWarnings=None, bpAcrossVersionsMonitorVersion=None, batchPredictionArgsType=None, batchInputs={}, inputFeatureGroups={}, globalPredictionArgs={}, batchPredictionArgs={})

Bases: abacusai.return_class.AbstractApiClass

Batch Prediction Version

Parameters:

client (ApiClient) – An authenticated API Client instance
batchPredictionVersion (str) – The unique identifier of the batch prediction version
batchPredictionId (str) – The unique identifier of the batch prediction
status (str) – The current status of the batch prediction
driftMonitorStatus (str) – The status of the drift monitor for this batch prediction version
deploymentId (str) – The deployment used to make the predictions
modelId (str) – The model used to make the predictions
modelVersion (str) – The model version used to make the predictions
predictionsStartedAt (str) – Predictions start date and time
predictionsCompletedAt (str) – Predictions completion date and time
databaseOutputError (bool) – If true, there were errors reported by the database connector while writing
totalPredictions (int) – Number of predictions performed in this batch prediction job
failedPredictions (int) – Number of predictions that failed
databaseConnectorId (str) – The database connector to write the results to
databaseOutputConfiguration (dict) – Contains information about where the batch predictions are written to
fileConnectorOutputLocation (str) – Contains information about where the batch predictions are written to
fileOutputFormat (str) – The format of the batch prediction output (CSV or JSON)
connectorType (str) – Null if writing to internal console, else FEATURE_GROUP | FILE_CONNECTOR | DATABASE_CONNECTOR
legacyInputLocation (str) – The location of the input data
error (str) – Relevant error if the status is FAILED
driftMonitorError (str) – Error message for the drift monitor of this batch predcition
monitorWarnings (str) – Relevant warning if there are issues found in drift or data integrity
csvInputPrefix (str) – A prefix to prepend to the input columns, only applies when output format is CSV
csvPredictionPrefix (str) – A prefix to prepend to the prediction columns, only applies when output format is CSV
csvExplanationsPrefix (str) – A prefix to prepend to the explanation columns, only applies when output format is CSV
databaseOutputTotalWrites (int) – The total number of rows attempted to write (may be less than total_predictions if write mode is UPSERT and multiple rows share the same ID)
databaseOutputFailedWrites (int) – The number of failed writes to the Database Connector
outputIncludesMetadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version
resultInputColumns (list[str]) – If present, will limit result files or feature groups to only include columns present in this list
modelMonitorVersion (str) – The version of the model monitor
algoName (str) – The name of the algorithm used to train the model
algorithm (str) – The algorithm that is currently deployed.
outputFeatureGroupId (str) – The Batch Prediction output feature group ID if applicable
outputFeatureGroupVersion (str) – The Batch Prediction output feature group version if applicable
outputFeatureGroupTableName (str) – The Batch Prediction output feature group name if applicable
batchPredictionWarnings (str) – Relevant warnings if any issues are found
bpAcrossVersionsMonitorVersion (str) – The version of the batch prediction across versions monitor
batchPredictionArgsType (str) – The type of the batch prediction args
batchInputs (PredictionInput) – Inputs to the batch prediction
inputFeatureGroups (PredictionFeatureGroup) – List of prediction feature groups
globalPredictionArgs (BatchPredictionArgs)
batchPredictionArgs (BatchPredictionArgs) – Argument(s) passed to every prediction call

batch_prediction_version = None

batch_prediction_id = None

status = None

drift_monitor_status = None

deployment_id = None

model_id = None

model_version = None

predictions_started_at = None

predictions_completed_at = None

database_output_error = None

total_predictions = None

failed_predictions = None

database_connector_id = None

database_output_configuration = None

file_connector_output_location = None

file_output_format = None

connector_type = None

legacy_input_location = None

error = None

drift_monitor_error = None

monitor_warnings = None

csv_input_prefix = None

csv_prediction_prefix = None

csv_explanations_prefix = None

database_output_total_writes = None

database_output_failed_writes = None

output_includes_metadata = None

result_input_columns = None

model_monitor_version = None

algo_name = None

algorithm = None

output_feature_group_id = None

output_feature_group_version = None

output_feature_group_table_name = None

batch_prediction_warnings = None

bp_across_versions_monitor_version = None

batch_prediction_args_type = None

batch_inputs

input_feature_groups

global_prediction_args

batch_prediction_args

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

download_batch_prediction_result_chunk(offset=0, chunk_size=10485760)

Returns a stream containing the batch prediction results.

Parameters:

offset (int) – The offset to read from.
chunk_size (int) – The maximum amount of data to read.

get_batch_prediction_connector_errors()

Returns a stream containing the batch prediction database connection write errors, if any writes failed for the specified batch prediction job.

Parameters:: batch_prediction_version (str) – Unique string identifier of the batch prediction job to get the errors for.

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: BatchPredictionVersion

describe()

Describes a Batch Prediction Version.

Parameters:: batch_prediction_version (str) – Unique string identifier of the Batch Prediction Version.
Returns:: The Batch Prediction Version.
Return type:: BatchPredictionVersion

get_logs()

Retrieves the batch prediction logs.

Parameters:: batch_prediction_version (str) – The unique version ID of the batch prediction version.
Returns:: The logs for the specified batch prediction version.
Return type:: BatchPredictionVersionLogs

download_result_to_file(file)

Downloads the batch prediction version in a local file.

Parameters:: file (file object) – A file object opened in a binary mode e.g., file=open(‘/tmp/output’, ‘wb’).

wait_for_predictions(timeout=86400)

A waiting call until batch prediction version is ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_drift_monitor(timeout=86400)

A waiting call until batch prediction drift monitor calculations are ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status(drift_monitor_status=False)

Gets the status of the batch prediction version.

Returns:: A string describing the status of the batch prediction version, for e.g., pending, complete, etc.
Return type:: str
Parameters:: drift_monitor_status (bool)

load_results_as_pandas()

Loads the output feature groups into a python pandas dataframe.

Returns:: A pandas dataframe with annotations and text_snippet columns.
Return type:: DataFrame

class abacusai.BatchPredictionVersionLogs(client, logs=None, warnings=None)

Bases: abacusai.return_class.AbstractApiClass

Logs from batch prediction version.

Parameters:

client (ApiClient) – An authenticated API Client instance
logs (list[str]) – List of logs from batch prediction version.
warnings (list[str]) – List of warnings from batch prediction version.

logs = None

warnings = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.BotInfo(client, externalApplicationId=None)

Bases: abacusai.return_class.AbstractApiClass

Information about an external application and LLM.

Parameters:

client (ApiClient) – An authenticated API Client instance
externalApplicationId (str) – The external application ID.

external_application_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CategoricalRangeViolation(client, name=None, mostCommonValues=None, freqOutsideTrainingRange=None)

Bases: abacusai.return_class.AbstractApiClass

Summary of important range mismatches for a numerical feature discovered by a model monitoring instance

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – Name of feature.
mostCommonValues (list[str]) – List of most common feature names in the prediction distribution not present in the training distribution.
freqOutsideTrainingRange (float) – Frequency of prediction rows outside training distribution for the specified feature.

name = None

most_common_values = None

freq_outside_training_range = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ChatMessage(client, role=None, text=None, timestamp=None, isUseful=None, feedback=None, docIds=None, hotkeyTitle=None, tasks=None, keywordArguments=None, computePointsUsed=None)

Bases: abacusai.return_class.AbstractApiClass

A single chat message with Abacus Chat.

Parameters:

client (ApiClient) – An authenticated API Client instance
role (str) – The role of the message sender
text (list[dict]) – A list of text segments for the message
timestamp (str) – The timestamp at which the message was sent
isUseful (bool) – Whether this message was marked as useful or not
feedback (str) – The feedback provided for the message
docIds (list[str]) – A list of IDs of the uploaded document if the message has
hotkeyTitle (str) – The title of the hotkey prompt if the message has one
tasks (list[str]) – The list of spawned tasks, if the message was broken down into smaller sub-tasks.
keywordArguments (dict) – A dict of kwargs used to generate the response.
computePointsUsed (int) – The number of compute points used for the message.

role = None

text = None

timestamp = None

is_useful = None

feedback = None

doc_ids = None

hotkey_title = None

tasks = None

keyword_arguments = None

compute_points_used = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ChatSession(client, answer=None, chatSessionId=None, projectId=None, name=None, createdAt=None, status=None, aiBuildingInProgress=None, notification=None, whiteboard=None, chatHistory={}, nextAiBuildingTask={})

Bases: abacusai.return_class.AbstractApiClass

A chat session with Abacus Data Science Co-pilot.

Parameters:

client (ApiClient) – An authenticated API Client instance
answer (str) – The response from the chatbot
chatSessionId (str) – The chat session id
projectId (str) – The project id associated with the chat session
name (str) – The name of the chat session
createdAt (str) – The timestamp at which the chat session was created
status (str) – The status of the chat sessions
aiBuildingInProgress (bool) – Whether the AI building is in progress or not
notification (str) – A warn/info message about the chat session. For example, a suggestion to create a new session if the current one is too old
whiteboard (str) – A set of whiteboard notes associated with the chat session
chatHistory (ChatMessage) – The chat history for the conversation
nextAiBuildingTask (AiBuildingTask) – The next AI building task for the chat session

answer = None

chat_session_id = None

project_id = None

name = None

created_at = None

status = None

ai_building_in_progress = None

notification = None

whiteboard = None

chat_history

next_ai_building_task

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

get()

Gets a chat session from Data Science Co-pilot.

Parameters:: chat_session_id (str) – Unique ID of the chat session.
Returns:: The chat session with Data Science Co-pilot
Return type:: ChatSession

delete_chat_message(message_index)

Deletes a message in a chat session and its associated response.

Parameters:: message_index (int) – The index of the chat message within the UI.

export()

Exports a chat session to an HTML file

Parameters:: chat_session_id (str) – Unique ID of the chat session.

rename(name)

Renames a chat session with Data Science Co-pilot.

Parameters:: name (str) – The new name of the chat session.

class abacusai.ChatllmComputer(client, computerId=None, token=None, vncEndpoint=None)

Bases: abacusai.return_class.AbstractApiClass

ChatLLMComputer

Parameters:

client (ApiClient) – An authenticated API Client instance
computerId (int) – The computer id.
token (str) – The token.
vncEndpoint (str) – The VNC endpoint.

computer_id = None

token = None

vnc_endpoint = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ChatllmComputerStatus(client, computerId=None, vncEndpoint=None, computerStarted=None, restartRequired=None)

Bases: abacusai.return_class.AbstractApiClass

ChatLLM Computer Status

Parameters:

client (ApiClient) – An authenticated API Client instance
computerId (str) – The ID of the computer, it can be a deployment_conversation_id or a computer_id (TODO: add separate field for deployment_conversation_id)
vncEndpoint (str) – The VNC endpoint of the computer
computerStarted (bool) – Whether the computer has started
restartRequired (bool) – Whether the computer needs to be restarted

computer_id = None

vnc_endpoint = None

computer_started = None

restart_required = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ChatllmMemory(client, chatllmMemoryId=None, memory=None, sourceDeploymentConversationId=None)

Bases: abacusai.return_class.AbstractApiClass

An LLM created memory in ChatLLM

Parameters:

client (ApiClient) – An authenticated API Client instance
chatllmMemoryId (str) – The ID of the chatllm memory.
memory (str) – The text of the ChatLLM memory.
sourceDeploymentConversationId (str) – The deployment conversation where this memory was created.

chatllm_memory_id = None

memory = None

source_deployment_conversation_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ChatllmProject(client, chatllmProjectId=None, name=None, description=None, customInstructions=None, createdAt=None, updatedAt=None)

Bases: abacusai.return_class.AbstractApiClass

ChatLLM Project

Parameters:

client (ApiClient) – An authenticated API Client instance
chatllmProjectId (id) – The ID of the chatllm project.
name (str) – The name of the chatllm project.
description (str) – The description of the chatllm project.
customInstructions (str) – The custom instructions of the chatllm project.
createdAt (str) – The creation time of the chatllm project.
updatedAt (str) – The update time of the chatllm project.

chatllm_project_id = None

name = None

description = None

custom_instructions = None

created_at = None

updated_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ChatllmReferralInvite(client, userAlreadyExists=None, successfulInvites=None)

Bases: abacusai.return_class.AbstractApiClass

The response of the Chatllm Referral Invite for different emails

Parameters:

client (ApiClient) – An authenticated API Client instance
userAlreadyExists (list) – List of user emails not successfullt invited, because they are already registered users.
successfulInvites (list) – List of users successfully invited.

user_already_exists = None

successful_invites = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ChatllmTask(client, chatllmTaskId=None, daemonTaskId=None, taskType=None, name=None, instructions=None, lifecycle=None, scheduleInfo=None, externalApplicationId=None, deploymentConversationId=None, sourceDeploymentConversationId=None, enableEmailAlerts=None, email=None, numUnreadTaskInstances=None, computePointsUsed=None, displayMarkdown=None)

Bases: abacusai.return_class.AbstractApiClass

A chatllm task

Parameters:

client (ApiClient) – An authenticated API Client instance
chatllmTaskId (str) – The id of the chatllm task.
daemonTaskId (str) – The id of the daemon task.
taskType (str) – The type of task (‘chatllm’ or ‘daemon’).
name (str) – The name of the chatllm task.
instructions (str) – The instructions of the chatllm task.
lifecycle (str) – The lifecycle of the chatllm task.
scheduleInfo (dict) – The schedule info of the chatllm task.
externalApplicationId (str) – The external application id associated with the chatllm task.
deploymentConversationId (str) – The deployment conversation id associated with the chatllm task.
sourceDeploymentConversationId (str) – The source deployment conversation id associated with the chatllm task.
enableEmailAlerts (bool) – Whether email alerts are enabled for the chatllm task.
email (str) – The email to send alerts to.
numUnreadTaskInstances (int) – The number of unread task instances for the chatllm task.
computePointsUsed (int) – The compute points used for the chatllm task.
displayMarkdown (str) – The display markdown for the chatllm task.

chatllm_task_id = None

daemon_task_id = None

task_type = None

name = None

instructions = None

lifecycle = None

schedule_info = None

external_application_id = None

deployment_conversation_id = None

source_deployment_conversation_id = None

enable_email_alerts = None

email = None

num_unread_task_instances = None

compute_points_used = None

display_markdown = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.AgentResponse(*args, **kwargs)

Response object for agent to support attachments, section data and normal data

data_list = []

section_data_list = []

__getstate__(): Return state values to be pickled.

__setstate__(state): Restore state from the unpickled state values.

to_dict(): Get a dict representation of the response object

__getattr__(item)

class abacusai.ApiClient(api_key=None, server=None, client_options=None, skip_version_check=False, include_tb=False)

Bases: ReadOnlyClient

Abacus.AI API Client

Parameters:

api_key (str) – The api key to use as authentication to the server
server (str) – The base server url to use to send API requets to
client_options (ClientOptions) – Optional API client configurations
skip_version_check (bool) – If true, will skip checking the server’s current API version on initializing the client
include_tb (bool)

create_dataset_from_pandas(feature_group_table_name, df, clean_column_names=False)

[Deprecated] Creates a Dataset from a pandas dataframe

Parameters:

feature_group_table_name (str) – The table name to assign to the feature group created by this call
df (pandas.DataFrame) – The dataframe to upload
clean_column_names (bool) – If true, the dataframe’s column names will be automatically cleaned to be complaint with Abacus.AI’s column requirements. Otherwise it will raise a ValueError.

Returns:

The dataset object created

Return type:

Dataset

get_assignments_online_with_new_inputs(deployment_token, deployment_id, assignments_df=None, constraints_df=None, constraint_equations_df=None, feature_mapping_dict=None, solve_time_limit_seconds=None, optimality_gap_limit=None)

Get alternative positive assignments for given query. Optimal assignments are ignored and the alternative assignments are returned instead.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (ID) – The unique identifier of a deployment created under the project.
assignments_df (pd.DataFrame) – A dataframe with all the variables involved in the optimization problem
constraints_df (pd.DataFrame) – A dataframe of individual constraints, and variables in them
constraint_equations_df (pd.DataFrame) – A dataframe which tells us about the operator / constant / penalty etc of a constraint This gives us some data which is needed to make sense of the constraints_df.
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
optimality_gap_limit (float) – Optimality gap we want to come within, after which we accept the solution as valid. (0 means we only want an optimal solution). it is abs(best_solution_found - best_bound) / abs(best_solution_found)
feature_mapping_dict (dict)

Returns:

The assignments for a given query.

Return type:

Dict

get_optimization_input_dataframes_with_new_inputs(deployment_token, deployment_id, query_data)

Get assignments for given query, with new inputs

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – a dictionary with various key: value pairs corresponding to various updated FGs in the FG tree, which we want to update to compute new top level FGs for online solve. values are dataframes and keys are their names. Names should be same as the ones used during training.

Returns:

The output dataframes for a given query. These will be in serialized form. So, effectively we would have a dict of keys, and serialized dataframes.

Return type:

Dict

create_dataset_version_from_pandas(table_name_or_id, df, clean_column_names=False)

[Deprecated] Updates an existing dataset from a pandas dataframe

Parameters:

table_name_or_id (str) – The table name of the feature group or the ID of the dataset to update
df (pandas.DataFrame) – The dataframe to upload
clean_column_names (bool) – If true, the dataframe’s column names will be automatically cleaned to be complaint with Abacus.AI’s column requirements. Otherwise it will raise a ValueError.

Returns:

The dataset updated

Return type:

Dataset

create_feature_group_from_pandas_df(table_name, df, clean_column_names=False)

Create a Feature Group from a local Pandas DataFrame.

Parameters:

table_name (str) – The table name to assign to the feature group created by this call
df (pandas.DataFrame) – The dataframe to upload and use as the data source for the feature group
clean_column_names (bool) – If true, the dataframe’s column names will be automatically cleaned to be complaint with Abacus.AI’s column requirements. Otherwise it will raise a ValueError.

Return type:

abacusai.feature_group.FeatureGroup

update_feature_group_from_pandas_df(table_name, df, clean_column_names=False)

Updates a DATASET Feature Group from a local Pandas DataFrame.

Parameters:

table_name (str) – The table name of the existing feature group to update. A feature group with this name must exist and must have source type DATASET.
df (pandas.DataFrame) – The dataframe to upload
clean_column_names (bool) – If true, the dataframe’s column names will be automatically cleaned to be complaint with Abacus.AI’s column requirements. Otherwise it will raise a ValueError.

Return type:

abacusai.feature_group.FeatureGroup

create_feature_group_from_spark_df(table_name, df)

Create a Feature Group from a local Spark DataFrame.

Parameters:

df (pyspark.sql.DataFrame) – The dataframe to upload
table_name (str) – The table name to assign to the feature group created by this call

Return type:

abacusai.feature_group.FeatureGroup

update_feature_group_from_spark_df(table_name, df)

Create a Feature Group from a local Spark DataFrame.

Parameters:

df (pyspark.sql.DataFrame) – The dataframe to upload
table_name (str) – The table name to assign to the feature group created by this call
should_wait_for_upload (bool) – Wait for dataframe to upload before returning. Some FeatureGroup methods, like materialization, may not work until upload is complete.
timeout (int) – If waiting for upload, time out after this limit.

Return type:

abacusai.feature_group.FeatureGroup

create_spark_df_from_feature_group_version(session, feature_group_version)

Create a Spark Dataframe in the provided Spark Session context, for a materialized Abacus Feature Group Version.

Parameters:

session (pyspark.sql.SparkSession) – Spark session
feature_group_version (str) – Feature group version to load from

Returns:

pyspark.sql.DataFrame

create_prediction_operator_from_functions(name, project_id, predict_function=None, initialize_function=None, feature_group_ids=None, cpu_size=None, memory=None, included_modules=None, package_requirements=None, use_gpu=False)

Create a new prediction operator.

Parameters:

name (str) – Name of the prediction operator.
project_id (str) – The unique ID of the associated project.
predict_function (callable) – The function that will be executed to run predictions.
initialize_function (callable) – The initialization function that can generate anything used by predictions, based on input feature groups.
feature_group_ids (list) – List of feature groups that are supplied to the initialize function as parameters. Each of the parameters are materialized Dataframes.
cpu_size (str) – Size of the CPU for the prediction operator.
memory (int) – Memory (in GB) for the prediction operator.
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_gpu (bool) – Whether this rediction operator needs gpu.

Returns: PredictionOperator: The updated prediction operator object.

update_prediction_operator_from_functions(prediction_operator_id, name=None, predict_function=None, initialize_function=None, feature_group_ids=None, cpu_size=None, memory=None, included_modules=None, package_requirements=None, use_gpu=False)

Update an existing prediction operator.

Parameters:

prediction_operator_id (str) – The unique ID of the prediction operator.
name (str) – The name of the prediction operator
predict_function (callable) – The predict function callable to serialize and upload
initialize_function (callable) – The initialize function callable to serialize and upload
feature_group_ids (list) – List of feature groups that are supplied to the initialize function as parameters. Each of the parameters are materialized Dataframes. The order should match the initialize function’s parameters.
cpu_size (str) – Size of the cpu for the training function
memory (int) – Memory (in GB) for the training function
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_gpu (bool) – Whether this prediction needs gpu

create_model_from_functions(project_id, train_function, predict_function=None, training_input_tables=None, predict_many_function=None, initialize_function=None, cpu_size=None, memory=None, training_config=None, exclusive_run=False, included_modules=None, package_requirements=None, name=None, use_gpu=False, is_thread_safe=None)

Creates a model from a python function

Parameters:

project_id (str) – The project to create the model in
train_function (callable) – The training fucntion callable to serialize and upload
predict_function (callable) – The predict function callable to serialize and upload
predict_many_function (callable) – The predict many function callable to serialize and upload
initialize_function (callable) – The initialize function callable to serialize and upload
training_input_tables (list) – The input table names of the feature groups to pass to the train function
cpu_size (str) – Size of the cpu for the training function
memory (int) – Memory (in GB) for the training function
training_config (TrainingConfig) – Training configuration
exclusive_run (bool) – Decides if this model will be run exclusively or along with other Abacus.AI algorithms
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’
name (str) – The name of the model
use_gpu (bool) – Whether this model needs gpu
is_thread_safe (bool) – Whether the model is thread safe

update_model_from_functions(model_id, train_function, predict_function=None, predict_many_function=None, initialize_function=None, training_input_tables=None, cpu_size=None, memory=None, included_modules=None, package_requirements=None, use_gpu=False, is_thread_safe=None)

Creates a model from a python function. Please pass in all the functions, even if you don’t update it.

Parameters:

model_id (str) – The id of the model to update
train_function (callable) – The training fucntion callable to serialize and upload
predict_function (callable) – The predict function callable to serialize and upload
predict_many_function (callable) – The predict many function callable to serialize and upload
initialize_function (callable) – The initialize function callable to serialize and upload
training_input_tables (list) – The input table names of the feature groups to pass to the train function
cpu_size (str) – Size of the cpu for the training function
memory (int) – Memory (in GB) for the training function
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’
use_gpu (bool) – Whether this model needs gpu
is_thread_safe (bool) – Whether the model is thread safe

create_pipeline_step_from_function(pipeline_id, step_name, function, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, included_modules=None, timeout=None)

Creates a step in a given pipeline from a python function.

Parameters:

pipeline_id (str) – The ID of the pipeline to add the step to.
step_name (str) – The name of the step.
function (callable) – The python function.
step_input_mappings (List[PythonFunctionArguments]) – List of Python function arguments.
output_variable_mappings (List[OutputVariableMapping]) – List of Python function ouputs.
step_dependencies (List[str]) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’
timeout (int) – Timeout for how long the step can run in minutes, default is 300 minutes.

update_pipeline_step_from_function(pipeline_step_id, function, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, included_modules=None, timeout=None)

Updates a pipeline step from a python function.

Parameters:

pipeline_step_id (str) – The ID of the pipeline_step to update.
function (callable) – The python function.
step_input_mappings (List[PythonFunctionArguments]) – List of Python function arguments.
output_variable_mappings (List[OutputVariableMapping]) – List of Python function ouputs.
step_dependencies (List[str]) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’
timeout (int) – Timeout for the step in minutes, default is 300 minutes.

create_python_function_from_function(name, function, function_variable_mappings=None, package_requirements=None, function_type=PythonFunctionType.FEATURE_GROUP.value, description=None)

Creates a custom Python function

Parameters:

name (str) – The name to identify the Python function.
function (callable) – The function callable to serialize and upload.
function_variable_mappings (List<PythonFunctionArguments>) – List of Python function arguments.
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
function_type (PythonFunctionType) – Type of Python function to create. Default is FEATURE_GROUP, but can also be PLOTLY_FIG.
description (str) – Description of the Python function.

create_feature_group_from_python_function(function, table_name, input_tables=None, python_function_name=None, python_function_bindings=None, cpu_size=None, memory=None, package_requirements=None, included_modules=None)

Creates a feature group from a python function

Parameters:

function (callable) – The function callable for the feature group
table_name (str) – The table name to give the feature group
input_tables (list) – The input table names of the feature groups as input to the feature group function
python_function_name (str) – The name of the python function to create a feature group from.
python_function_bindings (List<PythonFunctionArguments>) – List of python function arguments
cpu_size (str) – Size of the cpu for the feature group function
memory (int) – Memory (in GB) for the feature group function
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’

update_python_function_code(name, function=None, function_variable_mappings=None, package_requirements=None, included_modules=None)

Update custom python function with user inputs for the given python function.

Parameters:

name (String) – The unique name to identify the python function in an organization.
function (callable) – The function callable to serialize and upload.
function_variable_mappings (List<PythonFunctionArguments>) – List of python function arguments
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’

Returns:

The python_function object.

Return type:

PythonFunction

create_algorithm_from_function(name, problem_type, training_data_parameter_names_mapping=None, training_config_parameter_name=None, train_function=None, predict_function=None, predict_many_function=None, initialize_function=None, common_functions=None, config_options=None, is_default_enabled=False, project_id=None, use_gpu=False, package_requirements=None, included_modules=None)

Create a new algorithm, or update existing algorithm if the name already exists

Parameters:

name (String) – The name to identify the algorithm, only uppercase letters, numbers and underscore allowed
problem_type (str) – The type of the problem this algorithm will work on
train_function (callable) – The training function callable to serialize and upload
predict_function (callable) – The predict function callable to serialize and upload
predict_many_function (callable) – The predict many function callable to serialize and upload
initialize_function (callable) – The initialize function callable to serialize and upload
common_functions (List of callables) – A list of functions that will be used by both train and predict functions, e.g. some data processing utilities
training_data_parameter_names_mapping (Dict) – The mapping from feature group types to training data parameter names in the train function
training_config_parameter_name (string) – The train config parameter name in the train function
config_options (Dict) – Map dataset types and configs to train function parameter names
is_default_enabled (bool) – Whether train with the algorithm by default
project_id (Unique String Identifier) – The unique version ID of the project
use_gpu (Boolean) – Whether this algorithm needs to run on GPU
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’

update_algorithm_from_function(algorithm, training_data_parameter_names_mapping=None, training_config_parameter_name=None, train_function=None, predict_function=None, predict_many_function=None, initialize_function=None, common_functions=None, config_options=None, is_default_enabled=None, use_gpu=None, package_requirements=None, included_modules=None)

Create a new algorithm, or update existing algorithm if the name already exists

Parameters:

algorithm (String) – The name to identify the algorithm, only uppercase letters, numbers and underscore allowed
train_function (callable) – The training fucntion callable to serialize and upload
predict_function (callable) – The predict function callable to serialize and upload
predict_many_function (callable) – The predict many function callable to serialize and upload
initialize_function (callable) – The initialize function callable to serialize and upload
common_functions (List of callables) – A list of functions that will be used by both train and predict functions, e.g. some data processing utilities
training_data_parameter_names_mapping (Dict) – The mapping from feature group types to training data parameter names in the train function
training_config_parameter_name (string) – The train config parameter name in the train function
config_options (Dict) – Map dataset types and configs to train function parameter names
is_default_enabled (Boolean) – Whether train with the algorithm by default
use_gpu (Boolean) – Whether this algorithm needs to run on GPU
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
included_modules (list) – List of names of user-created modules that will be included, which is equivalent to ‘from module import *’

get_train_function_input(project_id, training_table_names=None, training_data_parameter_name_override=None, training_config_parameter_name_override=None, training_config=None, custom_algorithm_config=None)

Get the input data for the train function to test locally.

Parameters:

project_id (String) – The id of the project
training_table_names (List) – A list of feature group tables used for training
training_data_parameter_name_override (Dict) – The mapping from feature group types to training data parameter names in the train function
training_config_parameter_name_override (String) – The train config parameter name in the train function
training_config (Dict) – A dictionary for Abacus.AI defined training options and values
custom_algorithm_config (Any) – User-defined config that can be serialized by JSON

Returns:

A dictionary that maps train function parameter names to their values.

get_train_function_input_from_model_version(model_version, algorithm=None, training_config=None, custom_algorithm_config=None)

Get the input data for the train function to test locally, based on a trained model version.

Parameters:

model_version (String) – The string identifier of the model version
algorithm (String) – The particular algorithm’s name, whose train function to test with
training_config (Dict) – A dictionary for Abacus.AI defined training options and values
custom_algorithm_config (Any) – User-defined config that can be serialized by JSON

Returns:

A dictionary that maps train function parameter names to their values.

create_custom_loss_function(name, loss_function_type, loss_function)

Registers a new custom loss function which can be used as an objective function during model training.

Parameters:

name (String) – A name for the loss. Should be unique per organization. Limit - 50 chars. Only underscores, numbers, uppercase alphabets allowed
loss_function_type (String) – The category of problems that this loss would be applicable to. Ex - REGRESSION_DL_TF, CLASSIFICATION_DL_TF, etc.
loss_function (Callable) – A python functor which can take required arguments (Ex - (y_true, y_pred)) and returns loss value(s) (Ex - An array of loss values of size batch size)

Returns:

A description of the registered custom loss function

Return type:

CustomLossFunction

Raises:

InvalidParameterError – If either loss function name or type or the passed function is invalid/incompatible
AlreadyExistsError – If the loss function with the same name already exists in the organization

update_custom_loss_function(name, loss_function)

Updates a previously registered custom loss function with a new function implementation.

Parameters:

name (String) – name of the registered custom loss.
loss_function (Callable) – A python functor which can take required arguments (Ex - (y_true, y_pred)) and returns loss value(s) (Ex - An array of loss values of size batch size)

Returns:

A description of the updated custom loss function

Return type:

CustomLossFunction

Raises:

InvalidParameterError – If either loss function name or type or the passed function is invalid/incompatible
DataNotFoundError – If a loss function with given name is not found in the organization

create_custom_metric_from_function(name, problem_type, custom_metric_function)

Registers a new custom metric which can be used as an evaluation metric for the trained model.

Parameters:

name (String) – A name for the metric. Should be unique per organization. Limit - 50 chars. Only underscores, numbers, uppercase alphabets allowed.
problem_type (String) – The problem type that this metric would be applicable to. e.g. - REGRESSION, FORECASTING, etc.
custom_metric_function (Callable) – A python functor which can take required arguments e.g. (y_true, y_pred) and returns the metric value.

Returns:

The newly created custom metric.

Return type:

CustomMetric

Raises:

InvalidParameterError – If either custom metric name or type or the passed function is invalid/incompatible.
AlreadyExistsError – If a custom metric with given name already exists in the organization.

update_custom_metric_from_function(name, custom_metric_function)

Updates a previously registered custom metric.

Parameters:

name (String) – A name for the metric. Should be unique per organization. Limit - 50 chars. Only underscores, numbers, uppercase alphabets allowed.
custom_metric_function (Callable) – A python functor which can take required arguments e.g. (y_true, y_pred) and returns the metric value.

Returns:

The updated custom metric.

Return type:

CustomMetric

Raises:

InvalidParameterError – If either custom metric name or type or the passed function is invalid/incompatible.
DataNotFoundError – If a custom metric with given name is not found in the organization.

create_module_from_notebook(file_path, name)

Create a module with the code marked in the notebook. Use ‘#module_start#’ to mark the starting code cell and ‘#module_end#’ for the ending code cell.

Parameters:

file_path (String) – Notebook’s relative path to the root directory, e.g. ‘n1.ipynb’
name (String) – Name of the module to create.

Returns:

the created Abacus.ai module object

Return type:

Module

update_module_from_notebook(file_path, name)

Update the module with the code marked in the notebook. Use ‘#module_start#’ to mark the starting code cell and ‘#module_end#’ for the ending code cell.

Parameters:

file_path (String) – Notebook’s relative path to the root directory, e.g. ‘n1.ipynb’
name (String) – Name of the module to create.

Returns:

the created Abacus.ai module object

Return type:

Module

import_module(name)

Import a module created previously. It will reload if has been imported before. This will be equivalent to including from that module file.

Parameters:: name (String) – Name of the module to import.
Returns:: the imported python module
Return type:: module

run_workflow_graph(workflow_graph, sample_user_inputs={}, agent_workflow_node_id=None, agent_interface=None, package_requirements=None)

Validates the workflow graph by running the flow using sample user inputs for an AI Agent.

Parameters:

workflow_graph (WorkflowGraph) – The workflow graph to validate.
sample_user_inputs (dict) – Contains sample values for variables of type user_input for starting node
agent_workflow_node_id (str) – Node id from which we want to run workflow
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
package_requirements (list) – A list of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].

Returns:

The output variables for every node in workflow which has executed.

Return type:

dict

execute_workflow_node(node, inputs)

Execute the workflow node given input arguments. This is to be used for testing purposes only.

Parameters:

node (WorkflowGraphNode) – The workflow node to be executed.
inputs (dict) – The inputs to be passed to the node function.

Returns:

The outputs returned by node execution.

Return type:

dict

create_agent_from_function(project_id, agent_function, name=None, memory=None, package_requirements=None, description=None, evaluation_feature_group_id=None, workflow_graph=None)

[Deprecated] Creates the agent from a python function

Parameters:

project_id (str) – The project to create the model in
agent_function (callable) – The agent function callable to serialize and upload
name (str) – The name of the agent
memory (int) – Memory (in GB) for hosting the agent
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
description (str) – A description of the agent.
evaluation_feature_group_id (str) – The ID of the feature group to use for evaluation.
workflow_graph (WorkflowGraph) – The workflow graph for the agent.

update_agent_with_function(model_id, agent_function, memory=None, package_requirements=None, enable_binary_input=None, description=None, workflow_graph=None)

[Deprecated] Updates the agent with a new agent function.

Parameters:

model_id (str) – The unique ID associated with the AI Agent to be changed.
agent_function (callable) – The new agent function callable to serialize and upload
memory (int) – Memory (in GB) for hosting the agent
package_requirements (List) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
enable_binary_input (bool) – If True, the agent will be able to accept binary data as inputs.
description (str) – A description of the agent.
workflow_graph (WorkflowGraph) – The workflow graph for the agent.

_attempt_deployment_sql_execution(sql)

execute_feature_group_sql(sql, fix_query_on_error=False, timeout=3600, delay=2, use_latest_version=True)

Execute a SQL query on the feature groups

Parameters:

sql (str) – The SQL query to execute.
fix_query_on_error (bool) – If enabled, SQL query is auto fixed if parsing fails.
use_latest_version (bool) – If enabled, executes the query on the latest version of the feature group, and if version doesn’t exist, FailedDependencyError is sent. If disabled, query is executed considering the latest feature group state irrespective of the latest version of the feature group. Defaults to True

Returns:

The result of the query.

Return type:

pandas.DataFrame

_get_agent_client_type()

Returns the client type for the current request context.

Returns:: The client type for the current request context.
Return type:: AgentClientType

get_agent_context_chat_history()

Gets a history of chat messages from the current request context. Applicable within a AIAgent execute function.

Returns:: The chat history for the current request being processed by the Agent.
Return type:: List[AgentChatMessage]

set_agent_context_chat_history(chat_history)

Sets the history of chat messages from the current request context.

Parameters:: chat_history (List[AgentChatMessage]) – The chat history associated with the current request context.

get_agent_context_chat_history_for_llm()

Gets a history of chat messages from the current request context. Applicable within a AIAgent execute function.

Returns:: The messages in format suitable for llm.
Return type:: AgentConversation

get_agent_context_conversation_id()

Gets the deployment conversation ID from the current request context. Applicable within a AIAgent execute function.

Returns:: The deployment conversation ID for the current request being processed by the Agent.
Return type:: str

set_agent_context_conversation_id(conversation_id)

Sets the deployment conversation ID from the current request context.

Parameters:: conversation_id (str) – The deployment conversation ID for the current request being processed by the Agent.

get_agent_context_external_session_id()

Gets the external session ID from the current request context if it has been set with the request. Applicable within a AIAgent execute function.

Returns:: The external session ID for the current request being processed by the Agent.
Return type:: str

set_agent_context_external_session_id(external_session_id)

Sets the external session ID from the current request context if it has been set with the request.

Parameters:: external_session_id (str) – The external session ID for the current request being processed by the Agent.

get_agent_context_doc_ids()

Gets the document ID from the current request context if a document has been uploaded with the request. Applicable within a AIAgent execute function.

Returns:: The document IDs the current request being processed by the Agent.
Return type:: List[str]

set_agent_context_doc_ids(doc_ids)

Sets the doc_ids from the current request context.

Parameters:: doc_ids (List[str]) – The doc_ids associated with the current request context.

get_agent_context_doc_infos()

Gets the document information from the current request context if documents have been uploaded with the request. Applicable within a AIAgent execute function.

Returns:: The document information for the current request being processed by the Agent.
Return type:: List[dict]

set_agent_context_doc_infos(doc_infos)

Sets the doc_infos in the current request context.

Parameters:: doc_infos (List[dict]) – The document information associated with the current request context.

get_agent_context_blob_inputs()

Gets the BlobInputs from the current request context if a document has been uploaded with the request. Applicable within a AIAgent execute function.

Returns:: The BlobInput the current request being processed by the Agent.
Return type:: List[BlobInput]

get_agent_context_user_info()

Gets information about the user interacting with the agent and user action if applicable. Applicable within a AIAgent execute function.

Returns:: Containing email and name of the end user.
Return type:: dict

get_runtime_config(key)

Retrieve the value of a specified configuration key from the deployment’s runtime settings. These settings can be configured in the deployment details page in the UI. Currently supported for AI Agents, Custom Python Model and Prediction Operators.

Parameters:: key (str) – The configuration key whose value is to be fetched.
Returns:: The value associated with the specified configuration key, or None if the key does not exist.
Return type:: str

get_request_user_info()

Gets the user information for the current request context.

Returns:: Containing email and name of the end user.
Return type:: dict

clear_agent_context(): Clears the current request context.

execute_chatllm_computer_streaming(computer_id, prompt, is_transient=False)

Executes a prompt on a remote computer and streams computer responses to the external chat UI in real-time. Must be called from agent execution context only.

Parameters:

computer_id (str) – The ID of the computer to use for the agent.
prompt (str) – The prompt to do tasks on the computer.
is_transient (bool) – If True, the message will be marked as transient and will not be persisted on reload in external chatllm UI. Transient messages are useful for streaming interim updates or results.

Returns:

The text responses from the computer.

Return type:

text (str)

streaming_evaluate_prompt(prompt=None, system_message=None, llm_name=None, max_tokens=None, temperature=0.0, messages=None, response_type=None, json_response_schema=None, section_key=None)

Generate response to the prompt using the specified model. This works the same as evaluate_prompt but would stream the text to the UI section while generating and returns the streamed text as an object of a str subclass.

Parameters:

prompt (str) – Prompt to use for generation.
system_message (str) – System prompt for models that support it.
llm_name (LLMName) – Name of the underlying LLM to be used for generation. Default is auto selection.
max_tokens (int) – Maximum number of tokens to generate. If set, the model will just stop generating after this token limit is reached.
temperature (float) – Temperature to use for generation. Higher temperature makes more non-deterministic responses, a value of zero makes mostly deterministic reponses. Default is 0.0. A range of 0.0 - 2.0 is allowed.
messages (list) – A list of messages to use as conversation history. For completion models like OPENAI_GPT3_5_TEXT and PALM_TEXT this should not be set. A message is a dict with attributes: is_user (bool): Whether the message is from the user. text (str): The message’s text.
response_type (str) – Specifies the type of response to request from the LLM. One of ‘text’ and ‘json’. If set to ‘json’, the LLM will respond with a json formatted string whose schema can be specified json_response_schema. Defaults to ‘text’
json_response_schema (dict) – A dictionary specifying the keys/schema/parameters which LLM should adhere to in its response when response_type is ‘json’. Each parameter is mapped to a dict with the following info - type (str) (required): Data type of the parameter description (str) (required): Description of the parameter is_required (bool) (optional): Whether the parameter is required or not. Example: json_response_schema={ ‘title’: {‘type’: ‘string’, ‘description’: ‘Article title’, ‘is_required’: true}, ‘body’: {‘type’: ‘string’, ‘description’: ‘Article body’}, }
section_key (str) – Key to identify output schema section.

Returns:

The response from the model.

Return type:

text (str)

execute_python(source_code)

Executes the given source code.

Parameters:: source_code (str) – The source code to execute.
Returns:: stdout, stderr, exception for source_code execution

_get_agent_app_request_id()

Gets the current request ID for the current request context of async app. Applicable within a AIAgent execute function.

Returns:: The request ID for the current request being processed by the Agent.
Return type:: str

_get_agent_caller()

Gets the caller for the current request context. Applicable within a AIAgent execute function.

Returns:: The caller for the current request being processed by the Agent.
Return type:: str

_is_proxy_app_caller()

Checks if the caller is cluster-proxy app.

Returns:: True if the caller is cluster-proxy app.
Return type:: bool

_is_async_app_caller()

Checks if the caller is async app.

Returns:: True if the caller is async app.
Return type:: bool

stream_message(message, is_transient=False)

Streams a message to the current request context. Applicable within a AIAgent execute function. If the request is from the abacus.ai app, the response will be streamed to the UI. otherwise would be logged info if used from notebook or python script.

Parameters:

message (str) – The message to be streamed.
is_transient (bool) – If True, the message will be marked as transient and will not be persisted on reload in external chatllm UI. Transient messages are useful for streaming interim updates or results.

Return type:

None

stream_section_output(section_key, value)

Streams value corresponding to a particular section to the current request context. Applicable within a AIAgent execute function. If the request is from the abacus.ai app, the response will be streamed to the UI. otherwise would be logged info if used from notebook or python script.

Parameters:

section_key (str) – The section key to which the output corresponds.
value (Any) – The output contents.

Return type:

None

stream_response_section(response_section)

Streams a response section to the current request context. Applicable within a AIAgent execute function. If the request is from the abacus.ai app, the response will be streamed to the UI. otherwise returned as part of response if used from notebook or python script.

Parameters:: response_section (ResponseSection) – The response section to be streamed.

_stream_llm_call(section_key=None, **kwargs)

_call_aiagent_app_send_message(request_id, caller, message=None, segment=None, llm_args=None, message_args=None, extra_args=None, proxy_caller=False)

Calls the AI Agent app send message endpoint.

Parameters:

request_id (str) – The request ID for the current request being processed by the Agent.
caller (str) – The caller for the current request being processed by the Agent.
message (str) – The message to send to the AsyncApp.
llm_args (dict) – The LLM arguments to send to the AsyncApp.

Returns:

The response from the AsyncApp.

Return type:

str

_status_poll(url, wait_states, method, body={}, headers=None, delay=1, timeout=1200)

Parameters:

url (str)
wait_states (set)
method (str)
body (dict)
headers (dict)
delay (int)
timeout (int)

execute_data_query_using_llm(query, feature_group_ids, prompt_context=None, llm_name=None, temperature=None, preview=False, schema_document_retriever_ids=None, timeout=3600, delay=2, use_latest_version=True)

Execute a data query using a large language model.

Parameters:

query (str) – The natural language query to execute. The query is converted to a SQL query using the language model.
feature_group_ids (List[str]) – A list of feature group IDs that the query should be executed against.
prompt_context (str) – The context message used to construct the prompt for the language model. If not provide, a default context message is used.
llm_name (str) – The name of the language model to use. If not provided, the default language model is used.
temperature (float) – The temperature to use for the language model if supported. If not provided, the default temperature is used.
preview (bool) – If True, a preview of the query execution is returned.
schema_document_retriever_ids (List[str]) – A list of document retrievers to retrieve schema information for the data query. Otherwise, they are retrieved from the feature group metadata.
timeout (int) – Time limit for the call.
delay (int) – Polling interval for checking timeout.
use_latest_version (bool) – If enabled, executes the query on the latest version of the feature group, and if version doesn’t exist, FailedDependencyError is sent. If disabled, query is executed considering the latest feature group state irrespective of the latest version of the feature group. Defaults to True.

Returns:

The result of the query execution. Execution results could be loaded as pandas using ‘load_as_pandas’, i.e., result.execution.load_as_pandas().

Return type:

LlmExecutionResult

_get_doc_retriever_deployment_info(document_retriever_id)

Parameters:: document_retriever_id (str)

get_matching_documents(document_retriever_id, query, filters=None, limit=None, result_columns=None, max_words=None, num_retrieval_margin_words=None, max_words_per_chunk=None, score_multiplier_column=None, min_score=None, required_phrases=None, filter_clause=None, crowding_limits=None, include_text_search=False)

Lookup document retrievers and return the matching documents from the document retriever deployed with given query.

Original documents are splitted into chunks and stored in the document retriever. This lookup function will return the relevant chunks from the document retriever. The returned chunks could be expanded to include more words from the original documents and merged if they are overlapping, and permitted by the settings provided. The returned chunks are sorted by relevance.

Parameters:

document_retriever_id (str) – A unique string identifier associated with the document retriever.
query (str) – The query to search for.
filters (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
limit (int) – If provided, will limit the number of results to the value specified.
result_columns (list) – If provided, will limit the column properties present in each result to those specified in this list.
max_words (int) – If provided, will limit the total number of words in the results to the value specified.
num_retrieval_margin_words (int) – If provided, will add this number of words from left and right of the returned chunks.
max_words_per_chunk (int) – If provided, will limit the number of words in each chunk to the value specified. If the value provided is smaller than the actual size of chunk on disk, which is determined during document retriever creation, the actual size of chunk will be used. I.e, chunks looked up from document retrievers will not be split into smaller chunks during lookup due to this setting.
score_multiplier_column (str) – If provided, will use the values in this column to modify the relevance score of the returned chunks. Values in this column must be numeric.
min_score (float) – If provided, will filter out the results with score lower than the value specified.
required_phrases (list) – If provided, each result will have at least one of the phrases.
filter_clause (str) – If provided, filter the results of the query using this sql where clause.
crowding_limits (dict) – A dictionary mapping metadata columns to the maximum number of results per unique value of the column. This is used to ensure diversity of metadata attribute values in the results. If a particular attribute value has already reached its maximum count, further results with that same attribute value will be excluded from the final result set.
include_text_search (bool) – If true, combine the ranking of results from a BM25 text search over the documents with the vector search using reciprocal rank fusion. It leverages both lexical and semantic matching for better overall results. It’s particularly valuable in professional, technical, or specialized fields where both precision in terminology and understanding of context are important.

Returns:

The relevant documentation results found from the document retriever.

Return type:

list[DocumentRetrieverLookupResult]

create_model_from_files(project_id, location, name=None, custom_artifact_filenames={}, model_config={})

Creates a new Model and returns Upload IDs for uploading the model artifacts.

Use this in supported use cases to provide a pre-trained model and supporting artifacts to be hosted on our platform.

Parameters:

project_id (str) – Unique string identifier associated with the project.
location (str) – Cloud location for the model.
name (str) – Name you want your model to have. Defaults to “<Project Name> Model”.
custom_artifact_filenames (dict) – Optional mapping to specify which filename should be used for a given model artifact type.
model_config (dict) – Extra configurations that are specific to the model being created.

Returns:

The new model which is being trained.

Return type:

Model

create_model_from_local_files(project_id, name=None, optional_artifacts=None, model_config={})

Creates a new Model and returns Upload IDs for uploading the model artifacts.

Use this in supported use cases to provide a pre-trained model and supporting artifacts to be hosted on our platform.

Parameters:

project_id (str) – The unique ID associated with the project.
name (str) – The name you want your model to have. Defaults to “<Project Name> Model”.
optional_artifacts (list) – A list of strings describing additional artifacts for the model. An example would be a verification file.
model_config (dict) – Extra configurations that are specific to the model being created.

Returns:

Collection of upload IDs to upload the model artifacts.

Return type:

ModelUpload

create_model_version_from_files(model_id)

Creates a new Model Version by re-importing from the paths specified when the model was created.

Parameters:: model_id (str) – Unique string identifier of the model to create a new version of with the new model artifacts.
Returns:: The updated model.
Return type:: ModelVersion

create_model_version_from_local_files(model_id, optional_artifacts=None)

Creates a new Model Version and returns Upload IDs for uploading the associated model artifacts.

Parameters:

model_id (str) – Unique string identifier of the model to create a new version of with the new model artifacts.
optional_artifacts (list) – List of strings describing additional artifacts for the model, e.g. a verification file.

Returns:

Collection of upload IDs to upload the model artifacts.

Return type:

ModelUpload

get_streaming_chat_response(deployment_token, deployment_id, messages, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, ignore_documents=False, include_search_results=False)

Return an asynchronous generator which continues the conversation based on the input messages and search results.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
messages (list) – A list of chronologically ordered messages, starting with a user message and alternating sources. A message is a dict with attributes: is_user (bool): Whether the message is from the user. text (str): The message’s text.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifying the query chat config override.
ignore_documents (bool) – If True, will ignore any documents and search results, and only use the messages to generate a response.
include_search_results (bool) – If True, will also return search results, if relevant.

get_streaming_conversation_response(deployment_token, deployment_id, message, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, ignore_documents=False, include_search_results=False)

Return an asynchronous generator which continues the conversation based on the input messages and search results.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
message (str) – A message from the user
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifying the query chat config override.
ignore_documents (bool) – If True, will ignore any documents and search results, and only use the messages to generate a response.
include_search_results (bool) – If True, will also return search results, if relevant.

execute_conversation_agent_streaming(deployment_token, deployment_id, arguments=None, keyword_arguments=None, deployment_conversation_id=None, external_session_id=None, regenerate=False, doc_infos=None, agent_workflow_node_id=None)

Return an asynchronous generator which gives out the agent response stream.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
arguments (list) – A list of arguments to pass to the agent.
keyword_arguments (dict) – A dictionary of keyword arguments to pass to the agent.
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
regenerate (bool) – If True, will regenerate the conversation from the start.
doc_infos (list) – A list of dictionaries containing information about the documents uploaded with the request.
agent_workflow_node_id (str) – The unique identifier of the agent workflow node to trigger. If not specified, the primary node will be used.

set_cache_scope(scope)

Set the scope of the cache, for example, deployment id.

Parameters:: scope (String) – The key of the cache entry.
Returns:: None

clear_cache_scope(): Clear the scope set before, and let it to automatically figure out the scope to use. If nothing found, will run in local.

set_scoped_cache_value(key, value, expiration_time=21600)

Set the value to key in the cache scope. Scope will be automatically figured out inside a deployment, or set with set_cache_scope. If no scope found, will run in local.

Parameters:

key (String) – The key of the cache entry.
value (String) – The value of the cache entry. Only string, integer and float numbers are supported now.
expiration_time (int) – How long to keep the cache key before expire, in seconds. Default is 6h.

Returns:

None

Raises:

InvalidParameterError – If key, value or expiration_time is invalid.

get_scoped_cache_value(key)

Get the value of the key in the cache scope. Scope will be automatically figured out inside a deployment, or set with set_cache_scope. If no scope found, will run in local.

Parameters:: key (String) – The key of the cache entry.
Returns:: The value of the key
Return type:: value (String)
Raises:: Generic404Error – if the key doesn’t exist.

delete_scoped_cache_key(key)

Delete the value of the key in the cache scope. Scope will be automatically figured out inside a deployment, or set with set_cache_scope. If no scope found, will run in local.

Parameters:: key (String) – The key of the cache entry.
Returns:: None

set_agent_response_document_sources(response_document_sources)

Sets the document sources to be shown with the response on the conversation UI.

Parameters:: response_document_sources (List) – List of document retriever results to be displayed in order.
Returns:: None

get_initialized_data(): Returns the object returned by the initialize_function during agent creation. :returns: Object returned in the initialize_function.

execute_agent_with_binary_data(deployment_token, deployment_id, arguments=None, keyword_arguments=None, deployment_conversation_id=None, external_session_id=None, blobs=None)

Executes a deployed AI agent function with binary data as inputs.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
arguments (list) – Positional arguments to the agent execute function.
keyword_arguments (dict) – A dictionary where each ‘key’ represents the parameter name and its corresponding ‘value’ represents the value of that parameter for the agent execute function.
deployment_conversation_id (str) – A unique string identifier for the deployment conversation used for the conversation.
external_session_id (str) – A unique string identifier for the session used for the conversation. If both deployment_conversation_id and external_session_id are not provided, a new session will be created.
blobs (None) – A dictionary of binary data to use as inputs to the agent execute function.

Returns:

The result of the agent execution

Return type:

AgentDataExecutionResult

add_user_to_organization(email)

Invite a user to your organization. This method will send the specified email address an invitation link to join your organization.

Parameters:: email (str) – The email address to invite to your organization.

create_organization_group(group_name, permissions, default_group=False)

Creates a new Organization Group.

Parameters:

group_name (str) – The name of the group.
permissions (list) – The list of permissions to initialize the group with.
default_group (bool) – If True, this group will replace the current default group.

Returns:

Information about the created Organization Group.

Return type:

OrganizationGroup

add_organization_group_permission(organization_group_id, permission)

Adds a permission to the specified Organization Group.

Parameters:

organization_group_id (str) – Unique string identifier of the Organization Group.
permission (str) – Permission to add to the Organization Group.

remove_organization_group_permission(organization_group_id, permission)

Removes a permission from the specified Organization Group.

Parameters:

organization_group_id (str) – Unique string identifier of the Organization Group.
permission (str) – The permission to remove from the Organization Group.

delete_organization_group(organization_group_id)

Deletes the specified Organization Group

Parameters:: organization_group_id (str) – Unique string identifier of the organization group.

add_user_to_organization_group(organization_group_id, email)

Adds a user to the specified Organization Group.

Parameters:

organization_group_id (str) – Unique string identifier of the Organization Group.
email (str) – Email of the user to be added to the group.

remove_user_from_organization_group(organization_group_id, email)

Removes a user from an Organization Group.

Parameters:

organization_group_id (str) – Unique string identifier of the Organization Group.
email (str) – Email of the user to remove.

set_default_organization_group(organization_group_id)

Sets the default Organization Group to which all new users joining an organization are automatically added.

Parameters:: organization_group_id (str) – Unique string identifier of the Organization Group.

delete_api_key(api_key_id)

Delete a specified API key.

Parameters:: api_key_id (str) – The ID of the API key to delete.

remove_user_from_organization(email)

Removes the specified user from the organization. You can remove yourself; otherwise, you must be an organization administrator to use this method to remove other users from the organization.

Parameters:: email (str) – The email address of the user to remove from the organization.

send_email(email, subject, body, is_html=False, attachments=None)

Send an email to the specified email address with provided subject and contents.

Parameters:

email (str) – The email address to send the email to.
subject (str) – The subject of the email.
body (str) – The body of the email.
is_html (bool) – Whether the body is html or not.
attachments (None) – A dictionary where the key is the filename (including the file extension), and the value is either a file-like object (e.g., an open file in binary mode) or raw file data (e.g., bytes).

create_deployment_webhook(deployment_id, endpoint, webhook_event_type, payload_template=None)

Create a webhook attached to a given deployment ID.

Parameters:

deployment_id (str) – Unique string identifier for the deployment this webhook will attach to.
endpoint (str) – URI that the webhook will send HTTP POST requests to.
webhook_event_type (str) – One of ‘DEPLOYMENT_START’, ‘DEPLOYMENT_SUCCESS’, or ‘DEPLOYMENT_FAILED’.
payload_template (dict) – Template for the body of the HTTP POST requests. Defaults to {}.

Returns:

The webhook attached to the deployment.

Return type:

Webhook

update_webhook(webhook_id, endpoint=None, webhook_event_type=None, payload_template=None)

Update the webhook

Parameters:

webhook_id (str) – The ID of the webhook to be updated.
endpoint (str) – If provided, changes the webhook’s endpoint.
webhook_event_type (str) – If provided, changes the event type.
payload_template (dict) – If provided, changes the payload template.

delete_webhook(webhook_id)

Delete the webhook

Parameters:: webhook_id (str) – Unique identifier of the target webhook.

create_project(name, use_case)

Creates a project with the specified project name and use case. Creating a project creates a container for all datasets and models associated with a particular problem/project. For example, if you want to create a model to detect fraud, you need to first create a project, upload datasets, create feature groups, and then create one or more models to get predictions for your use case.

Parameters:

name (str) – The project’s name.
use_case (str) – The use case that the project solves. Refer to our [guide on use cases](https://api.abacus.ai/app/help/useCases) for further details of each use case. The following enums are currently available for you to choose from: LANGUAGE_DETECTION, NLP_SENTIMENT, NLP_SEARCH, NLP_CHAT, CHAT_LLM, NLP_SENTENCE_BOUNDARY_DETECTION, NLP_CLASSIFICATION, NLP_SUMMARIZATION, NLP_DOCUMENT_VISUALIZATION, AI_AGENT, EMBEDDINGS_ONLY, MODEL_WITH_EMBEDDINGS, TORCH_MODEL, TORCH_MODEL_WITH_EMBEDDINGS, PYTHON_MODEL, NOTEBOOK_PYTHON_MODEL, DOCKER_MODEL, DOCKER_MODEL_WITH_EMBEDDINGS, CUSTOMER_CHURN, ENERGY, EVENT_ANOMALY_DETECTION, FINANCIAL_METRICS, CUMULATIVE_FORECASTING, FRAUD_ACCOUNT, FRAUD_TRANSACTIONS, CLOUD_SPEND, TIMESERIES_ANOMALY, OPERATIONS_MAINTENANCE, PERS_PROMOTIONS, PREDICTING, FEATURE_STORE, RETAIL, SALES_FORECASTING, SALES_SCORING, FEED_RECOMMEND, USER_RANKINGS, NAMED_ENTITY_RECOGNITION, USER_RECOMMENDATIONS, USER_RELATED, VISION, VISION_REGRESSION, VISION_OBJECT_DETECTION, FEATURE_DRIFT, SCHEDULING, GENERIC_FORECASTING, PRETRAINED_IMAGE_TEXT_DESCRIPTION, PRETRAINED_SPEECH_RECOGNITION, PRETRAINED_STYLE_TRANSFER, PRETRAINED_TEXT_TO_IMAGE_GENERATION, PRETRAINED_OCR_DOCUMENT_TO_TEXT, THEME_ANALYSIS, CLUSTERING, CLUSTERING_TIMESERIES, FINETUNED_LLM, PRETRAINED_INSTRUCT_PIX2PIX, PRETRAINED_TEXT_CLASSIFICATION.

Returns:

This object represents the newly created project.

Return type:

Project

rename_project(project_id, name)

This method renames a project after it is created.

Parameters:

project_id (str) – The unique identifier for the project.
name (str) – The new name for the project.

delete_project(project_id, force_delete=False)

Delete a specified project from your organization.

This method deletes the project, its associated trained models, and deployments. The datasets attached to the specified project remain available for use with other projects in the organization.

This method will not delete a project that contains active deployments. Ensure that all active deployments are stopped before using the delete option.

Note: All projects, models, and deployments cannot be recovered once they are deleted.

Parameters:

project_id (str) – The unique ID of the project to delete.
force_delete (bool) – If True, the project will be deleted even if it has active deployments.

add_project_tags(project_id, tags)

This method adds a tag to a project.

Parameters:

project_id (str) – The unique identifier for the project.
tags (list) – The tags to add to the project.

remove_project_tags(project_id, tags)

This method removes a tag from a project.

Parameters:

project_id (str) – The unique identifier for the project.
tags (list) – The tags to remove from the project.

get_raw_data_from_realtime_dataset(dataset_id, check_permissions=False, start_time=None, end_time=None, column_filter=None)

Returns raw data from a realtime dataset. Only Microsoft Teams datasets are supported currently due to data size constraints in realtime datasets.

Parameters:

dataset_id (str) – The unique ID associated with the dataset.
check_permissions (bool) – If True, checks user permissions using session email.
start_time (str) – Start time filter (inclusive) for created_date_time_t in ISO 8601 format (e.g. 2025-05-13T08:25:11Z or 2025-05-13T08:25:11+00:00).
end_time (str) – End time filter (inclusive) for created_date_time_t in ISO 8601 format (e.g. 2025-05-13T08:25:11Z or 2025-05-13T08:25:11+00:00).
column_filter (dict) – Dictionary mapping column names to filter values. Only rows matching all column filters will be returned.

Return type:

Dict

add_feature_group_to_project(feature_group_id, project_id, feature_group_type='CUSTOM_TABLE')

Adds a feature group to a project.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
project_id (str) – The unique ID associated with the project.
feature_group_type (str) – The feature group type of the feature group, based on the use case under which the feature group is being created.

set_project_feature_group_config(feature_group_id, project_id, project_config=None)

Sets a feature group’s project config

Parameters:

feature_group_id (str) – Unique string identifier for the feature group.
project_id (str) – Unique string identifier for the project.
project_config (ProjectFeatureGroupConfig) – Feature group’s project configuration.

remove_feature_group_from_project(feature_group_id, project_id)

Removes a feature group from a project.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
project_id (str) – The unique ID associated with the project.

set_feature_group_type(feature_group_id, project_id, feature_group_type='CUSTOM_TABLE')

Update the feature group type in a project. The feature group must already be added to the project.

Parameters:

feature_group_id (str) – Unique identifier associated with the feature group.
project_id (str) – Unique identifier associated with the project.
feature_group_type (str) – The feature group type to set the feature group as.

set_feature_mapping(project_id, feature_group_id, feature_name, feature_mapping=None, nested_column_name=None)

Set a column’s feature mapping. If the column mapping is single-use and already set in another column in this feature group, this call will first remove the other column’s mapping and move it to this column.

Parameters:

project_id (str) – The unique ID associated with the project.
feature_group_id (str) – The unique ID associated with the feature group.
feature_name (str) – The name of the feature.
feature_mapping (str) – The mapping of the feature in the feature group.
nested_column_name (str) – The name of the nested column if the input feature is part of a nested feature group for the given feature_group_id.

Returns:

A list of objects that describes the resulting feature group’s schema after the feature’s featureMapping is set.

Return type:

list[Feature]

add_annotation(annotation, feature_group_id, feature_name, doc_id=None, feature_group_row_identifier=None, annotation_source='ui', status=None, comments=None, project_id=None, save_metadata=False, pages=None)

Add an annotation entry to the database.

Parameters:

annotation (dict) – The annotation to add. Format of the annotation is determined by its annotation type.
feature_group_id (str) – The ID of the feature group the annotation is on.
feature_name (str) – The name of the feature the annotation is on.
doc_id (str) – The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the feature group’s primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
annotation_source (str) – Indicator of whether the annotation came from the UI, bulk upload, etc.
status (str) – The status of the annotation. Can be one of ‘todo’, ‘in_progress’, ‘done’. This is optional.
comments (dict) – Comments for the annotation. This is a dictionary of feature name to the corresponding comment. This is optional.
project_id (str) – The ID of the project that the annotation is associated with. This is optional.
save_metadata (bool) – Whether to save the metadata for the annotation. This is optional.
pages (list) – pages (list): List of page numbers to consider while processing the annotation. This is optional. doc_id must be provided if pages is provided.

Returns:

The annotation entry that was added.

Return type:

AnnotationEntry

describe_annotation(feature_group_id, feature_name=None, doc_id=None, feature_group_row_identifier=None)

Get the latest annotation entry for a given feature group, feature, and document.

Parameters:

feature_group_id (str) – The ID of the feature group the annotation is on.
feature_name (str) – The name of the feature the annotation is on.
doc_id (str) – The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the feature group’s primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.

Returns:

The latest annotation entry for the given feature group, feature, document, and/or annotation key value.

Return type:

AnnotationEntry

update_annotation_status(feature_group_id, feature_name, status, doc_id=None, feature_group_row_identifier=None, save_metadata=False)

Update the status of an annotation entry.

Parameters:

feature_group_id (str) – The ID of the feature group the annotation is on.
feature_name (str) – The name of the feature the annotation is on.
status (str) – The new status of the annotation. Must be one of the following: ‘TODO’, ‘IN_PROGRESS’, ‘DONE’.
doc_id (str) – The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the feature group’s primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
save_metadata (bool) – If True, save the metadata for the annotation entry.

Returns:

The updated annotation entry.

Return type:

AnnotationEntry

get_document_to_annotate(feature_group_id, project_id, feature_name, feature_group_row_identifier=None, get_previous=False)

Get an available document that needs to be annotated for a annotation feature group.

Parameters:

feature_group_id (str) – The ID of the feature group the annotation is on.
project_id (str) – The ID of the project that the annotation is associated with.
feature_name (str) – The name of the feature the annotation is on.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the primary key value. If provided, fetch the immediate next (or previous) available document.
get_previous (bool) – If True, get the previous document instead of the next document. Applicable if feature_group_row_identifier is provided.

Returns:

The document to annotate.

Return type:

AnnotationDocument

import_annotation_labels(feature_group_id, file, annotation_type)

Imports annotation labels from csv file. All valid values in the file will be imported as labels (including header row if present).

Parameters:

feature_group_id (str) – The unique string identifier of the feature group.
file (io.TextIOBase) – The file to import. Must be a csv file.
annotation_type (str) – The type of the annotation.

Returns:

The annotation config for the feature group.

Return type:

AnnotationConfig

create_feature_group(table_name, sql, description=None, version_limit=30)

Creates a new FeatureGroup from a SQL statement.

Parameters:

table_name (str) – The unique name to be given to the FeatureGroup. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
sql (str) – Input SQL statement for forming the FeatureGroup.
description (str) – The description about the FeatureGroup.
version_limit (int) – The number of versions to preserve for the FeatureGroup (minimum 30).

Returns:

The created FeatureGroup.

Return type:

FeatureGroup

create_feature_group_from_template(table_name, feature_group_template_id, template_bindings=None, should_attach_feature_group_to_template=True, description=None, version_limit=30)

Creates a new feature group from a SQL statement.

Parameters:

table_name (str) – The unique name to be given to the feature group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
feature_group_template_id (str) – The unique ID associated with the template that will be used to create this feature group.
template_bindings (list) – Variable bindings that override the template’s variable values.
should_attach_feature_group_to_template (bool) – Set to False to create a feature group but not leave it attached to the template that created it.
description (str) – A user-friendly description of this feature group.
version_limit (int) – The number of versions to preserve for the feature group (minimum 30).

Returns:

The created feature group.

Return type:

FeatureGroup

create_feature_group_from_function(table_name, function_source_code=None, function_name=None, input_feature_groups=None, description=None, cpu_size=None, memory=None, package_requirements=None, use_original_csv_names=False, python_function_name=None, python_function_bindings=None, use_gpu=None, version_limit=30)

Creates a new feature in a Feature Group from user-provided code. Currently supported code languages are Python.

If a list of input feature groups are supplied, we will provide DataFrames (pandas, in the case of Python) with the materialized feature groups for those input feature groups as arguments to the function.

This method expects the source code to be a valid language source file containing a function. This function needs to return a DataFrame when executed; this DataFrame will be used as the materialized version of this feature group table.

Parameters:

table_name (str) – The unique name to be given to the feature group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
function_source_code (str) – Contents of a valid source code file in a supported Feature Group specification language (currently only Python). The source code should contain a function called function_name. A list of allowed import and system libraries for each language is specified in the user functions documentation section.
function_name (str) – Name of the function found in the source code that will be executed (on the optional inputs) to materialize this feature group.
input_feature_groups (list) – List of feature group names that are supplied to the function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
description (str) – The description for this feature group.
cpu_size (CPUSize) – Size of the CPU for the feature group function.
memory (MemorySize) – Memory (in GB) for the feature group function.
package_requirements (list) – List of package requirements for the feature group function. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_original_csv_names (bool) – Defaults to False, if set it uses the original column names for input feature groups from CSV datasets.
python_function_name (str) – Name of Python Function that contains the source code and function arguments.
python_function_bindings (List) – List of python function arguments.
use_gpu (bool) – Whether the feature group needs a gpu or not. Otherwise default to CPU.
version_limit (int) – The number of versions to preserve for the feature group (minimum 30).

Returns:

The created feature group

Return type:

FeatureGroup

create_sampling_feature_group(feature_group_id, table_name, sampling_config, description=None)

Creates a new Feature Group defined as a sample of rows from another Feature Group.

For efficiency, sampling is approximate unless otherwise specified. (e.g. the number of rows may vary slightly from what was requested).

Parameters:

feature_group_id (str) – The unique ID associated with the pre-existing Feature Group that will be sampled by this new Feature Group. i.e. the input for sampling.
table_name (str) – The unique name to be given to this sampling Feature Group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
sampling_config (SamplingConfig) – Dictionary defining the sampling method and its parameters.
description (str) – A human-readable description of this Feature Group.

Returns:

The created Feature Group.

Return type:

FeatureGroup

create_merge_feature_group(source_feature_group_id, table_name, merge_config, description=None)

Creates a new feature group defined as the union of other feature group versions.

Args:
source_feature_group_id (str): Unique string identifier corresponding to the dataset feature group that will have its versions merged into this feature group. table_name (str): Unique string identifier to be given to this merge feature group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores. merge_config (MergeConfig): JSON object defining the merging method and its parameters. description (str): Human-readable description of this feature group.

Returns:
FeatureGroup: The created feature group.

Description: Creates a new feature group defined as the union of other feature group versions.

Parameters:

source_feature_group_id (str)
table_name (str)
merge_config (Union[dict, abacusai.api_class.MergeConfig])
description (str)

Return type:

abacusai.feature_group.FeatureGroup

create_operator_feature_group(source_feature_group_id, table_name, operator_config, description=None)

Creates a new Feature Group defined by a pre-defined operator applied to another Feature Group.

Parameters:

source_feature_group_id (str) – Unique string identifier corresponding to the Feature Group to which the operator will be applied.
table_name (str) – Unique string identifier for the operator Feature Group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
operator_config (OperatorConfig) – The operator config is used to define the operator and its parameters.
description (str) – Human-readable description of the Feature Group.

Returns:

The created Feature Group.

Return type:

FeatureGroup

create_snapshot_feature_group(feature_group_version, table_name)

Creates a Snapshot Feature Group corresponding to a specific Feature Group version.

Parameters:

feature_group_version (str) – Unique string identifier associated with the Feature Group version being snapshotted.
table_name (str) – Name for the newly created Snapshot Feature Group table. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.

Returns:

Feature Group corresponding to the newly created Snapshot.

Return type:

FeatureGroup

create_online_feature_group(table_name, primary_key, description=None)

Creates an Online Feature Group.

Parameters:

table_name (str) – Name for the newly created feature group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
primary_key (str) – The primary key for indexing the online feature group.
description (str) – Human-readable description of the Feature Group.

Returns:

The created online feature group.

Return type:

FeatureGroup

set_feature_group_sampling_config(feature_group_id, sampling_config)

Set a FeatureGroup’s sampling to the config values provided, so that the rows the FeatureGroup returns will be a sample of those it would otherwise have returned.

Parameters:

feature_group_id (str) – The unique identifier associated with the FeatureGroup.
sampling_config (SamplingConfig) – A JSON string object specifying the sampling method and parameters specific to that sampling method. An empty sampling_config indicates no sampling.

Returns:

The updated FeatureGroup.

Return type:

FeatureGroup

set_feature_group_merge_config(feature_group_id, merge_config)

Set a MergeFeatureGroup’s merge config to the values provided, so that the feature group only returns a bounded range of an incremental dataset.

Parameters:

feature_group_id (str) – Unique identifier associated with the feature group.
merge_config (MergeConfig) – JSON object string specifying the merge rule. An empty merge_config will default to only including the latest dataset version.

Returns:

The updated FeatureGroup.

Return type:

FeatureGroup

set_feature_group_operator_config(feature_group_id, operator_config)

Set a OperatorFeatureGroup’s operator config to the values provided.

Parameters:

feature_group_id (str) – A unique string identifier associated with the feature group.
operator_config (OperatorConfig) – A dictionary object specifying the pre-defined operations.

Returns:

The updated FeatureGroup.

Return type:

FeatureGroup

set_feature_group_schema(feature_group_id, schema)

Creates a new schema and points the feature group to the new feature group schema ID.

Parameters:

feature_group_id (str) – Unique string identifier associated with the feature group.
schema (list) – JSON string containing an array of objects with ‘name’ and ‘dataType’ properties.

create_feature(feature_group_id, name, select_expression)

Creates a new feature in a Feature Group from a SQL select statement.

Parameters:

feature_group_id (str) – The unique ID associated with the Feature Group.
name (str) – The name of the feature to add.
select_expression (str) – SQL SELECT expression to create the feature.

Returns:

A Feature Group object with the newly added feature.

Return type:

FeatureGroup

add_feature_group_tag(feature_group_id, tag)

Adds a tag to the feature group

Parameters:

feature_group_id (str) – Unique identifier of the feature group.
tag (str) – The tag to add to the feature group.

remove_feature_group_tag(feature_group_id, tag)

Removes a tag from the specified feature group.

Parameters:

feature_group_id (str) – Unique string identifier of the feature group.
tag (str) – The tag to remove from the feature group.

add_annotatable_feature(feature_group_id, name, annotation_type)

Add an annotatable feature in a Feature Group

Parameters:

feature_group_id (str) – The unique string identifier for the feature group.
name (str) – The name of the feature to add.
annotation_type (str) – The type of annotation to set.

Returns:

The feature group after the feature has been set

Return type:

FeatureGroup

set_feature_as_annotatable_feature(feature_group_id, feature_name, annotation_type, feature_group_row_identifier_feature=None, doc_id_feature=None)

Sets an existing feature as an annotatable feature (Feature that can be annotated).

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
feature_name (str) – The name of the feature to set as annotatable.
annotation_type (str) – The type of annotation label to add.
feature_group_row_identifier_feature (str) – The key value of the feature group row the annotation is on (cast to string) and uniquely identifies the feature group row. At least one of the doc_id or key value must be provided so that the correct annotation can be identified.
doc_id_feature (str) – The name of the document ID feature.

Returns:

A feature group object with the newly added annotatable feature.

Return type:

FeatureGroup

set_annotation_status_feature(feature_group_id, feature_name)

Sets a feature as the annotation status feature for a feature group.

Parameters:

feature_group_id (str) – The ID of the feature group.
feature_name (str) – The name of the feature to set as the annotation status feature.

Returns:

The updated feature group.

Return type:

FeatureGroup

unset_feature_as_annotatable_feature(feature_group_id, feature_name)

Unsets a feature as annotatable

Parameters:

feature_group_id (str) – The unique string identifier of the feature group.
feature_name (str) – The name of the feature to unset.

Returns:

The feature group after unsetting the feature

Return type:

FeatureGroup

add_feature_group_annotation_label(feature_group_id, label_name, annotation_type, label_definition=None)

Adds an annotation label

Parameters:

feature_group_id (str) – The unique string identifier of the feature group.
label_name (str) – The name of the label.
annotation_type (str) – The type of the annotation to set.
label_definition (str) – the definition of the label.

Returns:

The feature group after adding the annotation label

Return type:

FeatureGroup

remove_feature_group_annotation_label(feature_group_id, label_name)

Removes an annotation label

Parameters:

feature_group_id (str) – The unique string identifier of the feature group.
label_name (str) – The name of the label to remove.

Returns:

The feature group after adding the annotation label

Return type:

FeatureGroup

add_feature_tag(feature_group_id, feature, tag)

Adds a tag on a feature

Parameters:

feature_group_id (str) – The unique string identifier of the feature group.
feature (str) – The feature to set the tag on.
tag (str) – The tag to set on the feature.

remove_feature_tag(feature_group_id, feature, tag)

Removes a tag from a feature

Parameters:

feature_group_id (str) – The unique string identifier of the feature group.
feature (str) – The feature to remove the tag from.
tag (str) – The tag to remove.

create_nested_feature(feature_group_id, nested_feature_name, table_name, using_clause, where_clause=None, order_clause=None)

Creates a new nested feature in a feature group from a SQL statement.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
nested_feature_name (str) – The name of the feature.
table_name (str) – The table name of the feature group to nest. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
using_clause (str) – The SQL join column or logic to join the nested table with the parent.
where_clause (str) – A SQL WHERE statement to filter the nested rows.
order_clause (str) – A SQL clause to order the nested rows.

Returns:

A feature group object with the newly added nested feature.

Return type:

FeatureGroup

update_nested_feature(feature_group_id, nested_feature_name, table_name=None, using_clause=None, where_clause=None, order_clause=None, new_nested_feature_name=None)

Updates a previously existing nested feature in a feature group.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
nested_feature_name (str) – The name of the feature to be updated.
table_name (str) – The name of the table. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
using_clause (str) – The SQL join column or logic to join the nested table with the parent.
where_clause (str) – An SQL WHERE statement to filter the nested rows.
order_clause (str) – An SQL clause to order the nested rows.
new_nested_feature_name (str) – New name for the nested feature.

Returns:

A feature group object with the updated nested feature.

Return type:

FeatureGroup

delete_nested_feature(feature_group_id, nested_feature_name)

Delete a nested feature.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
nested_feature_name (str) – The name of the feature to be deleted.

Returns:

A feature group object without the specified nested feature.

Return type:

FeatureGroup

create_point_in_time_feature(feature_group_id, feature_name, history_table_name, aggregation_keys, timestamp_key, historical_timestamp_key, expression, lookback_window_seconds=None, lookback_window_lag_seconds=0, lookback_count=None, lookback_until_position=0)

Creates a new point in time feature in a feature group using another historical feature group, window spec, and aggregate expression.

We use the aggregation keys and either the lookbackWindowSeconds or the lookbackCount values to perform the window aggregation for every row in the current feature group.

If the window is specified in seconds, then all rows in the history table which match the aggregation keys and with historicalTimeFeature greater than or equal to lookbackStartCount and less than the value of the current rows timeFeature are considered. An optional lookbackWindowLagSeconds (+ve or -ve) can be used to offset the current value of the timeFeature. If this value is negative, we will look at the future rows in the history table, so care must be taken to ensure that these rows are available in the online context when we are performing a lookup on this feature group. If the window is specified in counts, then we order the historical table rows aligning by time and consider rows from the window where the rank order is greater than or equal to lookbackCount and includes the row just prior to the current one. The lag is specified in terms of positions using lookbackUntilPosition.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
feature_name (str) – The name of the feature to create.
history_table_name (str) – The table name of the history table.
aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation.
timestamp_key (str) – Name of feature which contains the timestamp value for the point in time feature.
historical_timestamp_key (str) – Name of feature which contains the historical timestamp.
expression (str) – SQL aggregate expression which can convert a sequence of rows into a scalar value.
lookback_window_seconds (float) – If window is specified in terms of time, number of seconds in the past from the current time for start of the window.
lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.

Returns:

A feature group object with the newly added nested feature.

Return type:

FeatureGroup

update_point_in_time_feature(feature_group_id, feature_name, history_table_name=None, aggregation_keys=None, timestamp_key=None, historical_timestamp_key=None, expression=None, lookback_window_seconds=None, lookback_window_lag_seconds=None, lookback_count=None, lookback_until_position=None, new_feature_name=None)

Updates an existing Point-in-Time (PiT) feature in a feature group. See createPointInTimeFeature for detailed semantics.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
feature_name (str) – The name of the feature.
history_table_name (str) – The table name of the history table. If not specified, we use the current table to do a self join.
aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation.
timestamp_key (str) – Name of the feature which contains the timestamp value for the PiT feature.
historical_timestamp_key (str) – Name of the feature which contains the historical timestamp.
expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.
lookback_window_seconds (float) – If the window is specified in terms of time, the number of seconds in the past from the current time for the start of the window.
lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of the window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If the window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of the window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
new_feature_name (str) – New name for the PiT feature.

Returns:

A feature group object with the newly added nested feature.

Return type:

FeatureGroup

create_point_in_time_group(feature_group_id, group_name, window_key, aggregation_keys, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=0, lookback_count=None, lookback_until_position=0)

Create a Point-in-Time Group

Parameters:

feature_group_id (str) – The unique ID associated with the feature group to add the point in time group to.
group_name (str) – The name of the point in time group.
window_key (str) – Name of feature to use for ordering the rows on the source table.
aggregation_keys (list) – List of keys to perform on the source table for the window aggregation.
history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used.
history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used.
history_aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys.
lookback_window (float) – Number of seconds in the past from the current time for the start of the window. If 0, the lookback will include all rows.
lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed. If it is negative, “future” rows in the history table are used.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed by that many rows. If it is negative, those many “future” rows in the history table are used.

Returns:

The feature group after the point in time group has been created.

Return type:

FeatureGroup

generate_point_in_time_features(feature_group_id, group_name, columns, window_functions, prefix=None)

Generates and adds PIT features given the selected columns to aggregate over, and the operations to include.

Parameters:

feature_group_id (str) – Unique string identifier associated with the feature group.
group_name (str) – Name of the point-in-time group.
columns (list) – List of columns to generate point-in-time features for.
window_functions (list) – List of window functions to operate on.
prefix (str) – Prefix for generated features, defaults to group name

Returns:

Feature group object with newly added point-in-time features.

Return type:

FeatureGroup

update_point_in_time_group(feature_group_id, group_name, window_key=None, aggregation_keys=None, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=None, lookback_count=None, lookback_until_position=None)

Update Point-in-Time Group

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
group_name (str) – The name of the point-in-time group.
window_key (str) – Name of feature which contains the timestamp value for the point-in-time feature.
aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation.
history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used.
history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used.
history_aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys.
lookback_window (float) – Number of seconds in the past from the current time for the start of the window.
lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed. If it is negative, future rows in the history table are looked at.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed by that many rows. If it is negative, those many future rows in the history table are looked at.

Returns:

The feature group after the update has been applied.

Return type:

FeatureGroup

delete_point_in_time_group(feature_group_id, group_name)

Delete point in time group

Parameters:

feature_group_id (str) – The unique identifier associated with the feature group.
group_name (str) – The name of the point in time group.

Returns:

The feature group after the point in time group has been deleted.

Return type:

FeatureGroup

create_point_in_time_group_feature(feature_group_id, group_name, name, expression)

Create point in time group feature

Parameters:

feature_group_id (str) – A unique string identifier associated with the feature group.
group_name (str) – The name of the point-in-time group.
name (str) – The name of the feature to add to the point-in-time group.
expression (str) – A SQL aggregate expression which can convert a sequence of rows into a scalar value.

Returns:

The feature group after the update has been applied.

Return type:

FeatureGroup

update_point_in_time_group_feature(feature_group_id, group_name, name, expression)

Update a feature’s SQL expression in a point in time group

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
group_name (str) – The name of the point-in-time group.
name (str) – The name of the feature to add to the point-in-time group.
expression (str) – SQL aggregate expression which can convert a sequence of rows into a scalar value.

Returns:

The feature group after the update has been applied.

Return type:

FeatureGroup

set_feature_type(feature_group_id, feature, feature_type, project_id=None)

Set the type of a feature in a feature group. Specify the feature group ID, feature name, and feature type, and the method will return the new column with the changes reflected.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
feature (str) – The name of the feature.
feature_type (str) – The machine learning type of the data in the feature.
project_id (str) – Optional unique ID associated with the project.

Returns:

The feature group after the data_type is applied.

Return type:

Schema

concatenate_feature_group_data(feature_group_id, source_feature_group_id, merge_type='UNION', replace_until_timestamp=None, skip_materialize=False)

Concatenates data from one Feature Group to another. Feature Groups can be merged if their schemas are compatible, they have the special updateTimestampKey column, and (if set) the primaryKey column. The second operand in the concatenate operation will be appended to the first operand (merge target).

Parameters:

feature_group_id (str) – The destination Feature Group.
source_feature_group_id (str) – The Feature Group to concatenate with the destination Feature Group.
merge_type (str) – UNION or INTERSECTION.
replace_until_timestamp (int) – The UNIX timestamp to specify the point until which we will replace data from the source Feature Group.
skip_materialize (bool) – If True, will not materialize the concatenated Feature Group.

remove_concatenation_config(feature_group_id)

Removes the concatenation config on a destination feature group.

Parameters:: feature_group_id (str) – Unique identifier of the destination feature group to remove the concatenation configuration from.

set_feature_group_indexing_config(feature_group_id, primary_key=None, update_timestamp_key=None, lookup_keys=None)

Sets various attributes of the feature group used for primary key, deployment lookups and streaming updates.

Parameters:

feature_group_id (str) – Unique string identifier for the feature group.
primary_key (str) – Name of the feature which defines the primary key of the feature group.
update_timestamp_key (str) – Name of the feature which defines the update timestamp of the feature group. Used in concatenation and primary key deduplication.
lookup_keys (list) – List of feature names which can be used in the lookup API to restrict the computation to a set of dataset rows. These feature names have to correspond to underlying dataset columns.

execute_async_feature_group_operation(query=None, fix_query_on_error=False, use_latest_version=True)

Starts the execution of fg operation

Parameters:

query (str) – The SQL to be executed.
fix_query_on_error (bool) – If enabled, SQL query is auto fixed if parsing fails.
use_latest_version (bool) – If enabled, executes the query on the latest version of the feature group, and if version doesn’t exist, FailedDependencyError is sent. If disabled, query is executed considering the latest feature group state irrespective of the latest version of the feature group.

Returns:

A dict that contains the execution status

Return type:

ExecuteFeatureGroupOperation

describe_async_feature_group_operation(feature_group_operation_run_id)

Gets the status of the execution of fg operation

Parameters:: feature_group_operation_run_id (str) – The unique ID associated with the execution.
Returns:: A dict that contains the execution status
Return type:: ExecuteFeatureGroupOperation

update_feature_group(feature_group_id, description=None)

Modify an existing Feature Group.

Parameters:

feature_group_id (str) – Unique identifier associated with the Feature Group.
description (str) – Description of the Feature Group.

Returns:

Updated Feature Group object.

Return type:

FeatureGroup

detach_feature_group_from_template(feature_group_id)

Update a feature group to detach it from a template.

Parameters:: feature_group_id (str) – Unique string identifier associated with the feature group.
Returns:: The updated feature group.
Return type:: FeatureGroup

update_feature_group_template_bindings(feature_group_id, template_bindings=None)

Update the feature group template bindings for a template feature group.

Parameters:

feature_group_id (str) – Unique string identifier associated with the feature group.
template_bindings (list) – Values in these bindings override values set in the template.

Returns:

Updated feature group.

Return type:

FeatureGroup

update_feature_group_python_function_bindings(feature_group_id, python_function_bindings)

Updates an existing Feature Group’s Python function bindings from a user-provided Python Function. If a list of feature groups are supplied within the Python function bindings, we will provide DataFrames (Pandas in the case of Python) with the materialized feature groups for those input feature groups as arguments to the function.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
python_function_bindings (List) – List of python function arguments.

update_feature_group_python_function(feature_group_id, python_function_name, python_function_bindings=None, cpu_size=None, memory=None, use_gpu=None, use_original_csv_names=None)

Updates an existing Feature Group’s python function from a user provided Python Function. If a list of feature groups are supplied within the python function

bindings, we will provide as arguments to the function DataFrame’s (pandas in the case of Python) with the materialized feature groups for those input feature groups.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
python_function_name (str) – The name of the python function to be associated with the feature group.
python_function_bindings (List) – List of python function arguments.
cpu_size (CPUSize) – Size of the CPU for the feature group python function.
memory (MemorySize) – Memory (in GB) for the feature group python function.
use_gpu (bool) – Whether the feature group needs a gpu or not. Otherwise default to CPU.
use_original_csv_names (bool) – If enabled, it uses the original column names for input feature groups from CSV datasets.

update_feature_group_sql_definition(feature_group_id, sql)

Updates the SQL statement for a feature group.

Parameters:

feature_group_id (str) – The unique identifier associated with the feature group.
sql (str) – The input SQL statement for the feature group.

Returns:

The updated feature group.

Return type:

FeatureGroup

update_dataset_feature_group_feature_expression(feature_group_id, feature_expression)

Updates the SQL feature expression for a Dataset FeatureGroup’s custom features

Parameters:

feature_group_id (str) – The unique identifier associated with the feature group.
feature_expression (str) – The input SQL statement for the feature group.

Returns:

The updated feature group.

Return type:

FeatureGroup

update_feature(feature_group_id, name, select_expression=None, new_name=None)

Modifies an existing feature in a feature group.

Parameters:

feature_group_id (str) – Unique identifier of the feature group.
name (str) – Name of the feature to be updated.
select_expression (str) – SQL statement for modifying the feature.
new_name (str) – New name of the feature.

Returns:

Updated feature group object.

Return type:

FeatureGroup

export_feature_group_version_to_file_connector(feature_group_version, location, export_file_format, overwrite=False)

Export Feature group to File Connector.

Parameters:

feature_group_version (str) – Unique string identifier for the feature group instance to export.
location (str) – Cloud file location to export to.
export_file_format (str) – Enum string specifying the file format to export to.
overwrite (bool) – If true and a file exists at this location, this process will overwrite the file.

Returns:

The FeatureGroupExport instance.

Return type:

FeatureGroupExport

export_feature_group_version_to_database_connector(feature_group_version, database_connector_id, object_name, write_mode, database_feature_mapping, id_column=None, additional_id_columns=None)

Export Feature group to Database Connector.

Parameters:

feature_group_version (str) – Unique string identifier for the Feature Group instance to export.
database_connector_id (str) – Unique string identifier for the Database Connector to export to.
object_name (str) – Name of the database object to write to.
write_mode (str) – Enum string indicating whether to use INSERT or UPSERT.
database_feature_mapping (dict) – Key/value pair JSON object of “database connector column” -> “feature name” pairs.
id_column (str) – Required if write_mode is UPSERT. Indicates which database column should be used as the lookup key.
additional_id_columns (list) – For database connectors which support it, additional ID columns to use as a complex key for upserting.

Returns:

The FeatureGroupExport instance.

Return type:

FeatureGroupExport

export_feature_group_version_to_console(feature_group_version, export_file_format)

Export Feature group to console.

Parameters:

feature_group_version (str) – Unique string identifier of the Feature Group instance to export.
export_file_format (str) – File format to export to.

Returns:

The FeatureGroupExport instance.

Return type:

FeatureGroupExport

set_feature_group_modifier_lock(feature_group_id, locked=True)

Lock a feature group to prevent modification.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
locked (bool) – Whether to disable or enable feature group modification (True or False).

add_user_to_feature_group_modifiers(feature_group_id, email)

Adds a user to a feature group.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
email (str) – The email address of the user to be added.

add_organization_group_to_feature_group_modifiers(feature_group_id, organization_group_id)

Add OrganizationGroup to a feature group modifiers list

Parameters:

feature_group_id (str) – Unique string identifier of the feature group.
organization_group_id (str) – Unique string identifier of the organization group.

remove_user_from_feature_group_modifiers(feature_group_id, email)

Removes a user from a specified feature group.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
email (str) – The email address of the user to be removed.

remove_organization_group_from_feature_group_modifiers(feature_group_id, organization_group_id)

Removes an OrganizationGroup from a feature group modifiers list

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
organization_group_id (str) – The unique ID associated with the organization group.

delete_feature(feature_group_id, name)

Removes a feature from the feature group.

Parameters:

feature_group_id (str) – Unique string identifier associated with the feature group.
name (str) – Name of the feature to be deleted.

Returns:

Updated feature group object.

Return type:

FeatureGroup

delete_feature_group(feature_group_id)

Deletes a Feature Group.

Parameters:: feature_group_id (str) – Unique string identifier for the feature group to be removed.

delete_feature_group_version(feature_group_version)

Deletes a Feature Group Version.

Parameters:: feature_group_version (str) – String identifier for the feature group version to be removed.

create_feature_group_version(feature_group_id, variable_bindings=None)

Creates a snapshot for a specified feature group. Triggers materialization of the feature group. The new version of the feature group is created after it has materialized.

Parameters:

feature_group_id (str) – Unique string identifier associated with the feature group.
variable_bindings (dict) – Dictionary defining variable bindings that override parent feature group values.

Returns:

A feature group version.

Return type:

FeatureGroupVersion

set_feature_group_export_connector_config(feature_group_id, feature_group_export_config=None)

Sets FG export config for the given feature group.

Parameters:

feature_group_id (str) – The unique ID associated with the pre-existing Feature Group for which export config is to be set.
feature_group_export_config (FeatureGroupExportConfig) – The export config to be set for the given feature group.

set_export_on_materialization(feature_group_id, enable)

Can be used to enable or disable exporting feature group data to the export connector associated with the feature group.

Parameters:

feature_group_id (str) – The unique ID associated with the pre-existing Feature Group for which export config is to be set.
enable (bool) – If true, will enable exporting feature group to the connector. If false, will disable.

create_feature_group_template(feature_group_id, name, template_sql, template_variables, description=None, template_bindings=None, should_attach_feature_group_to_template=False)

Create a feature group template.

Parameters:

feature_group_id (str) – Unique identifier of the feature group this template was created from.
name (str) – User-friendly name for this feature group template.
template_sql (str) – The template SQL that will be resolved by applying values from the template variables to generate SQL for a feature group.
template_variables (list) – The template variables for resolving the template.
description (str) – Description of this feature group template.
template_bindings (list) – If the feature group will be attached to the newly created template, set these variable bindings on that feature group.
should_attach_feature_group_to_template (bool) – Set to True to convert the feature group to a template feature group and attach it to the newly created template.

Returns:

The created feature group template.

Return type:

FeatureGroupTemplate

delete_feature_group_template(feature_group_template_id)

Delete an existing feature group template.

Parameters:: feature_group_template_id (str) – Unique string identifier associated with the feature group template.

update_feature_group_template(feature_group_template_id, template_sql=None, template_variables=None, description=None, name=None)

Update a feature group template.

Parameters:

feature_group_template_id (str) – Unique identifier of the feature group template to update.
template_sql (str) – If provided, the new value to use for the template SQL.
template_variables (list) – If provided, the new value to use for the template variables.
description (str) – Description of this feature group template.
name (str) – User-friendly name for this feature group template.

Returns:

The updated feature group template.

Return type:

FeatureGroupTemplate

preview_feature_group_template_resolution(feature_group_template_id=None, template_bindings=None, template_sql=None, template_variables=None, should_validate=True)

Resolve template sql using template variables and template bindings.

Parameters:

feature_group_template_id (str) – Unique string identifier. If specified, use this template, otherwise assume an empty template.
template_bindings (list) – Values to override the template variable values specified by the template.
template_sql (str) – If specified, use this as the template SQL instead of the feature group template’s SQL.
template_variables (list) – Template variables to use. If a template is provided, this overrides the template’s template variables.
should_validate (bool) – If true, validates the resolved SQL.

Returns:

The resolved template

Return type:

ResolvedFeatureGroupTemplate

cancel_upload(upload_id)

Cancels an upload.

Parameters:: upload_id (str) – A unique string identifier for the upload.

upload_part(upload_id, part_number, part_data)

Uploads part of a large dataset file from your bucket to our system. Our system currently supports parts of up to 5GB and full files of up to 5TB. Note that each part must be at least 5MB in size, unless it is the last part in the sequence of parts for the full file.

Parameters:

upload_id (str) – A unique identifier for this upload.
part_number (int) – The 1-indexed number denoting the position of the file part in the sequence of parts for the full file.
part_data (io.TextIOBase) – The multipart/form-data for the current part of the full file.

Returns:

The object ‘UploadPart’ which encapsulates the hash and the etag for the part that got uploaded.

Return type:

UploadPart

mark_upload_complete(upload_id)

Marks an upload process as complete.

Parameters:: upload_id (str) – A unique string identifier for the upload process.
Returns:: The upload object associated with the process, containing details of the file.
Return type:: Upload

create_dataset_from_file_connector(table_name, location, file_format=None, refresh_schedule=None, csv_delimiter=None, filename_column=None, start_prefix=None, until_prefix=None, sql_query=None, location_date_format=None, date_format_lookback_days=None, incremental=False, is_documentset=False, extract_bounding_boxes=False, document_processing_config=None, merge_file_schemas=False, reference_only_documentset=False, parsing_config=None, version_limit=30)

Creates a dataset from a file located in a cloud storage, such as Amazon AWS S3, using the specified dataset name and location.

Parameters:

table_name (str) – Organization-unique table name or the name of the feature group table to create using the source table.
location (str) – The URI location format of the dataset source. The URI location format needs to be specified to match the location_date_format when location_date_format is specified. For example, Location = s3://bucket1/dir1/dir2/event_date=YYYY-MM-DD/* when location_date_format is specified. The URI location format needs to include both the start_prefix and until_prefix when both are specified. For example, Location s3://bucket1/dir1/* includes both s3://bucket1/dir1/dir2/event_date=2021-08-02/* and s3://bucket1/dir1/dir2/event_date=2021-08-08/*
file_format (str) – The file format of the dataset.
refresh_schedule (str) – The Cron time string format that describes a schedule to retrieve the latest version of the imported dataset. The time is specified in UTC.
csv_delimiter (str) – If the file format is CSV, use a specific csv delimiter.
filename_column (str) – Adds a new column to the dataset with the external URI path.
start_prefix (str) – The start prefix (inclusive) for a range based search on a cloud storage location URI.
until_prefix (str) – The end prefix (exclusive) for a range based search on a cloud storage location URI.
sql_query (str) – The SQL query to use when fetching data from the specified location. Use __TABLE__ as a placeholder for the table name. For example: “SELECT * FROM __TABLE__ WHERE event_date > ‘2021-01-01’”. If not provided, the entire dataset from the specified location will be imported.
location_date_format (str) – The date format in which the data is partitioned in the cloud storage location. For example, if the data is partitioned as s3://bucket1/dir1/dir2/event_date=YYYY-MM-DD/dir4/filename.parquet, then the location_date_format is YYYY-MM-DD. This format needs to be consistent across all files within the specified location.
date_format_lookback_days (int) – The number of days to look back from the current day for import locations that are date partitioned. For example, import date 2021-06-04 with date_format_lookback_days = 3 will retrieve data for all the dates in the range [2021-06-02, 2021-06-04].
incremental (bool) – Signifies if the dataset is an incremental dataset.
is_documentset (bool) – Signifies if the dataset is docstore dataset. A docstore dataset contains documents like images, PDFs, audio files etc. or is tabular data with links to such files.
extract_bounding_boxes (bool) – Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True.
document_processing_config (DatasetDocumentProcessingConfig) – The document processing configuration. Only valid if is_documentset is True.
merge_file_schemas (bool) – Signifies if the merge file schema policy is enabled. If is_documentset is True, this is also set to True by default.
reference_only_documentset (bool) – Signifies if the data reference only policy is enabled.
parsing_config (ParsingConfig) – Custom config for dataset parsing.
version_limit (int) – The number of recent versions to preserve for the dataset (minimum 30).

Returns:

The dataset created.

Return type:

Dataset

create_dataset_version_from_file_connector(dataset_id, location=None, file_format=None, csv_delimiter=None, merge_file_schemas=None, parsing_config=None, sql_query=None)

Creates a new version of the specified dataset.

Parameters:

dataset_id (str) – Unique string identifier associated with the dataset.
location (str) – External URI to import the dataset from. If not specified, the last location will be used.
file_format (str) – File format to be used. If not specified, the service will try to detect the file format.
csv_delimiter (str) – If the file format is CSV, use a specific CSV delimiter.
merge_file_schemas (bool) – Signifies if the merge file schema policy is enabled.
parsing_config (ParsingConfig) – Custom config for dataset parsing.
sql_query (str) – The SQL query to use when fetching data from the specified location. Use __TABLE__ as a placeholder for the table name. For example: “SELECT * FROM __TABLE__ WHERE event_date > ‘2021-01-01’”. If not provided, the entire dataset from the specified location will be imported.

Returns:

The new Dataset Version created.

Return type:

DatasetVersion

create_dataset_from_database_connector(table_name, database_connector_id, object_name=None, columns=None, query_arguments=None, refresh_schedule=None, sql_query=None, incremental=False, attachment_parsing_config=None, incremental_database_connector_config=None, document_processing_config=None, version_limit=30)

Creates a dataset from a Database Connector.

Parameters:

table_name (str) – Organization-unique table name.
database_connector_id (str) – Unique String Identifier of the Database Connector to import the dataset from.
object_name (str) – If applicable, the name/ID of the object in the service to query.
columns (str) – The columns to query from the external service object.
query_arguments (str) – Additional query arguments to filter the data.
refresh_schedule (str) – The Cron time string format that describes a schedule to retrieve the latest version of the imported dataset. The time is specified in UTC.
sql_query (str) – The full SQL query to use when fetching data. If present, this parameter will override object_name, columns, timestamp_column, and query_arguments.
incremental (bool) – Signifies if the dataset is an incremental dataset.
attachment_parsing_config (AttachmentParsingConfig) – The attachment parsing configuration. Only valid when attachments are being imported, either will take fg name and column name, or we will take list of urls to import (e.g. importing attachments via Salesforce).
incremental_database_connector_config (IncrementalDatabaseConnectorConfig) – The config for incremental datasets. Only valid if incremental is True
document_processing_config (DatasetDocumentProcessingConfig) – The document processing configuration. Only valid when documents are being imported (e.g. importing KnowledgeArticleDescriptions via Salesforce).
version_limit (int) – The number of recent versions to preserve for the dataset (minimum 30).

Returns:

The created dataset.

Return type:

Dataset

create_dataset_from_application_connector(table_name, application_connector_id, dataset_config=None, refresh_schedule=None, version_limit=30, incremental=False)

Creates a dataset from an Application Connector.

Parameters:

table_name (str) – Organization-unique table name.
application_connector_id (str) – Unique string identifier of the application connector to download data from.
dataset_config (ApplicationConnectorDatasetConfig) – Dataset config for the application connector.
refresh_schedule (str) – Cron time string format that describes a schedule to retrieve the latest version of the imported dataset. The time is specified in UTC.
version_limit (int) – The number of recent versions to preserve for the dataset (minimum 30).
incremental (bool) – Signifies if the dataset is an incremental dataset.

Returns:

The created dataset.

Return type:

Dataset

create_dataset_version_from_database_connector(dataset_id, object_name=None, columns=None, query_arguments=None, sql_query=None)

Creates a new version of the specified dataset.

Parameters:

dataset_id (str) – The unique ID associated with the dataset.
object_name (str) – The name/ID of the object in the service to query. If not specified, the last name will be used.
columns (str) – The columns to query from the external service object. If not specified, the last columns will be used.
query_arguments (str) – Additional query arguments to filter the data. If not specified, the last arguments will be used.
sql_query (str) – The full SQL query to use when fetching data. If present, this parameter will override object_name, columns, and query_arguments.

Returns:

The new Dataset Version created.

Return type:

DatasetVersion

create_dataset_version_from_application_connector(dataset_id, dataset_config=None)

Creates a new version of the specified dataset.

Parameters:

dataset_id (str) – The unique ID associated with the dataset.
dataset_config (ApplicationConnectorDatasetConfig) – Dataset config for the application connector. If any of the fields are not specified, the last values will be used.

Returns:

The new Dataset Version created.

Return type:

DatasetVersion

create_dataset_from_upload(table_name, file_format=None, csv_delimiter=None, is_documentset=False, extract_bounding_boxes=False, parsing_config=None, merge_file_schemas=False, document_processing_config=None, version_limit=30)

Creates a dataset and returns an upload ID that can be used to upload a file.

Parameters:

table_name (str) – Organization-unique table name for this dataset.
file_format (str) – The file format of the dataset.
csv_delimiter (str) – If the file format is CSV, use a specific CSV delimiter.
is_documentset (bool) – Signifies if the dataset is a docstore dataset. A docstore dataset contains documents like images, PDFs, audio files etc. or is tabular data with links to such files.
extract_bounding_boxes (bool) – Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True.
parsing_config (ParsingConfig) – Custom config for dataset parsing.
merge_file_schemas (bool) – Signifies whether to merge the schemas of all files in the dataset. If is_documentset is True, this is also set to True by default.
document_processing_config (DatasetDocumentProcessingConfig) – The document processing configuration. Only valid if is_documentset is True.
version_limit (int) – The number of recent versions to preserve for the dataset (minimum 30).

Returns:

A reference to be used when uploading file parts.

Return type:

Upload

create_dataset_version_from_upload(dataset_id, file_format=None)

Creates a new version of the specified dataset using a local file upload.

Parameters:

dataset_id (str) – Unique string identifier associated with the dataset.
file_format (str) – File format to be used. If not specified, the service will attempt to detect the file format.

Returns:

Token to be used when uploading file parts.

Return type:

Upload

create_dataset_version_from_document_reprocessing(dataset_id, document_processing_config=None)

Creates a new dataset version for a source docstore dataset with the provided document processing configuration. This does not re-import the data but uses the same data which is imported in the latest dataset version and only performs document processing on it.

Parameters:

dataset_id (str) – The unique ID associated with the dataset to use as the source dataset.
document_processing_config (DatasetDocumentProcessingConfig) – The document processing configuration to use for the new dataset version. If not specified, the document processing configuration from the source dataset will be used.

Returns:

The new dataset version created.

Return type:

DatasetVersion

create_streaming_dataset(table_name, primary_key=None, update_timestamp_key=None, lookup_keys=None, version_limit=30)

Creates a streaming dataset. Use a streaming dataset if your dataset is receiving information from multiple sources over an extended period of time.

Parameters:

table_name (str) – The feature group table name to create for this dataset.
primary_key (str) – The optional primary key column name for the dataset.
update_timestamp_key (str) – Name of the feature which defines the update timestamp of the feature group. Used in concatenation and primary key deduplication. Only relevant if lookup keys are set.
lookup_keys (list) – List of feature names which can be used in the lookup API to restrict the computation to a set of dataset rows. These feature names have to correspond to underlying dataset columns.
version_limit (int) – The number of recent versions to preserve for the dataset (minimum 30).

Returns:

The streaming dataset created.

Return type:

Dataset

create_realtime_content_store(table_name, application_connector_id, dataset_config=None)

Creates a real-time content store dataset.

Parameters:

table_name (str) – Organization-unique table name.
application_connector_id (str) – Unique string identifier of the application connector to download data from.
dataset_config (ApplicationConnectorDatasetConfig) – Dataset config for the application connector.

Returns:

The created dataset.

Return type:

Dataset

snapshot_streaming_data(dataset_id)

Snapshots the current data in the streaming dataset.

Parameters:: dataset_id (str) – The unique ID associated with the dataset.
Returns:: The new Dataset Version created by taking a snapshot of the current data in the streaming dataset.
Return type:: DatasetVersion

set_dataset_column_data_type(dataset_id, column, data_type)

Set a Dataset’s column type.

Parameters:

dataset_id (str) – The unique ID associated with the dataset.
column (str) – The name of the column.
data_type (DataType) – The type of the data in the column. Note: Some ColumnMappings may restrict the options or explicitly set the DataType.

Returns:

The dataset and schema after the data type has been set.

Return type:

Dataset

create_dataset_from_streaming_connector(table_name, streaming_connector_id, dataset_config=None, refresh_schedule=None, version_limit=30)

Creates a dataset from a Streaming Connector

Parameters:

table_name (str) – Organization-unique table name
streaming_connector_id (str) – Unique String Identifier for the Streaming Connector to import the dataset from
dataset_config (StreamingConnectorDatasetConfig) – Streaming dataset config
refresh_schedule (str) – Cron time string format that describes a schedule to retrieve the latest version of the imported dataset. Time is specified in UTC.
version_limit (int) – The number of recent versions to preserve for the dataset (minimum 30).

Returns:

The created dataset.

Return type:

Dataset

set_streaming_retention_policy(dataset_id, retention_hours=None, retention_row_count=None, ignore_records_before_timestamp=None)

Sets the streaming retention policy.

Parameters:

dataset_id (str) – Unique string identifier for the streaming dataset.
retention_hours (int) – Number of hours to retain streamed data in memory.
retention_row_count (int) – Number of rows to retain streamed data in memory.
ignore_records_before_timestamp (int) – The Unix timestamp (in seconds) to use as a cutoff to ignore all entries sent before it

rename_database_connector(database_connector_id, name)

Renames a Database Connector

Parameters:

database_connector_id (str) – The unique identifier for the database connector.
name (str) – The new name for the Database Connector.

rename_application_connector(application_connector_id, name)

Renames a Application Connector

Parameters:

application_connector_id (str) – The unique identifier for the application connector.
name (str) – A new name for the application connector.

verify_database_connector(database_connector_id)

Checks if Abacus.AI can access the specified database.

Parameters:: database_connector_id (str) – Unique string identifier for the database connector.

verify_file_connector(bucket)

Checks to see if Abacus.AI can access the given bucket.

Parameters:: bucket (str) – The bucket to test.
Returns:: The result of the verification.
Return type:: FileConnectorVerification

delete_database_connector(database_connector_id)

Delete a database connector.

Parameters:: database_connector_id (str) – The unique identifier for the database connector.

delete_application_connector(application_connector_id)

Delete an application connector.

Parameters:: application_connector_id (str) – The unique identifier for the application connector.

delete_file_connector(bucket)

Deletes a file connector

Parameters:: bucket (str) – The fully qualified URI of the bucket to remove.

verify_application_connector(application_connector_id)

Checks if Abacus.AI can access the application using the provided application connector ID.

Parameters:: application_connector_id (str) – Unique string identifier for the application connector.

set_azure_blob_connection_string(bucket, connection_string)

Authenticates the specified Azure Blob Storage bucket using an authenticated Connection String.

Parameters:

bucket (str) – The fully qualified Azure Blob Storage Bucket URI.
connection_string (str) – The Connection String Abacus.AI should use to authenticate when accessing this bucket.

Returns:

An object with the roleArn and verification status for the specified bucket.

Return type:

FileConnectorVerification

verify_streaming_connector(streaming_connector_id)

Checks to see if Abacus.AI can access the streaming connector.

Parameters:: streaming_connector_id (str) – Unique string identifier for the streaming connector to be checked for Abacus.AI access.

rename_streaming_connector(streaming_connector_id, name)

Renames a Streaming Connector

Parameters:

streaming_connector_id (str) – The unique identifier for the streaming connector.
name (str) – A new name for the streaming connector.

delete_streaming_connector(streaming_connector_id)

Delete a streaming connector.

Parameters:: streaming_connector_id (str) – The unique identifier for the streaming connector.

create_streaming_token()

Creates a streaming token for the specified project. Streaming tokens are used to authenticate requests when appending data to streaming datasets.

Returns:: The generated streaming token.
Return type:: StreamingAuthToken

delete_streaming_token(streaming_token)

Deletes the specified streaming token.

Parameters:: streaming_token (str) – The streaming token to delete.

delete_dataset(dataset_id)

Deletes the specified dataset from the organization.

Parameters:: dataset_id (str) – Unique string identifier of the dataset to delete.

delete_dataset_version(dataset_version)

Deletes the specified dataset version from the organization.

Parameters:: dataset_version (str) – String identifier of the dataset version to delete.

get_docstore_page_data(doc_id, page, document_processing_config=None, document_processing_version=None)

Returns the extracted page data for a document page.

Parameters:

doc_id (str) – A unique Docstore string identifier for the document.
page (int) – The page number to retrieve. Page numbers start from 0.
document_processing_config (DocumentProcessingConfig) – The document processing configuration to use for returning the data when the document is processed via EXTRACT_DOCUMENT_DATA Feature Group Operator. If Feature Group Operator is not used, this parameter should be kept as None. If Feature Group Operator is used but this parameter is not provided, the latest available data or the default configuration will be used.
document_processing_version (str) – The document processing version to use for returning the data when the document is processed via EXTRACT_DOCUMENT_DATA Feature Group Operator. If Feature Group Operator is not used, this parameter should be kept as None. If Feature Group Operator is used but this parameter is not provided, the latest version will be used.

Returns:

The extracted page data.

Return type:

PageData

get_docstore_document_data(doc_id, document_processing_config=None, document_processing_version=None, return_extracted_page_text=False)

Returns the extracted data for a document.

Parameters:

doc_id (str) – A unique Docstore string identifier for the document.
document_processing_config (DocumentProcessingConfig) – The document processing configuration to use for returning the data when the document is processed via EXTRACT_DOCUMENT_DATA Feature Group Operator. If Feature Group Operator is not used, this parameter should be kept as None. If Feature Group Operator is used but this parameter is not provided, the latest available data or the default configuration will be used.
document_processing_version (str) – The document processing version to use for returning the data when the document is processed via EXTRACT_DOCUMENT_DATA Feature Group Operator. If Feature Group Operator is not used, this parameter should be kept as None. If Feature Group Operator is used but this parameter is not provided, the latest version will be used.
return_extracted_page_text (bool) – Specifies whether to include a list of extracted text for each page in the response. Defaults to false if not provided.

Returns:

The extracted document data.

Return type:

DocumentData

extract_document_data(document=None, doc_id=None, document_processing_config=None, start_page=None, end_page=None, return_extracted_page_text=False)

Extracts data from a document using either OCR (for scanned documents/images) or embedded text extraction (for digital documents like .docx). Configure the extraction method through DocumentProcessingConfig

Parameters:

document (io.TextIOBase) – The document to extract data from. One of document or doc_id must be provided.
doc_id (str) – A unique Docstore string identifier for the document. One of document or doc_id must be provided.
document_processing_config (DocumentProcessingConfig) – The document processing configuration.
start_page (int) – The starting page to extract data from. Pages are indexed starting from 0. If not provided, the first page will be used.
end_page (int) – The last page to extract data from. Pages are indexed starting from 0. If not provided, the last page will be used.
return_extracted_page_text (bool) – Specifies whether to include a list of extracted text for each page in the response. Defaults to false if not provided.

Returns:

The extracted document data.

Return type:

DocumentData

get_training_config_options(project_id, feature_group_ids=None, for_retrain=False, current_training_config=None)

Retrieves the full initial description of the model training configuration options available for the specified project. The configuration options available are determined by the use case associated with the specified project. Refer to the [Use Case Documentation]({USE_CASES_URL}) for more information on use cases and use case-specific configuration options.

Parameters:

project_id (str) – The unique ID associated with the project.
feature_group_ids (List) – The feature group IDs to be used for training.
for_retrain (bool) – Whether the training config options are used for retraining.
current_training_config (TrainingConfig) – The current state of the training config, with some options set, which shall be used to get new options after refresh. This is None by default initially.

Returns:

An array of options that can be specified when training a model in this project.

Return type:

list[TrainingConfigOptions]

create_train_test_data_split_feature_group(project_id, training_config, feature_group_ids)

Get the train and test data split without training the model. Only supported for models with custom algorithms.

Parameters:

project_id (str) – The unique ID associated with the project.
training_config (TrainingConfig) – The training config used to influence how the split is calculated.
feature_group_ids (List) – List of feature group IDs provided by the user, including the required one for data split and others to influence how to split.

Returns:

The feature group containing the training data and folds information.

Return type:

FeatureGroup

train_model(project_id, name=None, training_config=None, feature_group_ids=None, refresh_schedule=None, custom_algorithms=None, custom_algorithms_only=False, custom_algorithm_configs=None, builtin_algorithms=None, cpu_size=None, memory=None, algorithm_training_configs=None)

Create a new model and start its training in the given project.

Parameters:

project_id (str) – The unique ID associated with the project.
name (str) – The name of the model. Defaults to “<Project Name> Model”.
training_config (TrainingConfig) – The training config used to train this model.
feature_group_ids (List) – List of feature group IDs provided by the user to train the model on.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically retrain the created model.
custom_algorithms (list) – List of user-defined algorithms to train. If not set, the default enabled custom algorithms will be used.
custom_algorithms_only (bool) – Whether to only run custom algorithms.
custom_algorithm_configs (dict) – Configs for each user-defined algorithm; key is the algorithm name, value is the config serialized to JSON.
builtin_algorithms (list) – List of algorithm names or algorithm IDs of the builtin algorithms provided by Abacus.AI to train. If not set, all applicable builtin algorithms will be used.
cpu_size (str) – Size of the CPU for the user-defined algorithms during training.
memory (int) – Memory (in GB) for the user-defined algorithms during training.
algorithm_training_configs (list) – List of algorithm specifc training configs that will be part of the model training AutoML run.

Returns:

The new model which is being trained.

Return type:

Model

create_model_from_python(project_id, function_source_code, train_function_name, training_input_tables, predict_function_name=None, predict_many_function_name=None, initialize_function_name=None, name=None, cpu_size=None, memory=None, training_config=None, exclusive_run=False, package_requirements=None, use_gpu=False, is_thread_safe=None)

Initializes a new Model from user-provided Python code. If a list of input feature groups is supplied, they will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects functionSourceCode to be a valid language source file which contains the functions named trainFunctionName and predictFunctionName. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName and predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

project_id (str) – The unique ID associated with the project.
function_source_code (str) – Contents of a valid Python source code file. The source code should contain the functions named trainFunctionName and predictFunctionName. A list of allowed import and system libraries for each language is specified in the user functions documentation section.
train_function_name (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the source code that will be executed for batch prediction of the model. It is not executed when this function is run.
initialize_function_name (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model
name (str) – The name you want your model to have. Defaults to “<Project Name> Model”
cpu_size (str) – Size of the CPU for the model training function
memory (int) – Memory (in GB) for the model training function
training_config (TrainingConfig) – Training configuration
exclusive_run (bool) – Decides if this model will be run exclusively or along with other Abacus.AI algorithms
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_gpu (bool) – Whether this model needs gpu
is_thread_safe (bool) – Whether this model is thread safe

Returns:

The new model, which has not been trained.

Return type:

Model

rename_model(model_id, name)

Renames a model

Parameters:

model_id (str) – Unique identifier of the model to rename.
name (str) – The new name to assign to the model.

update_python_model(model_id, function_source_code=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, initialize_function_name=None, training_input_tables=None, cpu_size=None, memory=None, package_requirements=None, use_gpu=None, is_thread_safe=None, training_config=None)

Updates an existing Python Model using user-provided Python code. If a list of input feature groups is supplied, they will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects functionSourceCode to be a valid language source file which contains the functions named trainFunctionName and predictFunctionName. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName. predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

model_id (str) – The unique ID associated with the Python model to be changed.
function_source_code (str) – Contents of a valid Python source code file. The source code should contain the functions named trainFunctionName and predictFunctionName. A list of allowed import and system libraries for each language is specified in the user functions documentation section.
train_function_name (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the source code that will be executed to run batch predictions through the model. It is not executed when this function is run.
initialize_function_name (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized DataFrames (same type as the functions return value).
cpu_size (str) – Size of the CPU for the model training function.
memory (int) – Memory (in GB) for the model training function.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
use_gpu (bool) – Whether this model needs gpu
is_thread_safe (bool) – Whether this model is thread safe
training_config (TrainingConfig) – The training config used to train this model.

Returns:

The updated model.

Return type:

Model

update_python_model_zip(model_id, train_function_name=None, predict_function_name=None, predict_many_function_name=None, train_module_name=None, predict_module_name=None, training_input_tables=None, cpu_size=None, memory=None, package_requirements=None, use_gpu=None)

Updates an existing Python Model using a provided zip file. If a list of input feature groups are supplied, they will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects trainModuleName and predictModuleName to be valid language source files which contain the functions named trainFunctionName and predictFunctionName, respectively. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName, and predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

model_id (str) – The unique ID associated with the Python model to be changed.
train_function_name (str) – Name of the function found in the train module that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the predict module that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the predict module that will be executed to run batch predictions through the model. It is not executed when this function is run.
train_module_name (str) – Full path of the module that contains the train function from the root of the zip.
predict_module_name (str) – Full path of the module that contains the predict function from the root of the zip.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the function’s return value).
cpu_size (str) – Size of the CPU for the model training function.
memory (int) – Memory (in GB) for the model training function.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
use_gpu (bool) – Whether this model needs gpu

Returns:

The updated model.

Return type:

Upload

update_python_model_git(model_id, application_connector_id=None, branch_name=None, python_root=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, train_module_name=None, predict_module_name=None, training_input_tables=None, cpu_size=None, memory=None, use_gpu=None)

Updates an existing Python model using an existing Git application connector. If a list of input feature groups are supplied, these will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects trainModuleName and predictModuleName to be valid language source files which contain the functions named trainFunctionName and predictFunctionName, respectively. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName, and predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

model_id (str) – The unique ID associated with the Python model to be changed.
application_connector_id (str) – The unique ID associated with the Git application connector.
branch_name (str) – Name of the branch in the Git repository to be used for training.
python_root (str) – Path from the top level of the Git repository to the directory containing the Python source code. If not provided, the default is the root of the Git repository.
train_function_name (str) – Name of the function found in train module that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the predict module that will be executed to run predictions through model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the predict module that will be executed to run batch predictions through model. It is not executed when this function is run.
train_module_name (str) – Full path of the module that contains the train function from the root of the zip.
predict_module_name (str) – Full path of the module that contains the predict function from the root of the zip.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
cpu_size (str) – Size of the CPU for the model training function.
memory (int) – Memory (in GB) for the model training function.
use_gpu (bool) – Whether this model needs gpu

Returns:

The updated model.

Return type:

Model

set_model_training_config(model_id, training_config, feature_group_ids=None)

Edits the default model training config

Parameters:

model_id (str) – A unique string identifier of the model to update.
training_config (TrainingConfig) – The training config used to train this model.
feature_group_ids (List) – The list of feature groups used as input to the model.

Returns:

The model object corresponding to the updated training config.

Return type:

Model

set_model_objective(model_version, metric=None)

Sets the best model for all model instances of the model based on the specified metric, and updates the training configuration to use the specified metric for any future model versions.

If metric is set to None, then just use the default selection

Parameters:

model_version (str) – The model version to set as the best model.
metric (str) – The metric to use to determine the best model.

set_model_prediction_params(model_id, prediction_config)

Sets the model prediction config for the model

Parameters:

model_id (str) – Unique string identifier of the model to update.
prediction_config (dict) – Prediction configuration for the model.

Returns:

Model object after the prediction configuration is applied.

Return type:

Model

retrain_model(model_id, deployment_ids=None, feature_group_ids=None, custom_algorithms=None, builtin_algorithms=None, custom_algorithm_configs=None, cpu_size=None, memory=None, training_config=None, algorithm_training_configs=None)

Retrains the specified model, with an option to choose the deployments to which the retraining will be deployed.

Parameters:

model_id (str) – Unique string identifier of the model to retrain.
deployment_ids (List) – List of unique string identifiers of deployments to automatically deploy to.
feature_group_ids (List) – List of feature group IDs provided by the user to train the model on.
custom_algorithms (list) – List of user-defined algorithms to train. If not set, will honor the runs from the last time and applicable new custom algorithms.
builtin_algorithms (list) – List of algorithm names or algorithm IDs of Abacus.AI built-in algorithms to train. If not set, will honor the runs from the last time and applicable new built-in algorithms.
custom_algorithm_configs (dict) – User-defined training configs for each custom algorithm.
cpu_size (str) – Size of the CPU for the user-defined algorithms during training.
memory (int) – Memory (in GB) for the user-defined algorithms during training.
training_config (TrainingConfig) – The training config used to train this model.
algorithm_training_configs (list) – List of algorithm specifc training configs that will be part of the model training AutoML run.

Returns:

The model that is being retrained.

Return type:

Model

delete_model(model_id)

Deletes the specified model and all its versions. Models which are currently used in deployments cannot be deleted.

Parameters:: model_id (str) – Unique string identifier of the model to delete.

delete_model_version(model_version)

Deletes the specified model version. Model versions which are currently used in deployments cannot be deleted.

Parameters:: model_version (str) – The unique identifier of the model version to delete.

export_model_artifact_as_feature_group(model_version, table_name, artifact_type=None)

Exports metric artifact data for a model as a feature group.

Parameters:

model_version (str) – Unique string identifier for the version of the model.
table_name (str) – Name of the feature group table to create.
artifact_type (EvalArtifactType) – eval artifact type to export.

Returns:

The created feature group.

Return type:

FeatureGroup

set_default_model_algorithm(model_id, algorithm=None, data_cluster_type=None)

Sets the model’s algorithm to default for all new deployments

Parameters:

model_id (str) – Unique identifier of the model to set.
algorithm (str) – Algorithm to pin in the model.
data_cluster_type (str) – Data cluster type to set the lead model for.

get_custom_train_function_info(project_id, feature_group_names_for_training=None, training_data_parameter_name_override=None, training_config=None, custom_algorithm_config=None)

Returns information about how to call the custom train function.

Parameters:

project_id (str) – The unique version ID of the project.
feature_group_names_for_training (list) – A list of feature group table names to be used for training.
training_data_parameter_name_override (dict) – Override from feature group type to parameter name in the train function.
training_config (TrainingConfig) – Training config for the options supported by the Abacus.AI platform.
custom_algorithm_config (dict) – User-defined config that can be serialized by JSON.

Returns:

Information about how to call the customer-provided train function.

Return type:

CustomTrainFunctionInfo

export_custom_model_version(model_version, output_location, algorithm=None)

Bundle custom model artifacts to a zip file, and export to the specified location.

Parameters:

model_version (str) – A unique string identifier for the model version.
output_location (str) – Location to export the model artifacts results. For example, s3://a-bucket/
algorithm (str) – The algorithm to be exported. Optional if there’s only one custom algorithm in the model version.

Returns:

Object describing the export and its status.

Return type:

ModelArtifactsExport

create_model_monitor(project_id, prediction_feature_group_id, training_feature_group_id=None, name=None, refresh_schedule=None, target_value=None, target_value_bias=None, target_value_performance=None, feature_mappings=None, model_id=None, training_feature_mappings=None, feature_group_base_monitor_config=None, feature_group_comparison_monitor_config=None, exclude_interactive_performance_analysis=True, exclude_bias_analysis=None, exclude_performance_analysis=None, exclude_feature_drift_analysis=None, exclude_data_integrity_analysis=None)

Runs a model monitor for the specified project.

Parameters:

project_id (str) – The unique ID associated with the project.
prediction_feature_group_id (str) – The unique ID of the prediction data feature group.
training_feature_group_id (str) – The unique ID of the training data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically retrain the created model monitor.
target_value (str) – A target positive value for the label to compute bias and PR/AUC for performance page.
target_value_bias (str) – A target positive value for the label to compute bias.
target_value_performance (str) – A target positive value for the label to compute PR curve/AUC for performance page.
feature_mappings (dict) – A JSON map to override features for prediction_feature_group, where keys are column names and the values are feature data use types.
model_id (str) – The unique ID of the model.
training_feature_mappings (dict) – A JSON map to override features for training_fature_group, where keys are column names and the values are feature data use types.
feature_group_base_monitor_config (dict) – Selection strategy for the feature_group 1 with the feature group version if selected.
feature_group_comparison_monitor_config (dict) – Selection strategy for the feature_group 1 with the feature group version if selected.
exclude_interactive_performance_analysis (bool) – Whether to exclude interactive performance analysis. Defaults to True if not provided.
exclude_bias_analysis (bool) – Whether to exclude bias analysis in the model monitor. For default value bias analysis is included.
exclude_performance_analysis (bool) – Whether to exclude performance analysis in the model monitor. For default value performance analysis is included.
exclude_feature_drift_analysis (bool) – Whether to exclude feature drift analysis in the model monitor. For default value feature drift analysis is included.
exclude_data_integrity_analysis (bool) – Whether to exclude data integrity analysis in the model monitor. For default value data integrity analysis is included.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

rerun_model_monitor(model_monitor_id)

Re-runs the specified model monitor.

Parameters:: model_monitor_id (str) – Unique string identifier of the model monitor to re-run.
Returns:: The model monitor that is being re-run.
Return type:: ModelMonitor

rename_model_monitor(model_monitor_id, name)

Renames a model monitor

Parameters:

model_monitor_id (str) – Unique identifier of the model monitor to rename.
name (str) – The new name to apply to the model monitor.

delete_model_monitor(model_monitor_id)

Deletes the specified Model Monitor and all its versions.

Parameters:: model_monitor_id (str) – Unique identifier of the Model Monitor to delete.

delete_model_monitor_version(model_monitor_version)

Deletes the specified model monitor version.

Parameters:: model_monitor_version (str) – Unique identifier of the model monitor version to delete.

create_vision_drift_monitor(project_id, prediction_feature_group_id, training_feature_group_id, name, feature_mappings, training_feature_mappings, target_value_performance=None, refresh_schedule=None)

Runs a vision drift monitor for the specified project.

Parameters:

project_id (str) – Unique string identifier of the project.
prediction_feature_group_id (str) – Unique string identifier of the prediction data feature group.
training_feature_group_id (str) – Unique string identifier of the training data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
feature_mappings (dict) – A JSON map to override features for prediction_feature_group, where keys are column names and the values are feature data use types.
training_feature_mappings (dict) – A JSON map to override features for training_feature_group, where keys are column names and the values are feature data use types.
target_value_performance (str) – A target positive value for the label to compute precision-recall curve/area under curve for performance page.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically rerun the created vision drift monitor.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

create_nlp_drift_monitor(project_id, prediction_feature_group_id, training_feature_group_id, name, feature_mappings, training_feature_mappings, target_value_performance=None, refresh_schedule=None)

Runs an NLP drift monitor for the specified project.

Parameters:

project_id (str) – Unique string identifier of the project.
prediction_feature_group_id (str) – Unique string identifier of the prediction data feature group.
training_feature_group_id (str) – Unique string identifier of the training data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
feature_mappings (dict) – A JSON map to override features for prediction_feature_group, where keys are column names and the values are feature data use types.
training_feature_mappings (dict) – A JSON map to override features for training_feature_group, where keys are column names and the values are feature data use types.
target_value_performance (str) – A target positive value for the label to compute precision-recall curve/area under curve for performance page.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically rerun the created nlp drift monitor.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

create_forecasting_monitor(project_id, name, prediction_feature_group_id, training_feature_group_id, training_forecast_config, prediction_forecast_config, forecast_frequency, refresh_schedule=None)

Runs a forecasting monitor for the specified project.

Parameters:

project_id (str) – Unique string identifier of the project.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
prediction_feature_group_id (str) – Unique string identifier of the prediction data feature group.
training_feature_group_id (str) – Unique string identifier of the training data feature group.
training_forecast_config (ForecastingMonitorConfig) – The configuration for the training data.
prediction_forecast_config (ForecastingMonitorConfig) – The configuration for the prediction data.
forecast_frequency (str) – The frequency of the forecast. Defaults to the frequency of the prediction data.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically rerun the created forecasting monitor.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

create_eda(project_id, feature_group_id, name, refresh_schedule=None, include_collinearity=False, include_data_consistency=False, collinearity_keys=None, primary_keys=None, data_consistency_test_config=None, data_consistency_reference_config=None, feature_mappings=None, forecast_frequency=None)

Run an Exploratory Data Analysis (EDA) for the specified project.

Parameters:

project_id (str) – The unique ID associated with the project.
feature_group_id (str) – The unique ID of the prediction data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> EDA”.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically retrain the created EDA.
include_collinearity (bool) – Set to True if the EDA type is collinearity.
include_data_consistency (bool) – Set to True if the EDA type is data consistency.
collinearity_keys (list) – List of features to use for collinearity
primary_keys (list) – List of features that corresponds to the primary keys or item ids for the given feature group for Data Consistency analysis or Forecasting analysis respectively.
data_consistency_test_config (dict) – Test feature group version selection strategy for Data Consistency EDA type.
data_consistency_reference_config (dict) – Reference feature group version selection strategy for Data Consistency EDA type.
feature_mappings (dict) – A JSON map to override features for the given feature_group, where keys are column names and the values are feature data use types. (In forecasting, used to set the timestamp column and target value)
forecast_frequency (str) – The frequency of the data. It can be either HOURLY, DAILY, WEEKLY, MONTHLY, QUARTERLY, YEARLY.

Returns:

The new EDA object that was created.

Return type:

Eda

rerun_eda(eda_id)

Reruns the specified EDA object.

Parameters:: eda_id (str) – Unique string identifier of the EDA object to rerun.
Returns:: The EDA object that is being rerun.
Return type:: Eda

rename_eda(eda_id, name)

Renames an EDA

Parameters:

eda_id (str) – Unique string identifier of the EDA to rename.
name (str) – The new name to apply to the model monitor.

delete_eda(eda_id)

Deletes the specified EDA and all its versions.

Parameters:: eda_id (str) – Unique string identifier of the EDA to delete.

delete_eda_version(eda_version)

Deletes the specified EDA version.

Parameters:: eda_version (str) – Unique string identifier of the EDA version to delete.

create_holdout_analysis(name, model_id, feature_group_ids, model_version=None, algorithm=None)

Create a holdout analysis for a model

Parameters:

name (str) – Name of the holdout analysis
model_id (str) – ID of the model to create a holdout analysis for
feature_group_ids (List) – List of feature group IDs to use for the holdout analysis
model_version (str) – (optional) Version of the model to use for the holdout analysis
algorithm (str) – (optional) ID of algorithm to use for the holdout analysis

Returns:

The created holdout analysis

Return type:

HoldoutAnalysis

rerun_holdout_analysis(holdout_analysis_id, model_version=None, algorithm=None)

Rerun a holdout analysis. A different model version and algorithm can be specified which should be under the same model.

Parameters:

holdout_analysis_id (str) – ID of the holdout analysis to rerun
model_version (str) – (optional) Version of the model to use for the holdout analysis
algorithm (str) – (optional) ID of algorithm to use for the holdout analysis

Returns:

The created holdout analysis version

Return type:

HoldoutAnalysisVersion

create_monitor_alert(project_id, alert_name, condition_config, action_config, model_monitor_id=None, realtime_monitor_id=None)

Create a monitor alert for the given conditions and monitor. We can create monitor alert either for model monitor or real-time monitor.

Parameters:

project_id (str) – Unique string identifier for the project.
alert_name (str) – Name of the alert.
condition_config (AlertConditionConfig) – Condition to run the actions for the alert.
action_config (AlertActionConfig) – Configuration for the action of the alert.
model_monitor_id (str) – Unique string identifier for the model monitor created under the project.
realtime_monitor_id (str) – Unique string identifier for the real-time monitor for the deployment created under the project.

Returns:

Object describing the monitor alert.

Return type:

MonitorAlert

update_monitor_alert(monitor_alert_id, alert_name=None, condition_config=None, action_config=None)

Update monitor alert

Parameters:

monitor_alert_id (str) – Unique identifier of the monitor alert.
alert_name (str) – Name of the alert.
condition_config (AlertConditionConfig) – Condition to run the actions for the alert.
action_config (AlertActionConfig) – Configuration for the action of the alert.

Returns:

Object describing the monitor alert.

Return type:

MonitorAlert

run_monitor_alert(monitor_alert_id)

Reruns a given monitor alert from latest monitor instance

Parameters:: monitor_alert_id (str) – Unique identifier of a monitor alert.
Returns:: Object describing the monitor alert.
Return type:: MonitorAlert

delete_monitor_alert(monitor_alert_id)

Delets a monitor alert

Parameters:: monitor_alert_id (str) – The unique string identifier of the alert to delete.

create_prediction_operator(name, project_id, source_code=None, predict_function_name=None, initialize_function_name=None, feature_group_ids=None, cpu_size=None, memory=None, package_requirements=None, use_gpu=False)

Create a new prediction operator.

Parameters:

name (str) – Name of the prediction operator.
project_id (str) – The unique ID of the associated project.
source_code (str) – Contents of a valid Python source code file. The source code should contain the function predictFunctionName, and the function ‘initializeFunctionName’ if defined.
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions.
initialize_function_name (str) – Name of the optional initialize function found in the source code. This function will generate anything used by predictions, based on input feature groups.
feature_group_ids (List) – List of feature groups that are supplied to the initialize function as parameters. Each of the parameters are materialized Dataframes. The order should match the initialize function’s parameters.
cpu_size (str) – Size of the CPU for the prediction operator.
memory (int) – Memory (in GB) for the prediction operator.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_gpu (bool) – Whether this prediction operator needs gpu.

Returns:

The created prediction operator object.

Return type:

PredictionOperator

update_prediction_operator(prediction_operator_id, name=None, feature_group_ids=None, source_code=None, initialize_function_name=None, predict_function_name=None, cpu_size=None, memory=None, package_requirements=None, use_gpu=None)

Update an existing prediction operator. This does not create a new version.

Parameters:

prediction_operator_id (str) – The unique ID of the prediction operator.
name (str) – Name of the prediction operator.
feature_group_ids (List) – List of feature groups that are supplied to the initialize function as parameters. Each of the parameters are materialized Dataframes. The order should match the initialize function’s parameters.
source_code (str) – Contents of a valid Python source code file. The source code should contain the function predictFunctionName, and the function ‘initializeFunctionName’ if defined.
initialize_function_name (str) – Name of the optional initialize function found in the source code. This function will generate anything used by predictions, based on input feature groups.
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions.
cpu_size (str) – Size of the CPU for the prediction operator.
memory (int) – Memory (in GB) for the prediction operator.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_gpu (bool) – Whether this prediction operator needs gpu.

Returns:

The updated prediction operator object.

Return type:

PredictionOperator

delete_prediction_operator(prediction_operator_id)

Delete an existing prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.

deploy_prediction_operator(prediction_operator_id, auto_deploy=True)

Deploy the prediction operator.

Parameters:

prediction_operator_id (str) – The unique ID of the prediction operator.
auto_deploy (bool) – Flag to enable the automatic deployment when a new prediction operator version is created.

Returns:

The created deployment object.

Return type:

Deployment

create_prediction_operator_version(prediction_operator_id)

Create a new version of the prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.
Returns:: The created prediction operator version object.
Return type:: PredictionOperatorVersion

delete_prediction_operator_version(prediction_operator_version)

Delete a prediction operator version.

Parameters:: prediction_operator_version (str) – The unique ID of the prediction operator version.

create_deployment(name=None, model_id=None, model_version=None, algorithm=None, feature_group_id=None, project_id=None, description=None, calls_per_second=None, auto_deploy=True, start=True, enable_batch_streaming_updates=False, skip_metrics_check=False, model_deployment_config=None)

Creates a deployment with the specified name and description for the specified model or feature group.

A Deployment makes the trained model or feature group available for prediction requests.

Parameters:

name (str) – The name of the deployment.
model_id (str) – The unique ID associated with the model.
model_version (str) – The unique ID associated with the model version to deploy.
algorithm (str) – The unique ID associated with the algorithm to deploy.
feature_group_id (str) – The unique ID associated with a feature group.
project_id (str) – The unique ID associated with a project.
description (str) – The description for the deployment.
calls_per_second (int) – The number of calls per second the deployment can handle.
auto_deploy (bool) – Flag to enable the automatic deployment when a new Model Version finishes training.
start (bool) – If true, will start the deployment; otherwise will create offline
enable_batch_streaming_updates (bool) – Flag to enable marking the feature group deployment to have a background process cache streamed in rows for quicker lookup.
skip_metrics_check (bool) – Flag to skip metric regression with this current deployment
model_deployment_config (dict) – The deployment config for model to deploy

Returns:

The new model or feature group deployment.

Return type:

Deployment

create_deployment_token(project_id, name=None)

Creates a deployment token for the specified project.

Deployment tokens are used to authenticate requests to the prediction APIs and are scoped to the project level.

Parameters:

project_id (str) – The unique string identifier associated with the project.
name (str) – The name of the deployment token.

Returns:

The deployment token.

Return type:

DeploymentAuthToken

update_deployment(deployment_id, description=None, auto_deploy=None, skip_metrics_check=None)

Updates a deployment’s properties.

Parameters:

deployment_id (str) – Unique identifier of the deployment to update.
description (str) – The new description for the deployment.
auto_deploy (bool) – Flag to enable the automatic deployment when a new Model Version finishes training.
skip_metrics_check (bool) – Flag to skip metric regression with this current deployment. This field is only relevant when auto_deploy is on

rename_deployment(deployment_id, name)

Updates a deployment’s name

Parameters:

deployment_id (str) – Unique string identifier for the deployment to update.
name (str) – The new deployment name.

set_auto_deployment(deployment_id, enable=None)

Enable or disable auto deployment for the specified deployment.

When a model is scheduled to retrain, deployments with auto deployment enabled will be marked to automatically promote the new model version. After the newly trained model completes, a check on its metrics in comparison to the currently deployed model version will be performed. If the metrics are comparable or better, the newly trained model version is automatically promoted. If not, it will be marked as a failed model version promotion with an error indicating poor metrics performance.

Parameters:

deployment_id (str) – The unique ID associated with the deployment.
enable (bool) – Enable or disable the autoDeploy property of the deployment.

set_deployment_model_version(deployment_id, model_version, algorithm=None, model_deployment_config=None)

Promotes a model version and/or algorithm to be the active served deployment version

Parameters:

deployment_id (str) – A unique identifier for the deployment.
model_version (str) – A unique identifier for the model version.
algorithm (str) – The algorithm to use for the model version. If not specified, the algorithm will be inferred from the model version.
model_deployment_config (dict) – The deployment configuration for the model to deploy.

set_deployment_feature_group_version(deployment_id, feature_group_version)

Promotes a feature group version to be served in the deployment.

Parameters:

deployment_id (str) – Unique string identifier for the deployment.
feature_group_version (str) – Unique string identifier for the feature group version.

set_deployment_prediction_operator_version(deployment_id, prediction_operator_version)

Promotes a prediction operator version to be served in the deployment.

Parameters:

deployment_id (str) – Unique string identifier for the deployment.
prediction_operator_version (str) – Unique string identifier for the prediction operator version.

start_deployment(deployment_id)

Restarts the specified deployment that was previously suspended.

Parameters:: deployment_id (str) – A unique string identifier associated with the deployment.

stop_deployment(deployment_id)

Stops the specified deployment.

Parameters:: deployment_id (str) – Unique string identifier of the deployment to be stopped.

delete_deployment(deployment_id)

Deletes the specified deployment. The deployment’s models will not be affected. Note that the deployments are not recoverable after they are deleted.

Parameters:: deployment_id (str) – Unique string identifier of the deployment to delete.

delete_deployment_token(deployment_token)

Deletes the specified deployment token.

Parameters:: deployment_token (str) – The deployment token to delete.

set_deployment_feature_group_export_file_connector_output(deployment_id, file_format=None, output_location=None)

Sets the export output for the Feature Group Deployment to be a file connector.

Parameters:

deployment_id (str) – The ID of the deployment for which the export type is set.
file_format (str) – The type of export output, either CSV or JSON.
output_location (str) – The file connector (cloud) location where the output should be exported.

set_deployment_feature_group_export_database_connector_output(deployment_id, database_connector_id, object_name, write_mode, database_feature_mapping, id_column=None, additional_id_columns=None)

Sets the export output for the Feature Group Deployment to a Database connector.

Parameters:

deployment_id (str) – The ID of the deployment for which the export type is set.
database_connector_id (str) – The unique string identifier of the database connector used.
object_name (str) – The object of the database connector to write to.
write_mode (str) – The write mode to use when writing to the database connector, either UPSERT or INSERT.
database_feature_mapping (dict) – The column/feature pairs mapping the features to the database columns.
id_column (str) – The id column to use as the upsert key.
additional_id_columns (list) – For database connectors which support it, a list of additional ID columns to use as a complex key for upserting.

remove_deployment_feature_group_export_output(deployment_id)

Removes the export type that is set for the Feature Group Deployment

Parameters:: deployment_id (str) – The ID of the deployment for which the export type is set.

set_default_prediction_arguments(deployment_id, prediction_arguments, set_as_override=False)

Sets the deployment config.

Parameters:

deployment_id (str) – The unique identifier for a deployment created under the project.
prediction_arguments (PredictionArguments) – The prediction arguments to set.
set_as_override (bool) – If True, use these arguments as overrides instead of defaults for predict calls

Returns:

description of the updated deployment.

Return type:

Deployment

create_deployment_alert(deployment_id, alert_name, condition_config, action_config)

Create a deployment alert for the given conditions.

Only support batch prediction usage now.

Parameters:

deployment_id (str) – Unique string identifier for the deployment.
alert_name (str) – Name of the alert.
condition_config (AlertConditionConfig) – Condition to run the actions for the alert.
action_config (AlertActionConfig) – Configuration for the action of the alert.

Returns:

Object describing the deployment alert.

Return type:

MonitorAlert

create_realtime_monitor(deployment_id, realtime_monitor_schedule=None, lookback_time=None)

Real time monitors compute and monitor metrics of real time prediction data.

Parameters:

deployment_id (str) – Unique string identifier for the deployment.
realtime_monitor_schedule (str) – The cron expression for triggering monitor.
lookback_time (int) – Lookback time (in seconds) for each monitor trigger

Returns:

Object describing the real-time monitor.

Return type:

RealtimeMonitor

update_realtime_monitor(realtime_monitor_id, realtime_monitor_schedule=None, lookback_time=None)

Update the real-time monitor associated with the real-time monitor id.

Parameters:

realtime_monitor_id (str) – Unique string identifier for the real-time monitor.
realtime_monitor_schedule (str) – The cron expression for triggering monitor
lookback_time (float) – Lookback time (in seconds) for each monitor trigger

Returns:

Object describing the realtime monitor.

Return type:

RealtimeMonitor

delete_realtime_monitor(realtime_monitor_id)

Delete the real-time monitor associated with the real-time monitor id.

Parameters:: realtime_monitor_id (str) – Unique string identifier for the real-time monitor.

create_refresh_policy(name, cron, refresh_type, project_id=None, dataset_ids=[], feature_group_id=None, model_ids=[], deployment_ids=[], batch_prediction_ids=[], model_monitor_ids=[], notebook_id=None, prediction_operator_id=None, feature_group_export_config=None)

Creates a refresh policy with a particular cron pattern and refresh type. The cron is specified in UTC time.

A refresh policy allows for the scheduling of a set of actions at regular intervals. This can be useful for periodically updating data that needs to be re-imported into the project for retraining.

Parameters:

name (str) – The name of the refresh policy.
cron (str) – A cron-like string specifying the frequency of the refresh policy in UTC time.
refresh_type (str) – The refresh type used to determine what is being refreshed, such as a single dataset, dataset and model, or more.
project_id (str) – Optionally, a project ID can be specified so that all datasets, models, deployments, batch predictions, prediction metrics, model monitrs, and notebooks are captured at the instant the policy was created.
dataset_ids (List) – Comma-separated list of dataset IDs.
feature_group_id (str) – Feature Group ID associated with refresh policy.
model_ids (List) – Comma-separated list of model IDs.
deployment_ids (List) – Comma-separated list of deployment IDs.
batch_prediction_ids (List) – Comma-separated list of batch prediction IDs.
model_monitor_ids (List) – Comma-separated list of model monitor IDs.
notebook_id (str) – Notebook ID associated with refresh policy.
prediction_operator_id (str) – Prediction Operator ID associated with refresh policy.
feature_group_export_config (FeatureGroupExportConfig) – Feature group export configuration.

Returns:

The created refresh policy.

Return type:

RefreshPolicy

delete_refresh_policy(refresh_policy_id)

Delete a refresh policy.

Parameters:: refresh_policy_id (str) – Unique string identifier associated with the refresh policy to delete.

pause_refresh_policy(refresh_policy_id)

Pauses a refresh policy

Parameters:: refresh_policy_id (str) – Unique identifier associated with the refresh policy to be paused.

resume_refresh_policy(refresh_policy_id)

Resumes a refresh policy

Parameters:: refresh_policy_id (str) – The unique ID associated with this refresh policy.

run_refresh_policy(refresh_policy_id)

Force a run of the refresh policy.

Parameters:: refresh_policy_id (str) – Unique string identifier associated with the refresh policy to be run.

update_refresh_policy(refresh_policy_id, name=None, cron=None, feature_group_export_config=None)

Update the name or cron string of a refresh policy

Parameters:

refresh_policy_id (str) – Unique string identifier associated with the refresh policy.
name (str) – Name of the refresh policy to be updated.
cron (str) – Cron string describing the schedule from the refresh policy to be updated.
feature_group_export_config (FeatureGroupExportConfig) – Feature group export configuration to update a feature group refresh policy.

Returns:

Updated refresh policy.

Return type:

RefreshPolicy

lookup_features(deployment_token, deployment_id, query_data, limit_results=None, result_columns=None)

Returns the feature group deployed in the feature store project.

Parameters:

deployment_token (str) – A deployment token used to authenticate access to created deployments. This token only authorizes predictions on deployments in this project, so it can be safely embedded inside an application or website.
deployment_id (str) – A unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the key is the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the value is the unique value of the same entity.
limit_results (int) – If provided, will limit the number of results to the value specified.
result_columns (list) – If provided, will limit the columns present in each result to the columns specified in this list.

Return type:

Dict

predict(deployment_token, deployment_id, query_data, **kwargs)

Returns a prediction for Predictive Modeling

Parameters:

deployment_token (str) – A deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, and is safe to embed in an application or website.
deployment_id (str) – A unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the key is the column name (e.g. a column with name ‘user_id’ in the dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed, and the value is the unique value of the same entity.

Return type:

Dict

predict_multiple(deployment_token, deployment_id, query_data)

Returns a list of predictions for predictive modeling.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, and is safe to embed in an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (list) – A list of dictionaries, where the ‘key’ is the column name (e.g. a column with name ‘user_id’ in the dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed, and the ‘value’ is the unique value of the same entity.

Return type:

Dict

predict_from_datasets(deployment_token, deployment_id, query_data)

Returns a list of predictions for Predictive Modeling.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the ‘key’ is the source dataset name, and the ‘value’ is a list of records corresponding to the dataset rows.

Return type:

Dict

predict_lead(deployment_token, deployment_id, query_data, explain_predictions=False, explainer_type=None)

Returns the probability of a user being a lead based on their interaction with the service/product and their own attributes (e.g. income, assets, credit score, etc.). Note that the inputs to this method, wherever applicable, should be the column names in the dataset mapped to the column mappings in our system (e.g. column ‘user_id’ mapped to mapping ‘LEAD_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – A dictionary containing user attributes and/or user’s interaction data with the product/service (e.g. number of clicks, items in cart, etc.).
explain_predictions (bool) – Will explain predictions for leads
explainer_type (str) – Type of explainer to use for explanations

Return type:

Dict

predict_churn(deployment_token, deployment_id, query_data, explain_predictions=False, explainer_type=None)

Returns the probability of a user to churn out in response to their interactions with the item/product/service. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘churn_result’ mapped to mapping ‘CHURNED_YN’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where the ‘key’ will be the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the ‘value’ will be the unique value of the same entity.
explain_predictions (bool) – Will explain predictions for churn
explainer_type (str) – Type of explainer to use for explanations

Return type:

Dict

predict_takeover(deployment_token, deployment_id, query_data)

Returns a probability for each class label associated with the types of fraud or a ‘yes’ or ‘no’ type label for the possibility of fraud. Note that the inputs to this method, wherever applicable, will be the column names in the dataset mapped to the column mappings in our system (e.g., column ‘account_name’ mapped to mapping ‘ACCOUNT_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – A dictionary containing account activity characteristics (e.g., login id, login duration, login type, IP address, etc.).

Return type:

Dict

predict_fraud(deployment_token, deployment_id, query_data)

Returns the probability of a transaction performed under a specific account being fraudulent or not. Note that the inputs to this method, wherever applicable, should be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘account_number’ mapped to the mapping ‘ACCOUNT_ID’ in our system).

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
query_data (dict) – A dictionary containing transaction attributes (e.g. credit card type, transaction location, transaction amount, etc.).

Return type:

Dict

predict_class(deployment_token, deployment_id, query_data, threshold=None, threshold_class=None, thresholds=None, explain_predictions=False, fixed_features=None, nested=None, explainer_type=None)

Returns a classification prediction

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the ‘Key’ is the column name (e.g. a column with the name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the ‘Value’ is the unique value of the same entity.
threshold (float) – A float value that is applied on the popular class label.
threshold_class (str) – The label upon which the threshold is added (binary labels only).
thresholds (Dict) – Maps labels to thresholds (multi-label classification only). Defaults to F1 optimal threshold if computed for the given class, else uses 0.5.
explain_predictions (bool) – If True, returns the SHAP explanations for all input features.
fixed_features (list) – A set of input features to treat as constant for explanations - only honored when the explainer type is KERNEL_EXPLAINER
nested (str) – If specified generates prediction delta for each index of the specified nested feature.
explainer_type (str) – The type of explainer to use.

Return type:

Dict

predict_target(deployment_token, deployment_id, query_data, explain_predictions=False, fixed_features=None, nested=None, explainer_type=None)

Returns a prediction from a classification or regression model. Optionally, includes explanations.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – A dictionary where the ‘key’ is the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the ‘value’ is the unique value of the same entity.
explain_predictions (bool) – If true, returns the SHAP explanations for all input features.
fixed_features (list) – Set of input features to treat as constant for explanations - only honored when the explainer type is KERNEL_EXPLAINER
nested (str) – If specified, generates prediction delta for each index of the specified nested feature.
explainer_type (str) – The type of explainer to use.

Return type:

Dict

get_anomalies(deployment_token, deployment_id, threshold=None, histogram=False)

Returns a list of anomalies from the training dataset.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
threshold (float) – The threshold score of what is an anomaly. Valid values are between 0.8 and 0.99.
histogram (bool) – If True, will return a histogram of the distribution of all points.

Return type:

io.BytesIO

get_timeseries_anomalies(deployment_token, deployment_id, start_timestamp=None, end_timestamp=None, query_data=None, get_all_item_data=False, series_ids=None)

Returns a list of anomalous timestamps from the training dataset.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
start_timestamp (str) – timestamp from which anomalies have to be detected in the training data
end_timestamp (str) – timestamp to which anomalies have to be detected in the training data
query_data (dict) – additional data on which anomaly detection has to be performed, it can either be a single record or list of records or a json string representing list of records
get_all_item_data (bool) – set this to true if anomaly detection has to be performed on all the data related to input ids
series_ids (List) – list of series ids on which the anomaly detection has to be performed

Return type:

Dict

is_anomaly(deployment_token, deployment_id, query_data=None)

Returns a list of anomaly attributes based on login information for a specified account. Note that the inputs to this method, wherever applicable, should be the column names in the dataset mapped to the column mappings in our system (e.g. column ‘account_name’ mapped to mapping ‘ACCOUNT_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – The input data for the prediction.

Return type:

Dict

get_event_anomaly_score(deployment_token, deployment_id, query_data=None)

Returns an anomaly score for an event.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – The input data for the prediction.

Return type:

Dict

get_forecast(deployment_token, deployment_id, query_data, future_data=None, num_predictions=None, prediction_start=None, explain_predictions=False, explainer_type=None, get_item_data=False)

Returns a list of forecasts for a given entity under the specified project deployment. Note that the inputs to the deployed model will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘holiday_yn’ mapped to mapping ‘FUTURE’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where ‘Key’ will be the column name (e.g. a column with name ‘store_id’ in your dataset) mapped to the column mapping ITEM_ID that uniquely identifies the entity against which forecasting is performed and ‘Value’ will be the unique value of the same entity.
future_data (list) – This will be a list of values known ahead of time that are relevant for forecasting (e.g. State Holidays, National Holidays, etc.). Each element is a dictionary, where the key and the value both will be of type ‘str’. For example future data entered for a Store may be [{“Holiday”:”No”, “Promo”:”Yes”, “Date”: “2015-07-31 00:00:00”}].
num_predictions (int) – The number of timestamps to predict in the future.
prediction_start (str) – The start date for predictions (e.g., “2015-08-01T00:00:00” as input for mid-night of 2015-08-01).
explain_predictions (bool) – Will explain predictions for forecasting
explainer_type (str) – Type of explainer to use for explanations
get_item_data (bool) – Will return the data corresponding to items in query

Return type:

Dict

get_k_nearest(deployment_token, deployment_id, vector, k=None, distance=None, include_score=False, catalog_id=None)

Returns the k nearest neighbors for the provided embedding vector.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
vector (list) – Input vector to perform the k nearest neighbors with.
k (int) – Overrideable number of items to return.
distance (str) – Specify the distance function to use. Options include “dot“, “cosine“, “euclidean“, and “manhattan“. Default = “dot“
include_score (bool) – If True, will return the score alongside the resulting embedding value.
catalog_id (str) – An optional parameter honored only for embeddings that provide a catalog id

Return type:

Dict

get_multiple_k_nearest(deployment_token, deployment_id, queries)

Returns the k nearest neighbors for the queries provided.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
queries (list) – List of mappings of format {“catalogId”: “cat0”, “vectors”: […], “k”: 20, “distance”: “euclidean”}. See getKNearest for additional information about the supported parameters.

get_labels(deployment_token, deployment_id, query_data, return_extracted_entities=False)

Returns a list of scored labels for a document.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – Dictionary where key is “Content” and value is the text from which entities are to be extracted.
return_extracted_entities (bool) – (Optional) If True, will return the extracted entities in simpler format

Return type:

Dict

get_entities_from_pdf(deployment_token, deployment_id, pdf=None, doc_id=None, return_extracted_features=False, verbose=False, save_extracted_features=None)

Extracts text from the provided PDF and returns a list of recognized labels and their scores.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
pdf (io.TextIOBase) – (Optional) The pdf to predict on. One of pdf or docId must be specified.
doc_id (str) – (Optional) The pdf to predict on. One of pdf or docId must be specified.
return_extracted_features (bool) – (Optional) If True, will return all extracted features (e.g. all tokens in a page) from the PDF. Default is False.
verbose (bool) – (Optional) If True, will return all the extracted tokens probabilities for all the trained labels. Default is False.
save_extracted_features (bool) – (Optional) If True, will save extracted features (i.e. page tokens) so that they can be fetched using the prediction docId. Default is False.

Return type:

Dict

get_recommendations(deployment_token, deployment_id, query_data, num_items=None, page=None, exclude_item_ids=None, score_field=None, scaling_factors=None, restrict_items=None, exclude_items=None, explore_fraction=None, diversity_attribute_name=None, diversity_max_results_per_value=None)

Returns a list of recommendations for a given user under the specified project deployment. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘time’ mapped to mapping ‘TIMESTAMP’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where ‘Key’ will be the column name (e.g. a column with name ‘user_name’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the user against which recommendations are made and ‘Value’ will be the unique value of the same item. For example, if you have the column name ‘user_name’ mapped to the column mapping ‘USER_ID’, then the query must have the exact same column name (user_name) as key and the name of the user (John Doe) as value.
num_items (int) – The number of items to recommend on one page. By default, it is set to 50 items per page.
page (int) – The page number to be displayed. For example, let’s say that the num_items is set to 10 with the total recommendations list size of 50 recommended items, then an input value of 2 in the ‘page’ variable will display a list of items that rank from 11th to 20th.
score_field (str) – The relative item scores are returned in a separate field named with the same name as the key (score_field) for this argument.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there’s a type of item that might be less popular but you want to promote it or there’s an item that always comes up and you want to demote it.
restrict_items (list) – It allows you to restrict the recommendations to certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, “value3”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”, “value3”, …]” to which to restrict the recommendations to. Let’s take an example where the input to restrict_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. This input will restrict the recommendations to SUVs and Sedans. This type of restriction is particularly useful if there’s a list of items that you know is of use in some particular scenario and you want to restrict the recommendations only to that list.
exclude_items (list) – It allows you to exclude certain items from the list of recommendations. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” to exclude from the recommendations. Let’s take an example where the input to exclude_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. The resulting recommendation list will exclude all SUVs and Sedans. This is
explore_fraction (float) – Explore fraction.
diversity_attribute_name (str) – item attribute column name which is used to ensure diversity of prediction results.
diversity_max_results_per_value (int) – maximum number of results per value of diversity_attribute_name.
exclude_item_ids (list)

Return type:

Dict

get_personalized_ranking(deployment_token, deployment_id, query_data, preserve_ranks=None, preserve_unknown_items=False, scaling_factors=None)

Returns a list of items with personalized promotions for a given user under the specified project deployment. Note that the inputs to this method, wherever applicable, should be the column names in the dataset mapped to the column mappings in our system (e.g. column ‘item_code’ mapped to mapping ‘ITEM_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model in an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This should be a dictionary with two key-value pairs. The first pair represents a ‘Key’ where the column name (e.g. a column with name ‘user_id’ in the dataset) mapped to the column mapping USER_ID uniquely identifies the user against whom a prediction is made and a ‘Value’ which is the identifier value for that user. The second pair will have a ‘Key’ which will be the name of the column name (e.g. movie_name) mapped to ITEM_ID (unique item identifier) and a ‘Value’ which will be a list of identifiers that uniquely identifies those items.
preserve_ranks (list) – List of dictionaries of format {“column”: “col0”, “values”: [“value0, value1”]}, where the ranks of items in query_data is preserved for all the items in “col0” with values, “value0” and “value1”. This option is useful when the desired items are being recommended in the desired order and the ranks for those items need to be kept unchanged during recommendation generation.
preserve_unknown_items (bool) – If true, any items that are unknown to the model, will not be reranked, and the original position in the query will be preserved.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there’s a type of item that might be less popular but you want to promote it or there’s an item that always comes up and you want to demote it.

Return type:

Dict

get_ranked_items(deployment_token, deployment_id, query_data, preserve_ranks=None, preserve_unknown_items=False, score_field=None, scaling_factors=None, diversity_attribute_name=None, diversity_max_results_per_value=None)

Returns a list of re-ranked items for a selected user when a list of items is required to be reranked according to the user’s preferences. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘item_code’ mapped to mapping ‘ITEM_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary with two key-value pairs. The first pair represents a ‘Key’ where the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID uniquely identifies the user against whom a prediction is made and a ‘Value’ which is the identifier value for that user. The second pair will have a ‘Key’ which will be the name of the column name (e.g. movie_name) mapped to ITEM_ID (unique item identifier) and a ‘Value’ which will be a list of identifiers that uniquely identifies those items.
preserve_ranks (list) – List of dictionaries of format {“column”: “col0”, “values”: [“value0, value1”]}, where the ranks of items in query_data is preserved for all the items in “col0” with values, “value0” and “value1”. This option is useful when the desired items are being recommended in the desired order and the ranks for those items need to be kept unchanged during recommendation generation.
preserve_unknown_items (bool) – If true, any items that are unknown to the model, will not be reranked, and the original position in the query will be preserved
score_field (str) – The relative item scores are returned in a separate field named with the same name as the key (score_field) for this argument.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there is a type of item that might be less popular but you want to promote it or there is an item that always comes up and you want to demote it.
diversity_attribute_name (str) – item attribute column name which is used to ensure diversity of prediction results.
diversity_max_results_per_value (int) – maximum number of results per value of diversity_attribute_name.

Return type:

Dict

get_related_items(deployment_token, deployment_id, query_data, num_items=None, page=None, scaling_factors=None, restrict_items=None, exclude_items=None)

Returns a list of related items for a given item under the specified project deployment. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘item_code’ mapped to mapping ‘ITEM_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where the ‘key’ will be the column name (e.g. a column with name ‘user_name’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the user against which related items are determined and the ‘value’ will be the unique value of the same item. For example, if you have the column name ‘user_name’ mapped to the column mapping ‘USER_ID’, then the query must have the exact same column name (user_name) as key and the name of the user (John Doe) as value.
num_items (int) – The number of items to recommend on one page. By default, it is set to 50 items per page.
page (int) – The page number to be displayed. For example, let’s say that the num_items is set to 10 with the total recommendations list size of 50 recommended items, then an input value of 2 in the ‘page’ variable will display a list of items that rank from 11th to 20th.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there’s a type of item that might be less popular but you want to promote it or there’s an item that always comes up and you want to demote it.
restrict_items (list) – It allows you to restrict the recommendations to certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, “value3”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”, “value3”, …]” to which to restrict the recommendations to. Let’s take an example where the input to restrict_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. This input will restrict the recommendations to SUVs and Sedans. This type of restriction is particularly useful if there’s a list of items that you know is of use in some particular scenario and you want to restrict the recommendations only to that list.
exclude_items (list) – It allows you to exclude certain items from the list of recommendations. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” to exclude from the recommendations. Let’s take an example where the input to exclude_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. The resulting recommendation list will exclude all SUVs and Sedans. This is particularly useful if there’s a list of items that you know is of no use in some particular scenario and you don’t want to show those items present in that list.

Return type:

Dict

get_chat_response(deployment_token, deployment_id, messages, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, user_info=None)

Return a chat response which continues the conversation based on the input messages and search results.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
messages (list) – A list of chronologically ordered messages, starting with a user message and alternating sources. A message is a dict with attributes: is_user (bool): Whether the message is from the user. text (str): The message’s text.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifying the query chat config override.
user_info (dict)

Return type:

Dict

get_chat_response_with_binary_data(deployment_token, deployment_id, messages, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, attachments=None)

Return a chat response which continues the conversation based on the input messages and search results.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
messages (list) – A list of chronologically ordered messages, starting with a user message and alternating sources. A message is a dict with attributes: is_user (bool): Whether the message is from the user. text (str): The message’s text.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifying the query chat config override.
attachments (None) – A dictionary of binary data to use to answer the queries.

Return type:

Dict

get_conversation_response(deployment_id, message, deployment_token, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, doc_infos=None, user_info=None, execute_usercode_tool=False)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:

deployment_id (str) – The unique identifier to a deployment created under the project.
message (str) – A message from the user
deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifiying the query chat config override.
doc_infos (list) – An optional list of documents use for the conversation. A keyword ‘doc_id’ is expected to be present in each document for retrieving contents from docstore.
execute_usercode_tool (bool) – If True, will return the tool output in the response.
user_info (dict)

Return type:

Dict

get_conversation_response_with_binary_data(deployment_id, deployment_token, message, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, attachments=None)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:

deployment_id (str) – The unique identifier to a deployment created under the project.
deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
message (str) – A message from the user
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifiying the query chat config override.
attachments (None) – A dictionary of binary data to use to answer the queries.

Return type:

Dict

get_search_results(deployment_token, deployment_id, query_data, num=15)

Return the most relevant search results to the search query from the uploaded documents.

Parameters:

deployment_token (str) – A token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be securely embedded in an application or website.
deployment_id (str) – A unique identifier of a deployment created under the project.
query_data (dict) – A dictionary where the key is “Content” and the value is the text from which entities are to be extracted.
num (int) – Number of search results to return.

Return type:

Dict

get_sentiment(deployment_token, deployment_id, document)

Predicts sentiment on a document

Parameters:

deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for a deployment created under this project.
document (str) – The document to be analyzed for sentiment.

Return type:

Dict

get_entailment(deployment_token, deployment_id, document)

Predicts the classification of the document

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
document (str) – The document to be classified.

Return type:

Dict

get_classification(deployment_token, deployment_id, document)

Predicts the classification of the document

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
document (str) – The document to be classified.

Return type:

Dict

get_summary(deployment_token, deployment_id, query_data)

Returns a JSON of the predicted summary for the given document. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘text’ mapped to mapping ‘DOCUMENT’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – Raw data dictionary containing the required document data - must have a key ‘document’ corresponding to a DOCUMENT type text as value.

Return type:

Dict

predict_language(deployment_token, deployment_id, query_data)

Predicts the language of the text

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments within this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for a deployment created under the project.
query_data (str) – The input string to detect.

Return type:

Dict

get_assignments(deployment_token, deployment_id, query_data, forced_assignments=None, solve_time_limit_seconds=None, include_all_assignments=False)

Get all positive assignments that match a query.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – Specifies the set of assignments being requested. The value for the key can be: 1. A simple scalar value, which is matched exactly 2. A list of values, which matches any element in the list 3. A dictionary with keys lower_in/lower_ex and upper_in/upper_ex, which matches values in an inclusive/exclusive range
forced_assignments (dict) – Set of assignments to force and resolve before returning query results.
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
include_all_assignments (bool) – If True, will return all assignments, including assignments with value 0. Default is False.

Return type:

Dict

get_alternative_assignments(deployment_token, deployment_id, query_data, add_constraints=None, solve_time_limit_seconds=None, best_alternate_only=False)

Get alternative positive assignments for given query. Optimal assignments are ignored and the alternative assignments are returned instead.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – Specifies the set of assignments being requested. The value for the key can be: 1. A simple scalar value, which is matched exactly 2. A list of values, which matches any element in the list 3. A dictionary with keys lower_in/lower_ex and upper_in/upper_ex, which matches values in an inclusive/exclusive range
add_constraints (list) – List of constraints dict to apply to the query. The constraint dict should have the following keys: 1. query (dict): Specifies the set of assignment variables involved in the constraint. The format is same as query_data. 2. operator (str): Constraint operator ‘=’ or ‘<=’ or ‘>=’. 3. constant (int): Constraint RHS constant value. 4. coefficient_column (str): Column in Assignment feature group to be used as coefficient for the assignment variables, optional and defaults to 1
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
best_alternate_only (bool) – When True only the best alternate will be returned, when False multiple alternates are returned

Return type:

Dict

get_optimization_inputs_from_serialized(deployment_token, deployment_id, query_data=None)

Get assignments for given query, with new inputs

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – a dictionary with various key: value pairs corresponding to various updated FGs in the FG tree, which we want to update to compute new top level FGs for online solve. (query data will be dict of names: serialized dataframes)

Return type:

Dict

get_assignments_online_with_new_serialized_inputs(deployment_token, deployment_id, query_data=None, solve_time_limit_seconds=None, optimality_gap_limit=None)

Get assignments for given query, with new inputs

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – a dictionary with assignment, constraint and constraint_equations_df (under these specific keys)
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
optimality_gap_limit (float) – Optimality gap we want to come within, after which we accept the solution as valid. (0 means we only want an optimal solution). it is abs(best_solution_found - best_bound) / abs(best_solution_found)

Return type:

Dict

check_constraints(deployment_token, deployment_id, query_data)

Check for any constraints violated by the overrides.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (dict) – Assignment overrides to the solution.

Return type:

Dict

predict_with_binary_data(deployment_token, deployment_id, blob)

Make predictions for a given blob, e.g. image, audio

Parameters:

deployment_token (str) – A token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model in an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
blob (io.TextIOBase) – The multipart/form-data of the data.

Return type:

Dict

describe_image(deployment_token, deployment_id, image, categories, top_n=None)

Describe the similarity between an image and a list of categories.

Parameters:

deployment_token (str) – Authentication token to access created deployments. This token is only authorized to predict on deployments in the current project, and can be safely embedded in an application or website.
deployment_id (str) – Unique identifier of a deployment created under the project.
image (io.TextIOBase) – Image to describe.
categories (list) – List of candidate categories to compare with the image.
top_n (int) – Return the N most similar categories.

Return type:

Dict

get_text_from_document(deployment_token, deployment_id, document=None, adjust_doc_orientation=False, save_predicted_pdf=False, save_extracted_features=False)

Generate text from a document

Parameters:

deployment_token (str) – Authentication token to access created deployments. This token is only authorized to predict on deployments in the current project, and can be safely embedded in an application or website.
deployment_id (str) – Unique identifier of a deployment created under the project.
document (io.TextIOBase) – Input document which can be an image, pdf, or word document (Some formats might not be supported yet)
adjust_doc_orientation (bool) – (Optional) whether to detect the document page orientation and rotate it if needed.
save_predicted_pdf (bool) – (Optional) If True, will save the predicted pdf bytes so that they can be fetched using the prediction docId. Default is False.
save_extracted_features (bool) – (Optional) If True, will save extracted features (i.e. page tokens) so that they can be fetched using the prediction docId. Default is False.

Return type:

Dict

transcribe_audio(deployment_token, deployment_id, audio)

Transcribe the audio

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to make predictions on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
audio (io.TextIOBase) – The audio to transcribe.

Return type:

Dict

classify_image(deployment_token, deployment_id, image=None, doc_id=None)

Classify an image.

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier to a deployment created under the project.
image (io.TextIOBase) – The binary data of the image to classify. One of image or doc_id must be specified.
doc_id (str) – The document ID of the image. One of image or doc_id must be specified.

Return type:

Dict

classify_pdf(deployment_token, deployment_id, pdf=None)

Returns a classification prediction from a PDF

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
pdf (io.TextIOBase) – (Optional) The pdf to predict on. One of pdf or docId must be specified.

Return type:

Dict

get_cluster(deployment_token, deployment_id, query_data)

Predicts the cluster for given data.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
query_data (dict) – A dictionary where each ‘key’ represents a column name and its corresponding ‘value’ represents the value of that column. For Timeseries Clustering, the ‘key’ should be ITEM_ID, and its value should represent a unique item ID that needs clustering.

Return type:

Dict

get_objects_from_image(deployment_token, deployment_id, image)

Classify an image.

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier to a deployment created under the project.
image (io.TextIOBase) – The binary data of the image to detect objects from.

Return type:

Dict

score_image(deployment_token, deployment_id, image)

Score on image.

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier to a deployment created under the project.
image (io.TextIOBase) – The binary data of the image to get the score.

Return type:

Dict

transfer_style(deployment_token, deployment_id, source_image, style_image)

Change the source image to adopt the visual style from the style image.

Parameters:

deployment_token (str) – A token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model in an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
source_image (io.TextIOBase) – The source image to apply the makeup.
style_image (io.TextIOBase) – The image that has the style as a reference.

Return type:

io.BytesIO

generate_image(deployment_token, deployment_id, query_data)

Generate an image from text prompt.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
query_data (dict) – Specifies the text prompt. For example, {‘prompt’: ‘a cat’}

Return type:

io.BytesIO

execute_agent(deployment_token, deployment_id, arguments=None, keyword_arguments=None)

Executes a deployed AI agent function using the arguments as keyword arguments to the agent execute function.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
arguments (list) – Positional arguments to the agent execute function.
keyword_arguments (dict) – A dictionary where each ‘key’ represents the paramter name and its corresponding ‘value’ represents the value of that parameter for the agent execute function.

Return type:

Dict

get_matrix_agent_schema(deployment_token, deployment_id, query, doc_infos=None, deployment_conversation_id=None, external_session_id=None)

Executes a deployed AI agent function using the arguments as keyword arguments to the agent execute function.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
query (str) – User input query to initialize the matrix computation.
doc_infos (list) – An optional list of documents use for constructing the matrix. A keyword ‘doc_id’ is expected to be present in each document for retrieving contents from docstore.
deployment_conversation_id (str) – A unique string identifier for the deployment conversation used for the conversation.
external_session_id (str) – A unique string identifier for the session used for the conversation. If both deployment_conversation_id and external_session_id are not provided, a new session will be created.

Return type:

Dict

execute_conversation_agent(deployment_token, deployment_id, arguments=None, keyword_arguments=None, deployment_conversation_id=None, external_session_id=None, regenerate=False, doc_infos=None, agent_workflow_node_id=None)

Executes a deployed AI agent function using the arguments as keyword arguments to the agent execute function.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
arguments (list) – Positional arguments to the agent execute function.
keyword_arguments (dict) – A dictionary where each ‘key’ represents the paramter name and its corresponding ‘value’ represents the value of that parameter for the agent execute function.
deployment_conversation_id (str) – A unique string identifier for the deployment conversation used for the conversation.
external_session_id (str) – A unique string identifier for the session used for the conversation. If both deployment_conversation_id and external_session_id are not provided, a new session will be created.
regenerate (bool) – If True, will regenerate the response from the last query.
doc_infos (list) – An optional list of documents use for the conversation. A keyword ‘doc_id’ is expected to be present in each document for retrieving contents from docstore.
agent_workflow_node_id (str) – An optional agent workflow node id to trigger agent execution from an intermediate node.

Return type:

Dict

lookup_matches(deployment_token, deployment_id, data=None, filters=None, num=None, result_columns=None, max_words=None, num_retrieval_margin_words=None, max_words_per_chunk=None, score_multiplier_column=None, min_score=None, required_phrases=None, filter_clause=None, crowding_limits=None, include_text_search=False)

Lookup document retrievers and return the matching documents from the document retriever deployed with given query.

Original documents are splitted into chunks and stored in the document retriever. This lookup function will return the relevant chunks from the document retriever. The returned chunks could be expanded to include more words from the original documents and merged if they are overlapping, and permitted by the settings provided. The returned chunks are sorted by relevance.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments within this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
data (str) – The query to search for.
filters (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
num (int) – If provided, will limit the number of results to the value specified.
result_columns (list) – If provided, will limit the column properties present in each result to those specified in this list.
max_words (int) – If provided, will limit the total number of words in the results to the value specified.
num_retrieval_margin_words (int) – If provided, will add this number of words from left and right of the returned chunks.
max_words_per_chunk (int) – If provided, will limit the number of words in each chunk to the value specified. If the value provided is smaller than the actual size of chunk on disk, which is determined during document retriever creation, the actual size of chunk will be used. I.e, chunks looked up from document retrievers will not be split into smaller chunks during lookup due to this setting.
score_multiplier_column (str) – If provided, will use the values in this column to modify the relevance score of the returned chunks. Values in this column must be numeric.
min_score (float) – If provided, will filter out the results with score less than the value specified.
required_phrases (list) – If provided, each result will contain at least one of the phrases in the given list. The matching is whitespace and case insensitive.
filter_clause (str) – If provided, filter the results of the query using this sql where clause.
crowding_limits (dict) – A dictionary mapping metadata columns to the maximum number of results per unique value of the column. This is used to ensure diversity of metadata attribute values in the results. If a particular attribute value has already reached its maximum count, further results with that same attribute value will be excluded from the final result set. An entry in the map can also be a map specifying the limit per attribute value rather than a single limit for all values. This allows a per value limit for attributes. If an attribute value is not present in the map its limit defaults to zero.
include_text_search (bool) – If true, combine the ranking of results from a BM25 text search over the documents with the vector search using reciprocal rank fusion. It leverages both lexical and semantic matching for better overall results. It’s particularly valuable in professional, technical, or specialized fields where both precision in terminology and understanding of context are important.

Returns:

The relevant documentation results found from the document retriever.

Return type:

list[DocumentRetrieverLookupResult]

get_completion(deployment_token, deployment_id, prompt)

Returns the finetuned LLM generated completion of the prompt.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
prompt (str) – The prompt given to the finetuned LLM to generate the completion.

Return type:

Dict

start_autonomous_agent(deployment_token, deployment_id, arguments=None, keyword_arguments=None, save_conversations=True)

Starts a deployed Autonomous agent associated with the given deployment_conversation_id using the arguments and keyword arguments as inputs for execute function of trigger node.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
arguments (list) – Positional arguments to the agent execute function.
keyword_arguments (dict) – A dictionary where each ‘key’ represents the parameter name and its corresponding ‘value’ represents the value of that parameter for the agent execute function.
save_conversations (bool) – If true then a new conversation will be created for every run of the workflow associated with the agent.

Return type:

Dict

pause_autonomous_agent(deployment_token, deployment_id, deployment_conversation_id)

Pauses a deployed Autonomous agent associated with the given deployment_conversation_id.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
deployment_conversation_id (str) – A unique string identifier for the deployment conversation used for the conversation.

Return type:

Dict

create_batch_prediction(deployment_id, table_name=None, name=None, global_prediction_args=None, batch_prediction_args=None, explanations=False, output_format=None, output_location=None, database_connector_id=None, database_output_config=None, refresh_schedule=None, csv_input_prefix=None, csv_prediction_prefix=None, csv_explanations_prefix=None, output_includes_metadata=None, result_input_columns=None, input_feature_groups=None)

Creates a batch prediction job description for the given deployment.

Parameters:

deployment_id (str) – Unique string identifier for the deployment.
table_name (str) – Name of the feature group table to write the results of the batch prediction. Can only be specified if outputLocation and databaseConnectorId are not specified. If tableName is specified, the outputType will be enforced as CSV.
name (str) – Name of the batch prediction job.
batch_prediction_args (BatchPredictionArgs) – Batch Prediction args specific to problem type.
output_format (str) – Format of the batch prediction output (CSV or JSON).
output_location (str) – Location to write the prediction results. Otherwise, results will be stored in Abacus.AI.
database_connector_id (str) – Unique identifier of a Database Connection to write predictions to. Cannot be specified in conjunction with outputLocation.
database_output_config (dict) – Key-value pair of columns/values to write to the database connector. Only available if databaseConnectorId is specified.
refresh_schedule (str) – Cron-style string that describes a schedule in UTC to automatically run the batch prediction.
csv_input_prefix (str) – Prefix to prepend to the input columns, only applies when output format is CSV.
csv_prediction_prefix (str) – Prefix to prepend to the prediction columns, only applies when output format is CSV.
csv_explanations_prefix (str) – Prefix to prepend to the explanation columns, only applies when output format is CSV.
output_includes_metadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version.
result_input_columns (list) – If present, will limit result files or feature groups to only include columns present in this list.
input_feature_groups (dict) – A dict of {‘<feature_group_type>’: ‘<feature_group_id>’} which overrides the default input data of that type for the Batch Prediction. Default input data is the training data that was used for training the deployed model.
global_prediction_args (Union[dict, abacusai.api_class.BatchPredictionArgs])
explanations (bool)

Returns:

The batch prediction description.

Return type:

BatchPrediction

start_batch_prediction(batch_prediction_id)

Creates a new batch prediction version job for a given batch prediction job description.

Parameters:: batch_prediction_id (str) – The unique identifier of the batch prediction to create a new version of.
Returns:: The batch prediction version started by this method call.
Return type:: BatchPredictionVersion

update_batch_prediction(batch_prediction_id, deployment_id=None, global_prediction_args=None, batch_prediction_args=None, explanations=None, output_format=None, csv_input_prefix=None, csv_prediction_prefix=None, csv_explanations_prefix=None, output_includes_metadata=None, result_input_columns=None, name=None)

Update a batch prediction job description.

Parameters:

batch_prediction_id (str) – Unique identifier of the batch prediction.
deployment_id (str) – Unique identifier of the deployment.
batch_prediction_args (BatchPredictionArgs) – Batch Prediction args specific to problem type.
output_format (str) – If specified, sets the format of the batch prediction output (CSV or JSON).
csv_input_prefix (str) – Prefix to prepend to the input columns, only applies when output format is CSV.
csv_prediction_prefix (str) – Prefix to prepend to the prediction columns, only applies when output format is CSV.
csv_explanations_prefix (str) – Prefix to prepend to the explanation columns, only applies when output format is CSV.
output_includes_metadata (bool) – If True, output will contain columns including prediction start time, batch prediction version, and model version.
result_input_columns (list) – If present, will limit result files or feature groups to only include columns present in this list.
name (str) – If present, will rename the batch prediction.
global_prediction_args (Union[dict, abacusai.api_class.BatchPredictionArgs])
explanations (bool)

Returns:

The batch prediction.

Return type:

BatchPrediction

set_batch_prediction_file_connector_output(batch_prediction_id, output_format=None, output_location=None)

Updates the file connector output configuration of the batch prediction

Parameters:

batch_prediction_id (str) – The unique identifier of the batch prediction.
output_format (str) – The format of the batch prediction output (CSV or JSON). If not specified, the default format will be used.
output_location (str) – The location to write the prediction results. If not specified, results will be stored in Abacus.AI.

Returns:

The batch prediction description.

Return type:

BatchPrediction

set_batch_prediction_database_connector_output(batch_prediction_id, database_connector_id=None, database_output_config=None)

Updates the database connector output configuration of the batch prediction

Parameters:

batch_prediction_id (str) – Unique string identifier of the batch prediction.
database_connector_id (str) – Unique string identifier of an Database Connection to write predictions to.
database_output_config (dict) – Key-value pair of columns/values to write to the database connector.

Returns:

Description of the batch prediction.

Return type:

BatchPrediction

set_batch_prediction_feature_group_output(batch_prediction_id, table_name)

Creates a feature group and sets it as the batch prediction output.

Parameters:

batch_prediction_id (str) – Unique string identifier of the batch prediction.
table_name (str) – Name of the feature group table to create.

Returns:

Batch prediction after the output has been applied.

Return type:

BatchPrediction

set_batch_prediction_output_to_console(batch_prediction_id)

Sets the batch prediction output to the console, clearing both the file connector and database connector configurations.

Parameters:: batch_prediction_id (str) – The unique identifier of the batch prediction.
Returns:: The batch prediction description.
Return type:: BatchPrediction

set_batch_prediction_feature_group(batch_prediction_id, feature_group_type, feature_group_id=None)

Sets the batch prediction input feature group.

Parameters:

batch_prediction_id (str) – Unique identifier of the batch prediction.
feature_group_type (str) – Enum string representing the feature group type to set. The type is based on the use case under which the feature group is being created (e.g. Catalog Attributes for personalized recommendation use case).
feature_group_id (str) – Unique identifier of the feature group to set as input to the batch prediction.

Returns:

Description of the batch prediction.

Return type:

BatchPrediction

set_batch_prediction_dataset_remap(batch_prediction_id, dataset_id_remap)

For the purpose of this batch prediction, will swap out datasets in the training feature groups

Parameters:

batch_prediction_id (str) – Unique string identifier of the batch prediction.
dataset_id_remap (dict) – Key/value pairs of dataset ids to be replaced during the batch prediction.

Returns:

Batch prediction object.

Return type:

BatchPrediction

delete_batch_prediction(batch_prediction_id)

Deletes a batch prediction and associated data, such as associated monitors.

Parameters:: batch_prediction_id (str) – Unique string identifier of the batch prediction.

upsert_item_embeddings(streaming_token, model_id, item_id, vector, catalog_id=None)

Upserts an embedding vector for an item id for a model_id.

Parameters:

streaming_token (str) – The streaming token for authenticating requests to the model.
model_id (str) – A unique string identifier for the model to upsert item embeddings to.
item_id (str) – The item id for which its embeddings will be upserted.
vector (list) – The embedding vector.
catalog_id (str) – The name of the catalog in the model to update.

delete_item_embeddings(streaming_token, model_id, item_ids, catalog_id=None)

Deletes KNN embeddings for a list of item IDs for a given model ID.

Parameters:

streaming_token (str) – The streaming token for authenticating requests to the model.
model_id (str) – A unique string identifier for the model from which to delete item embeddings.
item_ids (list) – A list of item IDs whose embeddings will be deleted.
catalog_id (str) – An optional name to specify which catalog in a model to update.

upsert_multiple_item_embeddings(streaming_token, model_id, upserts, catalog_id=None)

Upserts a knn embedding for multiple item ids for a model_id.

Parameters:

streaming_token (str) – The streaming token for authenticating requests to the model.
model_id (str) – The unique string identifier of the model to upsert item embeddings to.
upserts (list) – A list of dictionaries of the form {‘itemId’: …, ‘vector’: […]} for each upsert.
catalog_id (str) – Name of the catalog in the model to update.

append_data(feature_group_id, streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters:

feature_group_id (str) – Unique string identifier for the streaming feature group to record data to.
streaming_token (str) – The streaming token for authenticating requests.
data (dict) – The data to record as a JSON object.

append_multiple_data(feature_group_id, streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters:

feature_group_id (str) – Unique string identifier of the streaming feature group to record data to.
streaming_token (str) – Streaming token for authenticating requests.
data (list) – Data to record, as a list of JSON objects.

upsert_data(feature_group_id, data, streaming_token=None, blobs=None)

Update new data into the feature group for a given lookup key record ID if the record ID is found; otherwise, insert new data into the feature group.

Parameters:

feature_group_id (str) – A unique string identifier of the online feature group to record data to.
data (dict) – The data to record, in JSON format.
streaming_token (str) – Optional streaming token for authenticating requests if upserting to streaming FG.
blobs (None) – A dictionary of binary data to populate file fields’ in data to upsert to the streaming FG.

Returns:

The feature group row that was upserted.

Return type:

FeatureGroupRow

delete_data(feature_group_id, primary_key)

Deletes a row from the feature group given the primary key

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
primary_key (str) – The primary key value for which to delete the feature group row

describe_feature_group_row_process_by_key(deployment_id, primary_key_value)

Gets the feature group row process.

Parameters:

deployment_id (str) – The deployment id
primary_key_value (str) – The primary key value

Returns:

An object representing the feature group row process

Return type:

FeatureGroupRowProcess

list_feature_group_row_processes(deployment_id, limit=None, status=None)

Gets a list of feature group row processes.

Parameters:

deployment_id (str) – The deployment id for the process
limit (int) – The maximum number of processes to return. Defaults to None.
status (str) – The status of the processes to return. Defaults to None.

Returns:

A list of object representing the feature group row process

Return type:

list[FeatureGroupRowProcess]

get_feature_group_row_process_summary(deployment_id)

Gets a summary of the statuses of the individual feature group processes.

Parameters:: deployment_id (str) – The deployment id for the process
Returns:: An object representing the summary of the statuses of the individual feature group processes
Return type:: FeatureGroupRowProcessSummary

reset_feature_group_row_process_by_key(deployment_id, primary_key_value)

Resets a feature group row process so that it can be reprocessed

Parameters:

deployment_id (str) – The deployment id
primary_key_value (str) – The primary key value

Returns:

An object representing the feature group row process.

Return type:

FeatureGroupRowProcess

get_feature_group_row_process_logs_by_key(deployment_id, primary_key_value)

Gets the logs for a feature group row process

Parameters:

deployment_id (str) – The deployment id
primary_key_value (str) – The primary key value

Returns:

An object representing the logs for the feature group row process

Return type:

FeatureGroupRowProcessLogs

create_python_function(name, source_code=None, function_name=None, function_variable_mappings=None, package_requirements=None, function_type='FEATURE_GROUP', description=None, examples=None, user_level_connectors=None, org_level_connectors=None, output_variable_mappings=None)

Creates a custom Python function that is reusable.

Parameters:

name (str) – The name to identify the Python function. Must be a valid Python identifier.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
function_name (str) – The name of the Python function.
function_variable_mappings (List) – List of Python function arguments.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
function_type (str) – Type of Python function to create. Default is FEATURE_GROUP, but can also be PLOTLY_FIG.
description (str) – Description of the Python function. This should include details about the function’s purpose, expected inputs and outputs, and any important usage considerations or limitations.
examples (dict) – Dictionary containing example use cases and anti-patterns. Should include ‘positive_examples’ showing recommended usage and ‘negative_examples’ showing cases to avoid.])
user_level_connectors (Dict) – Dictionary containing user level connectors.
org_level_connectors (List) – List containing organization level connectors.
output_variable_mappings (List) – List of output variable mappings that defines the elements of the function’s return value.

Returns:

The Python function that can be used (e.g. for feature group transform).

Return type:

PythonFunction

update_python_function(name, source_code=None, function_name=None, function_variable_mappings=None, package_requirements=None, description=None, examples=None, user_level_connectors=None, org_level_connectors=None, output_variable_mappings=None)

Update custom python function with user inputs for the given python function.

Parameters:

name (str) – The name to identify the Python function. Must be a valid Python identifier.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
function_name (str) – The name of the Python function within source_code.
function_variable_mappings (List) – List of arguments required by function_name.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
description (str) – Description of the Python function. This should include details about the function’s purpose, expected inputs and outputs, and any important usage considerations or limitations.
examples (dict) – Dictionary containing example use cases and anti-patterns. Should include ‘positive_examples’ showing recommended usage and ‘negative_examples’ showing cases to avoid.
user_level_connectors (Dict) – Dictionary containing user level connectors.
org_level_connectors (List) – List of organization level connectors.
output_variable_mappings (List) – List of output variable mappings that defines the elements of the function’s return value.

Returns:

The Python function object.

Return type:

PythonFunction

delete_python_function(name)

Removes an existing Python function.

Parameters:: name (str) – The name to identify the Python function. Must be a valid Python identifier.

create_pipeline(pipeline_name, project_id=None, cron=None, is_prod=None)

Creates a pipeline for executing multiple steps.

Parameters:

pipeline_name (str) – The name of the pipeline, which should be unique to the organization.
project_id (str) – A unique string identifier for the pipeline.
cron (str) – A cron-like string specifying the frequency of pipeline reruns.
is_prod (bool) – Whether the pipeline is a production pipeline or not.

Returns:

An object that describes a Pipeline.

Return type:

Pipeline

describe_pipeline(pipeline_id)

Describes a given pipeline.

Parameters:: pipeline_id (str) – The ID of the pipeline to describe.
Returns:: An object describing a Pipeline
Return type:: Pipeline

describe_pipeline_by_name(pipeline_name)

Describes a given pipeline.

Parameters:: pipeline_name (str) – The name of the pipeline to describe.
Returns:: An object describing a Pipeline
Return type:: Pipeline

update_pipeline(pipeline_id, project_id=None, pipeline_variable_mappings=None, cron=None, is_prod=None)

Updates a pipeline for executing multiple steps.

Parameters:

pipeline_id (str) – The ID of the pipeline to update.
project_id (str) – A unique string identifier for the pipeline.
pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.
cron (str) – A cron-like string specifying the frequency of the scheduled pipeline runs.
is_prod (bool) – Whether the pipeline is a production pipeline or not.

Returns:

An object that describes a Pipeline.

Return type:

Pipeline

rename_pipeline(pipeline_id, pipeline_name)

Renames a pipeline.

Parameters:

pipeline_id (str) – The ID of the pipeline to rename.
pipeline_name (str) – The new name of the pipeline.

Returns:

An object that describes a Pipeline.

Return type:

Pipeline

delete_pipeline(pipeline_id)

Deletes a pipeline.

Parameters:: pipeline_id (str) – The ID of the pipeline to delete.

list_pipeline_versions(pipeline_id, limit=200)

Lists the pipeline versions for a specified pipeline

Parameters:

pipeline_id (str) – The ID of the pipeline to list versions for.
limit (int) – The maximum number of pipeline versions to return.

Returns:

A list of pipeline versions.

Return type:

list[PipelineVersion]

run_pipeline(pipeline_id, pipeline_variable_mappings=None)

Runs a specified pipeline with the arguments provided.

Parameters:

pipeline_id (str) – The ID of the pipeline to run.
pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.

Returns:

The object describing the pipeline

Return type:

PipelineVersion

reset_pipeline_version(pipeline_version, steps=None, include_downstream_steps=True)

Reruns a pipeline version for the given steps and downstream steps if specified.

Parameters:

pipeline_version (str) – The id of the pipeline version.
steps (list) – List of pipeline step names to rerun.
include_downstream_steps (bool) – Whether to rerun downstream steps from the steps you have passed

Returns:

Object describing the pipeline version

Return type:

PipelineVersion

create_pipeline_step(pipeline_id, step_name, function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)

Creates a step in a given pipeline.

Parameters:

pipeline_id (str) – The ID of the pipeline to run.
step_name (str) – The name of the step.
function_name (str) – The name of the Python function.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
step_input_mappings (List) – List of Python function arguments.
output_variable_mappings (List) – List of Python function outputs.
step_dependencies (list) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
timeout (int) – Timeout for the step in minutes, default is 300 minutes.

Returns:

Object describing the pipeline.

Return type:

Pipeline

delete_pipeline_step(pipeline_step_id)

Deletes a step from a pipeline.

Parameters:: pipeline_step_id (str) – The ID of the pipeline step.

update_pipeline_step(pipeline_step_id, function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)

Creates a step in a given pipeline.

Parameters:

pipeline_step_id (str) – The ID of the pipeline_step to update.
function_name (str) – The name of the Python function.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
step_input_mappings (List) – List of Python function arguments.
output_variable_mappings (List) – List of Python function outputs.
step_dependencies (list) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
timeout (int) – Timeout for the pipeline step, default is 300 minutes.

Returns:

Object describing the pipeline.

Return type:

PipelineStep

rename_pipeline_step(pipeline_step_id, step_name)

Renames a step in a given pipeline.

Parameters:

pipeline_step_id (str) – The ID of the pipeline_step to update.
step_name (str) – The name of the step.

Returns:

Object describing the pipeline.

Return type:

PipelineStep

unset_pipeline_refresh_schedule(pipeline_id)

Deletes the refresh schedule for a given pipeline.

Parameters:: pipeline_id (str) – The id of the pipeline.
Returns:: Object describing the pipeline.
Return type:: Pipeline

pause_pipeline_refresh_schedule(pipeline_id)

Pauses the refresh schedule for a given pipeline.

Parameters:: pipeline_id (str) – The id of the pipeline.
Returns:: Object describing the pipeline.
Return type:: Pipeline

resume_pipeline_refresh_schedule(pipeline_id)

Resumes the refresh schedule for a given pipeline.

Parameters:: pipeline_id (str) – The id of the pipeline.
Returns:: Object describing the pipeline.
Return type:: Pipeline

skip_pending_pipeline_version_steps(pipeline_version)

Skips pending steps in a pipeline version.

Parameters:: pipeline_version (str) – The id of the pipeline version.
Returns:: Object describing the pipeline version
Return type:: PipelineVersion

create_graph_dashboard(project_id, name, python_function_ids=None)

Create a plot dashboard given selected python plots

Parameters:

project_id (str) – A unique string identifier for the plot dashboard.
name (str) – The name of the dashboard.
python_function_ids (List) – A list of unique string identifiers for the python functions to be used in the graph dashboard.

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

delete_graph_dashboard(graph_dashboard_id)

Deletes a graph dashboard

Parameters:: graph_dashboard_id (str) – Unique string identifier for the graph dashboard to be deleted.

update_graph_dashboard(graph_dashboard_id, name=None, python_function_ids=None)

Updates a graph dashboard

Parameters:

graph_dashboard_id (str) – Unique string identifier for the graph dashboard to update.
name (str) – Name of the dashboard.
python_function_ids (List) – List of unique string identifiers for the Python functions to be used in the graph dashboard.

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

add_graph_to_dashboard(python_function_id, graph_dashboard_id, function_variable_mappings=None, name=None)

Add a python plot function to a dashboard

Parameters:

python_function_id (str) – Unique string identifier for the Python function.
graph_dashboard_id (str) – Unique string identifier for the graph dashboard to update.
function_variable_mappings (List) – List of arguments to be supplied to the function as parameters, in the format [{‘name’: ‘function_argument’, ‘variable_type’: ‘FEATURE_GROUP’, ‘value’: ‘name_of_feature_group’}].
name (str) – Name of the added python plot

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

update_graph_to_dashboard(graph_reference_id, function_variable_mappings=None, name=None)

Update a python plot function to a dashboard

Parameters:

graph_reference_id (str) – A unique string identifier for the graph reference.
function_variable_mappings (List) – A list of arguments to be supplied to the Python function as parameters in the format [{‘name’: ‘function_argument’, ‘variable_type’: ‘FEATURE_GROUP’, ‘value’: ‘name_of_feature_group’}].
name (str) – The updated name for the graph

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

delete_graph_from_dashboard(graph_reference_id)

Deletes a python plot function from a dashboard

Parameters:: graph_reference_id (str) – Unique String Identifier for the graph

create_algorithm(name, problem_type, source_code=None, training_data_parameter_names_mapping=None, training_config_parameter_name=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, initialize_function_name=None, config_options=None, is_default_enabled=False, project_id=None, use_gpu=False, package_requirements=None)

Creates a custom algorithm that is re-usable for model training.

Parameters:

name (str) – The name to identify the algorithm; only uppercase letters, numbers, and underscores are allowed.
problem_type (str) – The type of problem this algorithm will work on.
source_code (str) – Contents of a valid Python source code file. The source code should contain the train/predict/predict_many/initialize functions. A list of allowed import and system libraries for each language is specified in the user functions documentation section.
training_data_parameter_names_mapping (dict) – The mapping from feature group types to training data parameter names in the train function.
training_config_parameter_name (str) – The train config parameter name in the train function.
train_function_name (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the source code that will be executed for batch prediction of the model. It is not executed when this function is run.
initialize_function_name (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model.
config_options (dict) – Map dataset types and configs to train function parameter names.
is_default_enabled (bool) – Whether to train with the algorithm by default.
project_id (str) – The unique version ID of the project.
use_gpu (bool) – Whether this algorithm needs to run on GPU.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].

Returns:

The new custom model that can be used for training.

Return type:

Algorithm

delete_algorithm(algorithm)

Deletes the specified customer algorithm.

Parameters:: algorithm (str) – The name of the algorithm to delete.

update_algorithm(algorithm, source_code=None, training_data_parameter_names_mapping=None, training_config_parameter_name=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, initialize_function_name=None, config_options=None, is_default_enabled=None, use_gpu=None, package_requirements=None)

Update a custom algorithm for the given algorithm name. If source code is provided, all function names for the source code must also be provided.

Parameters:

algorithm (str) – The name to identify the algorithm. Only uppercase letters, numbers, and underscores are allowed.
source_code (str) – Contents of a valid Python source code file. The source code should contain the train/predict/predict_many/initialize functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
training_data_parameter_names_mapping (dict) – The mapping from feature group types to training data parameter names in the train function.
training_config_parameter_name (str) – The train config parameter name in the train function.
train_function_name (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the source code that will be executed for batch prediction of the model. It is not executed when this function is run.
initialize_function_name (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model.
config_options (dict) – Map dataset types and configs to train function parameter names.
is_default_enabled (bool) – Whether to train with the algorithm by default.
use_gpu (bool) – Whether this algorithm needs to run on GPU.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].

Returns:

The new custom model can be used for training.

Return type:

Algorithm

list_builtin_algorithms(project_id, feature_group_ids, training_config=None)

Return list of built-in algorithms based on given input data and training config.

Parameters:

project_id (str) – Unique string identifier associated with the project.
feature_group_ids (List) – List of feature group IDs specifying input data.
training_config (TrainingConfig) – The training config to be used for model training.

Returns:

List of applicable builtin algorithms.

Return type:

list[Algorithm]

create_custom_loss_function_with_source_code(name, loss_function_type, loss_function_name, loss_function_source_code)

Registers a new custom loss function which can be used as an objective function during model training.

Parameters:

name (str) – A name for the loss, unique per organization. Must be 50 characters or fewer, and can contain only underscores, numbers, and uppercase alphabets.
loss_function_type (str) – The category of problems that this loss would be applicable to, e.g. REGRESSION_DL_TF, CLASSIFICATION_DL_TF, etc.
loss_function_name (str) – The name of the function whose full source code is passed in loss_function_source_code.
loss_function_source_code (str) – Python source code string of the function.

Returns:

A description of the registered custom loss function.

Return type:

CustomLossFunction

update_custom_loss_function_with_source_code(name, loss_function_name, loss_function_source_code)

Updates a previously registered custom loss function with a new function implementation.

Parameters:

name (str) – Name of the registered custom loss.
loss_function_name (str) – Name of the function whose full source code is passed in loss_function_source_code.
loss_function_source_code (str) – Python source code string of the function.

Returns:

A description of the updated custom loss function.

Return type:

CustomLossFunction

delete_custom_loss_function(name)

Deletes a previously registered custom loss function.

Parameters:: name (str) – The name of the custom loss function to be deleted.

create_custom_metric(name, problem_type, custom_metric_function_name=None, source_code=None)

Registers a new custom metric which can be used as an evaluation metric for the trained model.

Parameters:

name (str) – A unique name for the metric, with a limit of 50 characters. Only underscores, numbers, and uppercase alphabets are allowed.
problem_type (str) – The problem type that this metric would be applicable to, e.g. REGRESSION, FORECASTING, etc.
custom_metric_function_name (str) – The name of the function whose full source code is passed in source_code.
source_code (str) – The full source code of the custom metric function. This is required if custom_metric_function_name is passed.

Returns:

The newly created custom metric.

Return type:

CustomMetric

update_custom_metric(name, custom_metric_function_name, source_code)

Updates a previously registered custom metric with a new function implementation.

Parameters:

name (str) – Name of the registered custom metric.
custom_metric_function_name (str) – Name of the function whose full source code is passed in source_code.
source_code (str) – Python source code string of the function.

Returns:

A description of the updated custom metric.

Return type:

CustomMetric

delete_custom_metric(name)

Deletes a previously registered custom metric.

Parameters:: name (str) – The name of the custom metric to be deleted.

create_module(name, source_code=None)

Creates a module that’s re-usable in customer’s code, e.g. python function, bring your own algorithm and etc.

Parameters:

name (str) – The name to identify the module, only lower case letters and underscore allowed.
source_code (str) – Contents of a valid python source code file.

Returns:

The new module

Return type:

Module

delete_module(name)

Deletes the specified customer module.

Parameters:: name (str) – The name of the custom module to delete.

update_module(name, source_code=None)

Update the module.

Parameters:

name (str) – The name to identify the module.
source_code (str) – Contents of a valid python source code file.

Returns:

The updated module.

Return type:

Module

create_organization_secret(secret_key, value, secret_type=OrganizationSecretType.ORG_SECRET, metadata=None)

Creates a secret which can be accessed in functions and notebooks.

Parameters:

secret_key (str) – The secret key.
value (str) – The secret value.
secret_type (OrganizationSecretType) – The type of secret. Use OrganizationSecretType enum values.
metadata (dict) – Additional metadata for the secret.

Returns:

The created secret.

Return type:

OrganizationSecret

delete_organization_secret(secret_key)

Deletes a secret.

Parameters:: secret_key (str) – The secret key.

update_organization_secret(secret_key, value)

Updates a secret.

Parameters:

secret_key (str) – The secret key.
value (str) – The secret value.

Returns:

The updated secret.

Return type:

OrganizationSecret

set_natural_language_explanation(short_explanation, long_explanation, feature_group_id=None, feature_group_version=None, model_id=None)

Saves the natural language explanation of an artifact with given ID. The artifact can be - Feature Group or Feature Group Version

Parameters:

short_explanation (str) – succinct explanation of the artifact with given ID
long_explanation (str) – verbose explanation of the artifact with given ID
feature_group_id (str) – A unique string identifier associated with the Feature Group.
feature_group_version (str) – A unique string identifier associated with the Feature Group Version.
model_id (str) – A unique string identifier associated with the Model.

create_chat_session(project_id=None, name=None)

Creates a chat session with Data Science Co-pilot.

Parameters:

project_id (str) – The unique project identifier this chat session belongs to
name (str) – The name of the chat session. Defaults to the project name.

Returns:

The chat session with Data Science Co-pilot

Return type:

ChatSession

delete_chat_message(chat_session_id, message_index)

Deletes a message in a chat session and its associated response.

Parameters:

chat_session_id (str) – Unique ID of the chat session.
message_index (int) – The index of the chat message within the UI.

export_chat_session(chat_session_id)

Exports a chat session to an HTML file

Parameters:: chat_session_id (str) – Unique ID of the chat session.

rename_chat_session(chat_session_id, name)

Renames a chat session with Data Science Co-pilot.

Parameters:

chat_session_id (str) – Unique ID of the chat session.
name (str) – The new name of the chat session.

suggest_abacus_apis(query, verbosity=1, limit=5, include_scores=False)

Suggests several Abacus APIs that are most relevant to the supplied natural language query.

Parameters:

query (str) – The natural language query to find Abacus APIs for
verbosity (int) – The verbosity level of the suggested Abacus APIs. Ranges from 0 to 2, with 0 being the least verbose and 2 being the most verbose.
limit (int) – The maximum number of APIs to return
include_scores (bool) – Whether to include the relevance scores of the suggested APIs

Returns:

A list of suggested Abacus APIs

Return type:

list[AbacusApi]

create_deployment_conversation(deployment_id=None, name=None, external_application_id=None)

Creates a deployment conversation.

Parameters:

deployment_id (str) – The deployment this conversation belongs to.
name (str) – The name of the conversation.
external_application_id (str) – The external application id associated with the deployment conversation.

Returns:

The deployment conversation.

Return type:

DeploymentConversation

delete_deployment_conversation(deployment_conversation_id, deployment_id=None)

Delete a Deployment Conversation.

Parameters:

deployment_conversation_id (str) – A unique string identifier associated with the deployment conversation.
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.

clear_deployment_conversation(deployment_conversation_id=None, external_session_id=None, deployment_id=None, user_message_indices=None)

Clear the message history of a Deployment Conversation.

Parameters:

deployment_conversation_id (str) – A unique string identifier associated with the deployment conversation.
external_session_id (str) – The external session id associated with the deployment conversation.
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.
user_message_indices (list) – Optional list of user message indices to clear. The associated bot response will also be cleared. If not provided, all messages will be cleared.

set_deployment_conversation_feedback(deployment_conversation_id, message_index, is_useful=None, is_not_useful=None, feedback=None, feedback_type=None, deployment_id=None)

Sets a deployment conversation message as useful or not useful

Parameters:

deployment_conversation_id (str) – A unique string identifier associated with the deployment conversation.
message_index (int) – The index of the deployment conversation message
is_useful (bool) – If the message is useful. If true, the message is useful. If false, clear the useful flag.
is_not_useful (bool) – If the message is not useful. If true, the message is not useful. If set to false, clear the useful flag.
feedback (str) – Optional feedback on why the message is useful or not useful
feedback_type (str) – Optional feedback type
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.

rename_deployment_conversation(deployment_conversation_id, name, deployment_id=None)

Rename a Deployment Conversation.

Parameters:

deployment_conversation_id (str) – A unique string identifier associated with the deployment conversation.
name (str) – The new name of the conversation.
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.

create_app_user_group(name)

Creates a new App User Group. This User Group is used to have permissions to access the external chatbots.

Parameters:: name (str) – The name of the App User Group.
Returns:: The App User Group.
Return type:: AppUserGroup

delete_app_user_group(user_group_id)

Deletes an App User Group.

Parameters:: user_group_id (str) – The ID of the App User Group.

invite_users_to_app_user_group(user_group_id, emails)

Invite users to an App User Group. This method will send the specified email addresses an invitation link to join a specific user group.

This will allow them to use any chatbots that this user group has access to.

Parameters:

user_group_id (str) – The ID of the App User Group to invite the user to.
emails (List) – The email addresses to invite to your user group.

Returns:

The response of the invitation. This will contain the emails that were successfully invited and the emails that were not.

Return type:

ExternalInvite

add_users_to_app_user_group(user_group_id, user_emails)

Adds users to a App User Group.

Parameters:

user_group_id (str) – The ID of the App User Group.
user_emails (list) – The emails of the users to add to the App User Group.

remove_users_from_app_user_group(user_group_id, user_emails)

Removes users from an App User Group.

Parameters:

user_group_id (str) – The ID of the App User Group.
user_emails (list) – The emails of the users to remove from the App User Group.

add_app_user_group_report_permission(user_group_id)

Give the App User Group the permission to view all reports in the corresponding organization.

Parameters:: user_group_id (str) – The ID of the App User Group.

remove_app_user_group_report_permission(user_group_id)

Remove the App User Group’s permission toview all reports in the corresponding organization.

Parameters:: user_group_id (str) – The ID of the App User Group.

add_app_user_group_to_external_application(user_group_id, external_application_id)

Adds a permission for an App User Group to access an External Application.

Parameters:

user_group_id (str) – The ID of the App User Group.
external_application_id (str) – The ID of the External Application.

remove_app_user_group_from_external_application(user_group_id, external_application_id)

Removes a permission for an App User Group to access an External Application.

Parameters:

user_group_id (str) – The ID of the App User Group.
external_application_id (str) – The ID of the External Application.

create_external_application(deployment_id, name=None, description=None, logo=None, theme=None)

Creates a new External Application from an existing ChatLLM Deployment.

Parameters:

deployment_id (str) – The ID of the deployment to use.
name (str) – The name of the External Application. If not provided, the name of the deployment will be used.
description (str) – The description of the External Application. This will be shown to users when they access the External Application. If not provided, the description of the deployment will be used.
logo (str) – The logo to be displayed.
theme (dict) – The visual theme of the External Application.

Returns:

The newly created External Application.

Return type:

ExternalApplication

update_external_application(external_application_id, name=None, description=None, theme=None, deployment_id=None, deployment_conversation_retention_hours=None, reset_retention_policy=False)

Updates an External Application.

Parameters:

external_application_id (str) – The ID of the External Application.
name (str) – The name of the External Application.
description (str) – The description of the External Application. This will be shown to users when they access the External Application.
theme (dict) – The visual theme of the External Application.
deployment_id (str) – The ID of the deployment to use.
deployment_conversation_retention_hours (int) – The number of hours to retain the conversations for.
reset_retention_policy (bool) – If true, the retention policy will be removed.

Returns:

The updated External Application.

Return type:

ExternalApplication

delete_external_application(external_application_id)

Deletes an External Application.

Parameters:: external_application_id (str) – The ID of the External Application.

create_agent(project_id, function_source_code=None, agent_function_name=None, name=None, memory=None, package_requirements=[], description=None, enable_binary_input=False, evaluation_feature_group_id=None, agent_input_schema=None, agent_output_schema=None, workflow_graph=None, agent_interface=AgentInterface.DEFAULT, included_modules=None, org_level_connectors=None, user_level_connectors=None, initialize_function_name=None, initialize_function_code=None)

Creates a new AI agent using the given agent workflow graph definition.

Parameters:

project_id (str) – The unique ID associated with the project.
name (str) – The name you want your agent to have, defaults to “<Project Name> Agent”.
memory (int) – Overrides the default memory allocation (in GB) for the agent.
package_requirements (list) – A list of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
description (str) – A description of the agent, including its purpose and instructions.
evaluation_feature_group_id (str) – The ID of the feature group to use for evaluation.
workflow_graph (WorkflowGraph) – The workflow graph for the agent.
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
included_modules (List) – A list of user created custom modules to include in the agent’s environment.
org_level_connectors (List) – A list of org level connector ids to be used by the agent.
user_level_connectors (Dict) – A dictionary mapping ApplicationConnectorType keys to lists of OAuth scopes. Each key represents a specific user level application connector, while the value is a list of scopes that define the permissions granted to the application.
initialize_function_name (str) – The name of the function to be used for initialization.
initialize_function_code (str) – The function code to be used for initialization.
function_source_code (str)
agent_function_name (str)
enable_binary_input (bool)
agent_input_schema (dict)
agent_output_schema (dict)

Returns:

The new agent.

Return type:

Agent

update_agent(model_id, function_source_code=None, agent_function_name=None, memory=None, package_requirements=None, description=None, enable_binary_input=None, agent_input_schema=None, agent_output_schema=None, workflow_graph=None, agent_interface=None, included_modules=None, org_level_connectors=None, user_level_connectors=None, initialize_function_name=None, initialize_function_code=None)

Updates an existing AI Agent. A new version of the agent will be created and published.

Parameters:

model_id (str) – The unique ID associated with the AI Agent to be changed.
memory (int) – Memory (in GB) for the agent.
package_requirements (list) – A list of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
description (str) – A description of the agent, including its purpose and instructions.
workflow_graph (WorkflowGraph) – The workflow graph for the agent.
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
included_modules (List) – A list of user created custom modules to include in the agent’s environment.
org_level_connectors (List) – A list of org level connector ids to be used by the agent.
user_level_connectors (Dict) – A dictionary mapping ApplicationConnectorType keys to lists of OAuth scopes. Each key represents a specific user level application connector, while the value is a list of scopes that define the permissions granted to the application.
initialize_function_name (str) – The name of the function to be used for initialization.
initialize_function_code (str) – The function code to be used for initialization.
function_source_code (str)
agent_function_name (str)
enable_binary_input (bool)
agent_input_schema (dict)
agent_output_schema (dict)

Returns:

The updated agent.

Return type:

Agent

generate_agent_code(project_id, prompt, fast_mode=None)

Generates the code for defining an AI Agent

Parameters:

project_id (str) – The unique ID associated with the project.
prompt (str) – A natural language prompt which describes agent specification. Describe what the agent will do, what inputs it will expect, and what outputs it will give out
fast_mode (bool) – If True, runs a faster but slightly less accurate code generation pipeline

Return type:

list

evaluate_prompt(prompt=None, system_message=None, llm_name=None, max_tokens=None, temperature=0.0, messages=None, response_type=None, json_response_schema=None, stop_sequences=None, top_p=None)

Generate response to the prompt using the specified model.

Parameters:

prompt (str) – Prompt to use for generation.
system_message (str) – System prompt for models that support it.
llm_name (LLMName) – Name of the underlying LLM to be used for generation. Default is auto selection.
max_tokens (int) – Maximum number of tokens to generate. If set, the model will just stop generating after this token limit is reached.
temperature (float) – Temperature to use for generation. Higher temperature makes more non-deterministic responses, a value of zero makes mostly deterministic reponses. Default is 0.0. A range of 0.0 - 2.0 is allowed.
messages (list) – A list of messages to use as conversation history. A message is a dict with attributes: is_user (bool): Whether the message is from the user. text (str): The message’s text. attachments (list): The files attached to the message represented as a list of dictionaries [{“doc_id”: <doc_id1>}, {“doc_id”: <doc_id2>}]
response_type (str) – Specifies the type of response to request from the LLM. One of ‘text’ and ‘json’. If set to ‘json’, the LLM will respond with a json formatted string whose schema can be specified json_response_schema. Defaults to ‘text’
json_response_schema (dict) – A dictionary specifying the keys/schema/parameters which LLM should adhere to in its response when response_type is ‘json’. Each parameter is mapped to a dict with the following info - type (str) (required): Data type of the parameter. description (str) (required): Description of the parameter. is_required (bool) (optional): Whether the parameter is required or not. Example: json_response_schema = {‘title’: {‘type’: ‘string’, ‘description’: ‘Article title’, ‘is_required’: true}, ‘body’: {‘type’: ‘string’, ‘description’: ‘Article body’}}
stop_sequences (List) – Specifies the strings on which the LLM will stop generation.
top_p (float) – The nucleus sampling value used for this run. If set, the model will sample from the smallest set of tokens whose cumulative probability exceeds the probability top_p. Default is 1.0. A range of 0.0 - 1.0 is allowed. It is generally recommended to use either temperature sampling or nucleus sampling, but not both.

Returns:

The response from the model, raw text and parsed components.

Return type:

LlmResponse

render_feature_groups_for_llm(feature_group_ids, token_budget=None, include_definition=True)

Encode feature groups as language model inputs.

Parameters:

feature_group_ids (List) – List of feature groups to be encoded.
token_budget (int) – Enforce a given budget for each encoded feature group.
include_definition (bool) – Include the definition of the feature group in the encoding.

Returns:

LLM input object comprising of information about the feature groups with given IDs.

Return type:

list[LlmInput]

generate_code_for_data_query_using_llm(query, feature_group_ids=None, external_database_schemas=None, prompt_context=None, llm_name=None, temperature=None, sql_dialect='Spark')

Execute a data query using a large language model in an async fashion.

Parameters:

query (str) – The natural language query to execute. The query is converted to a SQL query using the language model.
feature_group_ids (List) – A list of feature group IDs that the query should be executed against.
external_database_schemas (List) – A list of schmeas from external database that the query should be executed against.
prompt_context (str) – The context message used to construct the prompt for the language model. If not provide, a default context message is used.
llm_name (LLMName) – The name of the language model to use. If not provided, the default language model is used.
temperature (float) – The temperature to use for the language model if supported. If not provided, the default temperature is used.
sql_dialect (str) – The dialect of sql to generate sql for. The default is Spark.

Returns:

The generated SQL code.

Return type:

LlmGeneratedCode

extract_data_using_llm(field_descriptors, document_id=None, document_text=None, llm_name=None)

Extract fields from a document using a large language model.

Parameters:

field_descriptors (List) – A list of fields to extract from the document.
document_id (str) – The ID of the document to query.
document_text (str) – The text of the document to query. Only used if document_id is not provided.
llm_name (LLMName) – The name of the language model to use. If not provided, the default language model is used.

Returns:

The response from the document query.

Return type:

ExtractedFields

search_web_for_llm(queries, search_providers=None, max_results=1, safe=True, fetch_content=False, max_page_tokens=8192, convert_to_markdown=True)

Access web search providers to fetch content related to the queries for use in large language model inputs.

This method can access multiple search providers and return information from them. If the provider supplies URLs for the results then this method also supports fetching the contents of those URLs, optionally converting them to markdown format, and returning them as part of the response. Set a token budget to limit the amount of content returned in the response.

Parameters:

queries (List) – List of queries to send to the search providers. At most 10 queries each less than 512 characters.
search_providers (List) – Search providers to use for the search. If not provided a default provider is used. - BING - GOOGLE
max_results (int) – Maximum number of results to fetch per provider. Must be in [1, 100]. Defaults to 1 (I’m feeling lucky).
safe (bool) – Whether content safety is enabled for these search request. Defaults to True.
fetch_content (bool) – If true fetches the content from the urls in the search results. Defailts to False.
max_page_tokens (int) – Maximum number of tokens to accumulate if fetching search result contents.
convert_to_markdown (bool) – Whether content should be converted to markdown. Defaults to True.

Returns:

Results of running the search queries.

Return type:

WebSearchResponse

fetch_web_page(url, convert_to_markdown=True)

Scrapes the content of a web page and returns it as a string.

Parameters:

url (str) – The url of the web page to scrape.
convert_to_markdown (bool) – Whether content should be converted to markdown.

Returns:

The content of the web page.

Return type:

WebPageResponse

construct_agent_conversation_messages_for_llm(deployment_conversation_id=None, external_session_id=None, include_document_contents=True)

Returns conversation history in a format for LLM calls.

Parameters:

deployment_conversation_id (str) – Unique ID of the conversation. One of deployment_conversation_id or external_session_id must be provided.
external_session_id (str) – External session ID of the conversation.
include_document_contents (bool) – If true, include contents from uploaded documents in the generated messages.

Returns:

Contains a list of AgentConversationMessage that represents the conversation.

Return type:

AgentConversation

validate_workflow_graph(workflow_graph, agent_interface=AgentInterface.DEFAULT, package_requirements=[])

Validates the workflow graph for an AI Agent.

Parameters:

workflow_graph (WorkflowGraph) – The workflow graph to validate.
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
package_requirements (list) – A list of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].

Return type:

dict

extract_agent_workflow_information(workflow_graph, agent_interface=AgentInterface.DEFAULT, package_requirements=[])

Extracts source code of workflow graph, ancestors, in_edges and traversal orders from the agent workflow.

Parameters:

workflow_graph (WorkflowGraph) – The workflow graph to validate.
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
package_requirements (list) – A list of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].

Return type:

dict

get_llm_app_response(llm_app_name, prompt)

Queries the specified LLM App to generate a response to the prompt. LLM Apps are LLMs tailored to achieve a specific task like code generation for a specific service’s API.

Parameters:

llm_app_name (str) – The name of the LLM App to use for generation.
prompt (str) – The prompt to use for generation.

Returns:

The response from the LLM App.

Return type:

LlmResponse

create_document_retriever(project_id, name, feature_group_id, document_retriever_config=None)

Returns a document retriever that stores embeddings for document chunks in a feature group.

Document columns in the feature group are broken into chunks. For cases with multiple document columns, chunks from all columns are combined together to form a single chunk.

Parameters:

project_id (str) – The ID of project that the Document Retriever is created in.
name (str) – The name of the Document Retriever. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
feature_group_id (str) – The ID of the feature group that the Document Retriever is associated with.
document_retriever_config (VectorStoreConfig) – The configuration, including chunk_size and chunk_overlap_fraction, for document retrieval.

Returns:

The newly created document retriever.

Return type:

DocumentRetriever

rename_document_retriever(document_retriever_id, name)

Updates an existing document retriever.

Parameters:

document_retriever_id (str) – The unique ID associated with the document retriever.
name (str) – The name to update the document retriever with.

Returns:

The updated document retriever.

Return type:

DocumentRetriever

create_document_retriever_version(document_retriever_id, feature_group_id=None, document_retriever_config=None)

Creates a document retriever version from the latest version of the feature group that the document retriever associated with.

Parameters:

document_retriever_id (str) – The unique ID associated with the document retriever to create version with.
feature_group_id (str) – The ID of the feature group to update the document retriever with.
document_retriever_config (VectorStoreConfig) – The configuration, including chunk_size and chunk_overlap_fraction, for document retrieval.

Returns:

The newly created document retriever version.

Return type:

DocumentRetrieverVersion

delete_document_retriever(vector_store_id)

Delete a Document Retriever.

Parameters:: vector_store_id (str) – A unique string identifier associated with the document retriever.

delete_document_retriever_version(document_retriever_version)

Delete a document retriever version.

Parameters:: document_retriever_version (str) – A unique string identifier associated with the document retriever version.

get_document_snippet(document_retriever_id, document_id, start_word_index=None, end_word_index=None)

Get a snippet from documents in the document retriever.

Parameters:

document_retriever_id (str) – A unique string identifier associated with the document retriever.
document_id (str) – The ID of the document to retrieve the snippet from.
start_word_index (int) – If provided, will start the snippet at the index (of words in the document) specified.
end_word_index (int) – If provided, will end the snippet at the index of (of words in the document) specified.

Returns:

The documentation snippet found from the document retriever.

Return type:

DocumentRetrieverLookupResult

restart_document_retriever(document_retriever_id)

Restart the document retriever if it is stopped or has failed. This will start the deployment of the document retriever,

but will not wait for it to be ready. You need to call wait_until_ready to wait until the deployment is ready.

Parameters:: document_retriever_id (str) – A unique string identifier associated with the document retriever.

get_relevant_snippets(doc_ids=None, blobs=None, query=None, document_retriever_config=None, honor_sentence_boundary=True, num_retrieval_margin_words=None, max_words_per_snippet=None, max_snippets_per_document=None, start_word_index=None, end_word_index=None, including_bounding_boxes=False, text=None, document_processing_config=None)

Retrieves snippets relevant to a given query from specified documents. This function supports flexible input options,

allowing for retrieval from a variety of data sources including document IDs, blob data, and plain text. When multiple data sources are provided, all are considered in the retrieval process. Document retrievers may be created on-the-fly to perform lookup.

Parameters:

doc_ids (List) – A list of document store IDs to retrieve the snippets from.
blobs (io.TextIOBase) – A dictionary mapping document names to the blob data.
query (str) – Query string to find relevant snippets in the documents.
document_retriever_config (VectorStoreConfig) – If provided, used to configure the retrieval steps like chunking for embeddings.
num_retrieval_margin_words (int) – If provided, will add this number of words from left and right of the returned snippets.
max_words_per_snippet (int) – If provided, will limit the number of words in each snippet to the value specified.
max_snippets_per_document (int) – If provided, will limit the number of snippets retrieved from each document to the value specified.
start_word_index (int) – If provided, will start the snippet at the index (of words in the document) specified.
end_word_index (int) – If provided, will end the snippet at the index of (of words in the document) specified.
including_bounding_boxes (bool) – If true, will include the bounding boxes of the snippets if they are available.
text (str) – Plain text from which to retrieve snippets.
document_processing_config (DocumentProcessingConfig) – The document processing configuration used to extract text when doc_ids or blobs are provided. If provided, this will override including_bounding_boxes parameter.
honor_sentence_boundary (bool)

Returns:

The snippets found from the documents.

Return type:

list[DocumentRetrieverLookupResult]

query_mcp_server(task, server_name=None, mcp_server_connection_id=None)

Query a Remote MCP server. For a given task, it runs the required tools of the MCP

server with appropriate arguments and returns the output.

Parameters:

task (str) – a comprehensive description of the task to perform using the MCP server tools.
server_name (str) – The name of the MCP server.
mcp_server_connection_id (str) – connection id of the MCP server to query.

Returns:

The execution logs as well as the final output of the task execution.

Return type:

McpServerQueryResult

exception abacusai.ApiException(message, http_status, exception=None, request_id=None)

Bases: Exception

Default ApiException raised by APIs

Parameters:

message (str) – The error message
http_status (int) – The https status code raised by the server
exception (str) – The exception class raised by the server
request_id (str) – The request id

message

http_status

exception = 'ApiException'

request_id = None

__str__(): Return str(self).

class abacusai.ClientOptions(exception_on_404=True, server=DEFAULT_SERVER)

Options for configuring the ApiClient

Parameters:

exception_on_404 (bool) – If true, will raise an exception on a 404 from the server, else will return None.
server (str) – The default server endpoint to use for API requests

exception_on_404 = True

server = 'https://api.abacus.ai'

class abacusai.ReadOnlyClient(api_key=None, server=None, client_options=None, skip_version_check=False, include_tb=False)

Bases: BaseApiClient

Abacus.AI Read Only API Client. Only contains GET methods

Parameters:

api_key (str) – The api key to use as authentication to the server
server (str) – The base server url to use to send API requets to
client_options (ClientOptions) – Optional API client configurations
skip_version_check (bool) – If true, will skip checking the server’s current API version on initializing the client
include_tb (bool)

list_api_keys()

Lists all of the user’s API keys

Returns:: List of API Keys for the current user’s organization.
Return type:: list[ApiKey]

list_organization_users()

Retrieves a list of all platform users in the organization, including pending users who have been invited.

Returns:: An array of all the users in the organization.
Return type:: list[User]

describe_user()

Retrieve the current user’s information, such as their name, email address, and admin status.

Returns:: An object containing information about the current user.
Return type:: User

list_organization_groups()

Lists all Organizations Groups

Returns:: A list of all the organization groups within this organization.
Return type:: list[OrganizationGroup]

describe_organization_group(organization_group_id)

Returns the specific organization group passed in by the user.

Parameters:: organization_group_id (str) – The unique identifier of the organization group to be described.
Returns:: Information about a specific organization group.
Return type:: OrganizationGroup

describe_webhook(webhook_id)

Describe the webhook with a given ID.

Parameters:: webhook_id (str) – Unique string identifier of the target webhook.
Returns:: The webhook with the given ID.
Return type:: Webhook

list_deployment_webhooks(deployment_id)

List all the webhooks attached to a given deployment.

Parameters:: deployment_id (str) – Unique identifier of the target deployment.
Returns:: List of the webhooks attached to the given deployment ID.
Return type:: list[Webhook]

list_use_cases()

Retrieves a list of all use cases with descriptions. Use the given mappings to specify a use case when needed.

Returns:: A list of UseCase objects describing all the use cases addressed by the platform. For details, please refer to.
Return type:: list[UseCase]

describe_problem_type(problem_type)

Describes a problem type

Parameters:: problem_type (str) – The problem type to get details on
Returns:: The problem type requirements
Return type:: ProblemType

describe_use_case_requirements(use_case)

This API call returns the feature requirements for a specified use case.

Parameters:: use_case (str) – This contains the Enum String for the use case whose dataset requirements are needed.
Returns:: The feature requirements of the use case are returned, including all the feature groups required for the use case along with their descriptions and feature mapping details.
Return type:: list[UseCaseRequirements]

describe_project(project_id)

Returns a description of a project.

Parameters:: project_id (str) – A unique string identifier for the project.
Returns:: The description of the project.
Return type:: Project

list_projects(limit=100, start_after_id=None)

Retrieves a list of all projects in the current organization.

Parameters:

limit (int) – The maximum length of the list of projects.
start_after_id (str) – The ID of the project after which the list starts.

Returns:

A list of all projects in the Organization the user is currently logged in to.

Return type:

list[Project]

get_project_feature_group_config(feature_group_id, project_id)

Gets a feature group’s project config

Parameters:

feature_group_id (str) – Unique string identifier for the feature group.
project_id (str) – Unique string identifier for the project.

Returns:

The feature group’s project configuration.

Return type:

ProjectConfig

validate_project(project_id, feature_group_ids=None)

Validates that the specified project has all required feature group types for its use case and that all required feature columns are set.

Parameters:

project_id (str) – The unique ID associated with the project.
feature_group_ids (List) – The list of feature group IDs to validate.

Returns:

The project validation. If the specified project is missing required columns or feature groups, the response includes an array of objects for each missing required feature group and the missing required features in each feature group.

Return type:

ProjectValidation

infer_feature_mappings(project_id, feature_group_id)

Infer the feature mappings for the feature group in the project based on the problem type.

Parameters:

project_id (str) – The unique ID associated with the project.
feature_group_id (str) – The unique ID associated with the feature group.

Returns:

A dict that contains the inferred feature mappings.

Return type:

InferredFeatureMappings

verify_and_describe_annotation(feature_group_id, feature_name=None, doc_id=None, feature_group_row_identifier=None)

Get the latest annotation entry for a given feature group, feature, and document along with verification information.

Parameters:

feature_group_id (str) – The ID of the feature group the annotation is on.
feature_name (str) – The name of the feature the annotation is on.
doc_id (str) – The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the feature group’s primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.

Returns:

The latest annotation entry for the given feature group, feature, document, and/or annotation key value. Includes the verification information.

Return type:

AnnotationEntry

get_annotations_status(feature_group_id, feature_name=None, check_for_materialization=False)

Get the status of the annotations for a given feature group and feature.

Parameters:

feature_group_id (str) – The ID of the feature group the annotation is on.
feature_name (str) – The name of the feature the annotation is on.
check_for_materialization (bool) – If True, check if the feature group needs to be materialized before using for annotations.

Returns:

The status of the annotations for the given feature group and feature.

Return type:

AnnotationsStatus

get_feature_group_schema(feature_group_id, project_id=None)

Returns a schema for a given FeatureGroup in a project.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
project_id (str) – The unique ID associated with the project.

Returns:

A list of objects for each column in the specified feature group.

Return type:

list[Feature]

get_point_in_time_feature_group_creation_options()

Returns the options that can be used to generate PIT features.

Returns:: List of possible generated aggregation function options.
Return type:: list[GeneratedPitFeatureConfigOption]

describe_feature_group(feature_group_id)

Describe a Feature Group.

Parameters:: feature_group_id (str) – A unique string identifier associated with the feature group.
Returns:: The feature group object.
Return type:: FeatureGroup

describe_feature_group_by_table_name(table_name)

Describe a Feature Group by its table name.

Parameters:: table_name (str) – The unique table name of the Feature Group to look up.
Returns:: The Feature Group.
Return type:: FeatureGroup

list_feature_groups(limit=100, start_after_id=None, feature_group_template_id=None, is_including_detached_from_template=False)

List all the feature groups

Parameters:

limit (int) – The number of feature groups to retrieve.
start_after_id (str) – An offset parameter to exclude all feature groups up to a specified ID.
feature_group_template_id (str) – If specified, limit the results to feature groups attached to this template ID.
is_including_detached_from_template (bool) – When feature_group_template_id is specified, include feature groups that have been detached from that template ID.

Returns:

All the feature groups in the organization associated with the specified project.

Return type:

list[FeatureGroup]

describe_project_feature_group(project_id, feature_group_id)

Describe a feature group associated with a project

Parameters:

project_id (str) – The unique ID associated with the project.
feature_group_id (str) – The unique ID associated with the feature group.

Returns:

The project feature group object.

Return type:

ProjectFeatureGroup

list_project_feature_groups(project_id, filter_feature_group_use=None, limit=100, start_after_id=None)

List all the feature groups associated with a project

Parameters:

project_id (str) – The unique ID associated with the project.
filter_feature_group_use (str) – The feature group use filter, when given as an argument only allows feature groups present in this project to be returned if they are of the given use. Possible values are: ‘USER_CREATED’, ‘BATCH_PREDICTION_OUTPUT’.
limit (int) – The maximum number of feature groups to be retrieved.
start_after_id (str) – An offset parameter to exclude all feature groups up to a specified ID.

Returns:

All the Feature Groups in a project.

Return type:

list[ProjectFeatureGroup]

list_python_function_feature_groups(name, limit=100)

List all the feature groups associated with a python function.

Parameters:

name (str) – The name used to identify the Python function.
limit (int) – The maximum number of feature groups to be retrieved.

Returns:

All the feature groups associated with the specified Python function ID.

Return type:

list[FeatureGroup]

get_execute_feature_group_operation_result_part_count(feature_group_operation_run_id)

Gets the number of parts in the result of the execution of fg operation

Parameters:: feature_group_operation_run_id (str) – The unique ID associated with the execution.
Return type:: int

download_execute_feature_group_operation_result_part_chunk(feature_group_operation_run_id, part, offset=0, chunk_size=10485760)

Downloads a chunk of the result of the execution of feature group operation

Parameters:

feature_group_operation_run_id (str) – The unique ID associated with the execution.
part (int) – The part number of the result
offset (int) – The offset in the part
chunk_size (int) – The size of the chunk

Return type:

io.BytesIO

update_feature_group_version_limit(feature_group_id, version_limit)

Updates the version limit for the feature group.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
version_limit (int) – The maximum number of versions permitted for the feature group. Once this limit is exceeded, the oldest versions will be purged in a First-In-First-Out (FIFO) order.

Returns:

The updated feature group.

Return type:

FeatureGroup

get_feature_group_version_export_download_url(feature_group_export_id)

Get a link to download the feature group version.

Parameters:: feature_group_export_id (str) – Unique identifier of the Feature Group Export to get a signed URL for.
Returns:: Instance containing the download URL and expiration time for the Feature Group Export.
Return type:: FeatureGroupExportDownloadUrl

describe_feature_group_export(feature_group_export_id)

A feature group export

Parameters:: feature_group_export_id (str) – Unique identifier of the feature group export.
Returns:: The feature group export object.
Return type:: FeatureGroupExport

list_feature_group_exports(feature_group_id)

Lists all of the feature group exports for the feature group

Parameters:: feature_group_id (str) – Unique identifier of the feature group
Returns:: List of feature group exports
Return type:: list[FeatureGroupExport]

get_feature_group_export_connector_errors(feature_group_export_id)

Returns a stream containing the write errors of the feature group export database connection, if any writes failed to the database connector.

Parameters:: feature_group_export_id (str) – Unique identifier of the feature group export to get the errors for.
Return type:: io.BytesIO

list_feature_group_modifiers(feature_group_id)

List the users who can modify a given feature group.

Parameters:: feature_group_id (str) – Unique string identifier of the feature group.
Returns:: Information about the modification lock status and groups/organizations added to the feature group.
Return type:: ModificationLockInfo

get_materialization_logs(feature_group_version, stdout=False, stderr=False)

Returns logs for a materialized feature group version.

Parameters:

feature_group_version (str) – Unique string identifier for the feature group instance to export.
stdout (bool) – Set to True to get info logs.
stderr (bool) – Set to True to get error logs.

Returns:

A function logs object.

Return type:

FunctionLogs

list_feature_group_versions(feature_group_id, limit=100, start_after_version=None)

Retrieves a list of all feature group versions for the specified feature group.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
limit (int) – The maximum length of the returned versions.
start_after_version (str) – Results will start after this version.

Returns:

A list of feature group versions.

Return type:

list[FeatureGroupVersion]

describe_feature_group_version(feature_group_version)

Describe a feature group version.

Parameters:: feature_group_version (str) – The unique identifier associated with the feature group version.
Returns:: The feature group version.
Return type:: FeatureGroupVersion

get_feature_group_version_metrics(feature_group_version, selected_columns=None, include_charts=False, include_statistics=True)

Get metrics for a specific feature group version.

Parameters:

feature_group_version (str) – A unique string identifier associated with the feature group version.
selected_columns (List) – A list of columns to order first.
include_charts (bool) – A flag indicating whether charts should be included in the response. Default is false.
include_statistics (bool) – A flag indicating whether statistics should be included in the response. Default is true.

Returns:

The metrics for the specified feature group version.

Return type:

DataMetrics

get_feature_group_version_logs(feature_group_version)

Retrieves the feature group materialization logs.

Parameters:: feature_group_version (str) – The unique version ID of the feature group version.
Returns:: The logs for the specified feature group version.
Return type:: FeatureGroupVersionLogs

describe_feature_group_template(feature_group_template_id)

Describe a Feature Group Template.

Parameters:: feature_group_template_id (str) – The unique identifier of a feature group template.
Returns:: The feature group template object.
Return type:: FeatureGroupTemplate

list_feature_group_templates(limit=100, start_after_id=None, feature_group_id=None, should_include_system_templates=False)

List feature group templates, optionally scoped by the feature group that created the templates.

Parameters:

limit (int) – Maximum number of templates to be retrieved.
start_after_id (str) – Offset parameter to exclude all templates up to the specified feature group template ID.
feature_group_id (str) – If specified, limit to templates created from this feature group.
should_include_system_templates (bool) – If True, will include built-in templates.

Returns:

All the feature groups in the organization, optionally limited by the feature group that created the template(s).

Return type:

list[FeatureGroupTemplate]

list_project_feature_group_templates(project_id, limit=100, start_after_id=None, should_include_all_system_templates=False)

List feature group templates for feature groups associated with the project.

Parameters:

project_id (str) – Unique string identifier to limit to templates associated with this project, e.g. templates associated with feature groups in this project.
limit (int) – Maximum number of templates to be retrieved.
start_after_id (str) – Offset parameter to exclude all templates till the specified feature group template ID.
should_include_all_system_templates (bool) – If True, will include built-in templates.

Returns:

All the feature groups in the organization, optionally limited by the feature group that created the template(s).

Return type:

list[FeatureGroupTemplate]

suggest_feature_group_template_for_feature_group(feature_group_id)

Suggest values for a feature gruop template, based on a feature group.

Parameters:: feature_group_id (str) – Unique identifier associated with the feature group to use for suggesting values to use in the template.
Returns:: The suggested feature group template.
Return type:: FeatureGroupTemplate

get_dataset_schema(dataset_id)

Retrieves the column schema of a dataset.

Parameters:: dataset_id (str) – Unique string identifier of the dataset schema to look up.
Returns:: List of column schema definitions.
Return type:: list[DatasetColumn]

set_dataset_database_connector_config(dataset_id, database_connector_id, object_name=None, columns=None, query_arguments=None, sql_query=None)

Sets database connector config for a dataset. This method is currently only supported for streaming datasets.

Parameters:

dataset_id (str) – Unique String Identifier of the dataset_id.
database_connector_id (str) – Unique String Identifier of the Database Connector to import the dataset from.
object_name (str) – If applicable, the name/ID of the object in the service to query.
columns (str) – The columns to query from the external service object.
query_arguments (str) – Additional query arguments to filter the data.
sql_query (str) – The full SQL query to use when fetching data. If present, this parameter will override object_name, columns and query_arguments.

get_dataset_version_metrics(dataset_version, selected_columns=None, include_charts=False, include_statistics=True)

Get metrics for a specific dataset version.

Parameters:

dataset_version (str) – A unique string identifier associated with the dataset version.
selected_columns (List) – A list of columns to order first.
include_charts (bool) – A flag indicating whether charts should be included in the response. Default is false.
include_statistics (bool) – A flag indicating whether statistics should be included in the response. Default is true.

Returns:

The metrics for the specified Dataset version.

Return type:

DataMetrics

update_dataset_version_limit(dataset_id, version_limit)

Updates the version limit for the specified dataset.

Parameters:

dataset_id (str) – The unique ID associated with the dataset.
version_limit (int) – The maximum number of versions permitted for the feature group. Once this limit is exceeded, the oldest versions will be purged in a First-In-First-Out (FIFO) order.

Returns:

The updated dataset.

Return type:

Dataset

get_file_connector_instructions(bucket, write_permission=False)

Retrieves verification information to create a data connector to a cloud storage bucket.

Parameters:

bucket (str) – The fully-qualified URI of the storage bucket to verify.
write_permission (bool) – If True, instructions will include steps for allowing Abacus.AI to write to this service.

Returns:

An object with a full description of the cloud storage bucket authentication options and bucket policy. Returns an error message if the parameters are invalid.

Return type:

FileConnectorInstructions

list_database_connectors()

Retrieves a list of all database connectors along with their associated attributes.

Returns:: An object containing the database connector and its attributes.
Return type:: list[DatabaseConnector]

list_file_connectors()

Retrieves a list of all connected services in the organization and their current verification status.

Returns:: A list of cloud storage buckets connected to the organization.
Return type:: list[FileConnector]

list_database_connector_objects(database_connector_id, fetch_raw_data=False)

Lists querable objects in the database connector.

Parameters:

database_connector_id (str) – Unique string identifier for the database connector.
fetch_raw_data (bool) – If true, return unfiltered objects.

Return type:

list

get_database_connector_object_schema(database_connector_id, object_name=None, fetch_raw_data=False)

Get the schema of an object in an database connector.

Parameters:

database_connector_id (str) – Unique string identifier for the database connector.
object_name (str) – Unique identifier for the object in the external system.
fetch_raw_data (bool) – If true, return unfiltered list of columns.

Returns:

The schema of the object.

Return type:

DatabaseConnectorSchema

query_database_connector(database_connector_id, query)

Runs a query in the specified database connector.

Parameters:

database_connector_id (str) – A unique string identifier for the database connector.
query (str) – The query to be run in the database connector.

Return type:

list

get_database_connector_auth(database_connector_id)

Get the authentication details for a given database connector.

Parameters:: database_connector_id (str) – The unique ID associated with the database connector.
Returns:: The database connector with the authentication details.
Return type:: DatabaseConnector

list_application_connectors()

Retrieves a list of all application connectors along with their associated attributes.

Returns:: A list of application connectors.
Return type:: list[ApplicationConnector]

list_application_connector_objects(application_connector_id)

Lists querable objects in the application connector.

Parameters:: application_connector_id (str) – Unique string identifier for the application connector.
Return type:: list

get_connector_auth(service=None, application_connector_id=None, scopes=None)

Get the authentication details for a given connector. For user level connectors, the service is required. For org level connectors, the application_connector_id is required.

Parameters:

service (ApplicationConnectorType) – The service name.
application_connector_id (str) – The unique ID associated with the connector.
scopes (List) – The scopes to request for the connector.

Returns:

The application connector with the authentication details.

Return type:

UnifiedConnector

get_user_connector_auth(service, scopes=None)

Get the authentication details for a given user level connector.

Parameters:

service (ApplicationConnectorType) – The service name.
scopes (List) – The scopes to request for the connector.

Returns:

The application connector with the authentication details.

Return type:

UnifiedConnector

get_user_mcp_connector_auth(mcp_server_name=None)

Get the auth for a MCP connector

Parameters:: mcp_server_name (str) – The name of the MCP server
Returns:: The MCP connector
Return type:: ApplicationConnector

list_streaming_connectors()

Retrieves a list of all streaming connectors along with their corresponding attributes.

Returns:: A list of StreamingConnector objects.
Return type:: list[StreamingConnector]

list_streaming_tokens()

Retrieves a list of all streaming tokens.

Returns:: A list of streaming tokens and their associated attributes.
Return type:: list[StreamingAuthToken]

get_recent_feature_group_streamed_data(feature_group_id)

Returns recently streamed data to a streaming feature group.

Parameters:: feature_group_id (str) – Unique string identifier associated with the feature group.

list_uploads()

Lists all pending uploads

Returns:: A list of ongoing uploads in the organization.
Return type:: list[Upload]

describe_upload(upload_id)

Retrieves the current upload status (complete or inspecting) and the list of file parts uploaded for a specified dataset upload.

Parameters:: upload_id (str) – The unique ID associated with the file uploaded or being uploaded in parts.
Returns:: Details associated with the large dataset file uploaded in parts.
Return type:: Upload

list_datasets(limit=100, start_after_id=None, exclude_streaming=False)

Retrieves a list of all datasets in the organization.

Parameters:

limit (int) – Maximum length of the list of datasets.
start_after_id (str) – ID of the dataset after which the list starts.
exclude_streaming (bool) – Exclude streaming datasets from the result.

Returns:

List of datasets.

Return type:

list[Dataset]

describe_dataset(dataset_id)

Retrieves a full description of the specified dataset, with attributes such as its ID, name, source type, etc.

Parameters:: dataset_id (str) – The unique ID associated with the dataset.
Returns:: The dataset.
Return type:: Dataset

describe_dataset_version(dataset_version)

Retrieves a full description of the specified dataset version, including its ID, name, source type, and other attributes.

Parameters:: dataset_version (str) – Unique string identifier associated with the dataset version.
Returns:: The dataset version.
Return type:: DatasetVersion

list_dataset_versions(dataset_id, limit=100, start_after_version=None)

Retrieves a list of all dataset versions for the specified dataset.

Parameters:

dataset_id (str) – The unique ID associated with the dataset.
limit (int) – The maximum length of the list of all dataset versions.
start_after_version (str) – The ID of the version after which the list starts.

Returns:

A list of dataset versions.

Return type:

list[DatasetVersion]

get_dataset_version_logs(dataset_version)

Retrieves the dataset import logs.

Parameters:: dataset_version (str) – The unique version ID of the dataset version.
Returns:: The logs for the specified dataset version.
Return type:: DatasetVersionLogs

get_docstore_document(doc_id)

Return a document store document by id.

Parameters:: doc_id (str) – Unique Docstore string identifier for the document.
Return type:: io.BytesIO

get_docstore_image(doc_id, max_width=None, max_height=None)

Return a document store image by id.

Parameters:

doc_id (str) – A unique Docstore string identifier for the image.
max_width (int) – Rescales the returned image so the width is less than or equal to the given maximum width, while preserving the aspect ratio.
max_height (int) – Rescales the returned image so the height is less than or equal to the given maximum height, while preserving the aspect ratio.

Return type:

io.BytesIO

describe_train_test_data_split_feature_group(model_id)

Get the train and test data split for a trained model by its unique identifier. This is only supported for models with custom algorithms.

Parameters:: model_id (str) – The unique ID of the model. By default, the latest model version will be returned if no version is specified.
Returns:: The feature group containing the training data and fold information.
Return type:: FeatureGroup

describe_train_test_data_split_feature_group_version(model_version)

Get the train and test data split for a trained model by model version. This is only supported for models with custom algorithms.

Parameters:: model_version (str) – The unique version ID of the model version.
Returns:: The feature group version containing the training data and folds information.
Return type:: FeatureGroupVersion

list_models(project_id)

Retrieves the list of models in the specified project.

Parameters:: project_id (str) – Unique string identifier associated with the project.
Returns:: A list of models.
Return type:: list[Model]

describe_model(model_id)

Retrieves a full description of the specified model.

Parameters:: model_id (str) – Unique string identifier associated with the model.
Returns:: Description of the model.
Return type:: Model

get_model_metrics(model_id, model_version=None, return_graphs=False, validation=False)

Retrieves metrics for all the algorithms trained in this model version.

If only the model’s unique identifier (model_id) is specified, the latest trained version of the model (model_version) is used.

Parameters:

model_id (str) – Unique string identifier for the model.
model_version (str) – Version of the model.
return_graphs (bool) – If true, will return the information used for the graphs on the model metrics page such as PR Curve per label.
validation (bool) – If true, will return the validation metrics instead of the test metrics.

Returns:

An object containing the model metrics and explanations for what each metric means.

Return type:

ModelMetrics

get_feature_group_schemas_for_model_version(model_version)

Gets the schema (including feature mappings) for all feature groups used in the model version.

Parameters:: model_version (str) – Unique string identifier for the version of the model.
Returns:: List of schema for all feature groups used in the model version.
Return type:: list[ModelVersionFeatureGroupSchema]

list_model_versions(model_id, limit=100, start_after_version=None)

Retrieves a list of versions for a given model.

Parameters:

model_id (str) – Unique string identifier associated with the model.
limit (int) – Maximum length of the list of all dataset versions.
start_after_version (str) – Unique string identifier of the version after which the list starts.

Returns:

An array of model versions.

Return type:

list[ModelVersion]

describe_model_version(model_version)

Retrieves a full description of the specified model version.

Parameters:: model_version (str) – Unique string identifier of the model version.
Returns:: A model version.
Return type:: ModelVersion

get_feature_importance_by_model_version(model_version)

Gets the feature importance calculated by various methods for the model.

Parameters:: model_version (str) – Unique string identifier for the model version.
Returns:: Feature importances for the model.
Return type:: FeatureImportance

get_training_data_logs(model_version)

Retrieves the data preparation logs during model training.

Parameters:: model_version (str) – The unique version ID of the model version.
Returns:: A list of logs.
Return type:: list[DataPrepLogs]

get_training_logs(model_version, stdout=False, stderr=False)

Returns training logs for the model.

Parameters:

model_version (str) – The unique version ID of the model version.
stdout (bool) – Set True to get info logs.
stderr (bool) – Set True to get error logs.

Returns:

A function logs object.

Return type:

FunctionLogs

describe_model_artifacts_export(model_artifacts_export_id)

Get the description and status of the model artifacts export.

Parameters:: model_artifacts_export_id (str) – A unique string identifier for the export.
Returns:: Object describing the export and its status.
Return type:: ModelArtifactsExport

list_model_artifacts_exports(model_id, limit=25)

List all the model artifacts exports.

Parameters:

model_id (str) – A unique string identifier for the model.
limit (int) – Maximum length of the list of all exports.

Returns:

List of model artifacts exports.

Return type:

list[ModelArtifactsExport]

list_model_monitors(project_id, limit=None)

Retrieves the list of model monitors in the specified project.

Parameters:

project_id (str) – Unique string identifier associated with the project.
limit (int) – Maximum number of model monitors to return. We’ll have internal limit if not set.

Returns:

A list of model monitors.

Return type:

list[ModelMonitor]

describe_model_monitor(model_monitor_id)

Retrieves a full description of the specified model monitor.

Parameters:: model_monitor_id (str) – Unique string identifier associated with the model monitor.
Returns:: Description of the model monitor.
Return type:: ModelMonitor

get_prediction_drift(model_monitor_version)

Gets the label and prediction drifts for a model monitor.

Parameters:: model_monitor_version (str) – Unique string identifier for a model monitor version created under the project.
Returns:: Object describing training and prediction output label and prediction distributions.
Return type:: DriftDistributions

get_model_monitor_summary(model_monitor_id)

Gets the summary of a model monitor across versions.

Parameters:: model_monitor_id (str) – A unique string identifier associated with the model monitor.
Returns:: An object describing integrity, bias violations, model accuracy and drift for the model monitor.
Return type:: ModelMonitorSummary

list_model_monitor_versions(model_monitor_id, limit=100, start_after_version=None)

Retrieves a list of versions for a given model monitor.

Parameters:

model_monitor_id (str) – The unique ID associated with the model monitor.
limit (int) – The maximum length of the list of all model monitor versions.
start_after_version (str) – The ID of the version after which the list starts.

Returns:

A list of model monitor versions.

Return type:

list[ModelMonitorVersion]

describe_model_monitor_version(model_monitor_version)

Retrieves a full description of the specified model monitor version.

Parameters:: model_monitor_version (str) – The unique version ID of the model monitor version.
Returns:: A model monitor version.
Return type:: ModelMonitorVersion

model_monitor_version_metric_data(model_monitor_version, metric_type, actual_values_to_detail=None)

Provides the data needed for decile metrics associated with the model monitor.

Parameters:

model_monitor_version (str) – Unique string identifier for the model monitor version.
metric_type (str) – The type of metric to get data for.
actual_values_to_detail (list) – The actual values to detail.

Returns:

Data associated with the metric.

Return type:

ModelMonitorVersionMetricData

list_organization_model_monitors(only_starred=False)

Gets a list of Model Monitors for an organization.

Parameters:: only_starred (bool) – Whether to return only starred Model Monitors. Defaults to False.
Returns:: A list of Model Monitors.
Return type:: list[ModelMonitor]

get_model_monitor_chart_from_organization(chart_type, limit=15)

Gets a list of model monitor summaries across monitors for an organization.

Parameters:

chart_type (str) – Type of chart (model_accuracy, bias_violations, data_integrity, or model_drift) to return.
limit (int) – Maximum length of the model monitors.

Returns:

List of ModelMonitorSummaryForOrganization objects describing accuracy, bias, drift, or integrity for all model monitors in an organization.

Return type:

list[ModelMonitorSummaryFromOrg]

get_model_monitor_summary_from_organization()

Gets a consolidated summary of model monitors for an organization.

Returns:: A list of ModelMonitorSummaryForOrganization objects describing accuracy, bias, drift, and integrity for all model monitors in an organization.
Return type:: list[ModelMonitorOrgSummary]

list_eda(project_id)

Retrieves the list of Exploratory Data Analysis (EDA) in the specified project.

Parameters:: project_id (str) – Unique string identifier associated with the project.
Returns:: List of EDA objects.
Return type:: list[Eda]

describe_eda(eda_id)

Retrieves a full description of the specified EDA object.

Parameters:: eda_id (str) – Unique string identifier associated with the EDA object.
Returns:: Description of the EDA object.
Return type:: Eda

list_eda_versions(eda_id, limit=100, start_after_version=None)

Retrieves a list of versions for a given EDA object.

Parameters:

eda_id (str) – The unique ID associated with the EDA object.
limit (int) – The maximum length of the list of all EDA versions.
start_after_version (str) – The ID of the version after which the list starts.

Returns:

A list of EDA versions.

Return type:

list[EdaVersion]

describe_eda_version(eda_version)

Retrieves a full description of the specified EDA version.

Parameters:: eda_version (str) – Unique string identifier of the EDA version.
Returns:: An EDA version.
Return type:: EdaVersion

get_eda_collinearity(eda_version)

Gets the Collinearity between all features for the Exploratory Data Analysis.

Parameters:: eda_version (str) – Unique string identifier associated with the EDA instance.
Returns:: An object with a record of correlations between each feature for the EDA.
Return type:: EdaCollinearity

get_eda_data_consistency(eda_version, transformation_feature=None)

Gets the data consistency for the Exploratory Data Analysis.

Parameters:

eda_version (str) – Unique string identifier associated with the EDA instance.
transformation_feature (str) – The transformation feature to get consistency for.

Returns:

Object with duplication, deletion, and transformation data for data consistency analysis for an EDA.

Return type:

EdaDataConsistency

get_collinearity_for_feature(eda_version, feature_name=None)

Gets the Collinearity for the given feature from the Exploratory Data Analysis.

Parameters:

eda_version (str) – Unique string identifier associated with the EDA instance.
feature_name (str) – Name of the feature for which correlation is shown.

Returns:

Object with a record of correlations for the provided feature for an EDA.

Return type:

EdaFeatureCollinearity

get_feature_association(eda_version, reference_feature_name, test_feature_name)

Gets the Feature Association for the given features from the feature group version within the eda_version.

Parameters:

eda_version (str) – Unique string identifier associated with the EDA instance.
reference_feature_name (str) – Name of the feature for feature association (on x-axis for the plots generated for the Feature association in the product).
test_feature_name (str) – Name of the feature for feature association (on y-axis for the plots generated for the Feature association in the product).

Returns:

An object with a record of data for the feature association between the two given features for an EDA version.

Return type:

EdaFeatureAssociation

get_eda_forecasting_analysis(eda_version)

Gets the Forecasting analysis for the Exploratory Data Analysis.

Parameters:: eda_version (str) – Unique string identifier associated with the EDA version.
Returns:: Object with forecasting analysis that includes sales_across_time, cummulative_contribution, missing_value_distribution, history_length, num_rows_histogram, product_maturity data.
Return type:: EdaForecastingAnalysis

list_holdout_analysis(project_id, model_id=None)

List holdout analyses for a project. Optionally, filter by model.

Parameters:

project_id (str) – ID of the project to list holdout analyses for
model_id (str) – (optional) ID of the model to filter by

Returns:

The holdout analyses

Return type:

list[HoldoutAnalysis]

describe_holdout_analysis(holdout_analysis_id)

Get a holdout analysis.

Parameters:: holdout_analysis_id (str) – ID of the holdout analysis to get
Returns:: The holdout analysis
Return type:: HoldoutAnalysis

list_holdout_analysis_versions(holdout_analysis_id)

List holdout analysis versions for a holdout analysis.

Parameters:: holdout_analysis_id (str) – ID of the holdout analysis to list holdout analysis versions for
Returns:: The holdout analysis versions
Return type:: list[HoldoutAnalysisVersion]

describe_holdout_analysis_version(holdout_analysis_version, get_metrics=False)

Get a holdout analysis version.

Parameters:

holdout_analysis_version (str) – ID of the holdout analysis version to get
get_metrics (bool) – (optional) Whether to get the metrics for the holdout analysis version

Returns:

The holdout analysis version

Return type:

HoldoutAnalysisVersion

describe_monitor_alert(monitor_alert_id)

Describes a given monitor alert id

Parameters:: monitor_alert_id (str) – Unique identifier of the monitor alert.
Returns:: Object containing information about the monitor alert.
Return type:: MonitorAlert

describe_monitor_alert_version(monitor_alert_version)

Describes a given monitor alert version id

Parameters:: monitor_alert_version (str) – Unique string identifier for the monitor alert.
Returns:: An object describing the monitor alert version.
Return type:: MonitorAlertVersion

list_monitor_alerts_for_monitor(model_monitor_id=None, realtime_monitor_id=None)

Retrieves the list of monitor alerts for a specified monitor. One of the model_monitor_id or realtime_monitor_id is required but not both.

Parameters:

model_monitor_id (str) – The unique ID associated with the model monitor.
realtime_monitor_id (str) – The unique ID associated with the real-time monitor.

Returns:

A list of monitor alerts.

Return type:

list[MonitorAlert]

list_monitor_alert_versions_for_monitor_version(model_monitor_version)

Retrieves the list of monitor alert versions for a specified monitor instance.

Parameters:: model_monitor_version (str) – The unique ID associated with the model monitor.
Returns:: A list of monitor alert versions.
Return type:: list[MonitorAlertVersion]

get_drift_for_feature(model_monitor_version, feature_name, nested_feature_name=None)

Gets the feature drift associated with a single feature in an output feature group from a prediction.

Parameters:

model_monitor_version (str) – Unique string identifier of a model monitor version created under the project.
feature_name (str) – Name of the feature to view the distribution of.
nested_feature_name (str) – Optionally, the name of the nested feature that the feature is in.

Returns:

An object describing the training and prediction output feature distributions.

Return type:

FeatureDistribution

get_outliers_for_feature(model_monitor_version, feature_name=None, nested_feature_name=None)

Gets a list of outliers measured by a single feature (or overall) in an output feature group from a prediction.

Parameters:

model_monitor_version (str) – Unique string identifier for a model monitor version created under the project.
feature_name (str) – Name of the feature to view the distribution of.
nested_feature_name (str) – Optionally, the name of the nested feature that the feature is in.

Return type:

Dict

describe_prediction_operator(prediction_operator_id)

Describe an existing prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.
Returns:: The requested prediction operator object.
Return type:: PredictionOperator

list_prediction_operators(project_id)

List all the prediction operators inside a project.

Parameters:: project_id (str) – The unique ID of the project.
Returns:: A list of prediction operator objects.
Return type:: list[PredictionOperator]

list_prediction_operator_versions(prediction_operator_id)

List all the prediction operator versions for a prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.
Returns:: A list of prediction operator version objects.
Return type:: list[PredictionOperatorVersion]

describe_deployment(deployment_id)

Retrieves a full description of the specified deployment.

Parameters:: deployment_id (str) – Unique string identifier associated with the deployment.
Returns:: Description of the deployment.
Return type:: Deployment

list_deployments(project_id)

Retrieves a list of all deployments in the specified project.

Parameters:: project_id (str) – The unique identifier associated with the project.
Returns:: An array of deployments.
Return type:: list[Deployment]

list_deployment_tokens(project_id)

Retrieves a list of all deployment tokens associated with the specified project.

Parameters:: project_id (str) – The unique ID associated with the project.
Returns:: A list of deployment tokens.
Return type:: list[DeploymentAuthToken]

get_api_endpoint(deployment_token=None, deployment_id=None, streaming_token=None, feature_group_id=None, model_id=None)

Returns the API endpoint specific to an organization. This function can be utilized using either an API Key or a deployment ID and token for authentication.

Parameters:

deployment_token (str) – Token used for authenticating access to deployed models.
deployment_id (str) – Unique identifier assigned to a deployment created under the specified project.
streaming_token (str) – Token used for authenticating access to streaming data.
feature_group_id (str) – Unique identifier assigned to a feature group.
model_id (str) – Unique identifier assigned to a model.

Returns:

The API endpoint specific to the organization.

Return type:

ApiEndpoint

get_model_training_types_for_deployment(model_id, model_version=None, algorithm=None)

Returns types of models that can be deployed for a given model instance ID.

Parameters:

model_id (str) – The unique ID associated with the model.
model_version (str) – The unique ID associated with the model version to deploy.
algorithm (str) – The unique ID associated with the algorithm to deploy.

Returns:

Model training types for deployment.

Return type:

ModelTrainingTypeForDeployment

get_prediction_logs_records(deployment_id, limit=10, last_log_request_id='', last_log_timestamp=None)

Retrieves the prediction request IDs for the most recent predictions made to the deployment.

Parameters:

deployment_id (str) – The unique identifier of a deployment created under the project.
limit (int) – The number of prediction log entries to retrieve up to the specified limit.
last_log_request_id (str) – The request ID of the last log entry to retrieve.
last_log_timestamp (int) – A Unix timestamp in milliseconds specifying the timestamp for the last log entry.

Returns:

A list of prediction log records.

Return type:

list[PredictionLogRecord]

list_deployment_alerts(deployment_id)

List the monitor alerts associated with the deployment id.

Parameters:: deployment_id (str) – Unique string identifier for the deployment.
Returns:: An array of deployment alerts.
Return type:: list[MonitorAlert]

list_realtime_monitors(project_id)

List the real-time monitors associated with the deployment id.

Parameters:: project_id (str) – Unique string identifier for the deployment.
Returns:: An array of real-time monitors.
Return type:: list[RealtimeMonitor]

describe_realtime_monitor(realtime_monitor_id)

Get the real-time monitor associated with the real-time monitor id.

Parameters:: realtime_monitor_id (str) – Unique string identifier for the real-time monitor.
Returns:: Object describing the real-time monitor.
Return type:: RealtimeMonitor

describe_refresh_policy(refresh_policy_id)

Retrieve a single refresh policy

Parameters:: refresh_policy_id (str) – The unique ID associated with this refresh policy.
Returns:: An object representing the refresh policy.
Return type:: RefreshPolicy

describe_refresh_pipeline_run(refresh_pipeline_run_id)

Retrieve a single refresh pipeline run

Parameters:: refresh_pipeline_run_id (str) – Unique string identifier associated with the refresh pipeline run.
Returns:: A refresh pipeline run object.
Return type:: RefreshPipelineRun

list_refresh_policies(project_id=None, dataset_ids=[], feature_group_id=None, model_ids=[], deployment_ids=[], batch_prediction_ids=[], model_monitor_ids=[], notebook_ids=[])

List the refresh policies for the organization. If no filters are specified, all refresh policies are returned.

Parameters:

project_id (str) – Project ID for which we wish to see the refresh policies attached.
dataset_ids (List) – Comma-separated list of Dataset IDs.
feature_group_id (str) – Feature Group ID for which we wish to see the refresh policies attached.
model_ids (List) – Comma-separated list of Model IDs.
deployment_ids (List) – Comma-separated list of Deployment IDs.
batch_prediction_ids (List) – Comma-separated list of Batch Prediction IDs.
model_monitor_ids (List) – Comma-separated list of Model Monitor IDs.
notebook_ids (List) – Comma-separated list of Notebook IDs.

Returns:

List of all refresh policies in the organization.

Return type:

list[RefreshPolicy]

list_refresh_pipeline_runs(refresh_policy_id)

List the the times that the refresh policy has been run

Parameters:: refresh_policy_id (str) – Unique identifier associated with the refresh policy.
Returns:: List of refresh pipeline runs for the given refresh policy ID.
Return type:: list[RefreshPipelineRun]

download_batch_prediction_result_chunk(batch_prediction_version, offset=0, chunk_size=10485760)

Returns a stream containing the batch prediction results.

Parameters:

batch_prediction_version (str) – Unique string identifier of the batch prediction version to get the results from.
offset (int) – The offset to read from.
chunk_size (int) – The maximum amount of data to read.

Return type:

io.BytesIO

get_batch_prediction_connector_errors(batch_prediction_version)

Returns a stream containing the batch prediction database connection write errors, if any writes failed for the specified batch prediction job.

Parameters:: batch_prediction_version (str) – Unique string identifier of the batch prediction job to get the errors for.
Return type:: io.BytesIO

list_batch_predictions(project_id, limit=None)

Retrieves a list of batch predictions in the project.

Parameters:

project_id (str) – Unique string identifier of the project.
limit (int) – Maximum number of batch predictions to return. We’ll have internal limit if not set.

Returns:

List of batch prediction jobs.

Return type:

list[BatchPrediction]

describe_batch_prediction(batch_prediction_id)

Describe the batch prediction.

Parameters:: batch_prediction_id (str) – The unique identifier associated with the batch prediction.
Returns:: The batch prediction description.
Return type:: BatchPrediction

list_batch_prediction_versions(batch_prediction_id, limit=100, start_after_version=None)

Retrieves a list of versions of a given batch prediction

Parameters:

batch_prediction_id (str) – Unique identifier of the batch prediction.
limit (int) – Number of versions to list.
start_after_version (str) – Version to start after.

Returns:

List of batch prediction versions.

Return type:

list[BatchPredictionVersion]

describe_batch_prediction_version(batch_prediction_version)

Describes a Batch Prediction Version.

Parameters:: batch_prediction_version (str) – Unique string identifier of the Batch Prediction Version.
Returns:: The Batch Prediction Version.
Return type:: BatchPredictionVersion

get_batch_prediction_version_logs(batch_prediction_version)

Retrieves the batch prediction logs.

Parameters:: batch_prediction_version (str) – The unique version ID of the batch prediction version.
Returns:: The logs for the specified batch prediction version.
Return type:: BatchPredictionVersionLogs

get_deployment_statistics_over_time(deployment_id, start_date, end_date)

Return basic access statistics for the given window

Parameters:

deployment_id (str) – Unique string identifier of the deployment created under the project.
start_date (str) – Timeline start date in ISO format.
end_date (str) – Timeline end date in ISO format. The date range must be 7 days or less.

Returns:

Object describing Time series data of the number of requests and latency over the specified time period.

Return type:

DeploymentStatistics

get_data(feature_group_id, primary_key=None, num_rows=None)

Gets the feature group rows for online updatable feature groups.

If primary key is set, row corresponding to primary_key is returned. If num_rows is set, we return maximum of num_rows latest updated rows.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
primary_key (str) – The primary key value for which to retrieve the feature group row (only for online feature groups).
num_rows (int) – Maximum number of rows to return from the feature group

Returns:

A list of feature group rows.

Return type:

list[FeatureGroupRow]

describe_python_function(name)

Describe a Python Function.

Parameters:: name (str) – The name to identify the Python function. Must be a valid Python identifier.
Returns:: The Python function object.
Return type:: PythonFunction

list_python_functions(function_type='FEATURE_GROUP')

List all python functions within the organization.

Parameters:: function_type (str) – Optional argument to specify the type of function to list Python functions for. Default is FEATURE_GROUP, but can also be PLOTLY_FIG.
Returns:: A list of PythonFunction objects.
Return type:: list[PythonFunction]

list_pipelines(project_id=None)

Lists the pipelines for an organization or a project

Parameters:: project_id (str) – Unique string identifier for the project to list graph dashboards from.
Returns:: A list of pipelines.
Return type:: list[Pipeline]

describe_pipeline_version(pipeline_version)

Describes a specified pipeline version

Parameters:: pipeline_version (str) – Unique string identifier for the pipeline version
Returns:: Object describing the pipeline version
Return type:: PipelineVersion

describe_pipeline_step(pipeline_step_id)

Deletes a step from a pipeline.

Parameters:: pipeline_step_id (str) – The ID of the pipeline step.
Returns:: An object describing the pipeline step.
Return type:: PipelineStep

describe_pipeline_step_by_name(pipeline_id, step_name)

Describes a pipeline step by the step name.

Parameters:

pipeline_id (str) – The ID of the pipeline.
step_name (str) – The name of the step.

Returns:

An object describing the pipeline step.

Return type:

PipelineStep

describe_pipeline_step_version(pipeline_step_version)

Describes a pipeline step version.

Parameters:: pipeline_step_version (str) – The ID of the pipeline step version.
Returns:: An object describing the pipeline step version.
Return type:: PipelineStepVersion

list_pipeline_version_logs(pipeline_version)

Gets the logs for the steps in a given pipeline version.

Parameters:: pipeline_version (str) – The id of the pipeline version.
Returns:: Object describing the logs for the steps in the pipeline.
Return type:: PipelineVersionLogs

get_step_version_logs(pipeline_step_version)

Gets the logs for a given step version.

Parameters:: pipeline_step_version (str) – The id of the pipeline step version.
Returns:: Object describing the pipeline step logs.
Return type:: PipelineStepVersionLogs

describe_graph_dashboard(graph_dashboard_id)

Describes a given graph dashboard.

Parameters:: graph_dashboard_id (str) – Unique identifier for the graph dashboard.
Returns:: An object containing information about the graph dashboard.
Return type:: GraphDashboard

list_graph_dashboards(project_id=None)

Lists the graph dashboards for a project

Parameters:: project_id (str) – Unique string identifier for the project to list graph dashboards from.
Returns:: A list of graph dashboards.
Return type:: list[GraphDashboard]

describe_graph_for_dashboard(graph_reference_id)

Describes a python plot to a graph dashboard

Parameters:: graph_reference_id (str) – Unique string identifier for the python function id for the graph
Returns:: An object describing the graph dashboard.
Return type:: PythonPlotFunction

describe_algorithm(algorithm)

Retrieves a full description of the specified algorithm.

Parameters:: algorithm (str) – The name of the algorithm.
Returns:: The description of the algorithm.
Return type:: Algorithm

list_algorithms(problem_type=None, project_id=None)

List all custom algorithms, with optional filtering on Problem Type and Project ID

Parameters:

problem_type (ProblemType) – The problem type to query. If None, return all algorithms in the organization.
project_id (str) – The ID of the project.

Returns:

A list of algorithms.

Return type:

list[Algorithm]

describe_custom_loss_function(name)

Retrieve a full description of a previously registered custom loss function.

Parameters:: name (str) – Registered name of the custom loss function.
Returns:: The description of the custom loss function with the given name.
Return type:: CustomLossFunction

list_custom_loss_functions(name_prefix=None, loss_function_type=None)

Retrieves a list of registered custom loss functions and their descriptions.

Parameters:

name_prefix (str) – The prefix of the names of the loss functions to list.
loss_function_type (str) – The category of loss functions to search in.

Returns:

The description of the custom loss function with the given name.

Return type:

CustomLossFunction

describe_custom_metric(name)

Retrieves a full description of a previously registered custom metric function.

Parameters:: name (str) – Registered name of the custom metric.
Returns:: The description of the custom metric with the given name.
Return type:: CustomMetric

describe_custom_metric_version(custom_metric_version)

Describes a given custom metric version

Parameters:: custom_metric_version (str) – A unique string identifier for the custom metric version.
Returns:: An object describing the custom metric version.
Return type:: CustomMetricVersion

list_custom_metrics(name_prefix=None, problem_type=None)

Retrieves a list of registered custom metrics.

Parameters:

name_prefix (str) – The prefix of the names of the custom metrics.
problem_type (str) – The associated problem type of the custom metrics.

Returns:

A list of custom metrics.

Return type:

list[CustomMetric]

describe_module(name)

Retrieves a full description of the specified module.

Parameters:: name (str) – The name of the module.
Returns:: The description of the module.
Return type:: Module

list_modules()

List all the modules

Returns:: A list of modules
Return type:: list[Module]

get_organization_secret(secret_key)

Gets a secret.

Parameters:: secret_key (str) – The secret key.
Returns:: The secret.
Return type:: OrganizationSecret

list_organization_secrets(decrypt_value=False, secret_type=OrganizationSecretType.ORG_SECRET)

Lists all secrets for an organization.

Parameters:

decrypt_value (bool) – Whether to decrypt the secret values.
secret_type (OrganizationSecretType) – Filter secrets by type. Use OrganizationSecretType enum values.

Returns:

List of secrets.

Return type:

list[OrganizationSecret]

get_app_user_group_sign_in_token(user_group_id, email, name)

Get a token for a user group user to sign in.

Parameters:

user_group_id (str) – The ID of the user group.
email (str) – The email of the user.
name (str) – The name of the user.

Returns:

The token to sign in the user

Return type:

AppUserGroupSignInToken

query_feature_group_code_generator(query, language, project_id=None)

Send a query to the feature group code generator tool to generate code for the query.

Parameters:

query (str) – A natural language query which specifies what the user wants out of the feature group or its code.
language (str) – The language in which code is to be generated. One of ‘sql’ or ‘python’.
project_id (str) – A unique string identifier of the project in context of which the query is.

Returns:

The response from the model, raw text and parsed components.

Return type:

LlmResponse

get_natural_language_explanation(feature_group_id=None, feature_group_version=None, model_id=None)

Returns the saved natural language explanation of an artifact with given ID. The artifact can be - Feature Group or Feature Group Version or Model

Parameters:

feature_group_id (str) – A unique string identifier associated with the Feature Group.
feature_group_version (str) – A unique string identifier associated with the Feature Group Version.
model_id (str) – A unique string identifier associated with the Model.

Returns:

The object containing natural language explanation(s) as field(s).

Return type:

NaturalLanguageExplanation

generate_natural_language_explanation(feature_group_id=None, feature_group_version=None, model_id=None)

Generates natural language explanation of an artifact with given ID. The artifact can be - Feature Group or Feature Group Version or Model

Parameters:

feature_group_id (str) – A unique string identifier associated with the Feature Group.
feature_group_version (str) – A unique string identifier associated with the Feature Group Version.
model_id (str) – A unique string identifier associated with the Model.

Returns:

The object containing natural language explanation(s) as field(s).

Return type:

NaturalLanguageExplanation

get_chat_session(chat_session_id)

Gets a chat session from Data Science Co-pilot.

Parameters:: chat_session_id (str) – Unique ID of the chat session.
Returns:: The chat session with Data Science Co-pilot
Return type:: ChatSession

list_chat_sessions(most_recent_per_project=False)

Lists all chat sessions for the current user

Parameters:: most_recent_per_project (bool) – An optional parameter whether to only return the most recent chat session per project. Default False.
Returns:: The chat sessions with Data Science Co-pilot
Return type:: ChatSession

get_deployment_conversation(deployment_conversation_id=None, external_session_id=None, deployment_id=None, filter_intermediate_conversation_events=True, get_unused_document_uploads=False, start=None, limit=None)

Gets a deployment conversation.

Parameters:

deployment_conversation_id (str) – Unique ID of the conversation. One of deployment_conversation_id or external_session_id must be provided.
external_session_id (str) – External session ID of the conversation.
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.
filter_intermediate_conversation_events (bool) – If true, intermediate conversation events will be filtered out. Default is true.
get_unused_document_uploads (bool) – If true, unused document uploads will be returned. Default is false.
start (int) – The start index of the conversation.
limit (int) – The limit of the conversation.

Returns:

The deployment conversation.

Return type:

DeploymentConversation

list_deployment_conversations(deployment_id=None, external_application_id=None, conversation_type=None, fetch_last_llm_info=False, limit=None, search=None)

Lists all conversations for the given deployment and current user.

Parameters:

deployment_id (str) – The deployment to get conversations for.
external_application_id (str) – The external application id associated with the deployment conversation. If specified, only conversations created on that application will be listed.
conversation_type (DeploymentConversationType) – The type of the conversation indicating its origin.
fetch_last_llm_info (bool) – If true, the LLM info for the most recent conversation will be fetched. Only applicable for system-created bots.
limit (int) – The number of conversations to return. Defaults to 600.
search (str) – The search query to filter conversations by title.

Returns:

The deployment conversations.

Return type:

list[DeploymentConversation]

export_deployment_conversation(deployment_conversation_id=None, external_session_id=None)

Export a Deployment Conversation.

Parameters:

deployment_conversation_id (str) – A unique string identifier associated with the deployment conversation.
external_session_id (str) – The external session id associated with the deployment conversation. One of deployment_conversation_id or external_session_id must be provided.

Returns:

The deployment conversation html export.

Return type:

DeploymentConversationExport

get_app_user_group(user_group_id)

Gets an App User Group.

Parameters:: user_group_id (str) – The ID of the App User Group.
Returns:: The App User Group.
Return type:: AppUserGroup

describe_external_application(external_application_id)

Describes an External Application.

Parameters:: external_application_id (str) – The ID of the External Application.
Returns:: The External Application.
Return type:: ExternalApplication

list_external_applications()

Lists External Applications in an organization.

Returns:: List of External Applications.
Return type:: list[ExternalApplication]

download_agent_attachment(deployment_id, attachment_id)

Return an agent attachment.

Parameters:

deployment_id (str) – The deployment ID.
attachment_id (str) – The attachment ID.

Return type:

io.BytesIO

describe_agent(agent_id)

Retrieves a full description of the specified model.

Parameters:: agent_id (str) – Unique string identifier associated with the model.
Returns:: Description of the agent.
Return type:: Agent

describe_agent_version(agent_version)

Retrieves a full description of the specified agent version.

Parameters:: agent_version (str) – Unique string identifier of the agent version.
Returns:: A agent version.
Return type:: AgentVersion

search_feature_groups(text, num_results=10, project_id=None, feature_group_ids=None)

Search feature groups based on text and filters.

Parameters:

text (str) – Text to use for approximately matching feature groups.
num_results (int) – The maximum number of search results to retrieve. The length of the returned list is less than or equal to num_results.
project_id (str) – The ID of the project in which to restrict the search, if specified.
feature_group_ids (List) – A list of feagure group IDs to restrict the search to.

Returns:

A list of search results, each containing the retrieved object and its relevance score

Return type:

list[OrganizationSearchResult]

list_agents(project_id)

Retrieves the list of agents in the specified project.

Parameters:: project_id (str) – The unique identifier associated with the project.
Returns:: A list of agents in the project.
Return type:: list[Agent]

list_agent_versions(agent_id, limit=100, start_after_version=None)

List all versions of an agent.

Parameters:

agent_id (str) – The unique identifier associated with the agent.
limit (int) – If provided, limits the number of agent versions returned.
start_after_version (str) – Unique string identifier of the version after which the list starts.

Returns:

An array of Agent versions.

Return type:

list[AgentVersion]

copy_agent(agent_id, project_id=None)

Creates a copy of the input agent

Parameters:

agent_id (str) – The unique id of the agent whose copy is to be generated.
project_id (str) – Project id to create the new agent to. By default it picks up the source agent’s project id.

Returns:

The newly generated agent.

Return type:

Agent

list_llm_apps()

Lists all available LLM Apps, which are LLMs tailored to achieve a specific task like code generation for a specific service’s API.

Returns:: A list of LLM Apps.
Return type:: list[LlmApp]

list_document_retrievers(project_id, limit=100, start_after_id=None)

List all the document retrievers.

Parameters:

project_id (str) – The ID of project that the document retriever is created in.
limit (int) – The number of document retrievers to return.
start_after_id (str) – An offset parameter to exclude all document retrievers up to this specified ID.

Returns:

All the document retrievers in the organization associated with the specified project.

Return type:

list[DocumentRetriever]

describe_document_retriever(document_retriever_id)

Describe a Document Retriever.

Parameters:: document_retriever_id (str) – A unique string identifier associated with the document retriever.
Returns:: The document retriever object.
Return type:: DocumentRetriever

describe_document_retriever_by_name(name)

Describe a document retriever by its name.

Parameters:: name (str) – The unique name of the document retriever to look up.
Returns:: The Document Retriever.
Return type:: DocumentRetriever

list_document_retriever_versions(document_retriever_id, limit=100, start_after_version=None)

List all the document retriever versions with a given ID.

Parameters:

document_retriever_id (str) – A unique string identifier associated with the document retriever.
limit (int) – The number of vector store versions to retrieve. The maximum value is 100.
start_after_version (str) – An offset parameter to exclude all document retriever versions up to this specified one.

Returns:

All the document retriever versions associated with the document retriever.

Return type:

list[DocumentRetrieverVersion]

describe_document_retriever_version(document_retriever_version)

Describe a document retriever version.

Parameters:: document_retriever_version (str) – A unique string identifier associated with the document retriever version.
Returns:: The document retriever version object.
Return type:: DocumentRetrieverVersion

class abacusai.ToolResponse(*args, **kwargs)

Bases: AgentResponse

Response object for tool to support non-text response sections

abacusai._request_context

class abacusai.CodeAgentResponse(client, deploymentConversationId=None, messages=None, toolUseRequest=None)

Bases: abacusai.return_class.AbstractApiClass

A response from a Code Agent

Parameters:

client (ApiClient) – An authenticated API Client instance
deploymentConversationId (str) – The unique identifier of the deployment conversation.
messages (list) – The conversation messages in the chat.
toolUseRequest (dict) – A request to use an external tool. Contains: - id (str): Unique identifier for the tool use request - input (dict): Input parameters for the tool, e.g. {‘command’: ‘ls’} - name (str): Name of the tool being used, e.g. ‘bash’ - type (str): Always ‘tool_use’ to identify this as a tool request

deployment_conversation_id = None

messages = None

tool_use_request = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeAutocompleteEditPredictionResponse(client, autocompleteResponse=None, showAutocomplete=None)

Bases: abacusai.return_class.AbstractApiClass

A autocomplete response from an LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
autocompleteResponse (str) – autocomplete code
showAutocomplete (bool) – Whether to show autocomplete in the client

autocomplete_response = None

show_autocomplete = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeAutocompleteResponse(client, autocompleteResponse=None, showAutocomplete=None, lineNumber=None)

Bases: abacusai.return_class.AbstractApiClass

A autocomplete response from an LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
autocompleteResponse (str) – autocomplete code
showAutocomplete (bool) – Whether to show autocomplete in the client
lineNumber (int) – The line number where autocomplete should be shown

autocomplete_response = None

show_autocomplete = None

line_number = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeBot(client, llmName=None, name=None, imageUploadSupported=None, codeAgentSupported=None, codeEditSupported=None, isPremium=None, llmBotIcon=None, provider=None, isUserApiKeyAllowed=None, isRateLimited=None, apiKeyUrl=None)

Bases: abacusai.return_class.AbstractApiClass

A bot option for CodeLLM

Parameters:

client (ApiClient) – An authenticated API Client instance
llmName (str) – The name of the LLM.
name (str) – The name of the bot.
imageUploadSupported (bool) – Whether the LLM supports image upload.
codeAgentSupported (bool) – Whether the LLM supports code agent.
codeEditSupported (bool) – Whether the LLM supports code edit.
isPremium (bool) – Whether the LLM is a premium LLM.
llmBotIcon (str) – The icon of the LLM bot.
provider (str) – The provider of the LLM.
isUserApiKeyAllowed (bool) – Whether the LLM supports user API key.
isRateLimited (bool) – Whether the LLM is rate limited.
apiKeyUrl (str) – The URL to get the API key.

llm_name = None

name = None

image_upload_supported = None

code_agent_supported = None

code_edit_supported = None

is_premium = None

llm_bot_icon = None

provider = None

is_user_api_key_allowed = None

is_rate_limited = None

api_key_url = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeEdit(client, filePath=None, startLine=None, endLine=None, text=None)

Bases: abacusai.return_class.AbstractApiClass

A code edit response from an LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
filePath (str) – The path of the file to be edited.
startLine (int) – The start line of the code to be replaced.
endLine (int) – The end line of the code to be replaced.
text (str) – The new text.

file_path = None

start_line = None

end_line = None

text = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeEditResponse(client, codeChanges=None)

Bases: abacusai.return_class.AbstractApiClass

A code edit response from an LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
codeChanges (list) – The code changes to be applied.

code_changes = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeEdits(client, codeEdits=None, codeChanges=None)

Bases: abacusai.return_class.AbstractApiClass

A code edit response from an LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
codeEdits (list[codeedit]) – The code changes to be applied.
codeChanges (list) – The code changes to be applied.

code_edits = None

code_changes = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeEmbeddings(client, embeddings=None, chunkingScheme=None)

Bases: abacusai.return_class.AbstractApiClass

Code embeddings

Parameters:

client (ApiClient) – An authenticated API Client instance
embeddings (dict) – A dictionary mapping the file name to its embeddings.
chunkingScheme (str) – The scheme used for chunking the embeddings.

embeddings = None

chunking_scheme = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeLlmChangedFiles(client, addedFiles=None, updatedFiles=None, deletedFiles=None)

Bases: abacusai.return_class.AbstractApiClass

Code changed files

Parameters:

client (ApiClient) – An authenticated API Client instance
addedFiles (list) – A list of added file paths.
updatedFiles (list) – A list of updated file paths.
deletedFiles (list) – A list of deleted file paths.

added_files = None

updated_files = None

deleted_files = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeSource(client, sourceType=None, sourceCode=None, applicationConnectorId=None, applicationConnectorInfo=None, packageRequirements=None, status=None, error=None, publishingMsg=None, moduleDependencies=None)

Bases: abacusai.return_class.AbstractApiClass

Code source for python-based custom feature groups and models

Parameters:

client (ApiClient) – An authenticated API Client instance
sourceType (str) – The type of the source, one of TEXT, PYTHON, FILE_UPLOAD, or APPLICATION_CONNECTOR
sourceCode (str) – If the type of the source is TEXT, the raw text of the function
applicationConnectorId (str) – The Application Connector to fetch the code from
applicationConnectorInfo (str) – Args passed to the application connector to fetch the code
packageRequirements (list) – The pip package dependencies required to run the code
status (str) – The status of the code and validations
error (str) – If the status is failed, an error message describing what went wrong
publishingMsg (dict) – Warnings in the source code
moduleDependencies (list) – The list of internal modules dependencies required to run the code

source_type = None

source_code = None

application_connector_id = None

application_connector_info = None

package_requirements = None

status = None

error = None

publishing_msg = None

module_dependencies = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

import_as_cell(): Adds the source code as an unexecuted cell in the notebook.

class abacusai.CodeSuggestionValidationResponse(client, isValid=None)

Bases: abacusai.return_class.AbstractApiClass

A response from an LLM to validate a code suggestion.

Parameters:

client (ApiClient) – An authenticated API Client instance
isValid (bool) – Whether the code suggestion is valid.

is_valid = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodeSummaryResponse(client, summary=None)

Bases: abacusai.return_class.AbstractApiClass

A summary response from an LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
summary (str) – The summary of the code.

summary = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CodellmEmbeddingConstants(client, maxSupportedWorkspaceFiles=None, maxSupportedWorkspaceChunks=None, maxConcurrentRequests=None, fileExtensionToChunkingScheme=None, idleTimeoutSeconds=None)

Bases: abacusai.return_class.AbstractApiClass

A dictionary of constants to be used in the autocomplete.

Parameters:

client (ApiClient) – An authenticated API Client instance
maxSupportedWorkspaceFiles (int) – Max supported workspace files
maxSupportedWorkspaceChunks (int) – Max supported workspace chunks
maxConcurrentRequests (int) – Max concurrent requests
fileExtensionToChunkingScheme (dict) – Map between the file extensions and their chunking schema
idleTimeoutSeconds (int) – The idle timeout without any activity before the workspace is refreshed.

max_supported_workspace_files = None

max_supported_workspace_chunks = None

max_concurrent_requests = None

file_extension_to_chunking_scheme = None

idle_timeout_seconds = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ComputePointInfo(client, updatedAt=None, last24HoursUsage=None, last7DaysUsage=None, currMonthAvailPoints=None, currMonthUsage=None, lastThrottlePopUp=None, alwaysDisplay=None)

Bases: abacusai.return_class.AbstractApiClass

The compute point info of the organization

Parameters:

client (ApiClient) – An authenticated API Client instance
updatedAt (str) – The last time the compute point info was updated
last24HoursUsage (int) – The 24 hours usage of the organization
last7DaysUsage (int) – The 7 days usage of the organization
currMonthAvailPoints (int) – The current month’s available compute points
currMonthUsage (int) – The current month’s usage compute points
lastThrottlePopUp (str) – The last time the organization was throttled
alwaysDisplay (bool) – Whether to always display the compute point toggle

updated_at = None

last_24_hours_usage = None

last_7_days_usage = None

curr_month_avail_points = None

curr_month_usage = None

last_throttle_pop_up = None

always_display = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ConcatenationConfig(client, concatenatedTable=None, mergeType=None, replaceUntilTimestamp=None, skipMaterialize=None)

Bases: abacusai.return_class.AbstractApiClass

Feature Group Concatenation Config

Parameters:

client (ApiClient) – An authenticated API Client instance
concatenatedTable (str) – The feature group to concatenate with the destination feature group.
mergeType (str) – The type of merge to perform, either UNION or INTERSECTION.
replaceUntilTimestamp (int) – The Unix timestamp to specify the point up to which data from the source feature group will be replaced.
skipMaterialize (bool) – If True, the concatenated feature group will not be materialized.

concatenated_table = None

merge_type = None

replace_until_timestamp = None

skip_materialize = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ConstantsAutocompleteResponse(client, maxPendingRequests=None, acceptanceDelay=None, debounceDelay=None, recordUserAction=None, validateSuggestion=None, validationLinesThreshold=None, maxTrackedRecentChanges=None, diffThreshold=None, derivativeThreshold=None, defaultSurroundingLines=None, maxTrackedVisitChanges=None, selectionCooldownMs=None, viewingCooldownMs=None, maxLines=None, editCooldownMs=None, scrollDebounceMs=None, lspDeadline=None, diagnosticsThreshold=None, diagnosticEachThreshold=None, numVsCodeSuggestions=None, minReindexingInterval=None, minRefreshSummaryInterval=None, summaryBatchSize=None, jobReorderInterval=None, stopRapidChanges=None, delaySummaryBatches=None, delaySummaryBatchesRateLimit=None, maxSymbolsFuzzyMatch=None, fuzzySymbolMatchThreshold=None, symbolsCacheUpdateInterval=None, symbolsStorageUpdateInterval=None, editPredictionSimilarityThreshold=None, minSearchWordLength=None, maxOccurrencesPerWord=None, maxWordsContentMatches=None, editPredictionEnabled=None, snapshotIntervalMs=None, linesForSnapshot=None, embeddingConstants=None)

Bases: abacusai.return_class.AbstractApiClass

A dictionary of constants to be used in the autocomplete.

Parameters:

client (ApiClient) – An authenticated API Client instance
maxPendingRequests (int) – The maximum number of pending requests.
acceptanceDelay (int) – The acceptance delay.
debounceDelay (int) – The debounce delay.
recordUserAction (bool) – Whether to record user action.
validateSuggestion (bool) – Whether to validate the suggestion.
validationLinesThreshold (int) – The number of lines to validate the suggestion.
maxTrackedRecentChanges (int) – The maximum number of recent file changes to track.
diffThreshold (int) – The diff operations threshold.
derivativeThreshold (int) – The derivative threshold for deletions
defaultSurroundingLines (int) – The default number of surrounding lines to include in the recently visited context.
maxTrackedVisitChanges (int) – The maximum number of recently visited ranges to track.
selectionCooldownMs (int) – The cooldown time in milliseconds for selection changes.
viewingCooldownMs (int) – The cooldown time in milliseconds for viewing changes.
maxLines (int) – The maximum number of lines to include in recently visited context.
editCooldownMs (int) – The cooldown time in milliseconds after last edit.
scrollDebounceMs (int) – The debounce time in milliseconds for scroll events.
lspDeadline (int) – The deadline in milliseconds for LSP context.
diagnosticsThreshold (int) – The max number of diagnostics to show.
diagnosticEachThreshold (int) – The max number of characters to show for each diagnostic type.
numVsCodeSuggestions (int) – The number of VS Code suggestions to show.
minReindexingInterval (int) – The minimum interval between reindexes in ms.
minRefreshSummaryInterval (int) – The minimum interval between refresh summary in ms.
summaryBatchSize (int) – The batch size for code summary in autocomplete.
jobReorderInterval (int) – The interval in ms to reorder jobs in the job queue for summary.
stopRapidChanges (bool) – Whether to stop rapid changes in autocomplete.
delaySummaryBatches (int) – The delay in ms between summary batches.
delaySummaryBatchesRateLimit (int) – The delay in ms in case of rate limit for delay summary batches.
maxSymbolsFuzzyMatch (int) – The max number of symbols to fuzzy match.
fuzzySymbolMatchThreshold (int) – The threshold for fuzzy symbol match.
symbolsCacheUpdateInterval (int) – The interval in ms to update the symbols cache.
symbolsStorageUpdateInterval (int) – The interval in ms to update the symbols storage.
editPredictionSimilarityThreshold (int) – The threshold for edit prediction similarity.
minSearchWordLength (int) – The minimum length of the word to be searched.
maxOccurrencesPerWord (int) – The maximum occurrences of a particular search word present in the file.
maxWordsContentMatches (int) – The maximum number of content matches from the client.
editPredictionEnabled (bool) – Whether to enable edit prediction.
snapshotIntervalMs (int) – The interval in ms to snapshot the file for recent file changes.
linesForSnapshot (int) – Limit of max number of lines to snapshot for recent file changes.
embeddingConstants (codellmembeddingconstants) – Embedding constants

max_pending_requests = None

acceptance_delay = None

debounce_delay = None

record_user_action = None

validate_suggestion = None

validation_lines_threshold = None

max_tracked_recent_changes = None

diff_threshold = None

derivative_threshold = None

default_surrounding_lines = None

max_tracked_visit_changes = None

selection_cooldown_ms = None

viewing_cooldown_ms = None

max_lines = None

edit_cooldown_ms = None

scroll_debounce_ms = None

lsp_deadline = None

diagnostics_threshold = None

diagnostic_each_threshold = None

num_vs_code_suggestions = None

min_reindexing_interval = None

min_refresh_summary_interval = None

summary_batch_size = None

job_reorder_interval = None

stop_rapid_changes = None

delay_summary_batches = None

delay_summary_batches_rate_limit = None

max_symbols_fuzzy_match = None

fuzzy_symbol_match_threshold = None

symbols_cache_update_interval = None

symbols_storage_update_interval = None

edit_prediction_similarity_threshold = None

min_search_word_length = None

max_occurrences_per_word = None

max_words_content_matches = None

edit_prediction_enabled = None

snapshot_interval_ms = None

lines_for_snapshot = None

embedding_constants = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CpuGpuMemorySpecs(client, default=None, data=None)

Bases: abacusai.return_class.AbstractApiClass

Includes the memory specs of the CPU/GPU

Parameters:

client (ApiClient) – An authenticated API Client instance
default (int) – the default memory size for the processing unit
data (list) – the list of memory sizes for the processing unit

default = None

data = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CustomChatInstructions(client, userInformationInstructions=None, responseInstructions=None, enableCodeExecution=None, enableImageGeneration=None, enableWebSearch=None, enablePlayground=None, enableMemories=None, experimentalFeatures=None)

Bases: abacusai.return_class.AbstractApiClass

Custom Chat Instructions

Parameters:

client (ApiClient) – An authenticated API Client instance
userInformationInstructions (str) – The behavior instructions for the chat.
responseInstructions (str) – The response instructions for the chat.
enableCodeExecution (bool) – Whether or not code execution is enabled.
enableImageGeneration (bool) – Whether or not image generation is enabled.
enableWebSearch (bool) – Whether or not web search is enabled.
enablePlayground (bool) – Whether or not playground is enabled.
enableMemories (bool) – Whether or not memories are enabled.
experimentalFeatures (dict) – Experimental features.

user_information_instructions = None

response_instructions = None

enable_code_execution = None

enable_image_generation = None

enable_web_search = None

enable_playground = None

enable_memories = None

experimental_features = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CustomDomain(client, status=None, message=None, expectedNameservers=None, currentNameservers=None)

Bases: abacusai.return_class.AbstractApiClass

Result of adding a custom domain to a hosted app

Parameters:

client (ApiClient) – An authenticated API Client instance
status (bool) – Whether the custom domain was added successfully
message (str) – The message from the custom domain
expectedNameservers (list) – The expected nameservers for the custom domain
currentNameservers (list) – The current nameservers for the custom domain

status = None

message = None

expected_nameservers = None

current_nameservers = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CustomLossFunction(client, notebookId=None, name=None, createdAt=None, lossFunctionName=None, lossFunctionType=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

Custom Loss Function

Parameters:

client (ApiClient) – An authenticated API Client instance
notebookId (str) – The unique identifier of the notebook used to create/edit the loss function.
name (str) – Name assigned to the custom loss function.
createdAt (str) – When the loss function was created.
lossFunctionName (str) – The name of the function defined in the source code.
lossFunctionType (str) – The category of problems that this loss would be applicable to, e.g. regression, multi-label classification, etc.
codeSource (CodeSource) – Information about the source code of the loss function.

notebook_id = None

name = None

created_at = None

loss_function_name = None

loss_function_type = None

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CustomMetric(client, customMetricId=None, name=None, createdAt=None, problemType=None, notebookId=None, latestCustomMetricVersion={})

Bases: abacusai.return_class.AbstractApiClass

Custom metric.

Parameters:

client (ApiClient) – An authenticated API Client instance
customMetricId (str) – Unique string identifier of the custom metric.
name (str) – Name assigned to the custom metric.
createdAt (str) – Date and time when the custom metric was created (ISO 8601 format).
problemType (str) – Problem type that this custom metric is applicable to (e.g. regression).
notebookId (str) – Unique string identifier of the notebook used to create/edit the custom metric.
latestCustomMetricVersion (CustomMetricVersion) – Latest version of the custom metric.

custom_metric_id = None

name = None

created_at = None

problem_type = None

notebook_id = None

latest_custom_metric_version

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.CustomMetricVersion(client, customMetricVersion=None, name=None, createdAt=None, customMetricFunctionName=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

Custom metric version

Parameters:

client (ApiClient) – An authenticated API Client instance
customMetricVersion (str) – Unique string identifier for the custom metric version.
name (str) – Name assigned to the custom metric.
createdAt (str) – ISO-8601 string indicating when the custom metric was created.
customMetricFunctionName (str) – The name of the function defined in the source code.
codeSource (CodeSource) – Information about the source code of the custom metric.

custom_metric_version = None

name = None

created_at = None

custom_metric_function_name = None

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: CustomMetricVersion

describe()

Describes a given custom metric version

Parameters:: custom_metric_version (str) – A unique string identifier for the custom metric version.
Returns:: An object describing the custom metric version.
Return type:: CustomMetricVersion

class abacusai.CustomTrainFunctionInfo(client, trainingDataParameterNameMapping=None, schemaMappings=None, trainDataParameterToFeatureGroupIds=None, trainingConfig=None)

Bases: abacusai.return_class.AbstractApiClass

Information about how to call the customer provided train function.

Parameters:

client (ApiClient) – An authenticated API Client instance
trainingDataParameterNameMapping (dict) – The mapping from feature group type to the dataframe parameter name
schemaMappings (dict) – The feature type to feature name mapping for each dataframe
trainDataParameterToFeatureGroupIds (dict) – The mapping from the dataframe parameter name to the feature group id backing the data
trainingConfig (dict) – The configs for training

training_data_parameter_name_mapping = None

schema_mappings = None

train_data_parameter_to_feature_group_ids = None

training_config = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DataConsistencyDuplication(client, totalCount=None, numDuplicates=None, sample={})

Bases: abacusai.return_class.AbstractApiClass

Data Consistency for duplication within data

Parameters:

client (ApiClient) – An authenticated API Client instance
totalCount (int) – Total count of rows in data.
numDuplicates (int) – Number of Duplicates based on primary keys in data.
sample (FeatureRecord) – A list of dicts enumerating rows the rows that contained duplications in primary keys.

total_count = None

num_duplicates = None

sample

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DataMetrics(client, metrics=None, schema=None, numRows=None, numCols=None, numDuplicateRows=None)

Bases: abacusai.return_class.AbstractApiClass

Processed Metrics and Schema for a dataset version or feature group version

Parameters:

client (ApiClient) – An authenticated API Client instance
metrics (list[dict]) – A list of dicts with metrics for each columns
schema (list[dict]) – A list of dicts with the schema for each metric
numRows (int) – The number of rows
numCols (int) – The number of columns
numDuplicateRows (int) – The number of duplicate rows

metrics = None

schema = None

num_rows = None

num_cols = None

num_duplicate_rows = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DataPrepLogs(client, logs=None)

Bases: abacusai.return_class.AbstractApiClass

Logs from data preparation.

Parameters:

client (ApiClient) – An authenticated API Client instance
logs (list[str]) – List of logs from data preparation during model training.

logs = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DataQualityResults(client, results=None)

Bases: abacusai.return_class.AbstractApiClass

Data Quality results from normalization stage

Parameters:

client (ApiClient) – An authenticated API Client instance
results (dict) – A list with different pairs of quality parameters and their values

results = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DataUploadResult(client, docInfos=None, maxCount=None)

Bases: abacusai.return_class.AbstractApiClass

Results of uploading data to agent.

Parameters:

client (ApiClient) – An authenticated API Client instance
docInfos (list[agentdatadocumentinfo]) – A list of dict for information on the documents uploaded to agent.
maxCount (int) – The maximum number of documents

doc_infos = None

max_count = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DatabaseColumnFeatureMapping(client, databaseColumn=None, feature=None)

Bases: abacusai.return_class.AbstractApiClass

Mapping for export of feature group version to database column

Parameters:

client (ApiClient) – An authenticated API Client instance
databaseColumn (str) – database column name
feature (str) – feature group column it has been matched to

database_column = None

feature = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DatabaseConnector(client, databaseConnectorId=None, service=None, name=None, status=None, auth=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

A connector to an external service

Parameters:

client (ApiClient) – An authenticated API Client instance
databaseConnectorId (str) – A unique string identifier for the connection.
service (str) – An enum string indicating the service this connection connects to.
name (str) – A user-friendly name for the service.
status (str) – The status of the database connector.
auth (dict) – Non-secret connection information for this connector.
createdAt (str) – The ISO-8601 string indicating when the API key was created.

database_connector_id = None

service = None

name = None

status = None

auth = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

list_objects(fetch_raw_data=False)

Lists querable objects in the database connector.

Parameters:: fetch_raw_data (bool) – If true, return unfiltered objects.

get_object_schema(object_name=None, fetch_raw_data=False)

Get the schema of an object in an database connector.

Parameters:

object_name (str) – Unique identifier for the object in the external system.
fetch_raw_data (bool) – If true, return unfiltered list of columns.

Returns:

The schema of the object.

Return type:

DatabaseConnectorSchema

rename(name)

Renames a Database Connector

Parameters:: name (str) – The new name for the Database Connector.

verify()

Checks if Abacus.AI can access the specified database.

Parameters:: database_connector_id (str) – Unique string identifier for the database connector.

delete()

Delete a database connector.

Parameters:: database_connector_id (str) – The unique identifier for the database connector.

query(query)

Runs a query in the specified database connector.

Parameters:: query (str) – The query to be run in the database connector.

get_auth()

Get the authentication details for a given database connector.

Parameters:: database_connector_id (str) – The unique ID associated with the database connector.
Returns:: The database connector with the authentication details.
Return type:: DatabaseConnector

class abacusai.DatabaseConnectorColumn(client, name=None, externalDataType=None)

Bases: abacusai.return_class.AbstractApiClass

A schema description for a column from a database connector

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The unique name of the column.
externalDataType (str) – The data type of column in the external database system.

name = None

external_data_type = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DatabaseConnectorSchema(client, tableName=None, columns={})

Bases: abacusai.return_class.AbstractApiClass

A schema description for a table from a database connector

Parameters:

client (ApiClient) – An authenticated API Client instance
tableName (str) – The unique name of the table.
columns (DatabaseConnectorColumn) – List of columns in the table.

table_name = None

columns

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Dataset(client, datasetId=None, sourceType=None, dataSource=None, createdAt=None, ignoreBefore=None, ephemeral=None, lookbackDays=None, databaseConnectorId=None, databaseConnectorConfig=None, connectorType=None, featureGroupTableName=None, applicationConnectorId=None, applicationConnectorConfig=None, incremental=None, isDocumentset=None, extractBoundingBoxes=None, mergeFileSchemas=None, referenceOnlyDocumentset=None, versionLimit=None, schema={}, refreshSchedules={}, latestDatasetVersion={}, parsingConfig={}, documentProcessingConfig={}, attachmentParsingConfig={})

Bases: abacusai.return_class.AbstractApiClass

A dataset reference

Parameters:

client (ApiClient) – An authenticated API Client instance
datasetId (str) – The unique identifier of the dataset.
sourceType (str) – The source of the Dataset. EXTERNAL_SERVICE, UPLOAD, or STREAMING.
dataSource (str) – Location of data. It may be a URI such as an s3 bucket or the database table.
createdAt (str) – The timestamp at which this dataset was created.
ignoreBefore (str) – The timestamp at which all previous events are ignored when training.
ephemeral (bool) – The dataset is ephemeral and not used for training.
lookbackDays (int) – Specific to streaming datasets, this specifies how many days worth of data to include when generating a snapshot. Value of 0 indicates leaves this selection to the system.
databaseConnectorId (str) – The Database Connector used.
databaseConnectorConfig (dict) – The database connector query used to retrieve data.
connectorType (str) – The type of connector used to get this dataset FILE or DATABASE.
featureGroupTableName (str) – The table name of the dataset’s feature group
applicationConnectorId (str) – The Application Connector used.
applicationConnectorConfig (dict) – The application connector query used to retrieve data.
incremental (bool) – If dataset is an incremental dataset.
isDocumentset (bool) – If dataset is a documentset.
extractBoundingBoxes (bool) – Signifies whether to extract bounding boxes out of the documents. Only valid if is_documentset if True.
mergeFileSchemas (bool) – If the merge file schemas policy is enabled.
referenceOnlyDocumentset (bool) – Signifies whether to save the data reference only. Only valid if is_documentset if True.
versionLimit (int) – Version limit for the dataset.
latestDatasetVersion (DatasetVersion) – The latest version of this dataset.
schema (DatasetColumn) – List of resolved columns.
refreshSchedules (RefreshSchedule) – List of schedules that determines when the next version of the dataset will be created.
parsingConfig (ParsingConfig) – The parsing config used for dataset.
documentProcessingConfig (DocumentProcessingConfig) – The document processing config used for dataset (when is_documentset is True).
attachmentParsingConfig (AttachmentParsingConfig) – The attachment parsing config used for dataset (eg. for salesforce attachment parsing)

dataset_id = None

source_type = None

data_source = None

created_at = None

ignore_before = None

ephemeral = None

lookback_days = None

database_connector_id = None

database_connector_config = None

connector_type = None

feature_group_table_name = None

application_connector_id = None

application_connector_config = None

incremental = None

is_documentset = None

extract_bounding_boxes = None

merge_file_schemas = None

reference_only_documentset = None

version_limit = None

schema

refresh_schedules

latest_dataset_version

parsing_config

document_processing_config

attachment_parsing_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

get_raw_data_from_realtime(check_permissions=False, start_time=None, end_time=None, column_filter=None)

Returns raw data from a realtime dataset. Only Microsoft Teams datasets are supported currently due to data size constraints in realtime datasets.

Parameters:

check_permissions (bool) – If True, checks user permissions using session email.
start_time (str) – Start time filter (inclusive) for created_date_time_t in ISO 8601 format (e.g. 2025-05-13T08:25:11Z or 2025-05-13T08:25:11+00:00).
end_time (str) – End time filter (inclusive) for created_date_time_t in ISO 8601 format (e.g. 2025-05-13T08:25:11Z or 2025-05-13T08:25:11+00:00).
column_filter (dict) – Dictionary mapping column names to filter values. Only rows matching all column filters will be returned.

create_version_from_file_connector(location=None, file_format=None, csv_delimiter=None, merge_file_schemas=None, parsing_config=None, sql_query=None)

Creates a new version of the specified dataset.

Parameters:

location (str) – External URI to import the dataset from. If not specified, the last location will be used.
file_format (str) – File format to be used. If not specified, the service will try to detect the file format.
csv_delimiter (str) – If the file format is CSV, use a specific CSV delimiter.
merge_file_schemas (bool) – Signifies if the merge file schema policy is enabled.
parsing_config (ParsingConfig) – Custom config for dataset parsing.
sql_query (str) – The SQL query to use when fetching data from the specified location. Use __TABLE__ as a placeholder for the table name. For example: “SELECT * FROM __TABLE__ WHERE event_date > ‘2021-01-01’”. If not provided, the entire dataset from the specified location will be imported.

Returns:

The new Dataset Version created.

Return type:

DatasetVersion

create_version_from_database_connector(object_name=None, columns=None, query_arguments=None, sql_query=None)

Creates a new version of the specified dataset.

Parameters:

object_name (str) – The name/ID of the object in the service to query. If not specified, the last name will be used.
columns (str) – The columns to query from the external service object. If not specified, the last columns will be used.
query_arguments (str) – Additional query arguments to filter the data. If not specified, the last arguments will be used.
sql_query (str) – The full SQL query to use when fetching data. If present, this parameter will override object_name, columns, and query_arguments.

Returns:

The new Dataset Version created.

Return type:

DatasetVersion

create_version_from_application_connector(dataset_config=None)

Creates a new version of the specified dataset.

Parameters:: dataset_config (ApplicationConnectorDatasetConfig) – Dataset config for the application connector. If any of the fields are not specified, the last values will be used.
Returns:: The new Dataset Version created.
Return type:: DatasetVersion

create_version_from_upload(file_format=None)

Creates a new version of the specified dataset using a local file upload.

Parameters:: file_format (str) – File format to be used. If not specified, the service will attempt to detect the file format.
Returns:: Token to be used when uploading file parts.
Return type:: Upload

create_version_from_document_reprocessing(document_processing_config=None)

Creates a new dataset version for a source docstore dataset with the provided document processing configuration. This does not re-import the data but uses the same data which is imported in the latest dataset version and only performs document processing on it.

Parameters:: document_processing_config (DatasetDocumentProcessingConfig) – The document processing configuration to use for the new dataset version. If not specified, the document processing configuration from the source dataset will be used.
Returns:: The new dataset version created.
Return type:: DatasetVersion

snapshot_streaming_data()

Snapshots the current data in the streaming dataset.

Parameters:: dataset_id (str) – The unique ID associated with the dataset.
Returns:: The new Dataset Version created by taking a snapshot of the current data in the streaming dataset.
Return type:: DatasetVersion

set_column_data_type(column, data_type)

Set a Dataset’s column type.

Parameters:

column (str) – The name of the column.
data_type (DataType) – The type of the data in the column. Note: Some ColumnMappings may restrict the options or explicitly set the DataType.

Returns:

The dataset and schema after the data type has been set.

Return type:

Dataset

set_streaming_retention_policy(retention_hours=None, retention_row_count=None, ignore_records_before_timestamp=None)

Sets the streaming retention policy.

Parameters:

retention_hours (int) – Number of hours to retain streamed data in memory.
retention_row_count (int) – Number of rows to retain streamed data in memory.
ignore_records_before_timestamp (int) – The Unix timestamp (in seconds) to use as a cutoff to ignore all entries sent before it

get_schema()

Retrieves the column schema of a dataset.

Parameters:: dataset_id (str) – Unique string identifier of the dataset schema to look up.
Returns:: List of column schema definitions.
Return type:: list[DatasetColumn]

set_database_connector_config(database_connector_id, object_name=None, columns=None, query_arguments=None, sql_query=None)

Sets database connector config for a dataset. This method is currently only supported for streaming datasets.

Parameters:

database_connector_id (str) – Unique String Identifier of the Database Connector to import the dataset from.
object_name (str) – If applicable, the name/ID of the object in the service to query.
columns (str) – The columns to query from the external service object.
query_arguments (str) – Additional query arguments to filter the data.
sql_query (str) – The full SQL query to use when fetching data. If present, this parameter will override object_name, columns and query_arguments.

update_version_limit(version_limit)

Updates the version limit for the specified dataset.

Parameters:: version_limit (int) – The maximum number of versions permitted for the feature group. Once this limit is exceeded, the oldest versions will be purged in a First-In-First-Out (FIFO) order.
Returns:: The updated dataset.
Return type:: Dataset

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Dataset

describe()

Retrieves a full description of the specified dataset, with attributes such as its ID, name, source type, etc.

Parameters:: dataset_id (str) – The unique ID associated with the dataset.
Returns:: The dataset.
Return type:: Dataset

list_versions(limit=100, start_after_version=None)

Retrieves a list of all dataset versions for the specified dataset.

Parameters:

limit (int) – The maximum length of the list of all dataset versions.
start_after_version (str) – The ID of the version after which the list starts.

Returns:

A list of dataset versions.

Return type:

list[DatasetVersion]

delete()

Deletes the specified dataset from the organization.

Parameters:: dataset_id (str) – Unique string identifier of the dataset to delete.

wait_for_import(timeout=900)

A waiting call until dataset is imported.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_inspection(timeout=None)

A waiting call until dataset is completely inspected.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the latest dataset version.

Returns:: A string describing the status of a dataset (importing, inspecting, complete, etc.).
Return type:: str

describe_feature_group()

Gets the feature group attached to the dataset.

Returns:: A feature group object.
Return type:: FeatureGroup

create_refresh_policy(cron)

To create a refresh policy for a dataset.

Parameters:: cron (str) – A cron style string to set the refresh time.
Returns:: The refresh policy object.
Return type:: RefreshPolicy

list_refresh_policies()

Gets the refresh policies in a list.

Returns:: A list of refresh policy objects.
Return type:: List[RefreshPolicy]

class abacusai.DatasetColumn(client, name=None, dataType=None, detectedDataType=None, featureType=None, detectedFeatureType=None, originalName=None, validDataTypes=None, timeFormat=None, timestampFrequency=None)

Bases: abacusai.return_class.AbstractApiClass

A schema description for a column

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The unique name of the column.
dataType (str) – The underlying data type of each column.
detectedDataType (str) – The detected data type of the column.
featureType (str) – Feature type of the column.
detectedFeatureType (str) – The detected feature type of the column.
originalName (str) – The original name of the column.
validDataTypes (list[str]) – The valid data type options for this column.
timeFormat (str) – The detected time format of the column.
timestampFrequency (str) – The detected frequency of the timestamps in the dataset.

name = None

data_type = None

detected_data_type = None

feature_type = None

detected_feature_type = None

original_name = None

valid_data_types = None

time_format = None

timestamp_frequency = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DatasetVersion(client, datasetVersion=None, status=None, datasetId=None, size=None, rowCount=None, fileInspectMetadata=None, createdAt=None, error=None, incrementalQueriedAt=None, uploadId=None, mergeFileSchemas=None, databaseConnectorConfig=None, applicationConnectorConfig=None, invalidRecords=None)

Bases: abacusai.return_class.AbstractApiClass

A specific version of a dataset

Parameters:

client (ApiClient) – An authenticated API Client instance
datasetVersion (str) – The unique identifier of the dataset version.
status (str) – The current status of the dataset version
datasetId (str) – A reference to the Dataset this dataset version belongs to.
size (int) – The size in bytes of the file.
rowCount (int) – Number of rows in the dataset version.
fileInspectMetadata (dict) – Metadata information about file’s inspection. For example - the detected delimiter for CSV files.
createdAt (str) – The timestamp this dataset version was created.
error (str) – If status is FAILED, this field will be populated with an error.
incrementalQueriedAt (str) – If the dataset version is from an incremental dataset, this is the last entry of timestamp column when the dataset version was created.
uploadId (str) – If the dataset version is being uploaded, this the reference to the Upload
mergeFileSchemas (bool) – If the merge file schemas policy is enabled.
databaseConnectorConfig (dict) – The database connector query used to retrieve data for this version.
applicationConnectorConfig (dict) – The application connector used to retrieve data for this version.
invalidRecords (str) – Invalid records in the dataset version

dataset_version = None

status = None

dataset_id = None

size = None

row_count = None

file_inspect_metadata = None

created_at = None

error = None

incremental_queried_at = None

upload_id = None

merge_file_schemas = None

database_connector_config = None

application_connector_config = None

invalid_records = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

get_metrics(selected_columns=None, include_charts=False, include_statistics=True)

Get metrics for a specific dataset version.

Parameters:

selected_columns (List) – A list of columns to order first.
include_charts (bool) – A flag indicating whether charts should be included in the response. Default is false.
include_statistics (bool) – A flag indicating whether statistics should be included in the response. Default is true.

Returns:

The metrics for the specified Dataset version.

Return type:

DataMetrics

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: DatasetVersion

describe()

Retrieves a full description of the specified dataset version, including its ID, name, source type, and other attributes.

Parameters:: dataset_version (str) – Unique string identifier associated with the dataset version.
Returns:: The dataset version.
Return type:: DatasetVersion

delete()

Deletes the specified dataset version from the organization.

Parameters:: dataset_version (str) – String identifier of the dataset version to delete.

get_logs()

Retrieves the dataset import logs.

Parameters:: dataset_version (str) – The unique version ID of the dataset version.
Returns:: The logs for the specified dataset version.
Return type:: DatasetVersionLogs

wait_for_import(timeout=900)

A waiting call until dataset version is imported.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_inspection(timeout=None)

A waiting call until dataset version is completely inspected.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the dataset version.

Returns:: A string describing the status of a dataset version (importing, inspecting, complete, etc.).
Return type:: str

class abacusai.DatasetVersionLogs(client, logs=None)

Bases: abacusai.return_class.AbstractApiClass

Logs from dataset version.

Parameters:

client (ApiClient) – An authenticated API Client instance
logs (list[str]) – List of logs from dataset version.

logs = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DefaultLlm(client, name=None, enum=None)

Bases: abacusai.return_class.AbstractApiClass

A default LLM.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the LLM.
enum (str) – The enum of the LLM.

name = None

enum = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Deployment(client, deploymentId=None, name=None, status=None, description=None, deployedAt=None, createdAt=None, projectId=None, modelId=None, modelVersion=None, featureGroupId=None, featureGroupVersion=None, callsPerSecond=None, autoDeploy=None, skipMetricsCheck=None, algoName=None, regions=None, error=None, batchStreamingUpdates=None, algorithm=None, pendingModelVersion=None, modelDeploymentConfig=None, predictionOperatorId=None, predictionOperatorVersion=None, pendingPredictionOperatorVersion=None, onlineFeatureGroupId=None, outputOnlineFeatureGroupId=None, realtimeMonitorId=None, runtimeConfigs=None, isSystemCreated=None, refreshSchedules={}, featureGroupExportConfig={}, defaultPredictionArguments={})

Bases: abacusai.return_class.AbstractApiClass

A model deployment

Parameters:

client (ApiClient) – An authenticated API Client instance
deploymentId (str) – A unique identifier for the deployment.
name (str) – A user-friendly name for the deployment.
status (str) – The status of the deployment.
description (str) – A description of the deployment.
deployedAt (str) – The date and time when the deployment became active, in ISO-8601 format.
createdAt (str) – The date and time when the deployment was created, in ISO-8601 format.
projectId (str) – A unique identifier for the project this deployment belongs to.
modelId (str) – The model that is currently deployed.
modelVersion (str) – The model version ID that is currently deployed.
featureGroupId (str) – The feature group that is currently deployed.
featureGroupVersion (str) – The feature group version ID that is currently deployed.
callsPerSecond (int) – The number of calls per second the deployment can handle.
autoDeploy (bool) – A flag marking the deployment as eligible for auto deployments whenever any model in the project finishes training.
skipMetricsCheck (bool) – A flag to skip metric regression with this current deployment. This field is only relevant when auto_deploy is on
algoName (str) – The name of the algorithm that is currently deployed.
regions (list) – A list of regions that the deployment has been deployed to.
error (str) – The relevant error, if the status is FAILED.
batchStreamingUpdates (bool) – A flag marking the feature group deployment as having enabled a background process which caches streamed-in rows for quicker lookup.
algorithm (str) – The algorithm that is currently deployed.
pendingModelVersion (dict) – The model that the deployment is switching to, or being stopped.
modelDeploymentConfig (dict) – The config for which model to be deployed.
predictionOperatorId (str) – The prediction operator ID that is currently deployed.
predictionOperatorVersion (str) – The prediction operator version ID that is currently deployed.
pendingPredictionOperatorVersion (str) – The prediction operator version ID that the deployment is switching to, or being stopped.
onlineFeatureGroupId (id) – The online feature group ID that the deployment is running on
outputOnlineFeatureGroupId (id) – The online feature group ID that the deployment is outputting results to
realtimeMonitorId (id) – The realtime monitor ID of the realtime-monitor that is associated with the deployment
runtimeConfigs (dict) – The runtime configurations of a deployment which is used by some of the usecases during prediction.
isSystemCreated (bool) – Whether the deployment is system created.
refreshSchedules (RefreshSchedule) – A list of refresh schedules that indicate when the deployment will be updated to the latest model version.
featureGroupExportConfig (FeatureGroupExportConfig) – The export config (file connector or database connector information) for feature group deployment exports.
defaultPredictionArguments (PredictionArguments) – The default prediction arguments for prediction APIs

deployment_id = None

name = None

status = None

description = None

deployed_at = None

created_at = None

project_id = None

model_id = None

model_version = None

feature_group_id = None

feature_group_version = None

calls_per_second = None

auto_deploy = None

skip_metrics_check = None

algo_name = None

regions = None

error = None

batch_streaming_updates = None

algorithm = None

pending_model_version = None

model_deployment_config = None

prediction_operator_id = None

prediction_operator_version = None

pending_prediction_operator_version = None

online_feature_group_id = None

output_online_feature_group_id = None

realtime_monitor_id = None

runtime_configs = None

is_system_created = None

refresh_schedules

feature_group_export_config

default_prediction_arguments

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

create_webhook(endpoint, webhook_event_type, payload_template=None)

Create a webhook attached to a given deployment ID.

Parameters:

endpoint (str) – URI that the webhook will send HTTP POST requests to.
webhook_event_type (str) – One of ‘DEPLOYMENT_START’, ‘DEPLOYMENT_SUCCESS’, or ‘DEPLOYMENT_FAILED’.
payload_template (dict) – Template for the body of the HTTP POST requests. Defaults to {}.

Returns:

The webhook attached to the deployment.

Return type:

Webhook

list_webhooks()

List all the webhooks attached to a given deployment.

Parameters:: deployment_id (str) – Unique identifier of the target deployment.
Returns:: List of the webhooks attached to the given deployment ID.
Return type:: list[Webhook]

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Deployment

describe()

Retrieves a full description of the specified deployment.

Parameters:: deployment_id (str) – Unique string identifier associated with the deployment.
Returns:: Description of the deployment.
Return type:: Deployment

update(description=None, auto_deploy=None, skip_metrics_check=None)

Updates a deployment’s properties.

Parameters:

description (str) – The new description for the deployment.
auto_deploy (bool) – Flag to enable the automatic deployment when a new Model Version finishes training.
skip_metrics_check (bool) – Flag to skip metric regression with this current deployment. This field is only relevant when auto_deploy is on

rename(name)

Updates a deployment’s name

Parameters:: name (str) – The new deployment name.

set_auto(enable=None)

Enable or disable auto deployment for the specified deployment.

When a model is scheduled to retrain, deployments with auto deployment enabled will be marked to automatically promote the new model version. After the newly trained model completes, a check on its metrics in comparison to the currently deployed model version will be performed. If the metrics are comparable or better, the newly trained model version is automatically promoted. If not, it will be marked as a failed model version promotion with an error indicating poor metrics performance.

Parameters:: enable (bool) – Enable or disable the autoDeploy property of the deployment.

set_model_version(model_version, algorithm=None, model_deployment_config=None)

Promotes a model version and/or algorithm to be the active served deployment version

Parameters:

model_version (str) – A unique identifier for the model version.
algorithm (str) – The algorithm to use for the model version. If not specified, the algorithm will be inferred from the model version.
model_deployment_config (dict) – The deployment configuration for the model to deploy.

set_feature_group_version(feature_group_version)

Promotes a feature group version to be served in the deployment.

Parameters:: feature_group_version (str) – Unique string identifier for the feature group version.

set_prediction_operator_version(prediction_operator_version)

Promotes a prediction operator version to be served in the deployment.

Parameters:: prediction_operator_version (str) – Unique string identifier for the prediction operator version.

start()

Restarts the specified deployment that was previously suspended.

Parameters:: deployment_id (str) – A unique string identifier associated with the deployment.

stop()

Stops the specified deployment.

Parameters:: deployment_id (str) – Unique string identifier of the deployment to be stopped.

delete()

Deletes the specified deployment. The deployment’s models will not be affected. Note that the deployments are not recoverable after they are deleted.

Parameters:: deployment_id (str) – Unique string identifier of the deployment to delete.

set_feature_group_export_file_connector_output(file_format=None, output_location=None)

Sets the export output for the Feature Group Deployment to be a file connector.

Parameters:

file_format (str) – The type of export output, either CSV or JSON.
output_location (str) – The file connector (cloud) location where the output should be exported.

set_feature_group_export_database_connector_output(database_connector_id, object_name, write_mode, database_feature_mapping, id_column=None, additional_id_columns=None)

Sets the export output for the Feature Group Deployment to a Database connector.

Parameters:

database_connector_id (str) – The unique string identifier of the database connector used.
object_name (str) – The object of the database connector to write to.
write_mode (str) – The write mode to use when writing to the database connector, either UPSERT or INSERT.
database_feature_mapping (dict) – The column/feature pairs mapping the features to the database columns.
id_column (str) – The id column to use as the upsert key.
additional_id_columns (list) – For database connectors which support it, a list of additional ID columns to use as a complex key for upserting.

remove_feature_group_export_output()

Removes the export type that is set for the Feature Group Deployment

Parameters:: deployment_id (str) – The ID of the deployment for which the export type is set.

set_default_prediction_arguments(prediction_arguments, set_as_override=False)

Sets the deployment config.

Parameters:

prediction_arguments (PredictionArguments) – The prediction arguments to set.
set_as_override (bool) – If True, use these arguments as overrides instead of defaults for predict calls

Returns:

description of the updated deployment.

Return type:

Deployment

get_prediction_logs_records(limit=10, last_log_request_id='', last_log_timestamp=None)

Retrieves the prediction request IDs for the most recent predictions made to the deployment.

Parameters:

limit (int) – The number of prediction log entries to retrieve up to the specified limit.
last_log_request_id (str) – The request ID of the last log entry to retrieve.
last_log_timestamp (int) – A Unix timestamp in milliseconds specifying the timestamp for the last log entry.

Returns:

A list of prediction log records.

Return type:

list[PredictionLogRecord]

create_alert(alert_name, condition_config, action_config)

Create a deployment alert for the given conditions.

Only support batch prediction usage now.

Parameters:

alert_name (str) – Name of the alert.
condition_config (AlertConditionConfig) – Condition to run the actions for the alert.
action_config (AlertActionConfig) – Configuration for the action of the alert.

Returns:

Object describing the deployment alert.

Return type:

MonitorAlert

list_alerts()

List the monitor alerts associated with the deployment id.

Parameters:: deployment_id (str) – Unique string identifier for the deployment.
Returns:: An array of deployment alerts.
Return type:: list[MonitorAlert]

create_realtime_monitor(realtime_monitor_schedule=None, lookback_time=None)

Real time monitors compute and monitor metrics of real time prediction data.

Parameters:

realtime_monitor_schedule (str) – The cron expression for triggering monitor.
lookback_time (int) – Lookback time (in seconds) for each monitor trigger

Returns:

Object describing the real-time monitor.

Return type:

RealtimeMonitor

get_conversation_response(message, deployment_token, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, doc_infos=None, user_info=None, execute_usercode_tool=False)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:

message (str) – A message from the user
deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifiying the query chat config override.
doc_infos (list) – An optional list of documents use for the conversation. A keyword ‘doc_id’ is expected to be present in each document for retrieving contents from docstore.
execute_usercode_tool (bool) – If True, will return the tool output in the response.
user_info (dict)

get_conversation_response_with_binary_data(deployment_token, message, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, attachments=None)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:

deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
message (str) – A message from the user
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifiying the query chat config override.
attachments (None) – A dictionary of binary data to use to answer the queries.

create_batch_prediction(table_name=None, name=None, global_prediction_args=None, batch_prediction_args=None, explanations=False, output_format=None, output_location=None, database_connector_id=None, database_output_config=None, refresh_schedule=None, csv_input_prefix=None, csv_prediction_prefix=None, csv_explanations_prefix=None, output_includes_metadata=None, result_input_columns=None, input_feature_groups=None)

Creates a batch prediction job description for the given deployment.

Parameters:

table_name (str) – Name of the feature group table to write the results of the batch prediction. Can only be specified if outputLocation and databaseConnectorId are not specified. If tableName is specified, the outputType will be enforced as CSV.
name (str) – Name of the batch prediction job.
batch_prediction_args (BatchPredictionArgs) – Batch Prediction args specific to problem type.
output_format (str) – Format of the batch prediction output (CSV or JSON).
output_location (str) – Location to write the prediction results. Otherwise, results will be stored in Abacus.AI.
database_connector_id (str) – Unique identifier of a Database Connection to write predictions to. Cannot be specified in conjunction with outputLocation.
database_output_config (dict) – Key-value pair of columns/values to write to the database connector. Only available if databaseConnectorId is specified.
refresh_schedule (str) – Cron-style string that describes a schedule in UTC to automatically run the batch prediction.
csv_input_prefix (str) – Prefix to prepend to the input columns, only applies when output format is CSV.
csv_prediction_prefix (str) – Prefix to prepend to the prediction columns, only applies when output format is CSV.
csv_explanations_prefix (str) – Prefix to prepend to the explanation columns, only applies when output format is CSV.
output_includes_metadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version.
result_input_columns (list) – If present, will limit result files or feature groups to only include columns present in this list.
input_feature_groups (dict) – A dict of {‘<feature_group_type>’: ‘<feature_group_id>’} which overrides the default input data of that type for the Batch Prediction. Default input data is the training data that was used for training the deployed model.
global_prediction_args (Union[dict, abacusai.api_class.BatchPredictionArgs])
explanations (bool)

Returns:

The batch prediction description.

Return type:

BatchPrediction

get_statistics_over_time(start_date, end_date)

Return basic access statistics for the given window

Parameters:

start_date (str) – Timeline start date in ISO format.
end_date (str) – Timeline end date in ISO format. The date range must be 7 days or less.

Returns:

Object describing Time series data of the number of requests and latency over the specified time period.

Return type:

DeploymentStatistics

describe_feature_group_row_process_by_key(primary_key_value)

Gets the feature group row process.

Parameters:: primary_key_value (str) – The primary key value
Returns:: An object representing the feature group row process
Return type:: FeatureGroupRowProcess

list_feature_group_row_processes(limit=None, status=None)

Gets a list of feature group row processes.

Parameters:

limit (int) – The maximum number of processes to return. Defaults to None.
status (str) – The status of the processes to return. Defaults to None.

Returns:

A list of object representing the feature group row process

Return type:

list[FeatureGroupRowProcess]

get_feature_group_row_process_summary()

Gets a summary of the statuses of the individual feature group processes.

Parameters:: deployment_id (str) – The deployment id for the process
Returns:: An object representing the summary of the statuses of the individual feature group processes
Return type:: FeatureGroupRowProcessSummary

reset_feature_group_row_process_by_key(primary_key_value)

Resets a feature group row process so that it can be reprocessed

Parameters:: primary_key_value (str) – The primary key value
Returns:: An object representing the feature group row process.
Return type:: FeatureGroupRowProcess

get_feature_group_row_process_logs_by_key(primary_key_value)

Gets the logs for a feature group row process

Parameters:: primary_key_value (str) – The primary key value
Returns:: An object representing the logs for the feature group row process
Return type:: FeatureGroupRowProcessLogs

create_conversation(name=None, external_application_id=None)

Creates a deployment conversation.

Parameters:

name (str) – The name of the conversation.
external_application_id (str) – The external application id associated with the deployment conversation.

Returns:

The deployment conversation.

Return type:

DeploymentConversation

list_conversations(external_application_id=None, conversation_type=None, fetch_last_llm_info=False, limit=None, search=None)

Lists all conversations for the given deployment and current user.

Parameters:

external_application_id (str) – The external application id associated with the deployment conversation. If specified, only conversations created on that application will be listed.
conversation_type (DeploymentConversationType) – The type of the conversation indicating its origin.
fetch_last_llm_info (bool) – If true, the LLM info for the most recent conversation will be fetched. Only applicable for system-created bots.
limit (int) – The number of conversations to return. Defaults to 600.
search (str) – The search query to filter conversations by title.

Returns:

The deployment conversations.

Return type:

list[DeploymentConversation]

create_external_application(name=None, description=None, logo=None, theme=None)

Creates a new External Application from an existing ChatLLM Deployment.

Parameters:

name (str) – The name of the External Application. If not provided, the name of the deployment will be used.
description (str) – The description of the External Application. This will be shown to users when they access the External Application. If not provided, the description of the deployment will be used.
logo (str) – The logo to be displayed.
theme (dict) – The visual theme of the External Application.

Returns:

The newly created External Application.

Return type:

ExternalApplication

download_agent_attachment(attachment_id)

Return an agent attachment.

Parameters:: attachment_id (str) – The attachment ID.

wait_for_deployment(wait_states={'PENDING', 'DEPLOYING'}, timeout=900)

A waiting call until deployment is completed.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_pending_deployment_update(timeout=900)

A waiting call until deployment is in a stable state, that pending model switch is completed and previous model is stopped.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
Returns:: the latest deployment object.
Return type:: Deployment

get_status()

Gets the status of the deployment.

Returns:: A string describing the status of a deploymet (pending, deploying, active, etc.).
Return type:: str

create_refresh_policy(cron)

To create a refresh policy for a deployment.

Parameters:: cron (str) – A cron style string to set the refresh time.
Returns:: The refresh policy object.
Return type:: RefreshPolicy

list_refresh_policies()

Gets the refresh policies in a list.

Returns:: A list of refresh policy objects.
Return type:: List[RefreshPolicy]

class abacusai.DeploymentAuthToken(client, deploymentToken=None, createdAt=None, name=None)

Bases: abacusai.return_class.AbstractApiClass

A deployment authentication token that is used to authenticate prediction requests

Parameters:

client (ApiClient) – An authenticated API Client instance
deploymentToken (str) – The unique token used to authenticate requests.
createdAt (str) – The date and time when the token was created, in ISO-8601 format.
name (str) – The name associated with the authentication token.

deployment_token = None

created_at = None

name = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DeploymentConversation(client, deploymentConversationId=None, name=None, deploymentId=None, ownerUserId=None, ownerOrgId=None, createdAt=None, lastEventCreatedAt=None, hasHistory=None, externalSessionId=None, regenerateAttempt=None, externalApplicationId=None, unusedDocumentUploadIds=None, humanizeInstructions=None, conversationWarning=None, conversationType=None, metadata=None, llmDisplayName=None, llmBotIcon=None, searchSuggestions=None, chatllmTaskId=None, conversationStatus=None, computerStatus=None, totalEvents=None, contestNames=None, daemonTaskId=None, parentDeploymentConversationId=None, introMessage=None, previewInfo=None, history={}, hostedArtifacts={})

Bases: abacusai.return_class.AbstractApiClass

A deployment conversation.

Parameters:

client (ApiClient) – An authenticated API Client instance
deploymentConversationId (str) – The unique identifier of the deployment conversation.
name (str) – The name of the deployment conversation.
deploymentId (str) – The deployment id associated with the deployment conversation.
ownerUserId (str) – The user id of the owner of the deployment conversation.
ownerOrgId (str) – The organization id of the owner of the deployment conversation.
createdAt (str) – The timestamp at which the deployment conversation was created.
lastEventCreatedAt (str) – The timestamp at which the most recent corresponding deployment conversation event was created at.
hasHistory (bool) – Whether the deployment conversation has any history.
externalSessionId (str) – The external session id associated with the deployment conversation.
regenerateAttempt (int) – The sequence number of regeneration. Not regenerated if 0.
externalApplicationId (str) – The external application id associated with the deployment conversation.
unusedDocumentUploadIds (list[str]) – The list of unused document upload ids associated with the deployment conversation.
humanizeInstructions (dict) – Instructions for humanizing the conversation.
conversationWarning (str) – Extra text associated with the deployment conversation (to show it at the bottom of chatbot).
conversationType (str) – The type of the conversation, which depicts the application it caters to.
metadata (dict) – Additional backend information about the conversation.
llmDisplayName (str) – The display name of the LLM model used to generate the most recent response. Only used for system-created bots.
llmBotIcon (str) – The icon location of the LLM model used to generate the most recent response. Only used for system-created bots.
searchSuggestions (list) – The list of search suggestions for the conversation.
chatllmTaskId (str) – The chatllm task id associated with the deployment conversation.
conversationStatus (str) – The status of the deployment conversation (used for deep agent conversations).
computerStatus (str) – The status of the computer associated with the deployment conversation (used for deep agent conversations).
totalEvents (int) – The total number of events in the deployment conversation.
contestNames (list[str]) – Names of contests that this deployment is a part of.
daemonTaskId (str) – The daemon task id associated with the deployment conversation.
parentDeploymentConversationId (str) – The parent deployment conversation id associated with the deployment conversation.
introMessage (str) – The intro message for the deployment conversation.
previewInfo (dict) – App preview info
history (DeploymentConversationEvent) – The history of the deployment conversation.
hostedArtifacts (HostedArtifact) – Artifacts that have been deployed by this conversation.

deployment_conversation_id = None

name = None

deployment_id = None

owner_user_id = None

owner_org_id = None

created_at = None

last_event_created_at = None

has_history = None

external_session_id = None

regenerate_attempt = None

external_application_id = None

unused_document_upload_ids = None

humanize_instructions = None

conversation_warning = None

conversation_type = None

metadata = None

llm_display_name = None

llm_bot_icon = None

search_suggestions = None

chatllm_task_id = None

conversation_status = None

computer_status = None

total_events = None

contest_names = None

daemon_task_id = None

parent_deployment_conversation_id = None

intro_message = None

preview_info = None

history

hosted_artifacts

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

get(external_session_id=None, deployment_id=None, filter_intermediate_conversation_events=True, get_unused_document_uploads=False, start=None, limit=None)

Gets a deployment conversation.

Parameters:

external_session_id (str) – External session ID of the conversation.
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.
filter_intermediate_conversation_events (bool) – If true, intermediate conversation events will be filtered out. Default is true.
get_unused_document_uploads (bool) – If true, unused document uploads will be returned. Default is false.
start (int) – The start index of the conversation.
limit (int) – The limit of the conversation.

Returns:

The deployment conversation.

Return type:

DeploymentConversation

delete(deployment_id=None)

Delete a Deployment Conversation.

Parameters:: deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.

clear(external_session_id=None, deployment_id=None, user_message_indices=None)

Clear the message history of a Deployment Conversation.

Parameters:

external_session_id (str) – The external session id associated with the deployment conversation.
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.
user_message_indices (list) – Optional list of user message indices to clear. The associated bot response will also be cleared. If not provided, all messages will be cleared.

set_feedback(message_index, is_useful=None, is_not_useful=None, feedback=None, feedback_type=None, deployment_id=None)

Sets a deployment conversation message as useful or not useful

Parameters:

message_index (int) – The index of the deployment conversation message
is_useful (bool) – If the message is useful. If true, the message is useful. If false, clear the useful flag.
is_not_useful (bool) – If the message is not useful. If true, the message is not useful. If set to false, clear the useful flag.
feedback (str) – Optional feedback on why the message is useful or not useful
feedback_type (str) – Optional feedback type
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.

rename(name, deployment_id=None)

Rename a Deployment Conversation.

Parameters:

name (str) – The new name of the conversation.
deployment_id (str) – The deployment this conversation belongs to. This is required if not logged in.

export(external_session_id=None)

Export a Deployment Conversation.

Parameters:: external_session_id (str) – The external session id associated with the deployment conversation. One of deployment_conversation_id or external_session_id must be provided.
Returns:: The deployment conversation html export.
Return type:: DeploymentConversationExport

construct_agent_conversation_messages_for_llm(external_session_id=None, include_document_contents=True)

Returns conversation history in a format for LLM calls.

Parameters:

external_session_id (str) – External session ID of the conversation.
include_document_contents (bool) – If true, include contents from uploaded documents in the generated messages.

Returns:

Contains a list of AgentConversationMessage that represents the conversation.

Return type:

AgentConversation

class abacusai.DeploymentConversationEvent(client, role=None, text=None, timestamp=None, messageIndex=None, regenerateAttempt=None, modelVersion=None, searchResults=None, isUseful=None, feedback=None, feedbackType=None, docInfos=None, keywordArguments=None, inputParams=None, attachments=None, responseVersion=None, agentWorkflowNodeId=None, nextAgentWorkflowNodeId=None, chatType=None, agentResponse=None, error=None, segments=None, streamedData=None, streamedSectionData=None, highlights=None, llmDisplayName=None, llmBotIcon=None, formResponse=None, routedLlm=None, computePointsUsed=None, computerFiles=None, toolUseRequest=None, verificationSummary=None, attachedUserFileNames=None, oldFileContent=None)

Bases: abacusai.return_class.AbstractApiClass

A single deployment conversation message.

Parameters:

client (ApiClient) – An authenticated API Client instance
role (str) – The role of the message sender
text (str) – The text of the message
timestamp (str) – The timestamp at which the message was sent
messageIndex (int) – The index of the message in the conversation
regenerateAttempt (int) – The sequence number of regeneration. Not regenerated if 0.
modelVersion (str) – The model instance id associated with the message.
searchResults (dict) – The search results for the message.
isUseful (bool) – Whether this message was marked as useful or not
feedback (str) – The feedback provided for the message
feedbackType (str) – The type of feedback provided for the message
docInfos (list) – A list of information on the documents associated with the message
keywordArguments (dict) – User message only. A dictionary of keyword arguments used to generate response.
inputParams (dict) – User message only. A dictionary of input parameters used to generate response.
attachments (list) – A list of attachments associated with the message.
responseVersion (str) – The version of the response, used to differentiate w/ legacy agent response.
agentWorkflowNodeId (str) – The workflow node id associated with the agent response.
nextAgentWorkflowNodeId (str) – The id of the workflow node to be executed next.
chatType (str) – The type of chat llm that was run for the message.
agentResponse (dict) – Response from the agent. Only for conversation with agents.
error (str) – The error message in case of an error.
segments (list) – The segments of the message.
streamedData (str) – Aggregated streamed messages from the agent.
streamedSectionData (str) – Aggregated streamed section outputs from the agent in a list.
highlights (dict) – Chunks with bounding boxes for highlighting the result sources.
llmDisplayName (str) – The display name of the LLM model used to generate the response. Only used for system-created bots.
llmBotIcon (str) – The icon location of the LLM model used to generate the response. Only used for system-created bots.
formResponse (dict) – Contains form data response from the user when a Form Segment is given out by the bot.
routedLlm (str) – The LLM that was chosen by RouteLLM to generate the response.
computePointsUsed (int) – The number of compute points used for the message.
computerFiles (list) – The list of files that were created by the computer agent.
toolUseRequest (dict) – The tool use request for the message.
verificationSummary (str) – The summary of the verification process for the message.
attachedUserFileNames (list) – The list of files attached by the user on the message.
oldFileContent (str) – The content of the file associated with an edit tool request before the file was edited

role = None

text = None

timestamp = None

message_index = None

regenerate_attempt = None

model_version = None

search_results = None

is_useful = None

feedback = None

feedback_type = None

doc_infos = None

keyword_arguments = None

input_params = None

attachments = None

response_version = None

agent_workflow_node_id = None

next_agent_workflow_node_id = None

chat_type = None

agent_response = None

error = None

segments = None

streamed_data = None

streamed_section_data = None

highlights = None

llm_display_name = None

llm_bot_icon = None

form_response = None

routed_llm = None

compute_points_used = None

computer_files = None

tool_use_request = None

verification_summary = None

attached_user_file_names = None

old_file_content = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DeploymentConversationExport(client, deploymentConversationId=None, conversationExportHtml=None)

Bases: abacusai.return_class.AbstractApiClass

A deployment conversation html export, to be used for downloading the conversation.

Parameters:

client (ApiClient) – An authenticated API Client instance
deploymentConversationId (str) – The unique identifier of the deployment conversation.
conversationExportHtml (str) – The html string of the deployment conversation.

deployment_conversation_id = None

conversation_export_html = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DeploymentStatistics(client, requestSeries=None, latencySeries=None, dateLabels=None, httpStatusSeries=None)

Bases: abacusai.return_class.AbstractApiClass

A set of statistics for a realtime deployment.

Parameters:

client (ApiClient) – An authenticated API Client instance
requestSeries (list) – A list of the number of requests per second.
latencySeries (list) – A list of the latency in milliseconds for each request.
dateLabels (list) – A list of date labels for each point in the series.
httpStatusSeries (list) – A list of the HTTP status codes for each request.

request_series = None

latency_series = None

date_labels = None

http_status_series = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DocumentData(client, docId=None, mimeType=None, pageCount=None, totalPageCount=None, extractedText=None, embeddedText=None, pages=None, tokens=None, metadata=None, pageMarkdown=None, extractedPageText=None, augmentedPageText=None)

Bases: abacusai.return_class.AbstractApiClass

Data extracted from a docstore document.

Parameters:

client (ApiClient) – An authenticated API Client instance
docId (str) – Unique Docstore string identifier for the document.
mimeType (str) – The mime type of the document.
pageCount (int) – The number of pages for which the data is available. This is generally same as the total number of pages but may be less than the total number of pages in the document if processing is done only for selected pages.
totalPageCount (int) – The total number of pages in the document.
extractedText (str) – The extracted text in the document obtained from OCR.
embeddedText (str) – The embedded text in the document. Only available for digital documents.
pages (list) – List of embedded text for each page in the document. Only available for digital documents.
tokens (list) – List of extracted tokens in the document obtained from OCR.
metadata (list) – List of metadata for each page in the document.
pageMarkdown (list) – The markdown text for the page.
extractedPageText (list) – List of extracted text for each page in the document obtained from OCR. Available when return_extracted_page_text parameter is set to True in the document data retrieval API.
augmentedPageText (list) – List of extracted text for each page in the document obtained from OCR augmented with embedded links in the document.

doc_id = None

mime_type = None

page_count = None

total_page_count = None

extracted_text = None

embedded_text = None

pages = None

tokens = None

metadata = None

page_markdown = None

extracted_page_text = None

augmented_page_text = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DocumentRetriever(client, name=None, documentRetrieverId=None, createdAt=None, featureGroupId=None, featureGroupName=None, indexingRequired=None, latestDocumentRetrieverVersion={}, documentRetrieverConfig={})

Bases: abacusai.return_class.AbstractApiClass

A vector store that stores embeddings for a list of document trunks.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the document retriever.
documentRetrieverId (str) – The unique identifier of the vector store.
createdAt (str) – When the vector store was created.
featureGroupId (str) – The feature group id associated with the document retriever.
featureGroupName (str) – The feature group name associated with the document retriever.
indexingRequired (bool) – Whether the document retriever is required to be indexed due to changes in underlying data.
latestDocumentRetrieverVersion (DocumentRetrieverVersion) – The latest version of vector store.
documentRetrieverConfig (VectorStoreConfig) – The config for vector store creation.

name = None

document_retriever_id = None

created_at = None

feature_group_id = None

feature_group_name = None

indexing_required = None

latest_document_retriever_version

document_retriever_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

rename(name)

Updates an existing document retriever.

Parameters:: name (str) – The name to update the document retriever with.
Returns:: The updated document retriever.
Return type:: DocumentRetriever

create_version(feature_group_id=None, document_retriever_config=None)

Creates a document retriever version from the latest version of the feature group that the document retriever associated with.

Parameters:

feature_group_id (str) – The ID of the feature group to update the document retriever with.
document_retriever_config (VectorStoreConfig) – The configuration, including chunk_size and chunk_overlap_fraction, for document retrieval.

Returns:

The newly created document retriever version.

Return type:

DocumentRetrieverVersion

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: DocumentRetriever

describe()

Describe a Document Retriever.

Parameters:: document_retriever_id (str) – A unique string identifier associated with the document retriever.
Returns:: The document retriever object.
Return type:: DocumentRetriever

list_versions(limit=100, start_after_version=None)

List all the document retriever versions with a given ID.

Parameters:

limit (int) – The number of vector store versions to retrieve. The maximum value is 100.
start_after_version (str) – An offset parameter to exclude all document retriever versions up to this specified one.

Returns:

All the document retriever versions associated with the document retriever.

Return type:

list[DocumentRetrieverVersion]

get_document_snippet(document_id, start_word_index=None, end_word_index=None)

Get a snippet from documents in the document retriever.

Parameters:

document_id (str) – The ID of the document to retrieve the snippet from.
start_word_index (int) – If provided, will start the snippet at the index (of words in the document) specified.
end_word_index (int) – If provided, will end the snippet at the index of (of words in the document) specified.

Returns:

The documentation snippet found from the document retriever.

Return type:

DocumentRetrieverLookupResult

restart()

Restart the document retriever if it is stopped or has failed. This will start the deployment of the document retriever,

but will not wait for it to be ready. You need to call wait_until_ready to wait until the deployment is ready.

Parameters:: document_retriever_id (str) – A unique string identifier associated with the document retriever.

wait_until_ready(timeout=3600)

A waiting call until document retriever is ready. It restarts the document retriever if it is stopped.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 3600 seconds.

wait_until_deployment_ready(timeout=3600)

A waiting call until the document retriever deployment is ready to serve.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 3600 seconds.

get_status()

Gets the status of the document retriever. It represents indexing status until indexing isn’t complete, and deployment status after indexing is complete.

Returns:: A string describing the status of a document retriever (pending, indexing, complete, active, etc.).
Return type:: str

get_deployment_status()

Gets the deployment status of the document retriever.

Returns:: A string describing the deployment status of document retriever (pending, deploying, active, etc.).
Return type:: str

get_matching_documents(query, filters=None, limit=None, result_columns=None, max_words=None, num_retrieval_margin_words=None, max_words_per_chunk=None, score_multiplier_column=None, min_score=None, required_phrases=None, filter_clause=None, crowding_limits=None, include_text_search=False)

Lookup document retrievers and return the matching documents from the document retriever deployed with given query.

Original documents are split into chunks and stored in the document retriever. This lookup function will return the relevant chunks from the document retriever. The returned chunks could be expanded to include more words from the original documents and merged if they are overlapping, and permitted by the settings provided. The returned chunks are sorted by relevance.

Parameters:

query (str) – The query to search for.
filters (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
limit (int) – If provided, will limit the number of results to the value specified.
result_columns (list) – If provided, will limit the column properties present in each result to those specified in this list.
max_words (int) – If provided, will limit the total number of words in the results to the value specified.
num_retrieval_margin_words (int) – If provided, will add this number of words from left and right of the returned chunks.
max_words_per_chunk (int) – If provided, will limit the number of words in each chunk to the value specified. If the value provided is smaller than the actual size of chunk on disk, which is determined during document retriever creation, the actual size of chunk will be used. I.e, chunks looked up from document retrievers will not be split into smaller chunks during lookup due to this setting.
score_multiplier_column (str) – If provided, will use the values in this column to modify the relevance score of the returned chunks. Values in this column must be numeric.
min_score (float) – If provided, will filter out the results with score lower than the value specified.
required_phrases (list) – If provided, each result will have at least one of the phrases.
filter_clause (str) – If provided, filter the results of the query using this sql where clause.
crowding_limits (dict) – A dictionary mapping metadata columns to the maximum number of results per unique value of the column. This is used to ensure diversity of metadata attribute values in the results. If a particular attribute value has already reached its maximum count, further results with that same attribute value will be excluded from the final result set.
include_text_search (bool) – If true, combine the ranking of results from a BM25 text search over the documents with the vector search using reciprocal rank fusion. It leverages both lexical and semantic matching for better overall results. It’s particularly valuable in professional, technical, or specialized fields where both precision in terminology and understanding of context are important.

Returns:

The relevant documentation results found from the document retriever.

Return type:

list[DocumentRetrieverLookupResult]

class abacusai.DocumentRetrieverLookupResult(client, document=None, score=None, properties=None, pages=None, boundingBoxes=None, documentSource=None, imageIds=None, metadata=None)

Bases: abacusai.return_class.AbstractApiClass

Result of a document retriever lookup.

Parameters:

client (ApiClient) – An authenticated API Client instance
document (str) – The document that was looked up.
score (float) – Score of the document with respect to the query.
properties (dict) – Properties of the retrieved documents.
pages (list) – Pages of the retrieved text from the original document.
boundingBoxes (list) – Bounding boxes of the retrieved text from the original document.
documentSource (str) – Document source name.
imageIds (list) – List of Image IDs for all the pages.
metadata (dict) – Metadata column values for the retrieved documents.

document = None

score = None

properties = None

pages = None

bounding_boxes = None

document_source = None

image_ids = None

metadata = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DocumentRetrieverVersion(client, documentRetrieverId=None, documentRetrieverVersion=None, createdAt=None, status=None, deploymentStatus=None, featureGroupId=None, featureGroupVersion=None, error=None, numberOfChunks=None, embeddingFileSize=None, warnings=None, resolvedConfig={}, documentRetrieverConfig={})

Bases: abacusai.return_class.AbstractApiClass

A version of document retriever.

Parameters:

client (ApiClient) – An authenticated API Client instance
documentRetrieverId (str) – The unique identifier of the Document Retriever.
documentRetrieverVersion (str) – The unique identifier of the Document Retriever version.
createdAt (str) – When the Document Retriever was created.
status (str) – The status of Document Retriever version. It represents indexing status until indexing isn’t complete, and deployment status after indexing is complete.
deploymentStatus (str) – The status of deploying the Document Retriever version.
featureGroupId (str) – The feature group id associated with the document retriever.
featureGroupVersion (str) – The unique identifier of the feature group version at which the Document Retriever version is created.
error (str) – The error message when it failed to create the document retriever version.
numberOfChunks (int) – The number of chunks for the document retriever.
embeddingFileSize (int) – The size of embedding file for the document retriever.
warnings (list) – The warning messages when creating the document retriever.
resolvedConfig (VectorStoreConfig) – The resolved configurations, such as default settings, for indexing documents.
documentRetrieverConfig (VectorStoreConfig) – The config used to create the document retriever version.

document_retriever_id = None

document_retriever_version = None

created_at = None

status = None

deployment_status = None

feature_group_id = None

feature_group_version = None

error = None

number_of_chunks = None

embedding_file_size = None

warnings = None

resolved_config

document_retriever_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

delete()

Delete a document retriever version.

Parameters:: document_retriever_version (str) – A unique string identifier associated with the document retriever version.

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: DocumentRetrieverVersion

describe()

Describe a document retriever version.

Parameters:: document_retriever_version (str) – A unique string identifier associated with the document retriever version.
Returns:: The document retriever version object.
Return type:: DocumentRetrieverVersion

wait_for_results(timeout=3600)

A waiting call until document retriever version is complete.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_until_ready(timeout=3600)

A waiting call until the document retriever version is ready. It restarts the document retriever if it is stopped.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_until_deployment_ready(timeout=3600)

A waiting call until the document retriever deployment is ready to serve.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 3600 seconds.

get_status()

Gets the status of the document retriever version.

Returns:: A string describing the status of a document retriever version (pending, complete, etc.).
Return type:: str

get_deployment_status()

Gets the status of the document retriever version.

Returns:: A string describing the deployment status of a document retriever version (pending, deploying, etc.).
Return type:: str

class abacusai.DriftDistribution(client, trainColumn=None, predictedColumn=None, metrics=None, distribution={})

Bases: abacusai.return_class.AbstractApiClass

How actuals or predicted values have changed in the training data versus predicted data

Parameters:

client (ApiClient) – An authenticated API Client instance
trainColumn (str) – The feature name in the train table.
predictedColumn (str) – The feature name in the prediction table.
metrics (dict) – Drift measures.
distribution (FeatureDistribution) – A FeatureDistribution, how the training data compares to the predicted data.

train_column = None

predicted_column = None

metrics = None

distribution

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.DriftDistributions(client, labelDrift={}, predictionDrift={}, bpPredictionDrift={})

Bases: abacusai.return_class.AbstractApiClass

For either actuals or predicted values, how it has changed in the training data versus some specified window

Parameters:

client (ApiClient) – An authenticated API Client instance
labelDrift (DriftDistribution) – A DriftDistribution describing column names and the range of values for label drift.
predictionDrift (DriftDistribution) – A DriftDistribution describing column names and the range of values for prediction drift.
bpPredictionDrift (DriftDistribution) – A DriftDistribution describing column names and the range of values for prediction drift, when the predictions come from BP.

label_drift

prediction_drift

bp_prediction_drift

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Eda(client, edaId=None, name=None, createdAt=None, projectId=None, featureGroupId=None, referenceFeatureGroupVersion=None, testFeatureGroupVersion=None, edaConfigs=None, latestEdaVersion={}, refreshSchedules={})

Bases: abacusai.return_class.AbstractApiClass

A exploratory data analysis object

Parameters:

client (ApiClient) – An authenticated API Client instance
edaId (str) – The unique identifier of the eda object.
name (str) – The user-friendly name for the eda object.
createdAt (str) – Date and time at which the eda object was created.
projectId (str) – The project this eda object belongs to.
featureGroupId (str) – Feature group ID for which eda analysis is being done.
referenceFeatureGroupVersion (str) – Reference Feature group version for data consistency analysis, will be latest feature group version for collinearity analysis.
testFeatureGroupVersion (str) – Test Feature group version for data consistency analysis, will be latest feature group version for collinearity analysis.
edaConfigs (dict) – Configurations for eda object.
latestEdaVersion (EdaVersion) – The latest eda object version.
refreshSchedules (RefreshSchedule) – List of refresh schedules that indicate when the next model version will be trained.

eda_id = None

name = None

created_at = None

project_id = None

feature_group_id = None

reference_feature_group_version = None

test_feature_group_version = None

eda_configs = None

latest_eda_version

refresh_schedules

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

rerun()

Reruns the specified EDA object.

Parameters:: eda_id (str) – Unique string identifier of the EDA object to rerun.
Returns:: The EDA object that is being rerun.
Return type:: Eda

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Eda

describe()

Retrieves a full description of the specified EDA object.

Parameters:: eda_id (str) – Unique string identifier associated with the EDA object.
Returns:: Description of the EDA object.
Return type:: Eda

list_versions(limit=100, start_after_version=None)

Retrieves a list of versions for a given EDA object.

Parameters:

limit (int) – The maximum length of the list of all EDA versions.
start_after_version (str) – The ID of the version after which the list starts.

Returns:

A list of EDA versions.

Return type:

list[EdaVersion]

rename(name)

Renames an EDA

Parameters:: name (str) – The new name to apply to the model monitor.

delete()

Deletes the specified EDA and all its versions.

Parameters:: eda_id (str) – Unique string identifier of the EDA to delete.

class abacusai.EdaChartDescription(client, chartType=None, description=None)

Bases: abacusai.return_class.AbstractApiClass

Eda Chart Description.

Parameters:

client (ApiClient) – An authenticated API Client instance
chartType (str) – Name of chart.
description (str) – Description of the eda chart.

chart_type = None

description = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.EdaCollinearity(client, columnNames=None, collinearityMatrix=None, groupFeatureDict=None, collinearityGroups=None, columnNamesX=None)

Bases: abacusai.return_class.AbstractApiClass

Eda Collinearity of the latest version of the data between all the features.

Parameters:

client (ApiClient) – An authenticated API Client instance
columnNames (list) – Name of all the features in the y axis of the collinearity matrix
collinearityMatrix (dict) – A dict describing the collinearity between all the features
groupFeatureDict (dict) – A dict describing the index of the group from collinearity_groups a feature exists in
collinearityGroups (list) – Groups created based on a collinearity threshold of 0.7
columnNamesX (list) – Name of all the features in the x axis of the collinearity matrix

column_names = None

collinearity_matrix = None

group_feature_dict = None

collinearity_groups = None

column_names_x = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.EdaDataConsistency(client, columnNames=None, primaryKeys=None, transformationColumnNames=None, baseDuplicates={}, compareDuplicates={}, deletions={}, transformations={})

Bases: abacusai.return_class.AbstractApiClass

Eda Data Consistency, contained the duplicates in the base version, Comparison version, Deletions between the base and comparison and feature transformations between the base and comparison data.

Parameters:

client (ApiClient) – An authenticated API Client instance
columnNames (list) – Name of all the features in the data
primaryKeys (list) – Name of the primary keys in the data
transformationColumnNames (list) – Name of all the features that are not the primary keys
baseDuplicates (DataConsistencyDuplication) – A DataConsistencyDuplication describing the number of duplicates within the data
compareDuplicates (DataConsistencyDuplication) – A DataConsistencyDuplication describing the number of duplicates within the data
deletions (DataConsistencyDuplication) – A DataConsistencyDeletion describing the number of deletion between two versions in the data
transformations (DataConsistencyTransformation) – A DataConsistencyTransformation the number of changes that occured per feature in the data

column_names = None

primary_keys = None

transformation_column_names = None

base_duplicates

compare_duplicates

deletions

transformations

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.EdaFeatureAssociation(client, data=None, isScatter=None, isBoxWhisker=None, xAxis=None, yAxis=None, xAxisColumnValues=None, yAxisColumnValues=None, dataColumns=None)

Bases: abacusai.return_class.AbstractApiClass

Eda Feature Association between two features in the data.

Parameters:

client (ApiClient) – An authenticated API Client instance
data (dict) – the data to display the feature association between two features
isScatter (bool) – A Boolean that represents if the data creates a scatter plot (for cases of numerical data vs numerical data)
isBoxWhisker (bool) – A Boolean that represents if the data creates a box whisker plot (For cases of categorical data vs numerical data and vice versa)
xAxis (str) – Name of the feature selected for feature association (reference_feature_name) for x axis on the plot
yAxis (str) – Name of the feature selected for feature association (test_feature_name) for y axis on the plot
xAxisColumnValues (list) – Name of all the categories within the x_axis feature (if it is a categorical data type)
yAxisColumnValues (list) – Name of all the categories within the y_axis feature (if it is a categorical data type)
dataColumns (list) – A list of columns listed in the data as keys

data = None

is_scatter = None

is_box_whisker = None

x_axis = None

y_axis = None

x_axis_column_values = None

y_axis_column_values = None

data_columns = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.EdaFeatureCollinearity(client, selectedFeature=None, sortedColumnNames=None, featureCollinearity=None)

Bases: abacusai.return_class.AbstractApiClass

Eda Collinearity of the latest version of the data for a given feature.

Parameters:

client (ApiClient) – An authenticated API Client instance
selectedFeature (str) – Selected feature to show the collinearity
sortedColumnNames (list) – Name of all the features in the data sorted in descending order of collinearity value
featureCollinearity (dict) – A dict describing the collinearity between a given feature and all the features in the data

selected_feature = None

sorted_column_names = None

feature_collinearity = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.EdaForecastingAnalysis(client, primaryKeys=None, forecastingTargetFeature=None, timestampFeature=None, forecastFrequency=None, salesAcrossTime={}, cummulativeContribution={}, missingValueDistribution={}, historyLength={}, numRowsHistogram={}, productMaturity={}, seasonalityYear={}, seasonalityMonth={}, seasonalityWeekOfYear={}, seasonalityDayOfYear={}, seasonalityDayOfMonth={}, seasonalityDayOfWeek={}, seasonalityQuarter={}, seasonalityHour={}, seasonalityMinute={}, seasonalitySecond={}, autocorrelation={}, partialAutocorrelation={})

Bases: abacusai.return_class.AbstractApiClass

Eda Forecasting Analysis of the latest version of the data.

Parameters:

client (ApiClient) – An authenticated API Client instance
primaryKeys (list) – Name of the primary keys in the data
forecastingTargetFeature (str) – Feature in the data that represents the target.
timestampFeature (str) – Feature in the data that represents the timestamp column.
forecastFrequency (str) – Frequency of data, could be hourly, daily, weekly, monthly, quarterly or yearly.
salesAcrossTime (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across time
cummulativeContribution (ForecastingAnalysisGraphData) – Data showing what percent of items contribute to what amount of sales.
missingValueDistribution (ForecastingAnalysisGraphData) – Data showing missing or null value distribution
historyLength (ForecastingAnalysisGraphData) – Data showing length of history distribution
numRowsHistogram (ForecastingAnalysisGraphData) – Data showing number of rows for an item distribution
productMaturity (ForecastingAnalysisGraphData) – Data showing length of how long a product has been alive with average, p10, p90 and median
seasonalityYear (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across grouped years
seasonalityMonth (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across grouped months
seasonalityWeekOfYear (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across week of year seasonality
seasonalityDayOfYear (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across day of year seasonality
seasonalityDayOfMonth (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across day of month seasonality
seasonalityDayOfWeek (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across day of week seasonality
seasonalityQuarter (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across grouped quarters
seasonalityHour (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across grouped hours
seasonalityMinute (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across grouped minutes
seasonalitySecond (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across grouped seconds
autocorrelation (ForecastingAnalysisGraphData) – Data showing the correlation of the forecasting target and its lagged values at different time lags.
partialAutocorrelation (ForecastingAnalysisGraphData) – Data showing the correlation of the forecasting target and its lagged values, controlling for the effects of intervening lags.

primary_keys = None

forecasting_target_feature = None

timestamp_feature = None

forecast_frequency = None

sales_across_time

cummulative_contribution

missing_value_distribution

history_length

num_rows_histogram

product_maturity

seasonality_year

seasonality_month

seasonality_week_of_year

seasonality_day_of_year

seasonality_day_of_month

seasonality_day_of_week

seasonality_quarter

seasonality_hour

seasonality_minute

seasonality_second

autocorrelation

partial_autocorrelation

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.EdaVersion(client, edaVersion=None, status=None, edaId=None, edaStartedAt=None, edaCompletedAt=None, referenceFeatureGroupVersion=None, testFeatureGroupVersion=None, error=None)

Bases: abacusai.return_class.AbstractApiClass

A version of an eda object

Parameters:

client (ApiClient) – An authenticated API Client instance
edaVersion (str) – The unique identifier of a eda version.
status (str) – The current status of the eda object.
edaId (str) – A reference to the eda this version belongs to.
edaStartedAt (str) – The start time and date of the eda process.
edaCompletedAt (str) – The end time and date of the eda process.
referenceFeatureGroupVersion (list[str]) – Feature group version IDs that this refresh pipeline run is analyzing.
testFeatureGroupVersion (list[str]) – Feature group version IDs that this refresh pipeline run is analyzing.
error (str) – Relevant error if the status is FAILED.

eda_version = None

status = None

eda_id = None

eda_started_at = None

eda_completed_at = None

reference_feature_group_version = None

test_feature_group_version = None

error = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: EdaVersion

describe()

Retrieves a full description of the specified EDA version.

Parameters:: eda_version (str) – Unique string identifier of the EDA version.
Returns:: An EDA version.
Return type:: EdaVersion

delete()

Deletes the specified EDA version.

Parameters:: eda_version (str) – Unique string identifier of the EDA version to delete.

get_eda_collinearity()

Gets the Collinearity between all features for the Exploratory Data Analysis.

Parameters:: eda_version (str) – Unique string identifier associated with the EDA instance.
Returns:: An object with a record of correlations between each feature for the EDA.
Return type:: EdaCollinearity

get_eda_data_consistency(transformation_feature=None)

Gets the data consistency for the Exploratory Data Analysis.

Parameters:: transformation_feature (str) – The transformation feature to get consistency for.
Returns:: Object with duplication, deletion, and transformation data for data consistency analysis for an EDA.
Return type:: EdaDataConsistency

get_collinearity_for_feature(feature_name=None)

Gets the Collinearity for the given feature from the Exploratory Data Analysis.

Parameters:: feature_name (str) – Name of the feature for which correlation is shown.
Returns:: Object with a record of correlations for the provided feature for an EDA.
Return type:: EdaFeatureCollinearity

get_feature_association(reference_feature_name, test_feature_name)

Gets the Feature Association for the given features from the feature group version within the eda_version.

Parameters:

reference_feature_name (str) – Name of the feature for feature association (on x-axis for the plots generated for the Feature association in the product).
test_feature_name (str) – Name of the feature for feature association (on y-axis for the plots generated for the Feature association in the product).

Returns:

An object with a record of data for the feature association between the two given features for an EDA version.

Return type:

EdaFeatureAssociation

get_eda_forecasting_analysis()

Gets the Forecasting analysis for the Exploratory Data Analysis.

Parameters:: eda_version (str) – Unique string identifier associated with the EDA version.
Returns:: Object with forecasting analysis that includes sales_across_time, cummulative_contribution, missing_value_distribution, history_length, num_rows_histogram, product_maturity data.
Return type:: EdaForecastingAnalysis

wait_for_eda(timeout=1200)

A waiting call until eda version is ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the eda version.

Returns:: A string describing the status of the model monitor version, for e.g., pending, complete, etc.
Return type:: str

class abacusai.EditImageModels(client, models=None, default=None)

Bases: abacusai.return_class.AbstractApiClass

Edit image models

Parameters:

client (ApiClient) – An authenticated API Client instance
models (list) – The models available for edit image.
default (str) – The default model for edit image.

models = None

default = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.EmbeddingFeatureDriftDistribution(client, distance=None, jsDistance=None, wsDistance=None, ksStatistic=None, psi=None, csi=None, chiSquare=None, averageDrift={})

Bases: abacusai.return_class.AbstractApiClass

Feature distribution for embeddings

Parameters:

client (ApiClient) – An authenticated API Client instance
distance (list) – Histogram data of KL divergences between the training distribution and the range of values in the specified window.
jsDistance (list) – Histogram data of JS divergence between the training distribution and the range of values in the specified window.
wsDistance (list) – Histogram data of Wasserstein distance between the training distribution and the range of values in the specified window.
ksStatistic (list) – Histogram data of Kolmogorov-Smirnov statistic computed between the training distribution and the range of values in the specified window.
psi (list) – Histogram data of Population stability index computed between the training distribution and the range of values in the specified window.
csi (list) – Histogram data of Characteristic Stability Index computed between the training distribution and the range of values in the specified window.
chiSquare (list) – Histogram data of Chi-square statistic computed between the training distribution and the range of values in the specified window.
averageDrift (DriftTypesValue) – Average drift embedding for each type of drift

distance = None

js_distance = None

ws_distance = None

ks_statistic = None

psi = None

csi = None

chi_square = None

average_drift

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ExecuteFeatureGroupOperation(client, featureGroupOperationRunId=None, status=None, error=None, query=None)

Bases: abacusai.return_class.AbstractApiClass

The result of executing a SQL query

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupOperationRunId (str) – The run id of the operation
status (str) – The status of the operation
error (str) – The error message if the operation failed
query (str) – The SQL query of the operation

feature_group_operation_run_id = None

status = None

error = None

query = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

wait_for_results(timeout=3600, delay=2)

A waiting call until query is executed.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
delay (int) – Polling interval for checking timeout.

wait_for_execution(timeout=3600, delay=2)

A waiting call until query is executed.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
delay (int) – Polling interval for checking timeout.

get_status()

Gets the status of the query execution

Returns:: A string describing the status of a query execution (pending, complete, etc.).
Return type:: str

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: DatasetVersion

describe()

Gets the description of the query execution

Returns:: A ExecuteFeatureGroupOperation object describing the query execution.
Return type:: ExecuteFeatureGroupOperation

_download_avro_file(file_part, tmp_dir, part_index)

load_as_pandas(max_workers=10)

Loads the result data into a pandas dataframe

Parameters:: max_workers (int) – The number of threads.
Returns:: A pandas dataframe displaying the data from execution.
Return type:: DataFrame

class abacusai.ExternalApplication(client, name=None, externalApplicationId=None, deploymentId=None, description=None, logo=None, theme=None, userGroupIds=None, useCase=None, isAgent=None, status=None, deploymentConversationRetentionHours=None, managedUserService=None, predictionOverrides=None, isSystemCreated=None, isCustomizable=None, isDeprecated=None, isVisible=None, hasThinkingOption=None, onlyImageGenEnabled=None, projectId=None, isCodellmChatmodeSupported=None)

Bases: abacusai.return_class.AbstractApiClass

An external application.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the external application.
externalApplicationId (str) – The unique identifier of the external application.
deploymentId (str) – The deployment id associated with the external application.
description (str) – The description of the external application.
logo (str) – The logo.
theme (dict) – The theme used for the External Application.
userGroupIds (list) – A list of App User Groups with access to this external application
useCase (str) – Use Case of the project of this deployment
isAgent (bool) – Whether the external application is an agent.
status (str) – The status of the deployment.
deploymentConversationRetentionHours (int) – The retention policy for the external application.
managedUserService (str) – The external service that is managing the user accounts.
predictionOverrides (dict) – The prediction overrides for the external application.
isSystemCreated (bool) – Whether the external application is system created.
isCustomizable (bool) – Whether the external application is customizable.
isDeprecated (bool) – Whether the external application is deprecated. Only applicable for system created bots. Deprecated external applications will not show in the UI.
isVisible (bool) – Whether the external application should be shown in the dropdown.
hasThinkingOption (bool) – Whether to show the thinking option in the toolbar.
onlyImageGenEnabled (bool) – Whether to LLM only allows image generation.
projectId (str) – The project id associated with the external application.
isCodellmChatmodeSupported (bool) – Whether the external application is codellm chatmode supported

name = None

external_application_id = None

deployment_id = None

description = None

logo = None

theme = None

user_group_ids = None

use_case = None

is_agent = None

status = None

deployment_conversation_retention_hours = None

managed_user_service = None

prediction_overrides = None

is_system_created = None

is_customizable = None

is_deprecated = None

is_visible = None

has_thinking_option = None

only_image_gen_enabled = None

project_id = None

is_codellm_chatmode_supported = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

update(name=None, description=None, theme=None, deployment_id=None, deployment_conversation_retention_hours=None, reset_retention_policy=False)

Updates an External Application.

Parameters:

name (str) – The name of the External Application.
description (str) – The description of the External Application. This will be shown to users when they access the External Application.
theme (dict) – The visual theme of the External Application.
deployment_id (str) – The ID of the deployment to use.
deployment_conversation_retention_hours (int) – The number of hours to retain the conversations for.
reset_retention_policy (bool) – If true, the retention policy will be removed.

Returns:

The updated External Application.

Return type:

ExternalApplication

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: ExternalApplication

describe()

Describes an External Application.

Parameters:: external_application_id (str) – The ID of the External Application.
Returns:: The External Application.
Return type:: ExternalApplication

delete()

Deletes an External Application.

Parameters:: external_application_id (str) – The ID of the External Application.

class abacusai.ExternalInvite(client, userAlreadyInOrg=None, userAlreadyInAppGroup=None, userExistsAsInternal=None, successfulInvites=None)

Bases: abacusai.return_class.AbstractApiClass

The response of the invites for different emails

Parameters:

client (ApiClient) – An authenticated API Client instance
userAlreadyInOrg (list) – List of user emails not successfully invited, because they are already in the organization.
userAlreadyInAppGroup (list) – List of user emails not successfully invited, because they are already in the application group.
userExistsAsInternal (list) – List of user emails not successfully invited, because they are already internal users.
successfulInvites (list) – List of users successfully invited.

user_already_in_org = None

user_already_in_app_group = None

user_exists_as_internal = None

successful_invites = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ExtractedFields(client, data=None, rawLlmResponse=None)

Bases: abacusai.return_class.AbstractApiClass

The fields extracted from a document.

Parameters:

client (ApiClient) – An authenticated API Client instance
data (dict) – The fields/data extracted from the document.
rawLlmResponse (str) – The raw llm response. Only returned if it could not be parsed to json dict.

data = None

raw_llm_response = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Feature(client, name=None, selectClause=None, featureMapping=None, sourceTable=None, originalName=None, usingClause=None, orderClause=None, whereClause=None, featureType=None, dataType=None, detectedFeatureType=None, detectedDataType=None, columns={}, pointInTimeInfo={})

Bases: abacusai.return_class.AbstractApiClass

A feature in a feature group

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The unique name of the column
selectClause (str) – The sql logic for creating this feature’s data
featureMapping (str) – The Feature Mapping of the feature
sourceTable (str) – The source table of the column
originalName (str) – The original name of the column
usingClause (str) – Nested Column Using Clause
orderClause (str) – Nested Column Ordering Clause
whereClause (str) – Nested Column Where Clause
featureType (str) – Feature Type of the Feature
dataType (str) – Data Type of the Feature
detectedFeatureType (str) – The detected feature type of the column
detectedDataType (str) – The detected data type of the column
columns (NestedFeature) – Nested Features
pointInTimeInfo (PointInTimeFeature) – Point in time column information

name = None

select_clause = None

feature_mapping = None

source_table = None

original_name = None

using_clause = None

order_clause = None

where_clause = None

feature_type = None

data_type = None

detected_feature_type = None

detected_data_type = None

columns

point_in_time_info

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureDistribution(client, type=None, trainingDistribution=None, predictionDistribution=None, numericalTrainingDistribution=None, numericalPredictionDistribution=None, trainingStatistics=None, predictionStatistics=None)

Bases: abacusai.return_class.AbstractApiClass

For a single feature, how it has changed in the training data versus some specified window

Parameters:

client (ApiClient) – An authenticated API Client instance
type (str) – Data type of values in each distribution, typically ‘categorical’ or ‘numerical’.
trainingDistribution (dict) – A dict describing the range of values in the training distribution.
predictionDistribution (dict) – A dict describing the range of values in the specified window.
numericalTrainingDistribution (dict) – A dict describing the summary statistics of the numerical training distribution.
numericalPredictionDistribution (dict) – A dict describing the summary statistics of the numerical prediction distribution.
trainingStatistics (dict) – A dict describing summary statistics of values in the training distribution.
predictionStatistics (dict) – A dict describing summary statistics of values in the specified window.

type = None

training_distribution = None

prediction_distribution = None

numerical_training_distribution = None

numerical_prediction_distribution = None

training_statistics = None

prediction_statistics = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureDriftRecord(client, name=None, distance=None, jsDistance=None, wsDistance=None, ksStatistic=None, psi=None, csi=None, chiSquare=None)

Bases: abacusai.return_class.AbstractApiClass

Value of each type of drift

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – Name of feature.
distance (float) – Symmetric sum of KL divergences between the training distribution and the range of values in the specified window.
jsDistance (float) – JS divergence between the training distribution and the range of values in the specified window.
wsDistance (float) – Wasserstein distance between the training distribution and the range of values in the specified window.
ksStatistic (float) – Kolmogorov-Smirnov statistic computed between the training distribution and the range of values in the specified window.
psi (float) – Population stability index computed between the training distribution and the range of values in the specified window.
csi (float) – Characteristic Stability Index computed between the training distribution and the range of values in the specified window.
chiSquare (float) – Chi-square statistic computed between the training distribution and the range of values in the specified window.

name = None

distance = None

js_distance = None

ws_distance = None

ks_statistic = None

psi = None

csi = None

chi_square = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureDriftSummary(client, featureIndex=None, name=None, distance=None, jsDistance=None, wsDistance=None, ksStatistic=None, predictionDrift=None, targetColumn=None, dataIntegrityTimeseries=None, nestedSummary=None, psi=None, csi=None, chiSquare=None, nullViolations={}, rangeViolations={}, catViolations={})

Bases: abacusai.return_class.AbstractApiClass

Summary of important model monitoring statistics for features available in a model monitoring instance

Parameters:

client (ApiClient) – An authenticated API Client instance
featureIndex (list[dict]) – A list of dicts of eligible feature names and corresponding overall feature drift measures.
name (str) – Name of feature.
distance (float) – Symmetric sum of KL divergences between the training distribution and the range of values in the specified window.
jsDistance (float) – JS divergence between the training distribution and the range of values in the specified window.
wsDistance (float) – Wasserstein distance between the training distribution and the range of values in the specified window.
ksStatistic (float) – Kolmogorov-Smirnov statistic computed between the training distribution and the range of values in the specified window.
predictionDrift (float) – Drift for the target column.
targetColumn (str) – Target column name.
dataIntegrityTimeseries (dict) – Frequency vs Data Integrity Violation Charts.
nestedSummary (list[dict]) – Summary of model monitoring statistics for nested features.
psi (float) – Population stability index computed between the training distribution and the range of values in the specified window.
csi (float) – Characteristic Stability Index computed between the training distribution and the range of values in the specified window.
chiSquare (float) – Chi-square statistic computed between the training distribution and the range of values in the specified window.
nullViolations (NullViolation) – A list of dicts of feature names and a description of corresponding null violations.
rangeViolations (RangeViolation) – A list of dicts of numerical feature names and corresponding prediction range discrepancies.
catViolations (CategoricalRangeViolation) – A list of dicts of categorical feature names and corresponding prediction range discrepancies.

feature_index = None

name = None

distance = None

js_distance = None

ws_distance = None

ks_statistic = None

prediction_drift = None

target_column = None

data_integrity_timeseries = None

nested_summary = None

psi = None

csi = None

chi_square = None

null_violations

range_violations

cat_violations

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroup(client, featureGroupId=None, modificationLock=None, name=None, featureGroupSourceType=None, tableName=None, sql=None, datasetId=None, functionSourceCode=None, functionName=None, sourceTables=None, createdAt=None, description=None, sqlError=None, latestVersionOutdated=None, referencedFeatureGroups=None, tags=None, primaryKey=None, updateTimestampKey=None, lookupKeys=None, streamingEnabled=None, incremental=None, mergeConfig=None, samplingConfig=None, cpuSize=None, memory=None, streamingReady=None, featureTags=None, moduleName=None, templateBindings=None, featureExpression=None, useOriginalCsvNames=None, pythonFunctionBindings=None, pythonFunctionName=None, useGpu=None, versionLimit=None, exportOnMaterialization=None, features={}, duplicateFeatures={}, pointInTimeGroups={}, annotationConfig={}, concatenationConfig={}, indexingConfig={}, codeSource={}, featureGroupTemplate={}, explanation={}, refreshSchedules={}, exportConnectorConfig={}, latestFeatureGroupVersion={}, operatorConfig={})

Bases: abacusai.return_class.AbstractApiClass

A feature group.

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – Unique identifier for this feature group.
modificationLock (bool) – If feature group is locked against a change or not.
name (str)
featureGroupSourceType (str) – The source type of the feature group
tableName (str) – Unique table name of this feature group.
sql (str) – SQL definition creating this feature group.
datasetId (str) – Dataset ID the feature group is sourced from.
functionSourceCode (str) – Source definition creating this feature group.
functionName (str) – Function name to execute from the source code.
sourceTables (list[str]) – Source tables for this feature group.
createdAt (str) – Timestamp at which the feature group was created.
description (str) – Description of the feature group.
sqlError (str) – Error message with this feature group.
latestVersionOutdated (bool) – Is latest materialized feature group version outdated.
referencedFeatureGroups (list[str]) – Feature groups this feature group is used in.
tags (list[str]) – Tags added to this feature group.
primaryKey (str) – Primary index feature.
updateTimestampKey (str) – Primary timestamp feature.
lookupKeys (list[str]) – Additional indexed features for this feature group.
streamingEnabled (bool) – If true, the feature group can have data streamed to it.
incremental (bool) – If feature group corresponds to an incremental dataset.
mergeConfig (dict) – Merge configuration settings for the feature group.
samplingConfig (dict) – Sampling configuration for the feature group.
cpuSize (str) – CPU size specified for the Python feature group.
memory (int) – Memory in GB specified for the Python feature group.
streamingReady (bool) – If true, the feature group is ready to receive streaming data.
featureTags (dict) – Tags for features in this feature group
moduleName (str) – Path to the file with the feature group function.
templateBindings (dict) – Config specifying variable names and values to use when resolving a feature group template.
featureExpression (str) – If the dataset feature group has custom features, the SQL select expression creating those features.
useOriginalCsvNames (bool) – If true, the feature group will use the original column names in the source dataset.
pythonFunctionBindings (dict) – Config specifying variable names, types, and values to use when resolving a Python feature group.
pythonFunctionName (str) – Name of the Python function the feature group was built from.
useGpu (bool) – Whether this feature group is using gpu
versionLimit (int) – Version limit for the feature group.
exportOnMaterialization (bool) – Whether to export the feature group on materialization.
features (Feature) – List of resolved features.
duplicateFeatures (Feature) – List of duplicate features.
pointInTimeGroups (PointInTimeGroup) – List of Point In Time Groups.
annotationConfig (AnnotationConfig) – Annotation config for this feature
latestFeatureGroupVersion (FeatureGroupVersion) – Latest feature group version.
concatenationConfig (ConcatenationConfig) – Feature group ID whose data will be concatenated into this feature group.
indexingConfig (IndexingConfig) – Indexing config for the feature group for feature store
codeSource (CodeSource) – If a Python feature group, information on the source code.
featureGroupTemplate (FeatureGroupTemplate) – FeatureGroupTemplate to use when this feature group is attached to a template.
explanation (NaturalLanguageExplanation) – Natural language explanation of the feature group
refreshSchedules (RefreshSchedule) – List of schedules that determines when the next version of the feature group will be created.
exportConnectorConfig (FeatureGroupRefreshExportConfig) – The export config (file connector or database connector information) for feature group exports.
operatorConfig (OperatorConfig) – Operator configuration settings for the feature group.

feature_group_id = None

modification_lock = None

name = None

feature_group_source_type = None

table_name = None

sql = None

dataset_id = None

function_source_code = None

function_name = None

source_tables = None

created_at = None

description = None

sql_error = None

latest_version_outdated = None

referenced_feature_groups = None

tags = None

primary_key = None

update_timestamp_key = None

lookup_keys = None

streaming_enabled = None

incremental = None

merge_config = None

sampling_config = None

cpu_size = None

memory = None

streaming_ready = None

feature_tags = None

module_name = None

template_bindings = None

feature_expression = None

use_original_csv_names = None

python_function_bindings = None

python_function_name = None

use_gpu = None

version_limit = None

export_on_materialization = None

features

duplicate_features

point_in_time_groups

annotation_config

concatenation_config

indexing_config

code_source

feature_group_template

explanation

refresh_schedules

export_connector_config

latest_feature_group_version

operator_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

add_to_project(project_id, feature_group_type='CUSTOM_TABLE')

Adds a feature group to a project.

Parameters:

project_id (str) – The unique ID associated with the project.
feature_group_type (str) – The feature group type of the feature group, based on the use case under which the feature group is being created.

set_project_config(project_id, project_config=None)

Sets a feature group’s project config

Parameters:

project_id (str) – Unique string identifier for the project.
project_config (ProjectFeatureGroupConfig) – Feature group’s project configuration.

get_project_config(project_id)

Gets a feature group’s project config

Parameters:: project_id (str) – Unique string identifier for the project.
Returns:: The feature group’s project configuration.
Return type:: ProjectConfig

remove_from_project(project_id)

Removes a feature group from a project.

Parameters:: project_id (str) – The unique ID associated with the project.

set_type(project_id, feature_group_type='CUSTOM_TABLE')

Update the feature group type in a project. The feature group must already be added to the project.

Parameters:

project_id (str) – Unique identifier associated with the project.
feature_group_type (str) – The feature group type to set the feature group as.

describe_annotation(feature_name=None, doc_id=None, feature_group_row_identifier=None)

Get the latest annotation entry for a given feature group, feature, and document.

Parameters:

feature_name (str) – The name of the feature the annotation is on.
doc_id (str) – The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the feature group’s primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.

Returns:

The latest annotation entry for the given feature group, feature, document, and/or annotation key value.

Return type:

AnnotationEntry

verify_and_describe_annotation(feature_name=None, doc_id=None, feature_group_row_identifier=None)

Get the latest annotation entry for a given feature group, feature, and document along with verification information.

Parameters:

feature_name (str) – The name of the feature the annotation is on.
doc_id (str) – The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the feature group’s primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.

Returns:

The latest annotation entry for the given feature group, feature, document, and/or annotation key value. Includes the verification information.

Return type:

AnnotationEntry

update_annotation_status(feature_name, status, doc_id=None, feature_group_row_identifier=None, save_metadata=False)

Update the status of an annotation entry.

Parameters:

feature_name (str) – The name of the feature the annotation is on.
status (str) – The new status of the annotation. Must be one of the following: ‘TODO’, ‘IN_PROGRESS’, ‘DONE’.
doc_id (str) – The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the feature group’s primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation.
save_metadata (bool) – If True, save the metadata for the annotation entry.

Returns:

The updated annotation entry.

Return type:

AnnotationEntry

get_document_to_annotate(project_id, feature_name, feature_group_row_identifier=None, get_previous=False)

Get an available document that needs to be annotated for a annotation feature group.

Parameters:

project_id (str) – The ID of the project that the annotation is associated with.
feature_name (str) – The name of the feature the annotation is on.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the primary key value. If provided, fetch the immediate next (or previous) available document.
get_previous (bool) – If True, get the previous document instead of the next document. Applicable if feature_group_row_identifier is provided.

Returns:

The document to annotate.

Return type:

AnnotationDocument

get_annotations_status(feature_name=None, check_for_materialization=False)

Get the status of the annotations for a given feature group and feature.

Parameters:

feature_name (str) – The name of the feature the annotation is on.
check_for_materialization (bool) – If True, check if the feature group needs to be materialized before using for annotations.

Returns:

The status of the annotations for the given feature group and feature.

Return type:

AnnotationsStatus

import_annotation_labels(file, annotation_type)

Imports annotation labels from csv file. All valid values in the file will be imported as labels (including header row if present).

Parameters:

file (io.TextIOBase) – The file to import. Must be a csv file.
annotation_type (str) – The type of the annotation.

Returns:

The annotation config for the feature group.

Return type:

AnnotationConfig

create_sampling(table_name, sampling_config, description=None)

Creates a new Feature Group defined as a sample of rows from another Feature Group.

For efficiency, sampling is approximate unless otherwise specified. (e.g. the number of rows may vary slightly from what was requested).

Parameters:

table_name (str) – The unique name to be given to this sampling Feature Group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
sampling_config (SamplingConfig) – Dictionary defining the sampling method and its parameters.
description (str) – A human-readable description of this Feature Group.

Returns:

The created Feature Group.

Return type:

FeatureGroup

set_sampling_config(sampling_config)

Set a FeatureGroup’s sampling to the config values provided, so that the rows the FeatureGroup returns will be a sample of those it would otherwise have returned.

Parameters:: sampling_config (SamplingConfig) – A JSON string object specifying the sampling method and parameters specific to that sampling method. An empty sampling_config indicates no sampling.
Returns:: The updated FeatureGroup.
Return type:: FeatureGroup

set_merge_config(merge_config)

Set a MergeFeatureGroup’s merge config to the values provided, so that the feature group only returns a bounded range of an incremental dataset.

Parameters:: merge_config (MergeConfig) – JSON object string specifying the merge rule. An empty merge_config will default to only including the latest dataset version.
Returns:: The updated FeatureGroup.
Return type:: FeatureGroup

set_operator_config(operator_config)

Set a OperatorFeatureGroup’s operator config to the values provided.

Parameters:: operator_config (OperatorConfig) – A dictionary object specifying the pre-defined operations.
Returns:: The updated FeatureGroup.
Return type:: FeatureGroup

set_schema(schema)

Creates a new schema and points the feature group to the new feature group schema ID.

Parameters:: schema (list) – JSON string containing an array of objects with ‘name’ and ‘dataType’ properties.

get_schema(project_id=None)

Returns a schema for a given FeatureGroup in a project.

Parameters:: project_id (str) – The unique ID associated with the project.
Returns:: A list of objects for each column in the specified feature group.
Return type:: list[Feature]

create_feature(name, select_expression)

Creates a new feature in a Feature Group from a SQL select statement.

Parameters:

name (str) – The name of the feature to add.
select_expression (str) – SQL SELECT expression to create the feature.

Returns:

A Feature Group object with the newly added feature.

Return type:

FeatureGroup

add_tag(tag)

Adds a tag to the feature group

Parameters:: tag (str) – The tag to add to the feature group.

remove_tag(tag)

Removes a tag from the specified feature group.

Parameters:: tag (str) – The tag to remove from the feature group.

add_annotatable_feature(name, annotation_type)

Add an annotatable feature in a Feature Group

Parameters:

name (str) – The name of the feature to add.
annotation_type (str) – The type of annotation to set.

Returns:

The feature group after the feature has been set

Return type:

FeatureGroup

set_feature_as_annotatable_feature(feature_name, annotation_type, feature_group_row_identifier_feature=None, doc_id_feature=None)

Sets an existing feature as an annotatable feature (Feature that can be annotated).

Parameters:

feature_name (str) – The name of the feature to set as annotatable.
annotation_type (str) – The type of annotation label to add.
feature_group_row_identifier_feature (str) – The key value of the feature group row the annotation is on (cast to string) and uniquely identifies the feature group row. At least one of the doc_id or key value must be provided so that the correct annotation can be identified.
doc_id_feature (str) – The name of the document ID feature.

Returns:

A feature group object with the newly added annotatable feature.

Return type:

FeatureGroup

set_annotation_status_feature(feature_name)

Sets a feature as the annotation status feature for a feature group.

Parameters:: feature_name (str) – The name of the feature to set as the annotation status feature.
Returns:: The updated feature group.
Return type:: FeatureGroup

unset_feature_as_annotatable_feature(feature_name)

Unsets a feature as annotatable

Parameters:: feature_name (str) – The name of the feature to unset.
Returns:: The feature group after unsetting the feature
Return type:: FeatureGroup

add_annotation_label(label_name, annotation_type, label_definition=None)

Adds an annotation label

Parameters:

label_name (str) – The name of the label.
annotation_type (str) – The type of the annotation to set.
label_definition (str) – the definition of the label.

Returns:

The feature group after adding the annotation label

Return type:

FeatureGroup

remove_annotation_label(label_name)

Removes an annotation label

Parameters:: label_name (str) – The name of the label to remove.
Returns:: The feature group after adding the annotation label
Return type:: FeatureGroup

add_feature_tag(feature, tag)

Adds a tag on a feature

Parameters:

feature (str) – The feature to set the tag on.
tag (str) – The tag to set on the feature.

remove_feature_tag(feature, tag)

Removes a tag from a feature

Parameters:

feature (str) – The feature to remove the tag from.
tag (str) – The tag to remove.

create_nested_feature(nested_feature_name, table_name, using_clause, where_clause=None, order_clause=None)

Creates a new nested feature in a feature group from a SQL statement.

Parameters:

nested_feature_name (str) – The name of the feature.
table_name (str) – The table name of the feature group to nest. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
using_clause (str) – The SQL join column or logic to join the nested table with the parent.
where_clause (str) – A SQL WHERE statement to filter the nested rows.
order_clause (str) – A SQL clause to order the nested rows.

Returns:

A feature group object with the newly added nested feature.

Return type:

FeatureGroup

update_nested_feature(nested_feature_name, table_name=None, using_clause=None, where_clause=None, order_clause=None, new_nested_feature_name=None)

Updates a previously existing nested feature in a feature group.

Parameters:

nested_feature_name (str) – The name of the feature to be updated.
table_name (str) – The name of the table. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
using_clause (str) – The SQL join column or logic to join the nested table with the parent.
where_clause (str) – An SQL WHERE statement to filter the nested rows.
order_clause (str) – An SQL clause to order the nested rows.
new_nested_feature_name (str) – New name for the nested feature.

Returns:

A feature group object with the updated nested feature.

Return type:

FeatureGroup

delete_nested_feature(nested_feature_name)

Delete a nested feature.

Parameters:: nested_feature_name (str) – The name of the feature to be deleted.
Returns:: A feature group object without the specified nested feature.
Return type:: FeatureGroup

create_point_in_time_feature(feature_name, history_table_name, aggregation_keys, timestamp_key, historical_timestamp_key, expression, lookback_window_seconds=None, lookback_window_lag_seconds=0, lookback_count=None, lookback_until_position=0)

Creates a new point in time feature in a feature group using another historical feature group, window spec, and aggregate expression.

We use the aggregation keys and either the lookbackWindowSeconds or the lookbackCount values to perform the window aggregation for every row in the current feature group.

If the window is specified in seconds, then all rows in the history table which match the aggregation keys and with historicalTimeFeature greater than or equal to lookbackStartCount and less than the value of the current rows timeFeature are considered. An optional lookbackWindowLagSeconds (+ve or -ve) can be used to offset the current value of the timeFeature. If this value is negative, we will look at the future rows in the history table, so care must be taken to ensure that these rows are available in the online context when we are performing a lookup on this feature group. If the window is specified in counts, then we order the historical table rows aligning by time and consider rows from the window where the rank order is greater than or equal to lookbackCount and includes the row just prior to the current one. The lag is specified in terms of positions using lookbackUntilPosition.

Parameters:

feature_name (str) – The name of the feature to create.
history_table_name (str) – The table name of the history table.
aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation.
timestamp_key (str) – Name of feature which contains the timestamp value for the point in time feature.
historical_timestamp_key (str) – Name of feature which contains the historical timestamp.
expression (str) – SQL aggregate expression which can convert a sequence of rows into a scalar value.
lookback_window_seconds (float) – If window is specified in terms of time, number of seconds in the past from the current time for start of the window.
lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.

Returns:

A feature group object with the newly added nested feature.

Return type:

FeatureGroup

update_point_in_time_feature(feature_name, history_table_name=None, aggregation_keys=None, timestamp_key=None, historical_timestamp_key=None, expression=None, lookback_window_seconds=None, lookback_window_lag_seconds=None, lookback_count=None, lookback_until_position=None, new_feature_name=None)

Updates an existing Point-in-Time (PiT) feature in a feature group. See createPointInTimeFeature for detailed semantics.

Parameters:

feature_name (str) – The name of the feature.
history_table_name (str) – The table name of the history table. If not specified, we use the current table to do a self join.
aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation.
timestamp_key (str) – Name of the feature which contains the timestamp value for the PiT feature.
historical_timestamp_key (str) – Name of the feature which contains the historical timestamp.
expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.
lookback_window_seconds (float) – If the window is specified in terms of time, the number of seconds in the past from the current time for the start of the window.
lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of the window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If the window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of the window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
new_feature_name (str) – New name for the PiT feature.

Returns:

A feature group object with the newly added nested feature.

Return type:

FeatureGroup

create_point_in_time_group(group_name, window_key, aggregation_keys, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=0, lookback_count=None, lookback_until_position=0)

Create a Point-in-Time Group

Parameters:

group_name (str) – The name of the point in time group.
window_key (str) – Name of feature to use for ordering the rows on the source table.
aggregation_keys (list) – List of keys to perform on the source table for the window aggregation.
history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used.
history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used.
history_aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys.
lookback_window (float) – Number of seconds in the past from the current time for the start of the window. If 0, the lookback will include all rows.
lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed. If it is negative, “future” rows in the history table are used.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed by that many rows. If it is negative, those many “future” rows in the history table are used.

Returns:

The feature group after the point in time group has been created.

Return type:

FeatureGroup

generate_point_in_time_features(group_name, columns, window_functions, prefix=None)

Generates and adds PIT features given the selected columns to aggregate over, and the operations to include.

Parameters:

group_name (str) – Name of the point-in-time group.
columns (list) – List of columns to generate point-in-time features for.
window_functions (list) – List of window functions to operate on.
prefix (str) – Prefix for generated features, defaults to group name

Returns:

Feature group object with newly added point-in-time features.

Return type:

FeatureGroup

update_point_in_time_group(group_name, window_key=None, aggregation_keys=None, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=None, lookback_count=None, lookback_until_position=None)

Update Point-in-Time Group

Parameters:

group_name (str) – The name of the point-in-time group.
window_key (str) – Name of feature which contains the timestamp value for the point-in-time feature.
aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation.
history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used.
history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used.
history_aggregation_keys (list) – List of keys to use for joining the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys.
lookback_window (float) – Number of seconds in the past from the current time for the start of the window.
lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed. If it is negative, future rows in the history table are looked at.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row).
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed by that many rows. If it is negative, those many future rows in the history table are looked at.

Returns:

The feature group after the update has been applied.

Return type:

FeatureGroup

delete_point_in_time_group(group_name)

Delete point in time group

Parameters:: group_name (str) – The name of the point in time group.
Returns:: The feature group after the point in time group has been deleted.
Return type:: FeatureGroup

create_point_in_time_group_feature(group_name, name, expression)

Create point in time group feature

Parameters:

group_name (str) – The name of the point-in-time group.
name (str) – The name of the feature to add to the point-in-time group.
expression (str) – A SQL aggregate expression which can convert a sequence of rows into a scalar value.

Returns:

The feature group after the update has been applied.

Return type:

FeatureGroup

update_point_in_time_group_feature(group_name, name, expression)

Update a feature’s SQL expression in a point in time group

Parameters:

group_name (str) – The name of the point-in-time group.
name (str) – The name of the feature to add to the point-in-time group.
expression (str) – SQL aggregate expression which can convert a sequence of rows into a scalar value.

Returns:

The feature group after the update has been applied.

Return type:

FeatureGroup

set_feature_type(feature, feature_type, project_id=None)

Set the type of a feature in a feature group. Specify the feature group ID, feature name, and feature type, and the method will return the new column with the changes reflected.

Parameters:

feature (str) – The name of the feature.
feature_type (str) – The machine learning type of the data in the feature.
project_id (str) – Optional unique ID associated with the project.

Returns:

The feature group after the data_type is applied.

Return type:

Schema

concatenate_data(source_feature_group_id, merge_type='UNION', replace_until_timestamp=None, skip_materialize=False)

Concatenates data from one Feature Group to another. Feature Groups can be merged if their schemas are compatible, they have the special updateTimestampKey column, and (if set) the primaryKey column. The second operand in the concatenate operation will be appended to the first operand (merge target).

Parameters:

source_feature_group_id (str) – The Feature Group to concatenate with the destination Feature Group.
merge_type (str) – UNION or INTERSECTION.
replace_until_timestamp (int) – The UNIX timestamp to specify the point until which we will replace data from the source Feature Group.
skip_materialize (bool) – If True, will not materialize the concatenated Feature Group.

remove_concatenation_config()

Removes the concatenation config on a destination feature group.

Parameters:: feature_group_id (str) – Unique identifier of the destination feature group to remove the concatenation configuration from.

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: FeatureGroup

describe()

Describe a Feature Group.

Parameters:: feature_group_id (str) – A unique string identifier associated with the feature group.
Returns:: The feature group object.
Return type:: FeatureGroup

set_indexing_config(primary_key=None, update_timestamp_key=None, lookup_keys=None)

Sets various attributes of the feature group used for primary key, deployment lookups and streaming updates.

Parameters:

primary_key (str) – Name of the feature which defines the primary key of the feature group.
update_timestamp_key (str) – Name of the feature which defines the update timestamp of the feature group. Used in concatenation and primary key deduplication.
lookup_keys (list) – List of feature names which can be used in the lookup API to restrict the computation to a set of dataset rows. These feature names have to correspond to underlying dataset columns.

update(description=None)

Modify an existing Feature Group.

Parameters:: description (str) – Description of the Feature Group.
Returns:: Updated Feature Group object.
Return type:: FeatureGroup

detach_from_template()

Update a feature group to detach it from a template.

Parameters:: feature_group_id (str) – Unique string identifier associated with the feature group.
Returns:: The updated feature group.
Return type:: FeatureGroup

update_template_bindings(template_bindings=None)

Update the feature group template bindings for a template feature group.

Parameters:: template_bindings (list) – Values in these bindings override values set in the template.
Returns:: Updated feature group.
Return type:: FeatureGroup

update_python_function_bindings(python_function_bindings)

Updates an existing Feature Group’s Python function bindings from a user-provided Python Function. If a list of feature groups are supplied within the Python function bindings, we will provide DataFrames (Pandas in the case of Python) with the materialized feature groups for those input feature groups as arguments to the function.

Parameters:: python_function_bindings (List) – List of python function arguments.

update_python_function(python_function_name, python_function_bindings=None, cpu_size=None, memory=None, use_gpu=None, use_original_csv_names=None)

Updates an existing Feature Group’s python function from a user provided Python Function. If a list of feature groups are supplied within the python function

bindings, we will provide as arguments to the function DataFrame’s (pandas in the case of Python) with the materialized feature groups for those input feature groups.

Parameters:

python_function_name (str) – The name of the python function to be associated with the feature group.
python_function_bindings (List) – List of python function arguments.
cpu_size (CPUSize) – Size of the CPU for the feature group python function.
memory (MemorySize) – Memory (in GB) for the feature group python function.
use_gpu (bool) – Whether the feature group needs a gpu or not. Otherwise default to CPU.
use_original_csv_names (bool) – If enabled, it uses the original column names for input feature groups from CSV datasets.

update_sql_definition(sql)

Updates the SQL statement for a feature group.

Parameters:: sql (str) – The input SQL statement for the feature group.
Returns:: The updated feature group.
Return type:: FeatureGroup

update_dataset_feature_expression(feature_expression)

Updates the SQL feature expression for a Dataset FeatureGroup’s custom features

Parameters:: feature_expression (str) – The input SQL statement for the feature group.
Returns:: The updated feature group.
Return type:: FeatureGroup

update_version_limit(version_limit)

Updates the version limit for the feature group.

Parameters:: version_limit (int) – The maximum number of versions permitted for the feature group. Once this limit is exceeded, the oldest versions will be purged in a First-In-First-Out (FIFO) order.
Returns:: The updated feature group.
Return type:: FeatureGroup

update_feature(name, select_expression=None, new_name=None)

Modifies an existing feature in a feature group.

Parameters:

name (str) – Name of the feature to be updated.
select_expression (str) – SQL statement for modifying the feature.
new_name (str) – New name of the feature.

Returns:

Updated feature group object.

Return type:

FeatureGroup

list_exports()

Lists all of the feature group exports for the feature group

Parameters:: feature_group_id (str) – Unique identifier of the feature group
Returns:: List of feature group exports
Return type:: list[FeatureGroupExport]

set_modifier_lock(locked=True)

Lock a feature group to prevent modification.

Parameters:: locked (bool) – Whether to disable or enable feature group modification (True or False).

list_modifiers()

List the users who can modify a given feature group.

Parameters:: feature_group_id (str) – Unique string identifier of the feature group.
Returns:: Information about the modification lock status and groups/organizations added to the feature group.
Return type:: ModificationLockInfo

add_user_to_modifiers(email)

Adds a user to a feature group.

Parameters:: email (str) – The email address of the user to be added.

add_organization_group_to_modifiers(organization_group_id)

Add OrganizationGroup to a feature group modifiers list

Parameters:: organization_group_id (str) – Unique string identifier of the organization group.

remove_user_from_modifiers(email)

Removes a user from a specified feature group.

Parameters:: email (str) – The email address of the user to be removed.

remove_organization_group_from_modifiers(organization_group_id)

Removes an OrganizationGroup from a feature group modifiers list

Parameters:: organization_group_id (str) – The unique ID associated with the organization group.

delete_feature(name)

Removes a feature from the feature group.

Parameters:: name (str) – Name of the feature to be deleted.
Returns:: Updated feature group object.
Return type:: FeatureGroup

delete()

Deletes a Feature Group.

Parameters:: feature_group_id (str) – Unique string identifier for the feature group to be removed.

create_version(variable_bindings=None)

Creates a snapshot for a specified feature group. Triggers materialization of the feature group. The new version of the feature group is created after it has materialized.

Parameters:: variable_bindings (dict) – Dictionary defining variable bindings that override parent feature group values.
Returns:: A feature group version.
Return type:: FeatureGroupVersion

list_versions(limit=100, start_after_version=None)

Retrieves a list of all feature group versions for the specified feature group.

Parameters:

limit (int) – The maximum length of the returned versions.
start_after_version (str) – Results will start after this version.

Returns:

A list of feature group versions.

Return type:

list[FeatureGroupVersion]

set_export_connector_config(feature_group_export_config=None)

Sets FG export config for the given feature group.

Parameters:: feature_group_export_config (FeatureGroupExportConfig) – The export config to be set for the given feature group.

set_export_on_materialization(enable)

Can be used to enable or disable exporting feature group data to the export connector associated with the feature group.

Parameters:: enable (bool) – If true, will enable exporting feature group to the connector. If false, will disable.

create_template(name, template_sql, template_variables, description=None, template_bindings=None, should_attach_feature_group_to_template=False)

Create a feature group template.

Parameters:

name (str) – User-friendly name for this feature group template.
template_sql (str) – The template SQL that will be resolved by applying values from the template variables to generate SQL for a feature group.
template_variables (list) – The template variables for resolving the template.
description (str) – Description of this feature group template.
template_bindings (list) – If the feature group will be attached to the newly created template, set these variable bindings on that feature group.
should_attach_feature_group_to_template (bool) – Set to True to convert the feature group to a template feature group and attach it to the newly created template.

Returns:

The created feature group template.

Return type:

FeatureGroupTemplate

suggest_template_for()

Suggest values for a feature gruop template, based on a feature group.

Parameters:: feature_group_id (str) – Unique identifier associated with the feature group to use for suggesting values to use in the template.
Returns:: The suggested feature group template.
Return type:: FeatureGroupTemplate

get_recent_streamed_data()

Returns recently streamed data to a streaming feature group.

Parameters:: feature_group_id (str) – Unique string identifier associated with the feature group.

append_data(streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters:

streaming_token (str) – The streaming token for authenticating requests.
data (dict) – The data to record as a JSON object.

append_multiple_data(streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters:

streaming_token (str) – Streaming token for authenticating requests.
data (list) – Data to record, as a list of JSON objects.

upsert_data(data, streaming_token=None, blobs=None)

Update new data into the feature group for a given lookup key record ID if the record ID is found; otherwise, insert new data into the feature group.

Parameters:

data (dict) – The data to record, in JSON format.
streaming_token (str) – Optional streaming token for authenticating requests if upserting to streaming FG.
blobs (None) – A dictionary of binary data to populate file fields’ in data to upsert to the streaming FG.

Returns:

The feature group row that was upserted.

Return type:

FeatureGroupRow

delete_data(primary_key)

Deletes a row from the feature group given the primary key

Parameters:: primary_key (str) – The primary key value for which to delete the feature group row

get_data(primary_key=None, num_rows=None)

Gets the feature group rows for online updatable feature groups.

If primary key is set, row corresponding to primary_key is returned. If num_rows is set, we return maximum of num_rows latest updated rows.

Parameters:

primary_key (str) – The primary key value for which to retrieve the feature group row (only for online feature groups).
num_rows (int) – Maximum number of rows to return from the feature group

Returns:

A list of feature group rows.

Return type:

list[FeatureGroupRow]

get_natural_language_explanation(feature_group_version=None, model_id=None)

Returns the saved natural language explanation of an artifact with given ID. The artifact can be - Feature Group or Feature Group Version or Model

Parameters:

feature_group_version (str) – A unique string identifier associated with the Feature Group Version.
model_id (str) – A unique string identifier associated with the Model.

Returns:

The object containing natural language explanation(s) as field(s).

Return type:

NaturalLanguageExplanation

generate_natural_language_explanation(feature_group_version=None, model_id=None)

Generates natural language explanation of an artifact with given ID. The artifact can be - Feature Group or Feature Group Version or Model

Parameters:

feature_group_version (str) – A unique string identifier associated with the Feature Group Version.
model_id (str) – A unique string identifier associated with the Model.

Returns:

The object containing natural language explanation(s) as field(s).

Return type:

NaturalLanguageExplanation

wait_for_dataset(timeout=7200)

A waiting call until the feature group’s dataset, if any, is ready for use.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.

wait_for_upload(timeout=7200)

Waits for a feature group created from a dataframe to be ready for materialization and version creation.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.

wait_for_materialization(timeout=7200)

A waiting call until feature group is materialized.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.

wait_for_streaming_ready(timeout=600)

Waits for the feature group indexing config to be applied for streaming

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 600 seconds.

get_status(streaming_status=False)

Gets the status of the feature group.

Returns:: A string describing the status of a feature group (pending, complete, etc.).
Return type:: str
Parameters:: streaming_status (bool)

load_as_pandas()

Loads the feature groups into a python pandas dataframe.

Returns:: A pandas dataframe with annotations and text_snippet columns.
Return type:: DataFrame

load_as_pandas_documents(doc_id_column='doc_id', document_column='page_infos')

Loads a feature group with documents data into a pandas dataframe.

Parameters:

doc_id_column (str) – The name of the feature / column containing the document ID.
document_column (str) – The name of the feature / column which either contains the document data itself or page infos with path to remotely stored documents. This column will be replaced with the extracted document data.

Returns:

A pandas dataframe containing the extracted document data.

Return type:

DataFrame

describe_dataset()

Displays the dataset attached to a feature group.

Returns:: A dataset object with all the relevant information about the dataset.
Return type:: Dataset

materialize()

Materializes the feature group’s latest change at the api call time. It’ll skip materialization if no change since the current latest version.

Returns:: A feature group object with the lastest changes materialized.
Return type:: FeatureGroup

class abacusai.FeatureGroupDocument(client, featureGroupId=None, docId=None, status=None)

Bases: abacusai.return_class.AbstractApiClass

A document of a feature group.

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – The ID of the feature group this row belongs to.
docId (str) – Unique document id
status (str) – The status of the document processing

feature_group_id = None

doc_id = None

status = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupExport(client, featureGroupExportId=None, failedWrites=None, totalWrites=None, featureGroupVersion=None, connectorType=None, outputLocation=None, fileFormat=None, databaseConnectorId=None, objectName=None, writeMode=None, databaseFeatureMapping=None, idColumn=None, status=None, createdAt=None, exportCompletedAt=None, additionalIdColumns=None, error=None, databaseOutputError=None, projectConfig={})

Bases: abacusai.return_class.AbstractApiClass

A feature Group Export Job

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupExportId (str) – Unique identifier for this export.
failedWrites (int) – Number of failed writes.
totalWrites (int) – Total number of writes.
featureGroupVersion (str) – Version of the feature group being exported.
connectorType (str) – The type of connector
outputLocation (str) – File Connector location the feature group is being written to.
fileFormat (str) – File format being written to output_location.
databaseConnectorId (str) – Database connector ID used.
objectName (str) – Database connector’s object to write to.
writeMode (str) – UPSERT or INSERT for writing to the database connector.
databaseFeatureMapping (dict) – Column/feature pairs mapping the features to the database columns.
idColumn (str) – ID column to use as the upsert key.
status (str) – Current status of the export.
createdAt (str) – Timestamp at which the export was created (ISO-8601 format).
exportCompletedAt (str) – Timestamp at which the export completed (ISO-8601 format).
additionalIdColumns (list[str]) – For database connectors which support it, additional ID columns to use as a complex key for upserting.
error (str) – If status is FAILED, this field will be populated with an error.
databaseOutputError (bool) – If True, there were errors reported by the database connector while writing.
projectConfig (ProjectConfig) – Project config for this feature group.

feature_group_export_id = None

failed_writes = None

total_writes = None

feature_group_version = None

connector_type = None

output_location = None

file_format = None

database_connector_id = None

object_name = None

write_mode = None

database_feature_mapping = None

id_column = None

status = None

created_at = None

export_completed_at = None

additional_id_columns = None

error = None

database_output_error = None

project_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

get_feature_group_version_export_download_url()

Get a link to download the feature group version.

Parameters:: feature_group_export_id (str) – Unique identifier of the Feature Group Export to get a signed URL for.
Returns:: Instance containing the download URL and expiration time for the Feature Group Export.
Return type:: FeatureGroupExportDownloadUrl

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: FeatureGroupExport

describe()

A feature group export

Parameters:: feature_group_export_id (str) – Unique identifier of the feature group export.
Returns:: The feature group export object.
Return type:: FeatureGroupExport

get_connector_errors()

Returns a stream containing the write errors of the feature group export database connection, if any writes failed to the database connector.

Parameters:: feature_group_export_id (str) – Unique identifier of the feature group export to get the errors for.

wait_for_results(timeout=3600)

A waiting call until feature group export is created.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_export(timeout=3600)

A waiting call until feature group export is created.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the feature group export.

Returns:: A string describing the status of a feature group export (pending, complete, etc.).
Return type:: str

class abacusai.FeatureGroupExportConfig(client, outputLocation=None, fileFormat=None, databaseConnectorId=None, objectName=None, writeMode=None, databaseFeatureMapping=None, idColumn=None, additionalIdColumns=None)

Bases: abacusai.return_class.AbstractApiClass

Export configuration (file connector or database connector information) for feature group exports.

Parameters:

client (ApiClient) – An authenticated API Client instance
outputLocation (str) – The File Connector location to which the feature group is being written.
fileFormat (str) – The file format being written to output_location.
databaseConnectorId (str) – The unique string identifier of the database connector used.
objectName (str) – The object in the database connector to which the feature group is being written.
writeMode (str) – UPSERT or INSERT for writing to the database connector.
databaseFeatureMapping (dict) – The column/feature pairs mapping the features to the database columns.
idColumn (str) – The id column to use as the upsert key.
additionalIdColumns (str) – For database connectors which support it, additional ID columns to use as a complex key for upserting.

output_location = None

file_format = None

database_connector_id = None

object_name = None

write_mode = None

database_feature_mapping = None

id_column = None

additional_id_columns = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupExportDownloadUrl(client, downloadUrl=None, expiresAt=None)

Bases: abacusai.return_class.AbstractApiClass

A Feature Group Export Download Url, which is used to download the feature group version

Parameters:

client (ApiClient) – An authenticated API Client instance
downloadUrl (str) – The URL of the download location.
expiresAt (str) – String representation of the ISO-8601 datetime when the URL expires.

download_url = None

expires_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupLineage(client, nodes=None, connections=None)

Bases: abacusai.return_class.AbstractApiClass

Directed acyclic graph of feature group lineage for all feature groups in a project

Parameters:

client (ApiClient) – An authenticated API Client instance
nodes (list<dict>) – A list of nodes in the graph containing feature groups and datasets
connections (list<dict>) – A list of connections in the graph between nodes

nodes = None

connections = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupRefreshExportConfig(client, connectorType=None, location=None, exportFileFormat=None, additionalIdColumns=None, databaseFeatureMapping=None, externalConnectionId=None, idColumn=None, objectName=None, writeMode=None)

Bases: abacusai.return_class.AbstractApiClass

A Feature Group Connector Export Config outlines the export configuration for a feature group.

Parameters:

client (ApiClient) – An authenticated API Client instance
connectorType (str) – The type of connector the feature group is
location (str) – The file connector location of the feature group export
exportFileFormat (str) – The file format of the feature group export
additionalIdColumns (list) – Additional id columns to use for upsert operations
databaseFeatureMapping (dict) – The mapping of feature names to database columns
externalConnectionId (str) – The unique identifier of the external connection to write to
idColumn (str) – The column to use as the id column for upsert operations
objectName (str) – The name of the object to write to
writeMode (str) – The write mode to use for the export

connector_type = None

location = None

export_file_format = None

additional_id_columns = None

database_feature_mapping = None

external_connection_id = None

id_column = None

object_name = None

write_mode = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupRow(client, featureGroupId=None, primaryKey=None, createdAt=None, updatedAt=None, contents=None)

Bases: abacusai.return_class.AbstractApiClass

A row of a feature group.

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – The ID of the feature group this row belongs to.
primaryKey (str) – Value of the primary key for this row.
createdAt (str) – The timestamp this feature group row was created in ISO-8601 format.
updatedAt (str) – The timestamp when this feature group row was last updated in ISO-8601 format.
contents (dict) – A dictionary of feature names and values for this row.

feature_group_id = None

primary_key = None

created_at = None

updated_at = None

contents = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupRowProcess(client, featureGroupId=None, deploymentId=None, primaryKeyValue=None, featureGroupRowProcessId=None, createdAt=None, updatedAt=None, startedAt=None, completedAt=None, timeoutAt=None, retriesRemaining=None, totalAttemptsAllowed=None, status=None, error=None)

Bases: abacusai.return_class.AbstractApiClass

A feature group row process

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – The ID of the feature group this row that was processed belongs to.
deploymentId (str) – The ID of the deployment that processed this row.
primaryKeyValue (str) – Value of the primary key for this row.
featureGroupRowProcessId (str) – The ID of the feature group row process.
createdAt (str) – The timestamp this feature group row was created in ISO-8601 format.
updatedAt (str) – The timestamp when this feature group row was last updated in ISO-8601 format.
startedAt (str) – The timestamp when this feature group row process was started in ISO-8601 format.
completedAt (str) – The timestamp when this feature group row was completed.
timeoutAt (str) – The time the feature group row process will timeout.
retriesRemaining (int) – The number of retries remaining for this feature group row process.
totalAttemptsAllowed (int) – The total number of attempts allowed for this feature group row process.
status (str) – The status of the feature group row process.
error (str) – The error message if the status is FAILED.

feature_group_id = None

deployment_id = None

primary_key_value = None

feature_group_row_process_id = None

created_at = None

updated_at = None

started_at = None

completed_at = None

timeout_at = None

retries_remaining = None

total_attempts_allowed = None

status = None

error = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

wait_for_process(timeout=1200)

A waiting call until model monitor version is ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the feature group row process.

Returns:: A string describing the status of the feature group row process
Return type:: str

class abacusai.FeatureGroupRowProcessLogs(client, logs=None, featureGroupId=None, deploymentId=None, primaryKeyValue=None, featureGroupRowProcessId=None)

Bases: abacusai.return_class.AbstractApiClass

Logs for the feature group row process.

Parameters:

client (ApiClient) – An authenticated API Client instance
logs (str) – The logs for both stdout and stderr of the step
featureGroupId (str) – The ID of the feature group this row that was processed belongs to.
deploymentId (str) – The ID of the deployment that processed this row.
primaryKeyValue (str) – Value of the primary key for this row.
featureGroupRowProcessId (str) – The ID of the feature group row process.

logs = None

feature_group_id = None

deployment_id = None

primary_key_value = None

feature_group_row_process_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupRowProcessSummary(client, totalProcesses=None, pendingProcesses=None, processingProcesses=None, completeProcesses=None, failedProcesses=None)

Bases: abacusai.return_class.AbstractApiClass

A summary of the feature group processes for a deployment.

Parameters:

client (ApiClient) – An authenticated API Client instance
totalProcesses (int) – The total number of processes
pendingProcesses (int) – The number of pending processes
processingProcesses (int) – The number of processes currently processing
completeProcesses (int) – The number of complete processes
failedProcesses (int) – The number of failed processes

total_processes = None

pending_processes = None

processing_processes = None

complete_processes = None

failed_processes = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupTemplate(client, featureGroupTemplateId=None, description=None, featureGroupId=None, isSystemTemplate=None, name=None, templateSql=None, templateVariables=None, createdAt=None, updatedAt=None)

Bases: abacusai.return_class.AbstractApiClass

A template for creating feature groups.

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupTemplateId (str) – The unique identifier for this feature group template.
description (str) – A user-friendly text description of this feature group template.
featureGroupId (str) – The unique identifier for the feature group used to create this template.
isSystemTemplate (bool) – True if this is a system template returned from a user organization.
name (str) – The user-friendly name of this feature group template.
templateSql (str) – SQL that can include variables which will be replaced by values from the template config to resolve this template SQL into a valid SQL query for a feature group.
templateVariables (dict) – A map, from template variable names to parameters for replacing those template variables with values (e.g. to values and metadata on how to resolve those values).
createdAt (str) – When the feature group template was created.
updatedAt (str) – When the feature group template was updated.

feature_group_template_id = None

description = None

feature_group_id = None

is_system_template = None

name = None

template_sql = None

template_variables = None

created_at = None

updated_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

delete()

Delete an existing feature group template.

Parameters:: feature_group_template_id (str) – Unique string identifier associated with the feature group template.

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: FeatureGroupTemplate

describe()

Describe a Feature Group Template.

Parameters:: feature_group_template_id (str) – The unique identifier of a feature group template.
Returns:: The feature group template object.
Return type:: FeatureGroupTemplate

update(template_sql=None, template_variables=None, description=None, name=None)

Update a feature group template.

Parameters:

template_sql (str) – If provided, the new value to use for the template SQL.
template_variables (list) – If provided, the new value to use for the template variables.
description (str) – Description of this feature group template.
name (str) – User-friendly name for this feature group template.

Returns:

The updated feature group template.

Return type:

FeatureGroupTemplate

preview_resolution(template_bindings=None, template_sql=None, template_variables=None, should_validate=True)

Resolve template sql using template variables and template bindings.

Parameters:

template_bindings (list) – Values to override the template variable values specified by the template.
template_sql (str) – If specified, use this as the template SQL instead of the feature group template’s SQL.
template_variables (list) – Template variables to use. If a template is provided, this overrides the template’s template variables.
should_validate (bool) – If true, validates the resolved SQL.

Returns:

The resolved template

Return type:

ResolvedFeatureGroupTemplate

class abacusai.FeatureGroupTemplateVariableOptions(client, templateVariableOptions=None, userFeedback=None)

Bases: abacusai.return_class.AbstractApiClass

Feature Group Template Variable Options

Parameters:

client (ApiClient) – An authenticated API Client instance
templateVariableOptions (list[dict]) – List of values we can select for different template variables.
userFeedback (list[str]) – List of additional information regarding variable options for the user.

template_variable_options = None

user_feedback = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureGroupVersion(client, featureGroupVersion=None, featureGroupId=None, sql=None, sourceTables=None, sourceDatasetVersions=None, createdAt=None, status=None, error=None, deployable=None, cpuSize=None, memory=None, useOriginalCsvNames=None, pythonFunctionBindings=None, indexingConfigWarningMsg=None, materializationStartedAt=None, materializationCompletedAt=None, columns=None, templateBindings=None, features={}, pointInTimeGroups={}, codeSource={}, annotationConfig={}, indexingConfig={})

Bases: abacusai.return_class.AbstractApiClass

A materialized version of a feature group

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupVersion (str) – The unique identifier for this materialized version of feature group.
featureGroupId (str) – The unique identifier of the feature group this version belongs to.
sql (str) – The sql definition creating this feature group.
sourceTables (list[str]) – The source tables for this feature group.
sourceDatasetVersions (list[str]) – The dataset version ids for this feature group version.
createdAt (str) – The timestamp at which the feature group version was created.
status (str) – The current status of the feature group version.
error (str) – Relevant error if the status is FAILED.
deployable (bool) – whether feature group is deployable or not.
cpuSize (str) – Cpu size specified for the python feature group.
memory (int) – Memory in GB specified for the python feature group.
useOriginalCsvNames (bool) – If true, the feature group will use the original column names in the source dataset.
pythonFunctionBindings (list) – Config specifying variable names, types, and values to use when resolving a Python feature group.
indexingConfigWarningMsg (str) – The warning message related to indexing keys.
materializationStartedAt (str) – The timestamp at which the feature group materialization started.
materializationCompletedAt (str) – The timestamp at which the feature group materialization completed.
columns (list[feature]) – List of resolved columns.
templateBindings (list) – Template variable bindings used for resolving the template.
features (Feature) – List of features.
pointInTimeGroups (PointInTimeGroup) – List of Point In Time Groups
codeSource (CodeSource) – If a python feature group, information on the source code
annotationConfig (AnnotationConfig) – The annotations config for the feature group.
indexingConfig (IndexingConfig) – The indexing config for the feature group.

feature_group_version = None

feature_group_id = None

sql = None

source_tables = None

source_dataset_versions = None

created_at = None

status = None

error = None

deployable = None

cpu_size = None

memory = None

use_original_csv_names = None

python_function_bindings = None

indexing_config_warning_msg = None

materialization_started_at = None

materialization_completed_at = None

columns = None

template_bindings = None

features

point_in_time_groups

code_source

annotation_config

indexing_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

create_snapshot_feature_group(table_name)

Creates a Snapshot Feature Group corresponding to a specific Feature Group version.

Parameters:: table_name (str) – Name for the newly created Snapshot Feature Group table. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
Returns:: Feature Group corresponding to the newly created Snapshot.
Return type:: FeatureGroup

export_to_file_connector(location, export_file_format, overwrite=False)

Export Feature group to File Connector.

Parameters:

location (str) – Cloud file location to export to.
export_file_format (str) – Enum string specifying the file format to export to.
overwrite (bool) – If true and a file exists at this location, this process will overwrite the file.

Returns:

The FeatureGroupExport instance.

Return type:

FeatureGroupExport

export_to_database_connector(database_connector_id, object_name, write_mode, database_feature_mapping, id_column=None, additional_id_columns=None)

Export Feature group to Database Connector.

Parameters:

database_connector_id (str) – Unique string identifier for the Database Connector to export to.
object_name (str) – Name of the database object to write to.
write_mode (str) – Enum string indicating whether to use INSERT or UPSERT.
database_feature_mapping (dict) – Key/value pair JSON object of “database connector column” -> “feature name” pairs.
id_column (str) – Required if write_mode is UPSERT. Indicates which database column should be used as the lookup key.
additional_id_columns (list) – For database connectors which support it, additional ID columns to use as a complex key for upserting.

Returns:

The FeatureGroupExport instance.

Return type:

FeatureGroupExport

export_to_console(export_file_format)

Export Feature group to console.

Parameters:: export_file_format (str) – File format to export to.
Returns:: The FeatureGroupExport instance.
Return type:: FeatureGroupExport

delete()

Deletes a Feature Group Version.

Parameters:: feature_group_version (str) – String identifier for the feature group version to be removed.

get_materialization_logs(stdout=False, stderr=False)

Returns logs for a materialized feature group version.

Parameters:

stdout (bool) – Set to True to get info logs.
stderr (bool) – Set to True to get error logs.

Returns:

A function logs object.

Return type:

FunctionLogs

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: FeatureGroupVersion

describe()

Describe a feature group version.

Parameters:: feature_group_version (str) – The unique identifier associated with the feature group version.
Returns:: The feature group version.
Return type:: FeatureGroupVersion

get_metrics(selected_columns=None, include_charts=False, include_statistics=True)

Get metrics for a specific feature group version.

Parameters:

selected_columns (List) – A list of columns to order first.
include_charts (bool) – A flag indicating whether charts should be included in the response. Default is false.
include_statistics (bool) – A flag indicating whether statistics should be included in the response. Default is true.

Returns:

The metrics for the specified feature group version.

Return type:

DataMetrics

get_logs()

Retrieves the feature group materialization logs.

Parameters:: feature_group_version (str) – The unique version ID of the feature group version.
Returns:: The logs for the specified feature group version.
Return type:: FeatureGroupVersionLogs

wait_for_results(timeout=3600)

A waiting call until feature group version is materialized

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_materialization(timeout=3600)

A waiting call until feature group version is materialized.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the feature group version.

Returns:: A string describing the status of a feature group version (pending, complete, etc.).
Return type:: str

_download_avro_file(file_part, tmp_dir, part_index)

load_as_pandas(max_workers=10)

Loads the feature group version into a pandas dataframe.

Parameters:: max_workers (int) – The number of threads.
Returns:: A pandas dataframe displaying the data in the feature group version.
Return type:: DataFrame

load_as_pandas_documents(doc_id_column='doc_id', document_column='page_infos', max_workers=10)

Loads a feature group with documents data into a pandas dataframe.

Parameters:

doc_id_column (str) – The name of the feature / column containing the document ID.
document_column (str) – The name of the feature / column which either contains the document data itself or page infos with path to remotely stored documents. This column will be replaced with the extracted document data.
max_workers (int) – The number of threads.

Returns:

A pandas dataframe containing the extracted document data.

Return type:

DataFrame

class abacusai.FeatureGroupVersionLogs(client, logs=None)

Bases: abacusai.return_class.AbstractApiClass

Logs from feature group version.

Parameters:

client (ApiClient) – An authenticated API Client instance
logs (list[str]) – List of logs from feature group version.

logs = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureImportance(client, shapFeatureImportance=None, limeFeatureImportance=None, permutationFeatureImportance=None, nullFeatureImportance=None, lofoFeatureImportance=None, ebmFeatureImportance=None)

Bases: abacusai.return_class.AbstractApiClass

Feature importance for a specified model monitor

Parameters:

client (ApiClient) – An authenticated API Client instance
shapFeatureImportance (dict) – A map of feature name to feature importance, determined by Shap values on a sample dataset.
limeFeatureImportance (dict) – A map of feature name to feature importance, determined by Lime contribution values on a sample dataset.
permutationFeatureImportance (dict) – A map of feature name to feature importance, determined by permutation importance.
nullFeatureImportance (dict) – A map of feature name to feature importance, determined by null feature importance.
lofoFeatureImportance (dict) – A map of feature name to feature importance, determined by the Leave One Feature Out method.
ebmFeatureImportance (dict) – A map of feature name to feature importance, determined by an Explainable Boosting Machine.

shap_feature_importance = None

lime_feature_importance = None

permutation_feature_importance = None

null_feature_importance = None

lofo_feature_importance = None

ebm_feature_importance = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeatureMapping(client, featureMapping=None, featureName=None)

Bases: abacusai.return_class.AbstractApiClass

A description of the data use for a feature

Parameters:

client (ApiClient) – An authenticated API Client instance
featureMapping (str) – The mapping of the feature. The possible values will be based on the project’s use-case. See the (Use Case Documentation)[https://api.abacus.ai/app/help/useCases] for more details.
featureName (str) – The unique name of the feature.

feature_mapping = None

feature_name = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FeaturePerformanceAnalysis(client, features=None, featureMetrics=None, metricsKeys=None)

Bases: abacusai.return_class.AbstractApiClass

A feature performance analysis for Monitor

Parameters:

client (ApiClient) – An authenticated API Client instance
features (list) – A list of the features that are being analyzed.
featureMetrics (list) – A list of dictionary for every feature and its metrics
metricsKeys (list) – A list of the keys for the metrics.

features = None

feature_metrics = None

metrics_keys = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FileConnector(client, bucket=None, verified=None, writePermission=None, authExpiresAt=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

Verification result for an external storage service

Parameters:

client (ApiClient) – An authenticated API Client instance
bucket (str) – The address of the bucket. eg., s3://your-bucket
verified (bool) – true if the bucket has passed verification
writePermission (bool) – true if Abacus.AI has permission to write to this bucket
authExpiresAt (str) – The time when the file connector’s auth expires, if applicable
createdAt (str) – The timestamp at which the file connector was created

bucket = None

verified = None

write_permission = None

auth_expires_at = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FileConnectorInstructions(client, verified=None, writePermission=None, authOptions=None)

Bases: abacusai.return_class.AbstractApiClass

An object with a full description of the cloud storage bucket authentication options and bucket policy. Returns an error message if the parameters are invalid.

Parameters:

client (ApiClient) – An authenticated API Client instance
verified (bool) – True if the bucket has passed verification
writePermission (bool) – True if Abacus.AI has permission to write to this bucket
authOptions (list[dict]) – A list of options for giving Abacus.AI access to this bucket

verified = None

write_permission = None

auth_options = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FileConnectorVerification(client, verified=None, writePermission=None)

Bases: abacusai.return_class.AbstractApiClass

To verify the file connector

Parameters:

client (ApiClient) – An authenticated API Client instance
verified (bool) – true if the bucket has passed verification
writePermission (bool) – true if Abacus.AI has permission to write to this bucket

verified = None

write_permission = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FinetunedPretrainedModel(client, name=None, finetunedPretrainedModelId=None, finetunedPretrainedModelVersion=None, createdAt=None, updatedAt=None, config=None, baseModel=None, finetuningDatasetVersion=None, status=None, error=None)

Bases: abacusai.return_class.AbstractApiClass

A finetuned pretrained model

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The user-friendly name for the model.
finetunedPretrainedModelId (str) – The unique identifier of the model.
finetunedPretrainedModelVersion (str) – The unique identifier of the model version.
createdAt (str) – When the finetuned pretrained model was created.
updatedAt (str) – When the finetuned pretrained model was last updated.
config (dict) – The finetuned pretrained model configuration
baseModel (str) – The pretrained base model for fine tuning
finetuningDatasetVersion (str) – The finetuned dataset instance id of the model.
status (str) – The current status of the finetuned pretrained model.
error (str) – Relevant error if the status is FAILED.

name = None

finetuned_pretrained_model_id = None

finetuned_pretrained_model_version = None

created_at = None

updated_at = None

config = None

base_model = None

finetuning_dataset_version = None

status = None

error = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ForecastingAnalysisGraphData(client, data=None, xAxis=None, yAxis=None, dataColumns=None, chartName=None, chartTypes=None, itemStatistics={}, chartDescriptions={})

Bases: abacusai.return_class.AbstractApiClass

Forecasting Analysis Graph Data representation.

Parameters:

client (ApiClient) – An authenticated API Client instance
data (list) – List of graph data
xAxis (str) – Feature that represents the x axis
yAxis (str) – Feature that represents the y axis
dataColumns (list) – Ordered name of the column for each rowwise data
chartName (str) – Name of the chart represented by the data
chartTypes (list) – Type of charts in that can exist in the current data.
itemStatistics (ItemStatistics) – In item wise charts, gives the mean, median, count, missing_percent, p10, p90, standard_deviation, min, max
chartDescriptions (EdaChartDescription) – List of descriptions of what the chart contains

data = None

x_axis = None

y_axis = None

data_columns = None

chart_name = None

chart_types = None

item_statistics

chart_descriptions

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ForecastingMonitorItemAnalysis(client, predictionItemAnalysis={}, trainingItemAnalysis={})

Bases: abacusai.return_class.AbstractApiClass

Forecasting Monitor Item Analysis of the latest version of the data.

Parameters:

client (ApiClient) – An authenticated API Client instance
predictionItemAnalysis (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across time for prediction data
trainingItemAnalysis (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across time for training data

prediction_item_analysis

training_item_analysis

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ForecastingMonitorSummary(client, predictionTimestampCol=None, predictionTargetCol=None, trainingTimestampCol=None, trainingTargetCol=None, predictionItemId=None, trainingItemId=None, forecastFrequency=None, trainingTargetAcrossTime={}, predictionTargetAcrossTime={}, actualsHistogram={}, predictionsHistogram={}, trainHistoryData={}, predictHistoryData={}, targetDrift={}, historyDrift={})

Bases: abacusai.return_class.AbstractApiClass

Forecasting Monitor Summary of the latest version of the data.

Parameters:

client (ApiClient) – An authenticated API Client instance
predictionTimestampCol (str) – Feature in the data that represents the timestamp column.
predictionTargetCol (str) – Feature in the data that represents the target.
trainingTimestampCol (str) – Feature in the data that represents the timestamp column.
trainingTargetCol (str) – Feature in the data that represents the target.
predictionItemId (str) – Feature in the data that represents the item id.
trainingItemId (str) – Feature in the data that represents the item id.
forecastFrequency (str) – Frequency of data, could be hourly, daily, weekly, monthly, quarterly or yearly.
trainingTargetAcrossTime (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across time
predictionTargetAcrossTime (ForecastingAnalysisGraphData) – Data showing average, p10, p90, median sales across time
actualsHistogram (ForecastingAnalysisGraphData) – Data showing actuals histogram
predictionsHistogram (ForecastingAnalysisGraphData) – Data showing predictions histogram
trainHistoryData (ForecastingAnalysisGraphData) – Data showing length of history distribution
predictHistoryData (ForecastingAnalysisGraphData) – Data showing length of history distribution
targetDrift (FeatureDriftRecord) – Data showing drift of the target for all drift types: distance (KL divergence), js_distance, ws_distance, ks_statistic, psi, csi, chi_square
historyDrift (FeatureDriftRecord) – Data showing drift of the history for all drift types: distance (KL divergence), js_distance, ws_distance, ks_statistic, psi, csi, chi_square

prediction_timestamp_col = None

prediction_target_col = None

training_timestamp_col = None

training_target_col = None

prediction_item_id = None

training_item_id = None

forecast_frequency = None

training_target_across_time

prediction_target_across_time

actuals_histogram

predictions_histogram

train_history_data

predict_history_data

target_drift

history_drift

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FsEntry(client, name=None, type=None, path=None, size=None, modified=None, isFolderEmpty=None, children=None)

Bases: abacusai.return_class.AbstractApiClass

File system entry.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the file/folder
type (str) – The type of entry (file/folder)
path (str) – The path of the entry
size (int) – The size of the entry in bytes
modified (int) – The last modified timestamp
isFolderEmpty (bool) – Whether the folder is empty (only for folders)
children (list) – List of child FSEntry objects (only for folders)

name = None

type = None

path = None

size = None

modified = None

isFolderEmpty = None

children = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.FunctionLogs(client, function=None, stats=None, stdout=None, stderr=None, algorithm=None, exception={})

Bases: abacusai.return_class.AbstractApiClass

Logs from an invocation of a function.

Parameters:

client (ApiClient) – An authenticated API Client instance
function (str) – The function this is logging
stats (dict) – Statistics for the start and end time execution for this function
stdout (str) – Standard out logs
stderr (str) – Standard error logs
algorithm (str) – Algorithm name for this function
exception (UserException) – The exception stacktrace

function = None

stats = None

stdout = None

stderr = None

algorithm = None

exception

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.GeneratedPitFeatureConfigOption(client, name=None, displayName=None, default=None, description=None)

Bases: abacusai.return_class.AbstractApiClass

The options to display for possible generated PIT aggregation functions

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The short name of the aggregation type.
displayName (str) – The display name of the aggregation type.
default (bool) – The default value for the option.
description (str) – The description of the aggregation type.

name = None

display_name = None

default = None

description = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.GraphDashboard(client, name=None, graphDashboardId=None, createdAt=None, projectId=None, pythonFunctionIds=None, plotReferenceIds=None, pythonFunctionNames=None, projectName=None, description=None)

Bases: abacusai.return_class.AbstractApiClass

A Graph Dashboard

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The user-friendly name for the graph dashboard.
graphDashboardId (str) – The unique identifier of the graph dashboard.
createdAt (str) – Date and time at which the graph dashboard was created, in ISO-8601 format.
projectId (str) – The unique identifier of the project this graph dashboard belongs to.
pythonFunctionIds (list[str]) – List of Python function IDs included in the dashboard.
plotReferenceIds (list[str]) – List of the graph reference IDs for the plots to the dashboard.
pythonFunctionNames (list[str]) – List of names of each of the plots to the dashboard.
projectName (str) – The name the graph dashboard belongs to.
description (str) – The description of the graph dashboard.

name = None

graph_dashboard_id = None

created_at = None

project_id = None

python_function_ids = None

plot_reference_ids = None

python_function_names = None

project_name = None

description = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: GraphDashboard

describe()

Describes a given graph dashboard.

Parameters:: graph_dashboard_id (str) – Unique identifier for the graph dashboard.
Returns:: An object containing information about the graph dashboard.
Return type:: GraphDashboard

delete()

Deletes a graph dashboard

Parameters:: graph_dashboard_id (str) – Unique string identifier for the graph dashboard to be deleted.

update(name=None, python_function_ids=None)

Updates a graph dashboard

Parameters:

name (str) – Name of the dashboard.
python_function_ids (List) – List of unique string identifiers for the Python functions to be used in the graph dashboard.

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

class abacusai.HoldoutAnalysis(client, holdoutAnalysisId=None, name=None, featureGroupIds=None, modelId=None, modelName=None)

Bases: abacusai.return_class.AbstractApiClass

A holdout analysis object.

Parameters:

client (ApiClient) – An authenticated API Client instance
holdoutAnalysisId (str) – The unique identifier of the holdout analysis.
name (str) – The name of the holdout analysis.
featureGroupIds (list[str]) – The feature group ids associated with the holdout analysis.
modelId (str) – The model id associated with the holdout analysis.
modelName (str) – The model name associated with the holdout analysis.

holdout_analysis_id = None

name = None

feature_group_ids = None

model_id = None

model_name = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

rerun(model_version=None, algorithm=None)

Rerun a holdout analysis. A different model version and algorithm can be specified which should be under the same model.

Parameters:

model_version (str) – (optional) Version of the model to use for the holdout analysis
algorithm (str) – (optional) ID of algorithm to use for the holdout analysis

Returns:

The created holdout analysis version

Return type:

HoldoutAnalysisVersion

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: HoldoutAnalysis

describe()

Get a holdout analysis.

Parameters:: holdout_analysis_id (str) – ID of the holdout analysis to get
Returns:: The holdout analysis
Return type:: HoldoutAnalysis

list_versions()

List holdout analysis versions for a holdout analysis.

Parameters:: holdout_analysis_id (str) – ID of the holdout analysis to list holdout analysis versions for
Returns:: The holdout analysis versions
Return type:: list[HoldoutAnalysisVersion]

class abacusai.HoldoutAnalysisVersion(client, holdoutAnalysisVersion=None, holdoutAnalysisId=None, createdAt=None, status=None, error=None, modelId=None, modelVersion=None, algorithm=None, algoName=None, metrics=None, metricInfos=None)

Bases: abacusai.return_class.AbstractApiClass

A holdout analysis version object.

Parameters:

client (ApiClient) – An authenticated API Client instance
holdoutAnalysisVersion (str) – The unique identifier of the holdout analysis version.
holdoutAnalysisId (str) – The unique identifier of the holdout analysis.
createdAt (str) – The timestamp at which the holdout analysis version was created.
status (str) – The status of the holdout analysis version.
error (str) – The error message if the status is FAILED.
modelId (str) – The model id associated with the holdout analysis.
modelVersion (str) – The model version associated with the holdout analysis.
algorithm (str) – The algorithm used to train the model.
algoName (str) – The name of the algorithm used to train the model.
metrics (dict) – The metrics of the holdout analysis version.
metricInfos (dict) – The metric infos of the holdout analysis version.

holdout_analysis_version = None

holdout_analysis_id = None

created_at = None

status = None

error = None

model_id = None

model_version = None

algorithm = None

algo_name = None

metrics = None

metric_infos = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: HoldoutAnalysisVersion

describe(get_metrics=False)

Get a holdout analysis version.

Parameters:: get_metrics (bool) – (optional) Whether to get the metrics for the holdout analysis version
Returns:: The holdout analysis version
Return type:: HoldoutAnalysisVersion

wait_for_results(timeout=3600)

A waiting call until holdout analysis for the version is complete

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the holdout analysis version.

Returns:: A string describing the status of a holdout analysis version (pending, complete, etc.).
Return type:: str

class abacusai.HostedApp(client, hostedAppId=None, deploymentConversationId=None, name=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

Hosted App

Parameters:

client (ApiClient) – An authenticated API Client instance
hostedAppId (id) – The ID of the hosted app
deploymentConversationId (id) – The ID of the deployment conversation
name (str) – The name of the hosted app
createdAt (str) – The creation timestamp

hosted_app_id = None

deployment_conversation_id = None

name = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HostedAppContainer(client, hostedAppContainerId=None, hostedAppId=None, deploymentConversationId=None, hostedAppVersion=None, name=None, createdAt=None, updatedAt=None, containerImage=None, route=None, appConfig=None, isDev=None, isDeployable=None, isPreviewAvailable=None, lifecycle=None, status=None, deployedStatus=None, hostname=None, llmArtifactId=None, artifactType=None, deployedLlmArtifactId=None, hasDatabase=None, webAppProjectId=None, parentConversationId=None)

Bases: abacusai.return_class.AbstractApiClass

Hosted app + Deep agent container information.

Parameters:

client (ApiClient) – An authenticated API Client instance
hostedAppContainerId (id) – The ID of the hosted app container
hostedAppId (id) – The ID of the hosted app
deploymentConversationId (id) – The deployment conversation ID
hostedAppVersion (id) – The instance of the hosted app
name (str) – The name of the hosted app
createdAt (str) – Creation timestamp
updatedAt (str) – Last update timestamp
containerImage (str) – Container image name
route (str) – Container route
appConfig (dict) – App configuration
isDev (bool) – Whether this is a dev container
isDeployable (bool) – Can this version be deployed
isPreviewAvailable (bool) – Is the dev preview available on the container
lifecycle (str) – Container lifecycle status (PENDING/DEPLOYING/ACTIVE/FAILED/STOPPED/DELETING)
status (str) – Container status (RUNNING/STOPPED/DEPLOYING/FAILED)
deployedStatus (str) – Deployment status (PENDING/ACTIVE/STOPPED/NOT_DEPLOYED)
hostname (str) – Hostname of the deployed app
llmArtifactId (id) – The ID of the LLM artifact
artifactType (str) – The type of the artifact
deployedLlmArtifactId (id) – The ID of the deployed LLM artifact
hasDatabase (bool) – Whether the app has a database associated to it
webAppProjectId (id) – The ID of the web app project
parentConversationId (id) – The ID of the parent conversation

hosted_app_container_id = None

hosted_app_id = None

deployment_conversation_id = None

hosted_app_version = None

name = None

created_at = None

updated_at = None

container_image = None

route = None

app_config = None

is_dev = None

is_deployable = None

is_preview_available = None

lifecycle = None

status = None

deployed_status = None

hostname = None

llm_artifact_id = None

artifact_type = None

deployed_llm_artifact_id = None

has_database = None

web_app_project_id = None

parent_conversation_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HostedAppFileRead(client, content=None, start=None, end=None, retcode=None)

Bases: abacusai.return_class.AbstractApiClass

Result of reading file content from a hosted app container.

Parameters:

client (ApiClient) – An authenticated API Client instance
content (str) – The contents of the file or a portion of it.
start (int) – If present, the starting position of the read.
end (int) – If present, the last position in the file returned in this read.
retcode (int) – If the read is associated with a log the return code of the command.

content = None

start = None

end = None

retcode = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HostedArtifact(client, hostname=None, artifactType=None, llmArtifactId=None, lifecycle=None, externalApplicationId=None, deploymentConversationId=None, conversationSequenceNumber=None)

Bases: abacusai.return_class.AbstractApiClass

A hosted artifact being served by the platform.

Parameters:

client (ApiClient) – An authenticated API Client instance
hostname (str) – The url at which the application is being hosted.
artifactType (str) – The type of artifact being hosted.
llmArtifactId (str) – The artifact id being hosted.
lifecycle (str) – The lifecycle of the artifact.
externalApplicationId (str) – Agent that deployed this application.
deploymentConversationId (str) – Conversation that created deployed this artifact, null if not applicable.
conversationSequenceNumber (number(integer)) – Conversation event associated with this artifact, null if not applicable.

hostname = None

artifact_type = None

llm_artifact_id = None

lifecycle = None

external_application_id = None

deployment_conversation_id = None

conversation_sequence_number = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HostedDatabase(client, hostedDatabaseId=None, displayName=None, createdAt=None, updatedAt=None, lifecycle=None)

Bases: abacusai.return_class.AbstractApiClass

Hosted Database

Parameters:

client (ApiClient) – An authenticated API Client instance
hostedDatabaseId (id) – The ID of the hosted database
displayName (str) – The name of the hosted database
createdAt (str) – The creation timestamp
updatedAt (str) – The last update timestamp
lifecycle (str) – The lifecycle of the hosted database

hosted_database_id = None

display_name = None

created_at = None

updated_at = None

lifecycle = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HostedDatabaseSnapshot(client, hostedDatabaseSnapshotId=None, srcHostedDatabaseId=None, createdAt=None, updatedAt=None, lifecycle=None)

Bases: abacusai.return_class.AbstractApiClass

Hosted Database Snapshot

Parameters:

client (ApiClient) – An authenticated API Client instance
hostedDatabaseSnapshotId (id) – The ID of the hosted database snapshot
srcHostedDatabaseId (id) – The ID of the source hosted database
createdAt (str) – The creation timestamp
updatedAt (str) – The last update timestamp
lifecycle (str) – The lifecycle of the hosted database snapshot

hosted_database_snapshot_id = None

src_hosted_database_id = None

created_at = None

updated_at = None

lifecycle = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HostedModelToken(client, createdAt=None, tag=None, trailingAuthToken=None, hostedModelTokenId=None)

Bases: abacusai.return_class.AbstractApiClass

A hosted model authentication token that is used to authenticate requests to an abacus hosted model

Parameters:

client (ApiClient) – An authenticated API Client instance
createdAt (str) – When the token was created
tag (str) – A user-friendly tag for the API key.
trailingAuthToken (str) – The last four characters of the un-encrypted auth token
hostedModelTokenId (str) – The unique identifier attached to this authenticaion token

created_at = None

tag = None

trailing_auth_token = None

hosted_model_token_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HostnameInfo(client, isRootDomain=None, registrar=None, suggestedFlow=None)

Bases: abacusai.return_class.AbstractApiClass

Hostname Info

Parameters:

client (ApiClient) – An authenticated API Client instance
isRootDomain (bool) – Whether the hostname is a root domain
registrar (str) – The registrar of the domain
suggestedFlow (str) – Suggested flow for the domain

is_root_domain = None

registrar = None

suggested_flow = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.HumeVoice(client, name=None, gender=None)

Bases: abacusai.return_class.AbstractApiClass

Hume Voice

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the voice.
gender (str) – The gender of the voice.

name = None

gender = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ImageGenModel(client, displayName=None, type=None, valueType=None, optional=None, default=None, helptext=None, options={})

Bases: abacusai.return_class.AbstractApiClass

Image generation model

Parameters:

client (ApiClient) – An authenticated API Client instance
displayName (str) – The display name for the UI component.
type (str) – The type of the UI component.
valueType (str) – The data type of the values within the UI component.
optional (bool) – Whether the selection of a value is optional.
default (str) – The default value for the image generation model.
helptext (str) – The helptext for the UI component.
options (ImageGenModelOptions) – The options of models available for image generation.

display_name = None

type = None

value_type = None

optional = None

default = None

helptext = None

options

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ImageGenModelOptions(client, keys=None, values=None)

Bases: abacusai.return_class.AbstractApiClass

Image generation model options

Parameters:

client (ApiClient) – An authenticated API Client instance
keys (list) – The keys of the image generation model options represented as the enum values.
values (list) – The display names of the image generation model options.

keys = None

values = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ImageGenSettings(client, settings=None, model={})

Bases: abacusai.return_class.AbstractApiClass

Image generation settings

Parameters:

client (ApiClient) – An authenticated API Client instance
settings (dict) – The settings for each model.
model (ImageGenModel) – Dropdown for models available for image generation.

settings = None

model

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.IndexingConfig(client, primaryKey=None, updateTimestampKey=None, lookupKeys=None)

Bases: abacusai.return_class.AbstractApiClass

The indexing config for a Feature Group

Parameters:

client (ApiClient) – An authenticated API Client instance
primaryKey (str) – A single key index
updateTimestampKey (str) – The primary timestamp feature
lookupKeys (list[str]) – A multi-key index. Cannot be used in conjuction with primary key.

primary_key = None

update_timestamp_key = None

lookup_keys = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.InferredDatabaseColumnToFeatureMappings(client, databaseColumnToFeatureMappings={})

Bases: abacusai.return_class.AbstractApiClass

Autocomplete mappings for database to connector columns

Parameters:

client (ApiClient) – An authenticated API Client instance
databaseColumnToFeatureMappings (DatabaseColumnFeatureMapping) – Database columns feature mappings

database_column_to_feature_mappings

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.InferredFeatureMappings(client, error=None, featureMappings={})

Bases: abacusai.return_class.AbstractApiClass

A description of the data use for a feature

Parameters:

client (ApiClient) – An authenticated API Client instance
error (str) – Error message if there was an error inferring the feature mappings
featureMappings (FeatureMapping) – The inferred feature mappings

error = None

feature_mappings

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ItemStatistics(client, missingPercent=None, count=None, median=None, mean=None, p10=None, p90=None, stddev=None, min=None, max=None, lowerBound=None, upperBound=None)

Bases: abacusai.return_class.AbstractApiClass

ItemStatistics representation.

Parameters:

client (ApiClient) – An authenticated API Client instance
missingPercent (float) – percentage of missing values in data
count (int) – count of data
median (float) – median of the data
mean (float) – mean value of the data
p10 (float) – 10th percentile of the data
p90 (float) – 90th_percentile of the data
stddev (float) – standard deviation of the data
min (int) – min value in the data
max (int) – max value in the data
lowerBound (float) – lower bound threshold of the data
upperBound (float) – upper bound threshold of the data

missing_percent = None

count = None

median = None

mean = None

p10 = None

p90 = None

stddev = None

min = None

max = None

lower_bound = None

upper_bound = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LipSyncGenSettings(client, model=None, settings=None)

Bases: abacusai.return_class.AbstractApiClass

Lip sync generation settings

Parameters:

client (ApiClient) – An authenticated API Client instance
model (dict) – The model settings.
settings (dict) – The settings for each model.

model = None

settings = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmApp(client, llmAppId=None, name=None, description=None, projectId=None, deploymentId=None, createdAt=None, updatedAt=None, status=None)

Bases: abacusai.return_class.AbstractApiClass

An LLM App that can be used for generation. LLM Apps are specifically crafted to help with certain tasks like code generation or question answering.

Parameters:

client (ApiClient) – An authenticated API Client instance
llmAppId (str) – The unique identifier of the LLM App.
name (str) – The name of the LLM App.
description (str) – The description of the LLM App.
projectId (str) – The project ID of the deployment associated with the LLM App.
deploymentId (str) – The deployment ID associated with the LLM App.
createdAt (str) – The timestamp at which the LLM App was created.
updatedAt (str) – The timestamp at which the LLM App was updated.
status (str) – The status of the LLM App’s deployment.

llm_app_id = None

name = None

description = None

project_id = None

deployment_id = None

created_at = None

updated_at = None

status = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmArtifact(client, llmArtifactId=None, info=None, description=None, createdAt=None, webAppDeploymentId=None, deploymentStatus=None, isLatest=None, deploymentConversationId=None, webAppProjectId=None)

Bases: abacusai.return_class.AbstractApiClass

LLM Artifact

Parameters:

client (ApiClient) – An authenticated API Client instance
llmArtifactId (id) – The ID of the LLM artifact
info (dict) – The info of the LLM artifact
description (str) – The description of the LLM artifact
createdAt (str) – The creation timestamp
webAppDeploymentId (id) – The ID of the associated web app deployment
deploymentStatus (str) – The status of the associated web app deployment
isLatest (bool) – Whether it is the most recent version of the artifact
deploymentConversationId (id) – The ID of the deployment conversation
webAppProjectId (id) – The ID of the web app project

llm_artifact_id = None

info = None

description = None

created_at = None

web_app_deployment_id = None

deployment_status = None

is_latest = None

deployment_conversation_id = None

web_app_project_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmCodeBlock(client, language=None, code=None, start=None, end=None, valid=None)

Bases: abacusai.return_class.AbstractApiClass

Parsed code block from an LLM response

Parameters:

client (ApiClient) – An authenticated API Client instance
language (str) – The language of the code block. Eg - python/sql/etc.
code (str) – source code string
start (int) – index of the starting character of the code block in the original response
end (int) – index of the last character of the code block in the original response
valid (bool) – flag denoting whether the soruce code string is syntactically valid

language = None

code = None

start = None

end = None

valid = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmExecutionPreview(client, error=None, sql=None)

Bases: abacusai.return_class.AbstractApiClass

Preview of executing queries using LLM.

Parameters:

client (ApiClient) – An authenticated API Client instance
error (str) – The error message if the preview failed.
sql (str) – Preview of SQL query generated by LLM.

error = None

sql = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmExecutionResult(client, status=None, error=None, execution={}, preview={})

Bases: abacusai.return_class.AbstractApiClass

Results of executing queries using LLM.

Parameters:

client (ApiClient) – An authenticated API Client instance
status (str) – The status of the execution.
error (str) – The error message if the execution failed.
execution (ExecuteFeatureGroupOperation) – Information on execution of the query.
preview (LlmExecutionPreview) – Preview of executing queries using LLM.

status = None

error = None

execution

preview

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmGeneratedCode(client, sql=None)

Bases: abacusai.return_class.AbstractApiClass

Code generated by LLM.

Parameters:

client (ApiClient) – An authenticated API Client instance
sql (str) – SQL query generated by LLM.

sql = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmInput(client, content=None)

Bases: abacusai.return_class.AbstractApiClass

The result of encoding an object as input for a language model.

Parameters:

client (ApiClient) – An authenticated API Client instance
content (str) – Content of the response

content = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmParameters(client, parameters=None)

Bases: abacusai.return_class.AbstractApiClass

The parameters of LLM for given inputs.

Parameters:

client (ApiClient) – An authenticated API Client instance
parameters (dict) – The parameters of LLM for given inputs.

parameters = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.LlmResponse(client, content=None, tokens=None, stopReason=None, llmName=None, inputTokens=None, outputTokens=None, totalTokens=None, codeBlocks={})

Bases: abacusai.return_class.AbstractApiClass

The response returned by LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
content (str) – Full response from LLM.
tokens (int) – The number of tokens in the response.
stopReason (str) – The reason due to which the response generation stopped.
llmName (str) – The name of the LLM model used to generate the response.
inputTokens (int) – The number of input tokens used in the LLM call.
outputTokens (int) – The number of output tokens generated in the LLM response.
totalTokens (int) – The total number of tokens (input + output) used in the LLM interaction.
codeBlocks (LlmCodeBlock) – A list of parsed code blocks from raw LLM Response

content = None

tokens = None

stop_reason = None

llm_name = None

input_tokens = None

output_tokens = None

total_tokens = None

code_blocks

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.McpConfig(client, mcpConfig=None)

Bases: abacusai.return_class.AbstractApiClass

Model Context Protocol Config

Parameters:

client (ApiClient) – An authenticated API Client instance
mcpConfig (dict) – The MCP configuration for the current user

mcp_config = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.McpServer(client, name=None, description=None, envVars=None, config=None, envVarInstructions=None, url=None, isActive=None, metadata=None)

Bases: abacusai.return_class.AbstractApiClass

Model Context Protocol Server

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the MCP server.
description (str) – description of what the MCP server does.
envVars (list) – list of api_keys or credentials required by the MCP server.
config (str) – a json string containing the command and arguments for the MCP server.
envVarInstructions (str) – instructions for the user to get the environment variables.
url (str) – The url of the MCP server github repository or webpage.
isActive (bool) – Whether the MCP server is active.
metadata (dict) – additional information about the MCP server including github_stars, etc.

name = None

description = None

env_vars = None

config = None

env_var_instructions = None

url = None

is_active = None

metadata = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.McpServerConnection(client, mcpServerConnectionId=None, createdAt=None, updatedAt=None, name=None, config=None, description=None, transport=None, authType=None, externalConnectionId=None, inactive=None, tools=None, errorMsg=None, metadata=None)

Bases: abacusai.return_class.AbstractApiClass

Model Context Protocol Server Connection

Parameters:

client (ApiClient) – An authenticated API Client instance
mcpServerConnectionId (id) – the id of the MCP server connection.
createdAt (str) – the date and time the MCP server connection was created.
updatedAt (str) – the date and time the MCP server connection was updated.
name (str) – The name of the MCP server.
config (dict) – a dictionary containing the config for the MCP server.
description (str) – description of what the MCP server does.
transport (str) – the transport type for the MCP server.
authType (str) – the auth type for the MCP server.
externalConnectionId (id) – the external connection id for the MCP server.
inactive (bool) – whether the MCP server is inactive.
tools (list) – the tools for the MCP server.
errorMsg (str) – the error message for the MCP server.
metadata (dict) – the metadata for the MCP server.

mcp_server_connection_id = None

created_at = None

updated_at = None

name = None

config = None

description = None

transport = None

auth_type = None

external_connection_id = None

inactive = None

tools = None

error_msg = None

metadata = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.McpServerQueryResult(client, output=None)

Bases: abacusai.return_class.AbstractApiClass

Result of a MCP server query

Parameters:

client (ApiClient) – An authenticated API Client instance
output (dict) – The execution logs as well as the final output of the MCP server query

output = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.MemoryOptions(client, cpu={}, gpu={})

Bases: abacusai.return_class.AbstractApiClass

The overall memory options for executing a job

Parameters:

client (ApiClient) – An authenticated API Client instance
cpu (CpuGpuMemorySpecs) – Contains information about the default CPU and list of CPU memory & size options
gpu (CpuGpuMemorySpecs) – Contains information about the default GPU and list of GPU memory & size options

cpu

gpu

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.MessagingConnectorResponse(client, welcomeMessage=None, defaultMessage=None, disclaimer=None, messagingBotName=None, useDefaultLabel=None, initAckReq=None, defaultLabels=None, enabledExternalLinks=None)

Bases: abacusai.return_class.AbstractApiClass

The response to view label data for Teams

Parameters:

client (ApiClient) – An authenticated API Client instance
welcomeMessage (str) – on the first installation of the app the user will get this message
defaultMessage (str) – when user triggers hi, hello, help they will get this message
disclaimer (str) – given along with every bot response
messagingBotName (str) – the name you want to see at various places instead of Abacus.AI
useDefaultLabel (bool) – to use the default Abacus.AI label in case it is set to true
initAckReq (bool) – Set to true if the initial Acknowledgment for the query is required by the user
defaultLabels (dict) – Dictionary of default labels, if the user-specified labels aren’t set
enabledExternalLinks (list) – list of external application which have external links applicable

welcome_message = None

default_message = None

disclaimer = None

messaging_bot_name = None

use_default_label = None

init_ack_req = None

default_labels = None

enabled_external_links = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Model(client, name=None, modelId=None, modelConfigType=None, modelPredictionConfig=None, createdAt=None, projectId=None, trainFunctionName=None, predictFunctionName=None, predictManyFunctionName=None, initializeFunctionName=None, trainingInputTables=None, sourceCode=None, cpuSize=None, memory=None, trainingFeatureGroupIds=None, algorithmModelConfigs=None, trainingVectorStoreVersions=None, documentRetrievers=None, documentRetrieverIds=None, isPythonModel=None, defaultAlgorithm=None, customAlgorithmConfigs=None, restrictedAlgorithms=None, useGpu=None, notebookId=None, trainingRequired=None, location={}, refreshSchedules={}, codeSource={}, databaseConnector={}, dataLlmFeatureGroups={}, latestModelVersion={}, modelConfig={})

Bases: abacusai.return_class.AbstractApiClass

A model

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The user-friendly name for the model.
modelId (str) – The unique identifier of the model.
modelConfigType (str) – Name of the TrainingConfig class of the model_config.
modelPredictionConfig (dict) – The prediction config options for the model.
createdAt (str) – Date and time at which the model was created.
projectId (str) – The project this model belongs to.
trainFunctionName (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
predictFunctionName (str) – Name of the function found in the source code that will be executed run predictions through model. It is not executed when this function is run.
predictManyFunctionName (str) – Name of the function found in the source code that will be executed to run batch predictions trhough the model.
initializeFunctionName (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model
trainingInputTables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
sourceCode (str) – Python code used to make the model.
cpuSize (str) – Cpu size specified for the python model training.
memory (int) – Memory in GB specified for the python model training.
trainingFeatureGroupIds (list of unique string identifiers) – The unique identifiers of the feature groups used as the inputs to train this model on.
algorithmModelConfigs (list[dict]) – List of algorithm specific training configs.
trainingVectorStoreVersions (list) – The vector store version IDs used as inputs during training to create this ModelVersion.
documentRetrievers (list) – List of document retrievers use to create this model.
documentRetrieverIds (list) – List of document retriever IDs used to create this model.
isPythonModel (bool) – If this model is handled as python model
defaultAlgorithm (str) – If set, this algorithm will always be used when deploying the model regardless of the model metrics
customAlgorithmConfigs (dict) – User-defined configs for each of the user-defined custom algorithm
restrictedAlgorithms (dict) – User-selected algorithms to train.
useGpu (bool) – If this model uses gpu.
notebookId (str) – The notebook associated with this model.
trainingRequired (bool) – If training is required to keep the model up-to-date.
latestModelVersion (ModelVersion) – The latest model version.
location (ModelLocation) – Location information for models that are imported.
refreshSchedules (RefreshSchedule) – List of refresh schedules that indicate when the next model version will be trained
codeSource (CodeSource) – If a python model, information on the source code
databaseConnector (DatabaseConnector) – Database connector used by the model.
dataLlmFeatureGroups (FeatureGroup) – List of feature groups used by the model for queries
modelConfig (TrainingConfig) – The training config options used to train this model.

name = None

model_id = None

model_config_type = None

model_prediction_config = None

created_at = None

project_id = None

train_function_name = None

predict_function_name = None

predict_many_function_name = None

initialize_function_name = None

training_input_tables = None

source_code = None

cpu_size = None

memory = None

training_feature_group_ids = None

algorithm_model_configs = None

training_vector_store_versions = None

document_retrievers = None

document_retriever_ids = None

is_python_model = None

default_algorithm = None

custom_algorithm_configs = None

restricted_algorithms = None

use_gpu = None

notebook_id = None

training_required = None

location

refresh_schedules

code_source

database_connector

data_llm_feature_groups

latest_model_version

model_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

describe_train_test_data_split_feature_group()

Get the train and test data split for a trained model by its unique identifier. This is only supported for models with custom algorithms.

Parameters:: model_id (str) – The unique ID of the model. By default, the latest model version will be returned if no version is specified.
Returns:: The feature group containing the training data and fold information.
Return type:: FeatureGroup

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Model

describe()

Retrieves a full description of the specified model.

Parameters:: model_id (str) – Unique string identifier associated with the model.
Returns:: Description of the model.
Return type:: Model

rename(name)

Renames a model

Parameters:: name (str) – The new name to assign to the model.

update_python(function_source_code=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, initialize_function_name=None, training_input_tables=None, cpu_size=None, memory=None, package_requirements=None, use_gpu=None, is_thread_safe=None, training_config=None)

Updates an existing Python Model using user-provided Python code. If a list of input feature groups is supplied, they will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects functionSourceCode to be a valid language source file which contains the functions named trainFunctionName and predictFunctionName. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName. predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

function_source_code (str) – Contents of a valid Python source code file. The source code should contain the functions named trainFunctionName and predictFunctionName. A list of allowed import and system libraries for each language is specified in the user functions documentation section.
train_function_name (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the source code that will be executed to run batch predictions through the model. It is not executed when this function is run.
initialize_function_name (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized DataFrames (same type as the functions return value).
cpu_size (str) – Size of the CPU for the model training function.
memory (int) – Memory (in GB) for the model training function.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
use_gpu (bool) – Whether this model needs gpu
is_thread_safe (bool) – Whether this model is thread safe
training_config (TrainingConfig) – The training config used to train this model.

Returns:

The updated model.

Return type:

Model

update_python_zip(train_function_name=None, predict_function_name=None, predict_many_function_name=None, train_module_name=None, predict_module_name=None, training_input_tables=None, cpu_size=None, memory=None, package_requirements=None, use_gpu=None)

Updates an existing Python Model using a provided zip file. If a list of input feature groups are supplied, they will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects trainModuleName and predictModuleName to be valid language source files which contain the functions named trainFunctionName and predictFunctionName, respectively. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName, and predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

train_function_name (str) – Name of the function found in the train module that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the predict module that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the predict module that will be executed to run batch predictions through the model. It is not executed when this function is run.
train_module_name (str) – Full path of the module that contains the train function from the root of the zip.
predict_module_name (str) – Full path of the module that contains the predict function from the root of the zip.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the function’s return value).
cpu_size (str) – Size of the CPU for the model training function.
memory (int) – Memory (in GB) for the model training function.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
use_gpu (bool) – Whether this model needs gpu

Returns:

The updated model.

Return type:

Upload

update_python_git(application_connector_id=None, branch_name=None, python_root=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, train_module_name=None, predict_module_name=None, training_input_tables=None, cpu_size=None, memory=None, use_gpu=None)

Updates an existing Python model using an existing Git application connector. If a list of input feature groups are supplied, these will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects trainModuleName and predictModuleName to be valid language source files which contain the functions named trainFunctionName and predictFunctionName, respectively. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName, and predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

application_connector_id (str) – The unique ID associated with the Git application connector.
branch_name (str) – Name of the branch in the Git repository to be used for training.
python_root (str) – Path from the top level of the Git repository to the directory containing the Python source code. If not provided, the default is the root of the Git repository.
train_function_name (str) – Name of the function found in train module that will be executed to train the model. It is not executed when this function is run.
predict_function_name (str) – Name of the function found in the predict module that will be executed to run predictions through model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the predict module that will be executed to run batch predictions through model. It is not executed when this function is run.
train_module_name (str) – Full path of the module that contains the train function from the root of the zip.
predict_module_name (str) – Full path of the module that contains the predict function from the root of the zip.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
cpu_size (str) – Size of the CPU for the model training function.
memory (int) – Memory (in GB) for the model training function.
use_gpu (bool) – Whether this model needs gpu

Returns:

The updated model.

Return type:

Model

set_training_config(training_config, feature_group_ids=None)

Edits the default model training config

Parameters:

training_config (TrainingConfig) – The training config used to train this model.
feature_group_ids (List) – The list of feature groups used as input to the model.

Returns:

The model object corresponding to the updated training config.

Return type:

Model

set_prediction_params(prediction_config)

Sets the model prediction config for the model

Parameters:: prediction_config (dict) – Prediction configuration for the model.
Returns:: Model object after the prediction configuration is applied.
Return type:: Model

get_metrics(model_version=None, return_graphs=False, validation=False)

Retrieves metrics for all the algorithms trained in this model version.

If only the model’s unique identifier (model_id) is specified, the latest trained version of the model (model_version) is used.

Parameters:

model_version (str) – Version of the model.
return_graphs (bool) – If true, will return the information used for the graphs on the model metrics page such as PR Curve per label.
validation (bool) – If true, will return the validation metrics instead of the test metrics.

Returns:

An object containing the model metrics and explanations for what each metric means.

Return type:

ModelMetrics

list_versions(limit=100, start_after_version=None)

Retrieves a list of versions for a given model.

Parameters:

limit (int) – Maximum length of the list of all dataset versions.
start_after_version (str) – Unique string identifier of the version after which the list starts.

Returns:

An array of model versions.

Return type:

list[ModelVersion]

retrain(deployment_ids=None, feature_group_ids=None, custom_algorithms=None, builtin_algorithms=None, custom_algorithm_configs=None, cpu_size=None, memory=None, training_config=None, algorithm_training_configs=None)

Retrains the specified model, with an option to choose the deployments to which the retraining will be deployed.

Parameters:

deployment_ids (List) – List of unique string identifiers of deployments to automatically deploy to.
feature_group_ids (List) – List of feature group IDs provided by the user to train the model on.
custom_algorithms (list) – List of user-defined algorithms to train. If not set, will honor the runs from the last time and applicable new custom algorithms.
builtin_algorithms (list) – List of algorithm names or algorithm IDs of Abacus.AI built-in algorithms to train. If not set, will honor the runs from the last time and applicable new built-in algorithms.
custom_algorithm_configs (dict) – User-defined training configs for each custom algorithm.
cpu_size (str) – Size of the CPU for the user-defined algorithms during training.
memory (int) – Memory (in GB) for the user-defined algorithms during training.
training_config (TrainingConfig) – The training config used to train this model.
algorithm_training_configs (list) – List of algorithm specifc training configs that will be part of the model training AutoML run.

Returns:

The model that is being retrained.

Return type:

Model

delete()

Deletes the specified model and all its versions. Models which are currently used in deployments cannot be deleted.

Parameters:: model_id (str) – Unique string identifier of the model to delete.

set_default_algorithm(algorithm=None, data_cluster_type=None)

Sets the model’s algorithm to default for all new deployments

Parameters:

algorithm (str) – Algorithm to pin in the model.
data_cluster_type (str) – Data cluster type to set the lead model for.

list_artifacts_exports(limit=25)

List all the model artifacts exports.

Parameters:: limit (int) – Maximum length of the list of all exports.
Returns:: List of model artifacts exports.
Return type:: list[ModelArtifactsExport]

get_training_types_for_deployment(model_version=None, algorithm=None)

Returns types of models that can be deployed for a given model instance ID.

Parameters:

model_version (str) – The unique ID associated with the model version to deploy.
algorithm (str) – The unique ID associated with the algorithm to deploy.

Returns:

Model training types for deployment.

Return type:

ModelTrainingTypeForDeployment

update_agent(function_source_code=None, agent_function_name=None, memory=None, package_requirements=None, description=None, enable_binary_input=None, agent_input_schema=None, agent_output_schema=None, workflow_graph=None, agent_interface=None, included_modules=None, org_level_connectors=None, user_level_connectors=None, initialize_function_name=None, initialize_function_code=None)

Updates an existing AI Agent. A new version of the agent will be created and published.

Parameters:

memory (int) – Memory (in GB) for the agent.
package_requirements (list) – A list of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
description (str) – A description of the agent, including its purpose and instructions.
workflow_graph (WorkflowGraph) – The workflow graph for the agent.
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
included_modules (List) – A list of user created custom modules to include in the agent’s environment.
org_level_connectors (List) – A list of org level connector ids to be used by the agent.
user_level_connectors (Dict) – A dictionary mapping ApplicationConnectorType keys to lists of OAuth scopes. Each key represents a specific user level application connector, while the value is a list of scopes that define the permissions granted to the application.
initialize_function_name (str) – The name of the function to be used for initialization.
initialize_function_code (str) – The function code to be used for initialization.
function_source_code (str)
agent_function_name (str)
enable_binary_input (bool)
agent_input_schema (dict)
agent_output_schema (dict)

Returns:

The updated agent.

Return type:

Agent

wait_for_training(timeout=None)

A waiting call until model is trained.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_evaluation(timeout=None)

A waiting call until model is evaluated completely.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_publish(timeout=None)

A waiting call until agent is published.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_full_automl(timeout=None)

A waiting call until full AutoML cycle is completed.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status(get_automl_status=False)

Gets the status of the model training.

Returns:: A string describing the status of a model training (pending, complete, etc.).
Return type:: str
Parameters:: get_automl_status (bool)

create_refresh_policy(cron)

To create a refresh policy for a model.

Parameters:: cron (str) – A cron style string to set the refresh time.
Returns:: The refresh policy object.
Return type:: RefreshPolicy

list_refresh_policies()

Gets the refresh policies in a list.

Returns:: A list of refresh policy objects.
Return type:: List[RefreshPolicy]

get_train_test_feature_group_as_pandas()

Get the model train test data split feature group as pandas.

Returns:: A pandas dataframe for the training data with fold column.
Return type:: pandas.Dataframe

class abacusai.ModelArtifactsExport(client, modelArtifactsExportId=None, modelVersion=None, outputLocation=None, status=None, createdAt=None, exportCompletedAt=None, error=None)

Bases: abacusai.return_class.AbstractApiClass

A Model Artifacts Export Job

Parameters:

client (ApiClient) – An authenticated API Client instance
modelArtifactsExportId (str) – Unique identifier for this export.
modelVersion (str) – Version of the model being exported.
outputLocation (str) – File Connector location the feature group is being written to.
status (str) – Current status of the export.
createdAt (str) – Timestamp at which the export was created (ISO-8601 format).
exportCompletedAt (str) – Timestamp at which the export completed (ISO-8601 format).
error (str) – If status is FAILED, this field will be populated with an error.

model_artifacts_export_id = None

model_version = None

output_location = None

status = None

created_at = None

export_completed_at = None

error = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: ModelArtifactsExport

describe()

Get the description and status of the model artifacts export.

Parameters:: model_artifacts_export_id (str) – A unique string identifier for the export.
Returns:: Object describing the export and its status.
Return type:: ModelArtifactsExport

wait_for_results(timeout=3600)

A waiting call until model artifacts export is created.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the model artifacts export.

Returns:: A string describing the status of a model artifacts export (pending, complete, etc.).
Return type:: str

class abacusai.ModelBlueprintExport(client, modelVersion=None, currentTrainingConfig=None, modelBlueprintStages={})

Bases: abacusai.return_class.AbstractApiClass

Model Blueprint

Parameters:

client (ApiClient) – An authenticated API Client instance
modelVersion (str) – Version of the model that the blueprint is for.
currentTrainingConfig (dict) – The current training configuration for the model. It can be used to get training configs and train a new model
modelBlueprintStages (ModelBlueprintStage) – The stages of the model blueprint. Each one includes the stage name, display name, description, parameters, and predecessors.

model_version = None

current_training_config = None

model_blueprint_stages

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelBlueprintStage(client, stageName=None, displayName=None, description=None, params=None, predecessors=None)

Bases: abacusai.return_class.AbstractApiClass

A stage in the model blueprint export process.

Parameters:

client (ApiClient) – An authenticated API Client instance
stageName (str) – The name of the stage.
displayName (str) – The display name of the stage.
description (str) – The description of the stage.
params (dict) – The parameters for the stage.
predecessors (list) – A list of stages that occur directly before this stage.

stage_name = None

display_name = None

description = None

params = None

predecessors = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelLocation(client, location=None, artifactNames=None)

Bases: abacusai.return_class.AbstractApiClass

Provide location information for the plug-and-play model.

Parameters:

client (ApiClient) – An authenticated API Client instance
location (str) – Location of the plug-and-play model.
artifactNames (dict) – Representations of the names of the artifacts used to create the model.

location = None

artifact_names = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelMetrics(client, algoMetrics=None, selectedAlgorithm=None, selectedAlgorithmName=None, modelId=None, modelVersion=None, metricNames=None, targetColumn=None, trainValTestSplit=None, trainingCompletedAt=None)

Bases: abacusai.return_class.AbstractApiClass

Metrics of the trained model.

Parameters:

client (ApiClient) – An authenticated API Client instance
algoMetrics (dict) – Dictionary mapping algorithm ID to algorithm name and algorithm metrics dictionary
selectedAlgorithm (str) – The algorithm ID of the selected (default) algorithm that will be used in deployments of this Model Version
selectedAlgorithmName (str) – The algorithm name of the selected (default) algorithm that will be used in deployments of this Model Version
modelId (str) – The Model ID
modelVersion (str) – The Model Version
metricNames (dict) – Maps shorthand names of the metrics to their verbose names
targetColumn (str) – The target feature that the model was trained to predict
trainValTestSplit (dict) – Info on train, val and test split
trainingCompletedAt (datetime) – Timestamp when training was completed

algo_metrics = None

selected_algorithm = None

selected_algorithm_name = None

model_id = None

model_version = None

metric_names = None

target_column = None

train_val_test_split = None

training_completed_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelMonitor(client, modelMonitorId=None, name=None, createdAt=None, projectId=None, trainingFeatureGroupId=None, predictionFeatureGroupId=None, predictionFeatureGroupVersion=None, trainingFeatureGroupVersion=None, alertConfig=None, biasMetricId=None, metricConfigs=None, featureGroupMonitorConfigs=None, metricTypes=None, modelId=None, starred=None, batchPredictionId=None, monitorType=None, edaConfigs=None, trainingForecastConfig=None, predictionForecastConfig=None, forecastFrequency=None, trainingFeatureGroupSampling=None, predictionFeatureGroupSampling=None, monitorDriftConfig=None, predictionDataUseMappings=None, trainingDataUseMappings=None, refreshSchedules={}, latestMonitorModelVersion={})

Bases: abacusai.return_class.AbstractApiClass

A model monitor

Parameters:

client (ApiClient) – An authenticated API Client instance
modelMonitorId (str) – The unique identifier of the model monitor.
name (str) – The user-friendly name for the model monitor.
createdAt (str) – Date and time at which the model was created.
projectId (str) – The project this model belongs to.
trainingFeatureGroupId (list[str]) – Feature group IDs that this model monitor is monitoring.
predictionFeatureGroupId (list[str]) – Feature group IDs that this model monitor is monitoring.
predictionFeatureGroupVersion (list[str]) – Feature group versions that this model monitor is monitoring.
trainingFeatureGroupVersion (list[str]) – Feature group versions that this model monitor is monitoring.
alertConfig (dict) – Alerting configuration for this model monitor.
biasMetricId (str) – The bias metric ID
metricConfigs (dict) – Configurations for model monitor
featureGroupMonitorConfigs (dict) – Configurations for feature group monitor
metricTypes (dict) – List of metric types
modelId (str) – Model ID that this model monitor is monitoring.
starred (bool) – Whether this model monitor is starred.
batchPredictionId (str) – The batch prediction ID this model monitor monitors
monitorType (str) – The type of the monitor, one of MODEL_MONITOR, or FEATURE_GROUP_MONITOR
edaConfigs (dict) – The configs for EDA
trainingForecastConfig (dict) – The tarining config for forecast monitors
predictionForecastConfig (dict) – The prediction config for forecast monitors
forecastFrequency (str) – The frequency of the forecast
trainingFeatureGroupSampling (bool) – Whether or not we sample from training feature group
predictionFeatureGroupSampling (bool) – Whether or not we sample from prediction feature group
monitorDriftConfig (dict) – The monitor drift config for the monitor
predictionDataUseMappings (dict) – The data_use mapping of the prediction features
trainingDataUseMappings (dict) – The data_use mapping of the training features
latestMonitorModelVersion (ModelMonitorVersion) – The latest model monitor version.
refreshSchedules (RefreshSchedule) – List of refresh schedules that indicate when the next model version will be trained.

model_monitor_id = None

name = None

created_at = None

project_id = None

training_feature_group_id = None

prediction_feature_group_id = None

prediction_feature_group_version = None

training_feature_group_version = None

alert_config = None

bias_metric_id = None

metric_configs = None

feature_group_monitor_configs = None

metric_types = None

model_id = None

starred = None

batch_prediction_id = None

monitor_type = None

eda_configs = None

training_forecast_config = None

prediction_forecast_config = None

forecast_frequency = None

training_feature_group_sampling = None

prediction_feature_group_sampling = None

monitor_drift_config = None

prediction_data_use_mappings = None

training_data_use_mappings = None

refresh_schedules

latest_monitor_model_version

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

rerun()

Re-runs the specified model monitor.

Parameters:: model_monitor_id (str) – Unique string identifier of the model monitor to re-run.
Returns:: The model monitor that is being re-run.
Return type:: ModelMonitor

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: ModelMonitor

describe()

Retrieves a full description of the specified model monitor.

Parameters:: model_monitor_id (str) – Unique string identifier associated with the model monitor.
Returns:: Description of the model monitor.
Return type:: ModelMonitor

get_summary()

Gets the summary of a model monitor across versions.

Parameters:: model_monitor_id (str) – A unique string identifier associated with the model monitor.
Returns:: An object describing integrity, bias violations, model accuracy and drift for the model monitor.
Return type:: ModelMonitorSummary

list_versions(limit=100, start_after_version=None)

Retrieves a list of versions for a given model monitor.

Parameters:

limit (int) – The maximum length of the list of all model monitor versions.
start_after_version (str) – The ID of the version after which the list starts.

Returns:

A list of model monitor versions.

Return type:

list[ModelMonitorVersion]

rename(name)

Renames a model monitor

Parameters:: name (str) – The new name to apply to the model monitor.

delete()

Deletes the specified Model Monitor and all its versions.

Parameters:: model_monitor_id (str) – Unique identifier of the Model Monitor to delete.

list_monitor_alerts_for_monitor(realtime_monitor_id=None)

Retrieves the list of monitor alerts for a specified monitor. One of the model_monitor_id or realtime_monitor_id is required but not both.

Parameters:: realtime_monitor_id (str) – The unique ID associated with the real-time monitor.
Returns:: A list of monitor alerts.
Return type:: list[MonitorAlert]

class abacusai.ModelMonitorOrgSummary(client, summary=None, featureDrift=None, labelDrift=None, dataIntegrity=None, performance=None, alerts=None, monitorData=None, totalStarredMonitors=None)

Bases: abacusai.return_class.AbstractApiClass

A summary of an organization’s model monitors

Parameters:

client (ApiClient) – An authenticated API Client instance
summary (dict) – Count of monitors, count of versions, count of total rows of prediction data, count of failed versions.
featureDrift (dict) – Percentage of monitors with and without KL divergence > 2.
labelDrift (dict) – Histogram of label drift across versions.
dataIntegrity (dict) – Counts of violations.
performance (dict) – Model accuracy information.
alerts (dict) – Count of alerts that are raised.
monitorData (dict) – Information about monitors used in the summary for each time period.
totalStarredMonitors (int) – Total number of starred monitors.

summary = None

feature_drift = None

label_drift = None

data_integrity = None

performance = None

alerts = None

monitor_data = None

total_starred_monitors = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelMonitorSummary(client, modelAccuracy=None, modelDrift=None, dataIntegrity=None, biasViolations=None, alerts=None)

Bases: abacusai.return_class.AbstractApiClass

A summary of model monitor

Parameters:

client (ApiClient) – An authenticated API Client instance
modelAccuracy (list) – A list of model accuracy objects including accuracy and monitor version information.
modelDrift (list) – A list of model drift objects including label and prediction drifts and monitor version information.
dataIntegrity (list) – A list of data integrity objects including counts of violations and monitor version information.
biasViolations (list) – A list of bias objects including bias counts and monitor version information.
alerts (list) – A list of alerts by type for each model monitor instance

model_accuracy = None

model_drift = None

data_integrity = None

bias_violations = None

alerts = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelMonitorSummaryFromOrg(client, data=None, infos=None)

Bases: abacusai.return_class.AbstractApiClass

A summary of model monitor given an organization

Parameters:

client (ApiClient) – An authenticated API Client instance
data (list) – A list of either model accuracy, drift, data integrity, or bias chart objects and their monitor version information.
infos (dict) – A dictionary of model monitor information.

data = None

infos = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelMonitorVersion(client, modelMonitorVersion=None, status=None, modelMonitorId=None, monitoringStartedAt=None, monitoringCompletedAt=None, trainingFeatureGroupVersion=None, predictionFeatureGroupVersion=None, error=None, pendingDeploymentIds=None, failedDeploymentIds=None, metricConfigs=None, featureGroupMonitorConfigs=None, metricTypes=None, modelVersion=None, batchPredictionVersion=None, edaConfigs=None, trainingForecastConfig=None, predictionForecastConfig=None, forecastFrequency=None, monitorDriftConfig=None, predictionDataUseMappings=None, trainingDataUseMappings=None)

Bases: abacusai.return_class.AbstractApiClass

A version of a model monitor

Parameters:

client (ApiClient) – An authenticated API Client instance
modelMonitorVersion (str) – The unique identifier of a model monitor version.
status (str) – The current status of the model.
modelMonitorId (str) – A reference to the model monitor this version belongs to.
monitoringStartedAt (str) – The start time and date of the monitoring process.
monitoringCompletedAt (str) – The end time and date of the monitoring process.
trainingFeatureGroupVersion (list[str]) – Feature group version IDs that this refresh pipeline run is monitoring.
predictionFeatureGroupVersion (list[str]) – Feature group version IDs that this refresh pipeline run is monitoring.
error (str) – Relevant error if the status is FAILED.
pendingDeploymentIds (list) – List of deployment IDs where deployment is pending.
failedDeploymentIds (list) – List of failed deployment IDs.
metricConfigs (list[dict]) – List of metric configs for the model monitor instance.
featureGroupMonitorConfigs (dict) – Configurations for feature group monitor
metricTypes (list) – List of metric types.
modelVersion (list[str]) – Model version IDs that this refresh pipeline run is monitoring.
batchPredictionVersion (str) – The batch prediction version this model monitor is monitoring
edaConfigs (list) – The list of eda configs for the version
trainingForecastConfig (dict) – The training forecast config for the monitor version
predictionForecastConfig (dict) – The prediction forecast config for the monitor version
forecastFrequency (str) – The forecast frequency for the monitor version
monitorDriftConfig (dict) – The monitor drift config for the monitor version
predictionDataUseMappings (dict) – The mapping of prediction data use to feature group version
trainingDataUseMappings (dict) – The mapping of training data use to feature group version

model_monitor_version = None

status = None

model_monitor_id = None

monitoring_started_at = None

monitoring_completed_at = None

training_feature_group_version = None

prediction_feature_group_version = None

error = None

pending_deployment_ids = None

failed_deployment_ids = None

metric_configs = None

feature_group_monitor_configs = None

metric_types = None

model_version = None

batch_prediction_version = None

eda_configs = None

training_forecast_config = None

prediction_forecast_config = None

forecast_frequency = None

monitor_drift_config = None

prediction_data_use_mappings = None

training_data_use_mappings = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

get_prediction_drift()

Gets the label and prediction drifts for a model monitor.

Parameters:: model_monitor_version (str) – Unique string identifier for a model monitor version created under the project.
Returns:: Object describing training and prediction output label and prediction distributions.
Return type:: DriftDistributions

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: ModelMonitorVersion

describe()

Retrieves a full description of the specified model monitor version.

Parameters:: model_monitor_version (str) – The unique version ID of the model monitor version.
Returns:: A model monitor version.
Return type:: ModelMonitorVersion

delete()

Deletes the specified model monitor version.

Parameters:: model_monitor_version (str) – Unique identifier of the model monitor version to delete.

metric_data(metric_type, actual_values_to_detail=None)

Provides the data needed for decile metrics associated with the model monitor.

Parameters:

metric_type (str) – The type of metric to get data for.
actual_values_to_detail (list) – The actual values to detail.

Returns:

Data associated with the metric.

Return type:

ModelMonitorVersionMetricData

list_monitor_alert_versions_for_monitor_version()

Retrieves the list of monitor alert versions for a specified monitor instance.

Parameters:: model_monitor_version (str) – The unique ID associated with the model monitor.
Returns:: A list of monitor alert versions.
Return type:: list[MonitorAlertVersion]

get_drift_for_feature(feature_name, nested_feature_name=None)

Gets the feature drift associated with a single feature in an output feature group from a prediction.

Parameters:

feature_name (str) – Name of the feature to view the distribution of.
nested_feature_name (str) – Optionally, the name of the nested feature that the feature is in.

Returns:

An object describing the training and prediction output feature distributions.

Return type:

FeatureDistribution

get_outliers_for_feature(feature_name=None, nested_feature_name=None)

Gets a list of outliers measured by a single feature (or overall) in an output feature group from a prediction.

Parameters:

feature_name (str) – Name of the feature to view the distribution of.
nested_feature_name (str) – Optionally, the name of the nested feature that the feature is in.

wait_for_monitor(timeout=1200)

A waiting call until model monitor version is ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the model monitor version.

Returns:: A string describing the status of the model monitor version, for e.g., pending, complete, etc.
Return type:: str

class abacusai.ModelMonitorVersionMetricData(client, name=None, algoName=None, featureGroupVersion=None, modelMonitor=None, modelMonitorVersion=None, metricInfos=None, metricNames=None, metrics=None, metricCharts=None, otherMetrics=None, actualValuesSupportedForDrilldown=None)

Bases: abacusai.return_class.AbstractApiClass

Data for displaying model monitor version metric data

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the metric type
algoName (str) – The name of the algo used for the prediction metric
featureGroupVersion (str) – The prediction feature group used for analysis
modelMonitor (str) – The id of the model monitor
modelMonitorVersion (str) – The id of the model monitor version
metricInfos (dict) – Name and description for metrics
metricNames (dict) – Internal name to external name mapping
metrics (dict) – Metric name to metric data
metricCharts (list) – List of different metric charts
otherMetrics (list) – List of other metrics to optionally plot
actualValuesSupportedForDrilldown (list) – List of values support for drilldown

name = None

algo_name = None

feature_group_version = None

model_monitor = None

model_monitor_version = None

metric_infos = None

metric_names = None

metrics = None

metric_charts = None

other_metrics = None

actual_values_supported_for_drilldown = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelTrainingTypeForDeployment(client, label=None, value=None)

Bases: abacusai.return_class.AbstractApiClass

Model training types for deployment.

Parameters:

client (ApiClient) – An authenticated API Client instance
label (str) – Labels to show to users in deployment UI
value (str) – Value to use on backend for deployment API call

label = None

value = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelUpload(client, modelId=None, modelVersion=None, status=None, createdAt=None, modelUploadId=None, embeddingsUploadId=None, artifactsUploadId=None, verificationsUploadId=None, defaultItemsUploadId=None, modelFileUploadId=None, modelStateUploadId=None, inputPreprocessorUploadId=None, requirementsUploadId=None, resourcesUploadId=None, multiCatalogEmbeddingsUploadId=None)

Bases: abacusai.return_class.AbstractApiClass

A model version that includes the upload identifiers for the various required files.

Parameters:

client (ApiClient) – An authenticated API Client instance
modelId (str) – A reference to the model this version belongs to.
modelVersion (str) – A unique identifier for the model version.
status (str) – The current status of the model.
createdAt (str) – The timestamp at which the model version was created, in ISO-8601 format.
modelUploadId (str) – An upload identifier to be used when uploading the TensorFlow Saved Model.
embeddingsUploadId (str) – An upload identifier to be used when uploading the embeddings CSV.
artifactsUploadId (str) – An upload identifier to be used when uploading the artifacts JSON file.
verificationsUploadId (str) – An upload identifier to be used when uploading the verifications JSON file.
defaultItemsUploadId (str) – An upload identifier to be used when uploading the default items JSON file.
modelFileUploadId (str) – An upload identifier to be used when uploading the model JSON file.
modelStateUploadId (str) – An upload identifier to be used when uploading the model state JSON file.
inputPreprocessorUploadId (str) – An upload identifier to be used when uploading the input preprocessor JSON file.
requirementsUploadId (str) – An upload identifier to be used when uploading the requirements JSON file.
resourcesUploadId (str) – An upload identifier to be used when uploading the resources JSON file.
multiCatalogEmbeddingsUploadId (str) – An upload identifier to be used when upload the multi-catalog embeddings CSV file.

model_id = None

model_version = None

status = None

created_at = None

model_upload_id = None

embeddings_upload_id = None

artifacts_upload_id = None

verifications_upload_id = None

default_items_upload_id = None

model_file_upload_id = None

model_state_upload_id = None

input_preprocessor_upload_id = None

requirements_upload_id = None

resources_upload_id = None

multi_catalog_embeddings_upload_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModelVersion(client, modelVersion=None, modelConfigType=None, status=None, modelId=None, modelPredictionConfig=None, trainingStartedAt=None, trainingCompletedAt=None, featureGroupVersions=None, customAlgorithms=None, builtinAlgorithms=None, error=None, pendingDeploymentIds=None, failedDeploymentIds=None, cpuSize=None, memory=None, automlComplete=None, trainingFeatureGroupIds=None, trainingDocumentRetrieverVersions=None, documentRetrieverMappings=None, bestAlgorithm=None, defaultAlgorithm=None, featureAnalysisStatus=None, dataClusterInfo=None, customAlgorithmConfigs=None, trainedModelTypes=None, useGpu=None, partialComplete=None, modelFeatureGroupSchemaMappings=None, trainingConfigUpdated=None, codeSource={}, modelConfig={}, deployableAlgorithms={})

Bases: abacusai.return_class.AbstractApiClass

A version of a model

Parameters:

client (ApiClient) – An authenticated API Client instance
modelVersion (str) – The unique identifier of a model version.
modelConfigType (str) – Name of the TrainingConfig class of the model_config.
status (str) – The current status of the model.
modelId (str) – A reference to the model this version belongs to.
modelPredictionConfig (dict) – The prediction config options for the model.
trainingStartedAt (str) – The start time and date of the training process in ISO-8601 format.
trainingCompletedAt (str) – The end time and date of the training process in ISO-8601 format.
featureGroupVersions (list) – A list of Feature Group version IDs used for model training.
customAlgorithms (list) – List of user-defined algorithms used for model training.
builtinAlgorithms (list) – List of algorithm names builtin algorithms provided by Abacus.AI used for model training.
error (str) – Relevant error if the status is FAILED.
pendingDeploymentIds (list) – List of deployment IDs where deployment is pending.
failedDeploymentIds (list) – List of failed deployment IDs.
cpuSize (str) – CPU size specified for the python model training.
memory (int) – Memory in GB specified for the python model training.
automlComplete (bool) – If true, all algorithms have completed training.
trainingFeatureGroupIds (list) – The unique identifiers of the feature groups used as inputs during training to create this ModelVersion.
trainingDocumentRetrieverVersions (list) – The document retriever version IDs used as inputs during training to create this ModelVersion.
documentRetrieverMappings (dict) – mapping of document retriever version to their respective information.
bestAlgorithm (dict) – Best performing algorithm.
defaultAlgorithm (dict) – Default algorithm that the user has selected.
featureAnalysisStatus (str) – Lifecycle of the feature analysis stage.
dataClusterInfo (dict) – Information about the models for different data clusters.
customAlgorithmConfigs (dict) – User-defined configs for each of the user-defined custom algorithms.
trainedModelTypes (list) – List of trained model types.
useGpu (bool) – Whether this model version is using gpu
partialComplete (bool) – If true, all required algorithms have completed training.
modelFeatureGroupSchemaMappings (dict) – mapping of feature group to schema version
trainingConfigUpdated (bool) – If the training config has been updated since the instance was created.
codeSource (CodeSource) – If a python model, information on where the source code is located.
modelConfig (TrainingConfig) – The training config options used to train this model.
deployableAlgorithms (DeployableAlgorithm) – List of deployable algorithms.

model_version = None

model_config_type = None

status = None

model_id = None

model_prediction_config = None

training_started_at = None

training_completed_at = None

feature_group_versions = None

custom_algorithms = None

builtin_algorithms = None

error = None

pending_deployment_ids = None

failed_deployment_ids = None

cpu_size = None

memory = None

automl_complete = None

training_feature_group_ids = None

training_document_retriever_versions = None

document_retriever_mappings = None

best_algorithm = None

default_algorithm = None

feature_analysis_status = None

data_cluster_info = None

custom_algorithm_configs = None

trained_model_types = None

use_gpu = None

partial_complete = None

model_feature_group_schema_mappings = None

training_config_updated = None

code_source

model_config

deployable_algorithms

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

describe_train_test_data_split_feature_group_version()

Get the train and test data split for a trained model by model version. This is only supported for models with custom algorithms.

Parameters:: model_version (str) – The unique version ID of the model version.
Returns:: The feature group version containing the training data and folds information.
Return type:: FeatureGroupVersion

set_model_objective(metric=None)

Sets the best model for all model instances of the model based on the specified metric, and updates the training configuration to use the specified metric for any future model versions.

If metric is set to None, then just use the default selection

Parameters:: metric (str) – The metric to use to determine the best model.

get_feature_group_schemas_for()

Gets the schema (including feature mappings) for all feature groups used in the model version.

Parameters:: model_version (str) – Unique string identifier for the version of the model.
Returns:: List of schema for all feature groups used in the model version.
Return type:: list[ModelVersionFeatureGroupSchema]

delete()

Deletes the specified model version. Model versions which are currently used in deployments cannot be deleted.

Parameters:: model_version (str) – The unique identifier of the model version to delete.

export_model_artifact_as_feature_group(table_name, artifact_type=None)

Exports metric artifact data for a model as a feature group.

Parameters:

table_name (str) – Name of the feature group table to create.
artifact_type (EvalArtifactType) – eval artifact type to export.

Returns:

The created feature group.

Return type:

FeatureGroup

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: ModelVersion

describe()

Retrieves a full description of the specified model version.

Parameters:: model_version (str) – Unique string identifier of the model version.
Returns:: A model version.
Return type:: ModelVersion

get_feature_importance_by()

Gets the feature importance calculated by various methods for the model.

Parameters:: model_version (str) – Unique string identifier for the model version.
Returns:: Feature importances for the model.
Return type:: FeatureImportance

get_training_data_logs()

Retrieves the data preparation logs during model training.

Parameters:: model_version (str) – The unique version ID of the model version.
Returns:: A list of logs.
Return type:: list[DataPrepLogs]

get_training_logs(stdout=False, stderr=False)

Returns training logs for the model.

Parameters:

stdout (bool) – Set True to get info logs.
stderr (bool) – Set True to get error logs.

Returns:

A function logs object.

Return type:

FunctionLogs

export_custom(output_location, algorithm=None)

Bundle custom model artifacts to a zip file, and export to the specified location.

Parameters:

output_location (str) – Location to export the model artifacts results. For example, s3://a-bucket/
algorithm (str) – The algorithm to be exported. Optional if there’s only one custom algorithm in the model version.

Returns:

Object describing the export and its status.

Return type:

ModelArtifactsExport

wait_for_training(timeout=None)

A waiting call until model gets trained.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_full_automl(timeout=None)

A waiting call until full AutoML cycle is completed.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the model version under training.

Returns:: A string describing the status of a model training (pending, complete, etc.).
Return type:: str

get_train_test_feature_group_as_pandas()

Get the model train test data split feature group of the model version as pandas data frame.

Returns:: A pandas dataframe for the training data with fold column.
Return type:: pandas.Dataframe

class abacusai.ModelVersionFeatureGroupSchema(client, featureGroupId=None, featureGroupName=None, schema={})

Bases: abacusai.return_class.AbstractApiClass

Schema for a feature group used in model version

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – The ID of the feature group.
featureGroupName (str) – The name of the feature group.
schema (Schema) – List of feature schemas of a feature group.

feature_group_id = None

feature_group_name = None

schema

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ModificationLockInfo(client, modificationLock=None, userEmails=None, organizationGroups=None)

Bases: abacusai.return_class.AbstractApiClass

Information about a modification lock for a certain object

Parameters:

client (ApiClient) – An authenticated API Client instance
modificationLock (bool) – Whether or not the object has its modification lock activated.
userEmails (list of strings) – The list of user emails allowed to modify the object if the object’s modification lock is activated.
organizationGroups (list of unique string identifiers) – The list organization groups allowed to modify the object if the object’s modification lock is activated.

modification_lock = None

user_emails = None

organization_groups = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Module(client, name=None, createdAt=None, notebookId=None, hideModuleCode=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

Customer created python module

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name to identify the algorithm. Only uppercase letters, numbers, and underscores are allowed.
createdAt (str) – The date and time when the Python function was created, in ISO-8601 format.
notebookId (str) – The unique string identifier of the notebook used to create or edit the module.
hideModuleCode (bool) – Whether the module code is hidden from external users
codeSource (CodeSource) – Information about the source code of the Python function.

name = None

created_at = None

notebook_id = None

hide_module_code = None

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.MonitorAlert(client, name=None, monitorAlertId=None, createdAt=None, projectId=None, modelMonitorId=None, realtimeMonitorId=None, conditionConfig=None, actionConfig=None, conditionDescription=None, actionDescription=None, alertType=None, deploymentId=None, latestMonitorAlertVersion={})

Bases: abacusai.return_class.AbstractApiClass

A Monitor Alert

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The user-friendly name for the alert.
monitorAlertId (str) – The unique identifier of the monitor alert.
createdAt (str) – Date and time at which the monitor alert was created.
projectId (str) – The project this alert belongs to.
modelMonitorId (str) – The monitor id that this alert is associated with
realtimeMonitorId (str) – The realtime monitor id that this alert is associated with
conditionConfig (dict) – The condition configuration for this alert.
actionConfig (dict) – The action configuration for this alert.
conditionDescription (str) – User friendly description of the condition
actionDescription (str) – User friendly description of the action
alertType (str) – The type of the alert
deploymentId (str) – The deployment ID this alert is associated with
latestMonitorAlertVersion (MonitorAlertVersion) – The latest monitor alert version.

name = None

monitor_alert_id = None

created_at = None

project_id = None

model_monitor_id = None

realtime_monitor_id = None

condition_config = None

action_config = None

condition_description = None

action_description = None

alert_type = None

deployment_id = None

latest_monitor_alert_version

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

update(alert_name=None, condition_config=None, action_config=None)

Update monitor alert

Parameters:

alert_name (str) – Name of the alert.
condition_config (AlertConditionConfig) – Condition to run the actions for the alert.
action_config (AlertActionConfig) – Configuration for the action of the alert.

Returns:

Object describing the monitor alert.

Return type:

MonitorAlert

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: MonitorAlert

describe()

Describes a given monitor alert id

Parameters:: monitor_alert_id (str) – Unique identifier of the monitor alert.
Returns:: Object containing information about the monitor alert.
Return type:: MonitorAlert

run()

Reruns a given monitor alert from latest monitor instance

Parameters:: monitor_alert_id (str) – Unique identifier of a monitor alert.
Returns:: Object describing the monitor alert.
Return type:: MonitorAlert

delete()

Delets a monitor alert

Parameters:: monitor_alert_id (str) – The unique string identifier of the alert to delete.

class abacusai.MonitorAlertVersion(client, name=None, monitorAlertVersion=None, monitorAlertId=None, status=None, createdAt=None, alertingStartedAt=None, alertingCompletedAt=None, error=None, modelMonitorVersion=None, conditionConfig=None, actionConfig=None, alertResult=None, actionStatus=None, actionError=None, actionStartedAt=None, actionCompletedAt=None, conditionDescription=None, actionDescription=None, alertType=None)

Bases: abacusai.return_class.AbstractApiClass

A monitor alert version

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The user-friendly name for the monitor alert.
monitorAlertVersion (str) – The identifier for the alert version.
monitorAlertId (str) – The identifier for the alert.
status (str) – The current status of the monitor alert.
createdAt (str) – Date and time at which the monitor alert was created.
alertingStartedAt (str) – The start time and date of the monitor alerting process.
alertingCompletedAt (str) – The end time and date of the monitor alerting process.
error (str) – Relevant error if the status is FAILED.
modelMonitorVersion (str) – The model monitor version associated with the monitor alert version.
conditionConfig (dict) – The condition configuration for this alert.
actionConfig (dict) – The action configuration for this alert.
alertResult (str) – The current result of the alert
actionStatus (str) – The current status of the action as a result of the monitor alert.
actionError (str) – Relevant error if the action status is FAILED.
actionStartedAt (str) – The start time and date of the actionfor the alerting process.
actionCompletedAt (str) – The end time and date of the actionfor the alerting process.
conditionDescription (str) – User friendly description of the condition
actionDescription (str) – User friendly description of the action
alertType (str) – The type of the alert

name = None

monitor_alert_version = None

monitor_alert_id = None

status = None

created_at = None

alerting_started_at = None

alerting_completed_at = None

error = None

model_monitor_version = None

condition_config = None

action_config = None

alert_result = None

action_status = None

action_error = None

action_started_at = None

action_completed_at = None

condition_description = None

action_description = None

alert_type = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: MonitorAlertVersion

describe()

Describes a given monitor alert version id

Parameters:: monitor_alert_version (str) – Unique string identifier for the monitor alert.
Returns:: An object describing the monitor alert version.
Return type:: MonitorAlertVersion

wait_for_monitor_alert(timeout=1200)

A waiting call until model monitor version is ready.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the monitor alert version.

Returns:: A string describing the status of a monitor alert version (pending, running, complete, etc.).
Return type:: str

class abacusai.MonitorDriftAndDistributions(client, featureDrifts=None, featureDistributions=None, nestedDrifts=None, forecastingMonitorSummary={}, embeddingsDistribution={})

Bases: abacusai.return_class.AbstractApiClass

Summary of important model monitoring statistics for features available in a model monitoring instance

Parameters:

client (ApiClient) – An authenticated API Client instance
featureDrifts (list[dict]) – A list of dicts of eligible feature names and corresponding overall feature drift measures.
featureDistributions (list[dict]) – A list of dicts of feature names and corresponding feature distributions.
nestedDrifts (list[dict]) – A list of dicts of nested feature names and corresponding overall feature drift measures.
forecastingMonitorSummary (ForecastingMonitorSummary) – Summary of important model monitoring statistics for features available in a model monitoring instance
embeddingsDistribution (EmbeddingFeatureDriftDistribution) – Summary of important model monitoring statistics for features available in a model monitoring instance

feature_drifts = None

feature_distributions = None

nested_drifts = None

forecasting_monitor_summary

embeddings_distribution

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.NaturalLanguageExplanation(client, shortExplanation=None, longExplanation=None, isOutdated=None, htmlExplanation=None)

Bases: abacusai.return_class.AbstractApiClass

Natural language explanation of an artifact/object

Parameters:

client (ApiClient) – An authenticated API Client instance
shortExplanation (str) – succinct explanation of the artifact
longExplanation (str) – Longer and verbose explanation of the artifact
isOutdated (bool) – Flag indicating whether the explanation is outdated due to a change in the underlying artifact
htmlExplanation (str) – HTML formatted explanation of the artifact

short_explanation = None

long_explanation = None

is_outdated = None

html_explanation = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.NestedFeature(client, name=None, selectClause=None, featureType=None, featureMapping=None, dataType=None, sourceTable=None, originalName=None)

Bases: abacusai.return_class.AbstractApiClass

A nested feature in a feature group

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The unique name of the column
selectClause (str) – The sql logic for creating this feature’s data
featureType (str) – Feature Type of the Feature
featureMapping (str) – The Feature Mapping of the feature
dataType (str) – Data Type of the Feature
sourceTable (str) – The source table of the column
originalName (str) – The original name of the column

name = None

select_clause = None

feature_type = None

feature_mapping = None

data_type = None

source_table = None

original_name = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.NestedFeatureSchema(client, name=None, featureType=None, featureMapping=None, dataType=None, detectedFeatureType=None, sourceTable=None, pointInTimeInfo={})

Bases: abacusai.return_class.AbstractApiClass

A schema description for a nested feature

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The unique name of the column
featureType (str) – Feature Type of the Feature
featureMapping (str) – The Feature Mapping of the feature
dataType (str) – Data Type of the Feature
detectedFeatureType (str) – The detected feature type for this feature
sourceTable (str) – The source table of the column
pointInTimeInfo (PointInTimeFeatureInfo) – Point in time information for this feature

name = None

feature_type = None

feature_mapping = None

data_type = None

detected_feature_type = None

source_table = None

point_in_time_info

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.NewsSearchResult(client, title=None, url=None, description=None, thumbnailUrl=None, thumbnailWidth=None, thumbnailHeight=None, faviconUrl=None, datePublished=None)

Bases: abacusai.return_class.AbstractApiClass

A single news search result.

Parameters:

client (ApiClient) – An authenticated API Client instance
title (str) – The title of the news.
url (str) – The URL of the news.
description (str) – The description of the news.
thumbnailUrl (str) – The URL of the image of the news.
thumbnailWidth (int) – The width of the image of the news.
thumbnailHeight (int) – The height of the image of the news.
faviconUrl (str) – The URL of the favicon of the news.
datePublished (str) – The date the news was published.

title = None

url = None

description = None

thumbnail_url = None

thumbnail_width = None

thumbnail_height = None

favicon_url = None

date_published = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.NlpChatResponse(client, deploymentConversationId=None, messages=None)

Bases: abacusai.return_class.AbstractApiClass

A chat response from an LLM

Parameters:

client (ApiClient) – An authenticated API Client instance
deploymentConversationId (str) – The unique identifier of the deployment conversation.
messages (list) – The conversation messages in the chat.

deployment_conversation_id = None

messages = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.NullViolation(client, name=None, violation=None, trainingNullFreq=None, predictionNullFreq=None)

Bases: abacusai.return_class.AbstractApiClass

Summary of anomalous null frequencies for a feature discovered by a model monitoring instance

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – Name of feature.
violation (str) – Description of null violation for a prediction feature.
trainingNullFreq (float) – Proportion of null entries in training feature.
predictionNullFreq (float) – Proportion of null entries in prediction feature.

name = None

violation = None

training_null_freq = None

prediction_null_freq = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.OrganizationExternalApplicationSettings(client, logo=None, theme=None, managedUserService=None, passwordsDisabled=None, externalServiceForRoles=None)

Bases: abacusai.return_class.AbstractApiClass

The External Application Settings for an Organization.

Parameters:

client (ApiClient) – An authenticated API Client instance
logo (str) – The logo.
theme (dict) – The theme used for External Applications in this org.
managedUserService (str) – The external service that is managing the user accounts.
passwordsDisabled (bool) – Whether or not passwords are disabled for this organization’s domain.
externalServiceForRoles (str) – The external service that is managing the user roles.

logo = None

theme = None

managed_user_service = None

passwords_disabled = None

external_service_for_roles = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.OrganizationGroup(client, organizationGroupId=None, permissions=None, groupName=None, defaultGroup=None, admin=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

An Organization Group. Defines the permissions available to the users who are members of the group.

Parameters:

client (ApiClient) – An authenticated API Client instance
organizationGroupId (str) – The unique identifier of the Organization Group.
permissions (list of enum string) – The list of permissions (VIEW, MODIFY, ADMIN, BILLING, API_KEY, INVITE_USER) the group has.
groupName (str) – The name of the Organization Group.
defaultGroup (bool) – If true, all new users will be added to this group automatically.
admin (bool) – If true, this group contains all permissions available to the organization and cannot be modified or deleted.
createdAt (str) – When the Organization Group was created.

organization_group_id = None

permissions = None

group_name = None

default_group = None

admin = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: OrganizationGroup

describe()

Returns the specific organization group passed in by the user.

Parameters:: organization_group_id (str) – The unique identifier of the organization group to be described.
Returns:: Information about a specific organization group.
Return type:: OrganizationGroup

add_permission(permission)

Adds a permission to the specified Organization Group.

Parameters:: permission (str) – Permission to add to the Organization Group.

remove_permission(permission)

Removes a permission from the specified Organization Group.

Parameters:: permission (str) – The permission to remove from the Organization Group.

delete()

Deletes the specified Organization Group

Parameters:: organization_group_id (str) – Unique string identifier of the organization group.

add_user_to(email)

Adds a user to the specified Organization Group.

Parameters:: email (str) – Email of the user to be added to the group.

remove_user_from(email)

Removes a user from an Organization Group.

Parameters:: email (str) – Email of the user to remove.

set_default()

Sets the default Organization Group to which all new users joining an organization are automatically added.

Parameters:: organization_group_id (str) – Unique string identifier of the Organization Group.

class abacusai.OrganizationSearchResult(client, score=None, featureGroupContext=None, featureGroup={}, featureGroupVersion={})

Bases: abacusai.return_class.AbstractApiClass

A search result object which contains the retrieved artifact and its relevance score

Parameters:

client (ApiClient) – An authenticated API Client instance
score (float) – The relevance score of the search result.
featureGroupContext (str) – The rendered context for the feature group that can be used in prompts
featureGroup (FeatureGroup) – The feature group object retrieved through search.
featureGroupVersion (FeatureGroupVersion) – The feature group version object retrieved through search.

score = None

feature_group_context = None

feature_group

feature_group_version

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.OrganizationSecret(client, secretKey=None, value=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

Organization secret

Parameters:

client (ApiClient) – An authenticated API Client instance
secretKey (str) – The key of the secret
value (str) – The value of the secret
createdAt (str) – The date and time when the secret was created, in ISO-8601 format.

secret_key = None

value = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PageData(client, docId=None, page=None, height=None, width=None, pageCount=None, pageText=None, pageTokenStartOffset=None, tokenCount=None, tokens=None, extractedText=None, rotationAngle=None, pageMarkdown=None, embeddedText=None)

Bases: abacusai.return_class.AbstractApiClass

Data extracted from a docstore page.

Parameters:

client (ApiClient) – An authenticated API Client instance
docId (str) – Unique Docstore string identifier for the document.
page (int) – The page number. Starts from 0.
height (int) – The height of the page in pixels.
width (int) – The width of the page in pixels.
pageCount (int) – The total number of pages in document.
pageText (str) – The text extracted from the page.
pageTokenStartOffset (int) – The offset of the first token in the page.
tokenCount (int) – The number of tokens in the page.
tokens (list) – The tokens in the page.
extractedText (str) – The extracted text in the page obtained from OCR.
rotationAngle (float) – The detected rotation angle of the page in degrees. Positive values indicate clockwise and negative values indicate anti-clockwise rotation from the original orientation.
pageMarkdown (str) – The markdown text for the page.
embeddedText (str) – The embedded text in the page. Only available for digital documents.

doc_id = None

page = None

height = None

width = None

page_count = None

page_text = None

page_token_start_offset = None

token_count = None

tokens = None

extracted_text = None

rotation_angle = None

page_markdown = None

embedded_text = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Pipeline(client, pipelineName=None, pipelineId=None, createdAt=None, notebookId=None, cron=None, nextRunTime=None, isProd=None, warning=None, createdBy=None, steps={}, pipelineReferences={}, latestPipelineVersion={}, codeSource={}, pipelineVariableMappings={})

Bases: abacusai.return_class.AbstractApiClass

A Pipeline For Steps.

Parameters:

client (ApiClient) – An authenticated API Client instance
pipelineName (str) – The name of the pipeline this step is a part of.
pipelineId (str) – The reference to the pipeline this step belongs to.
createdAt (str) – The date and time which the pipeline was created.
notebookId (str) – The reference to the notebook this pipeline belongs to.
cron (str) – A cron-style string that describes when this refresh policy is to be executed in UTC
nextRunTime (str) – The next time this pipeline will be run.
isProd (bool) – Whether this pipeline is a production pipeline.
warning (str) – Warning message for possible errors that might occur if the pipeline is run.
createdBy (str) – The email of the user who created the pipeline
steps (PipelineStep) – A list of the pipeline steps attached to the pipeline.
pipelineReferences (PipelineReference) – A list of references from the pipeline to other objects
latestPipelineVersion (PipelineVersion) – The latest version of the pipeline.
codeSource (CodeSource) – information on the source code
pipelineVariableMappings (PythonFunctionArgument) – A description of the function variables into the pipeline.

pipeline_name = None

pipeline_id = None

created_at = None

notebook_id = None

cron = None

next_run_time = None

is_prod = None

warning = None

created_by = None

steps

pipeline_references

latest_pipeline_version

code_source

pipeline_variable_mappings

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Pipeline

describe()

Describes a given pipeline.

Parameters:: pipeline_id (str) – The ID of the pipeline to describe.
Returns:: An object describing a Pipeline
Return type:: Pipeline

update(project_id=None, pipeline_variable_mappings=None, cron=None, is_prod=None)

Updates a pipeline for executing multiple steps.

Parameters:

project_id (str) – A unique string identifier for the pipeline.
pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.
cron (str) – A cron-like string specifying the frequency of the scheduled pipeline runs.
is_prod (bool) – Whether the pipeline is a production pipeline or not.

Returns:

An object that describes a Pipeline.

Return type:

Pipeline

rename(pipeline_name)

Renames a pipeline.

Parameters:: pipeline_name (str) – The new name of the pipeline.
Returns:: An object that describes a Pipeline.
Return type:: Pipeline

delete()

Deletes a pipeline.

Parameters:: pipeline_id (str) – The ID of the pipeline to delete.

list_versions(limit=200)

Lists the pipeline versions for a specified pipeline

Parameters:: limit (int) – The maximum number of pipeline versions to return.
Returns:: A list of pipeline versions.
Return type:: list[PipelineVersion]

run(pipeline_variable_mappings=None)

Runs a specified pipeline with the arguments provided.

Parameters:: pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.
Returns:: The object describing the pipeline
Return type:: PipelineVersion

create_step(step_name, function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)

Creates a step in a given pipeline.

Parameters:

step_name (str) – The name of the step.
function_name (str) – The name of the Python function.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
step_input_mappings (List) – List of Python function arguments.
output_variable_mappings (List) – List of Python function outputs.
step_dependencies (list) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
timeout (int) – Timeout for the step in minutes, default is 300 minutes.

Returns:

Object describing the pipeline.

Return type:

Pipeline

describe_step_by_name(step_name)

Describes a pipeline step by the step name.

Parameters:: step_name (str) – The name of the step.
Returns:: An object describing the pipeline step.
Return type:: PipelineStep

unset_refresh_schedule()

Deletes the refresh schedule for a given pipeline.

Parameters:: pipeline_id (str) – The id of the pipeline.
Returns:: Object describing the pipeline.
Return type:: Pipeline

pause_refresh_schedule()

Pauses the refresh schedule for a given pipeline.

Parameters:: pipeline_id (str) – The id of the pipeline.
Returns:: Object describing the pipeline.
Return type:: Pipeline

resume_refresh_schedule()

Resumes the refresh schedule for a given pipeline.

Parameters:: pipeline_id (str) – The id of the pipeline.
Returns:: Object describing the pipeline.
Return type:: Pipeline

create_step_from_function(step_name, function, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None)

Creates a step in the pipeline from a python function.

Parameters:

step_name (str) – The name of the step.
function (callable) – The python function.
step_input_mappings (List[PythonFunctionArguments]) – List of Python function arguments.
output_variable_mappings (List[OutputVariableMapping]) – List of Python function ouputs.
step_dependencies (List[str]) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.

wait_for_pipeline(timeout=1200)

A waiting call until all the stages of the latest pipeline version is completed.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the pipeline version.

Returns:: A string describing the status of a pipeline version (pending, running, complete, etc.).
Return type:: str

class abacusai.PipelineReference(client, pipelineReferenceId=None, pipelineId=None, objectType=None, datasetId=None, modelId=None, deploymentId=None, batchPredictionDescriptionId=None, modelMonitorId=None, notebookId=None, featureGroupId=None)

Bases: abacusai.return_class.AbstractApiClass

A reference to a pipeline to the objects it is run on.

Parameters:

client (ApiClient) – An authenticated API Client instance
pipelineReferenceId (str) – The id of the reference.
pipelineId (str) – The id of the pipeline for the reference.
objectType (str) – The object type of the reference.
datasetId (str) – The dataset id of the reference.
modelId (str) – The model id of the reference.
deploymentId (str) – The deployment id of the reference.
batchPredictionDescriptionId (str) – The batch prediction description id of the reference.
modelMonitorId (str) – The model monitor id of the reference.
notebookId (str) – The notebook id of the reference.
featureGroupId (str) – The feature group id of the reference.

pipeline_reference_id = None

pipeline_id = None

object_type = None

dataset_id = None

model_id = None

deployment_id = None

batch_prediction_description_id = None

model_monitor_id = None

notebook_id = None

feature_group_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PipelineStep(client, pipelineStepId=None, pipelineId=None, stepName=None, pipelineName=None, createdAt=None, updatedAt=None, pythonFunctionId=None, stepDependencies=None, cpuSize=None, memory=None, timeout=None, pythonFunction={}, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

A step in a pipeline.

Parameters:

client (ApiClient) – An authenticated API Client instance
pipelineStepId (str) – The reference to this step.
pipelineId (str) – The reference to the pipeline this step belongs to.
stepName (str) – The name of the step.
pipelineName (str) – The name of the pipeline this step is a part of.
createdAt (str) – The date and time which this step was created.
updatedAt (str) – The date and time when this step was last updated.
pythonFunctionId (str) – The python function_id.
stepDependencies (list[str]) – List of steps this step depends on.
cpuSize (str) – CPU size specified for the step function.
memory (int) – Memory in GB specified for the step function.
timeout (int) – Timeout for the step in minutes, default is 300 minutes.
pythonFunction (PythonFunction) – Information about the python function for the step.
codeSource (CodeSource) – Information about the source code of the step function.

pipeline_step_id = None

pipeline_id = None

step_name = None

pipeline_name = None

created_at = None

updated_at = None

python_function_id = None

step_dependencies = None

cpu_size = None

memory = None

timeout = None

python_function

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

delete()

Deletes a step from a pipeline.

Parameters:: pipeline_step_id (str) – The ID of the pipeline step.

update(function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)

Creates a step in a given pipeline.

Parameters:

function_name (str) – The name of the Python function.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
step_input_mappings (List) – List of Python function arguments.
output_variable_mappings (List) – List of Python function outputs.
step_dependencies (list) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
timeout (int) – Timeout for the pipeline step, default is 300 minutes.

Returns:

Object describing the pipeline.

Return type:

PipelineStep

rename(step_name)

Renames a step in a given pipeline.

Parameters:: step_name (str) – The name of the step.
Returns:: Object describing the pipeline.
Return type:: PipelineStep

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: PipelineStep

describe()

Deletes a step from a pipeline.

Parameters:: pipeline_step_id (str) – The ID of the pipeline step.
Returns:: An object describing the pipeline step.
Return type:: PipelineStep

class abacusai.PipelineStepVersion(client, stepName=None, pipelineStepVersion=None, pipelineStepId=None, pipelineId=None, pipelineVersion=None, createdAt=None, updatedAt=None, status=None, error=None, outputErrors=None, pythonFunctionId=None, functionVariableMappings=None, stepDependencies=None, outputVariableMappings=None, cpuSize=None, memory=None, timeout=None, pipelineStepVersionReferences={}, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

A version of a pipeline step.

Parameters:

client (ApiClient) – An authenticated API Client instance
stepName (str) – The name of the step.
pipelineStepVersion (str) – The reference to the pipeline step version.
pipelineStepId (str) – The reference to this step.
pipelineId (str) – The reference to the pipeline this step belongs to.
pipelineVersion (str) – The reference to the pipeline version.
createdAt (str) – The date and time which this step was created.
updatedAt (str) – The date and time when this step was last updated.
status (str) – The status of the pipeline version.
error (str) – The error message if the pipeline step failed.
outputErrors (str) – The error message of a pipeline step’s output.
pythonFunctionId (str) – The reference to the python function
functionVariableMappings (dict) – The mappings for function parameters’ names.
stepDependencies (list[str]) – List of steps this step depends on.
outputVariableMappings (dict) – The mappings for the output variables to the step.
cpuSize (str) – CPU size specified for the step function.
memory (int) – Memory in GB specified for the step function.
timeout (int) – The timeout in minutes for the pipeline step.
pipelineStepVersionReferences (PipelineStepVersionReference) – A list to the output instances of the pipeline step version.
codeSource (CodeSource) – Information about the source code of the pipeline step version.

step_name = None

pipeline_step_version = None

pipeline_step_id = None

pipeline_id = None

pipeline_version = None

created_at = None

updated_at = None

status = None

error = None

output_errors = None

python_function_id = None

function_variable_mappings = None

step_dependencies = None

output_variable_mappings = None

cpu_size = None

memory = None

timeout = None

pipeline_step_version_references

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: PipelineStepVersion

describe()

Describes a pipeline step version.

Parameters:: pipeline_step_version (str) – The ID of the pipeline step version.
Returns:: An object describing the pipeline step version.
Return type:: PipelineStepVersion

get_step_version_logs()

Gets the logs for a given step version.

Parameters:: pipeline_step_version (str) – The id of the pipeline step version.
Returns:: Object describing the pipeline step logs.
Return type:: PipelineStepVersionLogs

class abacusai.PipelineStepVersionLogs(client, stepName=None, pipelineStepId=None, pipelineStepVersion=None, logs=None)

Bases: abacusai.return_class.AbstractApiClass

Logs for a given pipeline step version.

Parameters:

client (ApiClient) – An authenticated API Client instance
stepName (str) – The name of the step
pipelineStepId (str) – The ID of the step
pipelineStepVersion (str) – The version of the step
logs (str) – The logs for both stdout and stderr of the step

step_name = None

pipeline_step_id = None

pipeline_step_version = None

logs = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PipelineStepVersionReference(client, pipelineStepVersionReferenceId=None, pipelineStepVersion=None, objectType=None, datasetVersion=None, modelVersion=None, deploymentVersion=None, batchPredictionId=None, modelMonitorVersion=None, notebookVersion=None, featureGroupVersion=None, status=None, error=None)

Bases: abacusai.return_class.AbstractApiClass

A reference from a pipeline step version to the versions that were output from the pipeline step.

Parameters:

client (ApiClient) – An authenticated API Client instance
pipelineStepVersionReferenceId (str) – The id of the reference.
pipelineStepVersion (str) – The pipeline step version the reference is connected to.
objectType (str) – The object type of the reference.
datasetVersion (str) – The dataset version the reference is connected to.
modelVersion (str) – The model version the reference is connected to.
deploymentVersion (str) – The deployment version the reference is connected to.
batchPredictionId (str) – The batch prediction id the reference is connected to.
modelMonitorVersion (str) – The model monitor version the reference is connected to.
notebookVersion (str) – The notebook version the reference is connected to.
featureGroupVersion (str) – The feature group version the reference is connected to.
status (str) – The status of the reference
error (str) – The error message if the reference is in an error state.

pipeline_step_version_reference_id = None

pipeline_step_version = None

object_type = None

dataset_version = None

model_version = None

deployment_version = None

batch_prediction_id = None

model_monitor_version = None

notebook_version = None

feature_group_version = None

status = None

error = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PipelineVersion(client, pipelineName=None, pipelineId=None, pipelineVersion=None, createdAt=None, updatedAt=None, completedAt=None, status=None, error=None, stepVersions={}, codeSource={}, pipelineVariableMappings={})

Bases: abacusai.return_class.AbstractApiClass

A version of a pipeline.

Parameters:

client (ApiClient) – An authenticated API Client instance
pipelineName (str) – The name of the pipeline this step is a part of.
pipelineId (str) – The reference to the pipeline this step belongs to.
pipelineVersion (str) – The reference to this pipeline version.
createdAt (str) – The date and time which this pipeline version was created.
updatedAt (str) – The date and time which this pipeline version was updated.
completedAt (str) – The date and time which this pipeline version was updated.
status (str) – The status of the pipeline version.
error (str) – The relevant error, if the status is FAILED.
stepVersions (PipelineStepVersion) – A list of the pipeline step versions.
codeSource (CodeSource) – information on the source code
pipelineVariableMappings (PythonFunctionArgument) – A description of the function variables into the pipeline.

pipeline_name = None

pipeline_id = None

pipeline_version = None

created_at = None

updated_at = None

completed_at = None

status = None

error = None

step_versions

code_source

pipeline_variable_mappings

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: PipelineVersion

describe()

Describes a specified pipeline version

Parameters:: pipeline_version (str) – Unique string identifier for the pipeline version
Returns:: Object describing the pipeline version
Return type:: PipelineVersion

reset(steps=None, include_downstream_steps=True)

Reruns a pipeline version for the given steps and downstream steps if specified.

Parameters:

steps (list) – List of pipeline step names to rerun.
include_downstream_steps (bool) – Whether to rerun downstream steps from the steps you have passed

Returns:

Object describing the pipeline version

Return type:

PipelineVersion

list_logs()

Gets the logs for the steps in a given pipeline version.

Parameters:: pipeline_version (str) – The id of the pipeline version.
Returns:: Object describing the logs for the steps in the pipeline.
Return type:: PipelineVersionLogs

skip_pending_steps()

Skips pending steps in a pipeline version.

Parameters:: pipeline_version (str) – The id of the pipeline version.
Returns:: Object describing the pipeline version
Return type:: PipelineVersion

wait_for_pipeline(timeout=1200)

A waiting call until all the stages in a pipeline version have completed.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the pipeline version.

Returns:: A string describing the status of a pipeline version (pending, running, complete, etc.).
Return type:: str

class abacusai.PipelineVersionLogs(client, stepLogs={})

Bases: abacusai.return_class.AbstractApiClass

Logs for a given pipeline version.

Parameters:

client (ApiClient) – An authenticated API Client instance
stepLogs (PipelineStepVersionLogs) – A list of the pipeline step version logs.

step_logs

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PlaygroundText(client, playgroundText=None, renderingCode=None)

Bases: abacusai.return_class.AbstractApiClass

The text content inside of a playground segment.

Parameters:

client (ApiClient) – An authenticated API Client instance
playgroundText (str) – The text of the playground segment.
renderingCode (str) – The rendering code of the playground segment.

playground_text = None

rendering_code = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PointInTimeFeature(client, historyTableName=None, aggregationKeys=None, timestampKey=None, historicalTimestampKey=None, lookbackWindowSeconds=None, lookbackWindowLagSeconds=None, lookbackCount=None, lookbackUntilPosition=None, expression=None, groupName=None)

Bases: abacusai.return_class.AbstractApiClass

A point-in-time feature description

Parameters:

client (ApiClient) – An authenticated API Client instance
historyTableName (str) – The name of the history table. If not specified, the current table is used for a self-join.
aggregationKeys (list[str]) – List of keys to use for joining the historical table and performing the window aggregation.
timestampKey (str) – Name of feature which contains the timestamp value for the point-in-time feature.
historicalTimestampKey (str) – Name of feature which contains the historical timestamp.
lookbackWindowSeconds (float) – If window is specified in terms of time, the number of seconds in the past from the current time for the start of the window.
lookbackWindowLagSeconds (float) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed. If it is negative, we are looking at the “future” rows in the history table.
lookbackCount (int) – If window is specified in terms of count, the start position of the window (0 is the current row).
lookbackUntilPosition (int) – Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
expression (str) – SQL aggregate expression which can convert a sequence of rows into a scalar value.
groupName (str) – The group name this point-in-time feature belongs to.

history_table_name = None

aggregation_keys = None

timestamp_key = None

historical_timestamp_key = None

lookback_window_seconds = None

lookback_window_lag_seconds = None

lookback_count = None

lookback_until_position = None

expression = None

group_name = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PointInTimeFeatureInfo(client, expression=None, groupName=None)

Bases: abacusai.return_class.AbstractApiClass

A point-in-time infos for a feature

Parameters:

client (ApiClient) – An authenticated API Client instance
expression (str) – SQL aggregate expression which can convert a sequence of rows into a scalar value.
groupName (str) – The group name this point-in-time feature belongs to.

expression = None

group_name = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PointInTimeGroup(client, groupName=None, windowKey=None, aggregationKeys=None, lookbackWindow=None, lookbackWindowLag=None, lookbackCount=None, lookbackUntilPosition=None, historyTableName=None, historyWindowKey=None, historyAggregationKeys=None, features={})

Bases: abacusai.return_class.AbstractApiClass

A point in time group containing point in time features

Parameters:

client (ApiClient) – An authenticated API Client instance
groupName (str) – The name of the point in time group
windowKey (str) – Name of feature which contains the timestamp value for the point in time feature
aggregationKeys (list) – List of keys to use for join the historical table and performing the window aggregation.
lookbackWindow (float) – Number of seconds in the past from the current time for start of the window.
lookbackWindowLag (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookbackCount (int) – If window is specified in terms of count, the start position of the window (0 is the current row)
lookbackUntilPosition (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
historyTableName (str) – The table to use for aggregating, if not provided, the source table will be used
historyWindowKey (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used
historyAggregationKeys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys
features (PointInTimeGroupFeature) – List of features in the Point in Time group

group_name = None

window_key = None

aggregation_keys = None

lookback_window = None

lookback_window_lag = None

lookback_count = None

lookback_until_position = None

history_table_name = None

history_window_key = None

history_aggregation_keys = None

features

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PointInTimeGroupFeature(client, name=None, expression=None, pitOperationType=None, pitOperationConfig=None)

Bases: abacusai.return_class.AbstractApiClass

A point in time group feature

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the feature
expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.
pitOperationType (str) – The operation used in point in time feature generation
pitOperationConfig (dict) – The configuration used as input to the operation type

name = None

expression = None

pit_operation_type = None

pit_operation_config = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PredictionClient(client_options=None)

Bases: abacusai.client.BaseApiClient

Abacus.AI Prediction API Client. Does not utilize authentication and only contains public prediction methods

Parameters:: client_options (ClientOptions) – Optional API client configurations

predict_raw(deployment_token, deployment_id, **kwargs)

Raw interface for returning predictions from Plug and Play deployments.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
**kwargs (dict) – Arbitrary key/value pairs may be passed in and is sent as part of the request body.

lookup_features(deployment_token, deployment_id, query_data, limit_results=None, result_columns=None)

Returns the feature group deployed in the feature store project.

Parameters:

deployment_token (str) – A deployment token used to authenticate access to created deployments. This token only authorizes predictions on deployments in this project, so it can be safely embedded inside an application or website.
deployment_id (str) – A unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the key is the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the value is the unique value of the same entity.
limit_results (int) – If provided, will limit the number of results to the value specified.
result_columns (list) – If provided, will limit the columns present in each result to the columns specified in this list.

Return type:

Dict

predict(deployment_token, deployment_id, query_data, **kwargs)

Returns a prediction for Predictive Modeling

Parameters:

deployment_token (str) – A deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, and is safe to embed in an application or website.
deployment_id (str) – A unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the key is the column name (e.g. a column with name ‘user_id’ in the dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed, and the value is the unique value of the same entity.

Return type:

Dict

predict_multiple(deployment_token, deployment_id, query_data)

Returns a list of predictions for predictive modeling.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, and is safe to embed in an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (list) – A list of dictionaries, where the ‘key’ is the column name (e.g. a column with name ‘user_id’ in the dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed, and the ‘value’ is the unique value of the same entity.

Return type:

Dict

predict_from_datasets(deployment_token, deployment_id, query_data)

Returns a list of predictions for Predictive Modeling.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the ‘key’ is the source dataset name, and the ‘value’ is a list of records corresponding to the dataset rows.

Return type:

Dict

predict_lead(deployment_token, deployment_id, query_data, explain_predictions=False, explainer_type=None)

Returns the probability of a user being a lead based on their interaction with the service/product and their own attributes (e.g. income, assets, credit score, etc.). Note that the inputs to this method, wherever applicable, should be the column names in the dataset mapped to the column mappings in our system (e.g. column ‘user_id’ mapped to mapping ‘LEAD_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – A dictionary containing user attributes and/or user’s interaction data with the product/service (e.g. number of clicks, items in cart, etc.).
explain_predictions (bool) – Will explain predictions for leads
explainer_type (str) – Type of explainer to use for explanations

Return type:

Dict

predict_churn(deployment_token, deployment_id, query_data, explain_predictions=False, explainer_type=None)

Returns the probability of a user to churn out in response to their interactions with the item/product/service. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘churn_result’ mapped to mapping ‘CHURNED_YN’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where the ‘key’ will be the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the ‘value’ will be the unique value of the same entity.
explain_predictions (bool) – Will explain predictions for churn
explainer_type (str) – Type of explainer to use for explanations

Return type:

Dict

predict_takeover(deployment_token, deployment_id, query_data)

Returns a probability for each class label associated with the types of fraud or a ‘yes’ or ‘no’ type label for the possibility of fraud. Note that the inputs to this method, wherever applicable, will be the column names in the dataset mapped to the column mappings in our system (e.g., column ‘account_name’ mapped to mapping ‘ACCOUNT_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – A dictionary containing account activity characteristics (e.g., login id, login duration, login type, IP address, etc.).

Return type:

Dict

predict_fraud(deployment_token, deployment_id, query_data)

Returns the probability of a transaction performed under a specific account being fraudulent or not. Note that the inputs to this method, wherever applicable, should be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘account_number’ mapped to the mapping ‘ACCOUNT_ID’ in our system).

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
query_data (dict) – A dictionary containing transaction attributes (e.g. credit card type, transaction location, transaction amount, etc.).

Return type:

Dict

predict_class(deployment_token, deployment_id, query_data, threshold=None, threshold_class=None, thresholds=None, explain_predictions=False, fixed_features=None, nested=None, explainer_type=None)

Returns a classification prediction

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (dict) – A dictionary where the ‘Key’ is the column name (e.g. a column with the name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the ‘Value’ is the unique value of the same entity.
threshold (float) – A float value that is applied on the popular class label.
threshold_class (str) – The label upon which the threshold is added (binary labels only).
thresholds (Dict) – Maps labels to thresholds (multi-label classification only). Defaults to F1 optimal threshold if computed for the given class, else uses 0.5.
explain_predictions (bool) – If True, returns the SHAP explanations for all input features.
fixed_features (list) – A set of input features to treat as constant for explanations - only honored when the explainer type is KERNEL_EXPLAINER
nested (str) – If specified generates prediction delta for each index of the specified nested feature.
explainer_type (str) – The type of explainer to use.

Return type:

Dict

predict_target(deployment_token, deployment_id, query_data, explain_predictions=False, fixed_features=None, nested=None, explainer_type=None)

Returns a prediction from a classification or regression model. Optionally, includes explanations.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – A dictionary where the ‘key’ is the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the entity against which a prediction is performed and the ‘value’ is the unique value of the same entity.
explain_predictions (bool) – If true, returns the SHAP explanations for all input features.
fixed_features (list) – Set of input features to treat as constant for explanations - only honored when the explainer type is KERNEL_EXPLAINER
nested (str) – If specified, generates prediction delta for each index of the specified nested feature.
explainer_type (str) – The type of explainer to use.

Return type:

Dict

get_anomalies(deployment_token, deployment_id, threshold=None, histogram=False)

Returns a list of anomalies from the training dataset.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
threshold (float) – The threshold score of what is an anomaly. Valid values are between 0.8 and 0.99.
histogram (bool) – If True, will return a histogram of the distribution of all points.

Return type:

io.BytesIO

get_timeseries_anomalies(deployment_token, deployment_id, start_timestamp=None, end_timestamp=None, query_data=None, get_all_item_data=False, series_ids=None)

Returns a list of anomalous timestamps from the training dataset.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
start_timestamp (str) – timestamp from which anomalies have to be detected in the training data
end_timestamp (str) – timestamp to which anomalies have to be detected in the training data
query_data (dict) – additional data on which anomaly detection has to be performed, it can either be a single record or list of records or a json string representing list of records
get_all_item_data (bool) – set this to true if anomaly detection has to be performed on all the data related to input ids
series_ids (List) – list of series ids on which the anomaly detection has to be performed

Return type:

Dict

is_anomaly(deployment_token, deployment_id, query_data=None)

Returns a list of anomaly attributes based on login information for a specified account. Note that the inputs to this method, wherever applicable, should be the column names in the dataset mapped to the column mappings in our system (e.g. column ‘account_name’ mapped to mapping ‘ACCOUNT_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – The input data for the prediction.

Return type:

Dict

get_event_anomaly_score(deployment_token, deployment_id, query_data=None)

Returns an anomaly score for an event.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – The input data for the prediction.

Return type:

Dict

get_forecast(deployment_token, deployment_id, query_data, future_data=None, num_predictions=None, prediction_start=None, explain_predictions=False, explainer_type=None, get_item_data=False)

Returns a list of forecasts for a given entity under the specified project deployment. Note that the inputs to the deployed model will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘holiday_yn’ mapped to mapping ‘FUTURE’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where ‘Key’ will be the column name (e.g. a column with name ‘store_id’ in your dataset) mapped to the column mapping ITEM_ID that uniquely identifies the entity against which forecasting is performed and ‘Value’ will be the unique value of the same entity.
future_data (list) – This will be a list of values known ahead of time that are relevant for forecasting (e.g. State Holidays, National Holidays, etc.). Each element is a dictionary, where the key and the value both will be of type ‘str’. For example future data entered for a Store may be [{“Holiday”:”No”, “Promo”:”Yes”, “Date”: “2015-07-31 00:00:00”}].
num_predictions (int) – The number of timestamps to predict in the future.
prediction_start (str) – The start date for predictions (e.g., “2015-08-01T00:00:00” as input for mid-night of 2015-08-01).
explain_predictions (bool) – Will explain predictions for forecasting
explainer_type (str) – Type of explainer to use for explanations
get_item_data (bool) – Will return the data corresponding to items in query

Return type:

Dict

get_k_nearest(deployment_token, deployment_id, vector, k=None, distance=None, include_score=False, catalog_id=None)

Returns the k nearest neighbors for the provided embedding vector.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
vector (list) – Input vector to perform the k nearest neighbors with.
k (int) – Overrideable number of items to return.
distance (str) – Specify the distance function to use. Options include “dot“, “cosine“, “euclidean“, and “manhattan“. Default = “dot“
include_score (bool) – If True, will return the score alongside the resulting embedding value.
catalog_id (str) – An optional parameter honored only for embeddings that provide a catalog id

Return type:

Dict

get_multiple_k_nearest(deployment_token, deployment_id, queries)

Returns the k nearest neighbors for the queries provided.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
queries (list) – List of mappings of format {“catalogId”: “cat0”, “vectors”: […], “k”: 20, “distance”: “euclidean”}. See getKNearest for additional information about the supported parameters.

get_labels(deployment_token, deployment_id, query_data, return_extracted_entities=False)

Returns a list of scored labels for a document.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – Dictionary where key is “Content” and value is the text from which entities are to be extracted.
return_extracted_entities (bool) – (Optional) If True, will return the extracted entities in simpler format

Return type:

Dict

get_entities_from_pdf(deployment_token, deployment_id, pdf=None, doc_id=None, return_extracted_features=False, verbose=False, save_extracted_features=None)

Extracts text from the provided PDF and returns a list of recognized labels and their scores.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
pdf (io.TextIOBase) – (Optional) The pdf to predict on. One of pdf or docId must be specified.
doc_id (str) – (Optional) The pdf to predict on. One of pdf or docId must be specified.
return_extracted_features (bool) – (Optional) If True, will return all extracted features (e.g. all tokens in a page) from the PDF. Default is False.
verbose (bool) – (Optional) If True, will return all the extracted tokens probabilities for all the trained labels. Default is False.
save_extracted_features (bool) – (Optional) If True, will save extracted features (i.e. page tokens) so that they can be fetched using the prediction docId. Default is False.

Return type:

Dict

get_recommendations(deployment_token, deployment_id, query_data, num_items=None, page=None, exclude_item_ids=None, score_field=None, scaling_factors=None, restrict_items=None, exclude_items=None, explore_fraction=None, diversity_attribute_name=None, diversity_max_results_per_value=None)

Returns a list of recommendations for a given user under the specified project deployment. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘time’ mapped to mapping ‘TIMESTAMP’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where ‘Key’ will be the column name (e.g. a column with name ‘user_name’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the user against which recommendations are made and ‘Value’ will be the unique value of the same item. For example, if you have the column name ‘user_name’ mapped to the column mapping ‘USER_ID’, then the query must have the exact same column name (user_name) as key and the name of the user (John Doe) as value.
num_items (int) – The number of items to recommend on one page. By default, it is set to 50 items per page.
page (int) – The page number to be displayed. For example, let’s say that the num_items is set to 10 with the total recommendations list size of 50 recommended items, then an input value of 2 in the ‘page’ variable will display a list of items that rank from 11th to 20th.
score_field (str) – The relative item scores are returned in a separate field named with the same name as the key (score_field) for this argument.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there’s a type of item that might be less popular but you want to promote it or there’s an item that always comes up and you want to demote it.
restrict_items (list) – It allows you to restrict the recommendations to certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, “value3”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”, “value3”, …]” to which to restrict the recommendations to. Let’s take an example where the input to restrict_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. This input will restrict the recommendations to SUVs and Sedans. This type of restriction is particularly useful if there’s a list of items that you know is of use in some particular scenario and you want to restrict the recommendations only to that list.
exclude_items (list) – It allows you to exclude certain items from the list of recommendations. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” to exclude from the recommendations. Let’s take an example where the input to exclude_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. The resulting recommendation list will exclude all SUVs and Sedans. This is
explore_fraction (float) – Explore fraction.
diversity_attribute_name (str) – item attribute column name which is used to ensure diversity of prediction results.
diversity_max_results_per_value (int) – maximum number of results per value of diversity_attribute_name.
exclude_item_ids (list)

Return type:

Dict

get_personalized_ranking(deployment_token, deployment_id, query_data, preserve_ranks=None, preserve_unknown_items=False, scaling_factors=None)

Returns a list of items with personalized promotions for a given user under the specified project deployment. Note that the inputs to this method, wherever applicable, should be the column names in the dataset mapped to the column mappings in our system (e.g. column ‘item_code’ mapped to mapping ‘ITEM_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model in an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This should be a dictionary with two key-value pairs. The first pair represents a ‘Key’ where the column name (e.g. a column with name ‘user_id’ in the dataset) mapped to the column mapping USER_ID uniquely identifies the user against whom a prediction is made and a ‘Value’ which is the identifier value for that user. The second pair will have a ‘Key’ which will be the name of the column name (e.g. movie_name) mapped to ITEM_ID (unique item identifier) and a ‘Value’ which will be a list of identifiers that uniquely identifies those items.
preserve_ranks (list) – List of dictionaries of format {“column”: “col0”, “values”: [“value0, value1”]}, where the ranks of items in query_data is preserved for all the items in “col0” with values, “value0” and “value1”. This option is useful when the desired items are being recommended in the desired order and the ranks for those items need to be kept unchanged during recommendation generation.
preserve_unknown_items (bool) – If true, any items that are unknown to the model, will not be reranked, and the original position in the query will be preserved.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there’s a type of item that might be less popular but you want to promote it or there’s an item that always comes up and you want to demote it.

Return type:

Dict

get_ranked_items(deployment_token, deployment_id, query_data, preserve_ranks=None, preserve_unknown_items=False, score_field=None, scaling_factors=None, diversity_attribute_name=None, diversity_max_results_per_value=None)

Returns a list of re-ranked items for a selected user when a list of items is required to be reranked according to the user’s preferences. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘item_code’ mapped to mapping ‘ITEM_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary with two key-value pairs. The first pair represents a ‘Key’ where the column name (e.g. a column with name ‘user_id’ in your dataset) mapped to the column mapping USER_ID uniquely identifies the user against whom a prediction is made and a ‘Value’ which is the identifier value for that user. The second pair will have a ‘Key’ which will be the name of the column name (e.g. movie_name) mapped to ITEM_ID (unique item identifier) and a ‘Value’ which will be a list of identifiers that uniquely identifies those items.
preserve_ranks (list) – List of dictionaries of format {“column”: “col0”, “values”: [“value0, value1”]}, where the ranks of items in query_data is preserved for all the items in “col0” with values, “value0” and “value1”. This option is useful when the desired items are being recommended in the desired order and the ranks for those items need to be kept unchanged during recommendation generation.
preserve_unknown_items (bool) – If true, any items that are unknown to the model, will not be reranked, and the original position in the query will be preserved
score_field (str) – The relative item scores are returned in a separate field named with the same name as the key (score_field) for this argument.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there is a type of item that might be less popular but you want to promote it or there is an item that always comes up and you want to demote it.
diversity_attribute_name (str) – item attribute column name which is used to ensure diversity of prediction results.
diversity_max_results_per_value (int) – maximum number of results per value of diversity_attribute_name.

Return type:

Dict

get_related_items(deployment_token, deployment_id, query_data, num_items=None, page=None, scaling_factors=None, restrict_items=None, exclude_items=None)

Returns a list of related items for a given item under the specified project deployment. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘item_code’ mapped to mapping ‘ITEM_ID’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – This will be a dictionary where the ‘key’ will be the column name (e.g. a column with name ‘user_name’ in your dataset) mapped to the column mapping USER_ID that uniquely identifies the user against which related items are determined and the ‘value’ will be the unique value of the same item. For example, if you have the column name ‘user_name’ mapped to the column mapping ‘USER_ID’, then the query must have the exact same column name (user_name) as key and the name of the user (John Doe) as value.
num_items (int) – The number of items to recommend on one page. By default, it is set to 50 items per page.
page (int) – The page number to be displayed. For example, let’s say that the num_items is set to 10 with the total recommendations list size of 50 recommended items, then an input value of 2 in the ‘page’ variable will display a list of items that rank from 11th to 20th.
scaling_factors (list) – It allows you to bias the model towards certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”], “factor”: 1.1}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” in reference to which the model recommendations need to be biased; and the key, “factor” takes the factor by which the item scores are adjusted. Let’s take an example where the input to scaling_factors is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”], “factor”: 1.4}]. After we apply the model to get item probabilities, for every SUV and Sedan in the list, we will multiply the respective probability by 1.1 before sorting. This is particularly useful if there’s a type of item that might be less popular but you want to promote it or there’s an item that always comes up and you want to demote it.
restrict_items (list) – It allows you to restrict the recommendations to certain items. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, “value3”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”, “value3”, …]” to which to restrict the recommendations to. Let’s take an example where the input to restrict_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. This input will restrict the recommendations to SUVs and Sedans. This type of restriction is particularly useful if there’s a list of items that you know is of use in some particular scenario and you want to restrict the recommendations only to that list.
exclude_items (list) – It allows you to exclude certain items from the list of recommendations. The input to this argument is a list of dictionaries where the format of each dictionary is as follows: {“column”: “col0”, “values”: [“value0”, “value1”, …]}. The key, “column” takes the name of the column, “col0”; the key, “values” takes the list of items, “[“value0”, “value1”]” to exclude from the recommendations. Let’s take an example where the input to exclude_items is [{“column”: “VehicleType”, “values”: [“SUV”, “Sedan”]}]. The resulting recommendation list will exclude all SUVs and Sedans. This is particularly useful if there’s a list of items that you know is of no use in some particular scenario and you don’t want to show those items present in that list.

Return type:

Dict

get_chat_response(deployment_token, deployment_id, messages, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, user_info=None)

Return a chat response which continues the conversation based on the input messages and search results.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
messages (list) – A list of chronologically ordered messages, starting with a user message and alternating sources. A message is a dict with attributes: is_user (bool): Whether the message is from the user. text (str): The message’s text.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifying the query chat config override.
user_info (dict)

Return type:

Dict

get_chat_response_with_binary_data(deployment_token, deployment_id, messages, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, attachments=None)

Return a chat response which continues the conversation based on the input messages and search results.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
messages (list) – A list of chronologically ordered messages, starting with a user message and alternating sources. A message is a dict with attributes: is_user (bool): Whether the message is from the user. text (str): The message’s text.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifying the query chat config override.
attachments (None) – A dictionary of binary data to use to answer the queries.

Return type:

Dict

get_conversation_response(deployment_id, message, deployment_token, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, doc_infos=None, user_info=None, execute_usercode_tool=False)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:

deployment_id (str) – The unique identifier to a deployment created under the project.
message (str) – A message from the user
deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifiying the query chat config override.
doc_infos (list) – An optional list of documents use for the conversation. A keyword ‘doc_id’ is expected to be present in each document for retrieving contents from docstore.
execute_usercode_tool (bool) – If True, will return the tool output in the response.
user_info (dict)

Return type:

Dict

get_conversation_response_with_binary_data(deployment_id, deployment_token, message, deployment_conversation_id=None, external_session_id=None, llm_name=None, num_completion_tokens=None, system_message=None, temperature=0.0, filter_key_values=None, search_score_cutoff=None, chat_config=None, attachments=None)

Return a conversation response which continues the conversation based on the input message and deployment conversation id (if exists).

Parameters:

deployment_id (str) – The unique identifier to a deployment created under the project.
deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
message (str) – A message from the user
deployment_conversation_id (str) – The unique identifier of a deployment conversation to continue. If not specified, a new one will be created.
external_session_id (str) – The user supplied unique identifier of a deployment conversation to continue. If specified, we will use this instead of a internal deployment conversation id.
llm_name (str) – Name of the specific LLM backend to use to power the chat experience
num_completion_tokens (int) – Default for maximum number of tokens for chat answers
system_message (str) – The generative LLM system message
temperature (float) – The generative LLM temperature
filter_key_values (dict) – A dictionary mapping column names to a list of values to restrict the retrived search results.
search_score_cutoff (float) – Cutoff for the document retriever score. Matching search results below this score will be ignored.
chat_config (dict) – A dictionary specifiying the query chat config override.
attachments (None) – A dictionary of binary data to use to answer the queries.

Return type:

Dict

get_search_results(deployment_token, deployment_id, query_data, num=15)

Return the most relevant search results to the search query from the uploaded documents.

Parameters:

deployment_token (str) – A token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be securely embedded in an application or website.
deployment_id (str) – A unique identifier of a deployment created under the project.
query_data (dict) – A dictionary where the key is “Content” and the value is the text from which entities are to be extracted.
num (int) – Number of search results to return.

Return type:

Dict

get_sentiment(deployment_token, deployment_id, document)

Predicts sentiment on a document

Parameters:

deployment_token (str) – A token used to authenticate access to deployments created in this project. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for a deployment created under this project.
document (str) – The document to be analyzed for sentiment.

Return type:

Dict

get_entailment(deployment_token, deployment_id, document)

Predicts the classification of the document

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
document (str) – The document to be classified.

Return type:

Dict

get_classification(deployment_token, deployment_id, document)

Predicts the classification of the document

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
document (str) – The document to be classified.

Return type:

Dict

get_summary(deployment_token, deployment_id, query_data)

Returns a JSON of the predicted summary for the given document. Note that the inputs to this method, wherever applicable, will be the column names in your dataset mapped to the column mappings in our system (e.g. column ‘text’ mapped to mapping ‘DOCUMENT’ in our system).

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
query_data (dict) – Raw data dictionary containing the required document data - must have a key ‘document’ corresponding to a DOCUMENT type text as value.

Return type:

Dict

predict_language(deployment_token, deployment_id, query_data)

Predicts the language of the text

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments within this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for a deployment created under the project.
query_data (str) – The input string to detect.

Return type:

Dict

get_assignments(deployment_token, deployment_id, query_data, forced_assignments=None, solve_time_limit_seconds=None, include_all_assignments=False)

Get all positive assignments that match a query.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – Specifies the set of assignments being requested. The value for the key can be: 1. A simple scalar value, which is matched exactly 2. A list of values, which matches any element in the list 3. A dictionary with keys lower_in/lower_ex and upper_in/upper_ex, which matches values in an inclusive/exclusive range
forced_assignments (dict) – Set of assignments to force and resolve before returning query results.
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
include_all_assignments (bool) – If True, will return all assignments, including assignments with value 0. Default is False.

Return type:

Dict

get_alternative_assignments(deployment_token, deployment_id, query_data, add_constraints=None, solve_time_limit_seconds=None, best_alternate_only=False)

Get alternative positive assignments for given query. Optimal assignments are ignored and the alternative assignments are returned instead.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – Specifies the set of assignments being requested. The value for the key can be: 1. A simple scalar value, which is matched exactly 2. A list of values, which matches any element in the list 3. A dictionary with keys lower_in/lower_ex and upper_in/upper_ex, which matches values in an inclusive/exclusive range
add_constraints (list) – List of constraints dict to apply to the query. The constraint dict should have the following keys: 1. query (dict): Specifies the set of assignment variables involved in the constraint. The format is same as query_data. 2. operator (str): Constraint operator ‘=’ or ‘<=’ or ‘>=’. 3. constant (int): Constraint RHS constant value. 4. coefficient_column (str): Column in Assignment feature group to be used as coefficient for the assignment variables, optional and defaults to 1
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
best_alternate_only (bool) – When True only the best alternate will be returned, when False multiple alternates are returned

Return type:

Dict

get_optimization_inputs_from_serialized(deployment_token, deployment_id, query_data=None)

Get assignments for given query, with new inputs

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – a dictionary with various key: value pairs corresponding to various updated FGs in the FG tree, which we want to update to compute new top level FGs for online solve. (query data will be dict of names: serialized dataframes)

Return type:

Dict

get_assignments_online_with_new_serialized_inputs(deployment_token, deployment_id, query_data=None, solve_time_limit_seconds=None, optimality_gap_limit=None)

Get assignments for given query, with new inputs

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
query_data (dict) – a dictionary with assignment, constraint and constraint_equations_df (under these specific keys)
solve_time_limit_seconds (float) – Maximum time in seconds to spend solving the query.
optimality_gap_limit (float) – Optimality gap we want to come within, after which we accept the solution as valid. (0 means we only want an optimal solution). it is abs(best_solution_found - best_bound) / abs(best_solution_found)

Return type:

Dict

check_constraints(deployment_token, deployment_id, query_data)

Check for any constraints violated by the overrides.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
query_data (dict) – Assignment overrides to the solution.

Return type:

Dict

predict_with_binary_data(deployment_token, deployment_id, blob)

Make predictions for a given blob, e.g. image, audio

Parameters:

deployment_token (str) – A token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model in an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
blob (io.TextIOBase) – The multipart/form-data of the data.

Return type:

Dict

describe_image(deployment_token, deployment_id, image, categories, top_n=None)

Describe the similarity between an image and a list of categories.

Parameters:

deployment_token (str) – Authentication token to access created deployments. This token is only authorized to predict on deployments in the current project, and can be safely embedded in an application or website.
deployment_id (str) – Unique identifier of a deployment created under the project.
image (io.TextIOBase) – Image to describe.
categories (list) – List of candidate categories to compare with the image.
top_n (int) – Return the N most similar categories.

Return type:

Dict

get_text_from_document(deployment_token, deployment_id, document=None, adjust_doc_orientation=False, save_predicted_pdf=False, save_extracted_features=False)

Generate text from a document

Parameters:

deployment_token (str) – Authentication token to access created deployments. This token is only authorized to predict on deployments in the current project, and can be safely embedded in an application or website.
deployment_id (str) – Unique identifier of a deployment created under the project.
document (io.TextIOBase) – Input document which can be an image, pdf, or word document (Some formats might not be supported yet)
adjust_doc_orientation (bool) – (Optional) whether to detect the document page orientation and rotate it if needed.
save_predicted_pdf (bool) – (Optional) If True, will save the predicted pdf bytes so that they can be fetched using the prediction docId. Default is False.
save_extracted_features (bool) – (Optional) If True, will save extracted features (i.e. page tokens) so that they can be fetched using the prediction docId. Default is False.

Return type:

Dict

transcribe_audio(deployment_token, deployment_id, audio)

Transcribe the audio

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to make predictions on deployments in this project, so it can be safely embedded in an application or website.
deployment_id (str) – The unique identifier of a deployment created under the project.
audio (io.TextIOBase) – The audio to transcribe.

Return type:

Dict

classify_image(deployment_token, deployment_id, image=None, doc_id=None)

Classify an image.

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier to a deployment created under the project.
image (io.TextIOBase) – The binary data of the image to classify. One of image or doc_id must be specified.
doc_id (str) – The document ID of the image. One of image or doc_id must be specified.

Return type:

Dict

classify_pdf(deployment_token, deployment_id, pdf=None)

Returns a classification prediction from a PDF

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – The unique identifier for a deployment created under the project.
pdf (io.TextIOBase) – (Optional) The pdf to predict on. One of pdf or docId must be specified.

Return type:

Dict

get_cluster(deployment_token, deployment_id, query_data)

Predicts the cluster for given data.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
query_data (dict) – A dictionary where each ‘key’ represents a column name and its corresponding ‘value’ represents the value of that column. For Timeseries Clustering, the ‘key’ should be ITEM_ID, and its value should represent a unique item ID that needs clustering.

Return type:

Dict

get_objects_from_image(deployment_token, deployment_id, image)

Classify an image.

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier to a deployment created under the project.
image (io.TextIOBase) – The binary data of the image to detect objects from.

Return type:

Dict

score_image(deployment_token, deployment_id, image)

Score on image.

Parameters:

deployment_token (str) – A deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier to a deployment created under the project.
image (io.TextIOBase) – The binary data of the image to get the score.

Return type:

Dict

transfer_style(deployment_token, deployment_id, source_image, style_image)

Change the source image to adopt the visual style from the style image.

Parameters:

deployment_token (str) – A token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model in an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
source_image (io.TextIOBase) – The source image to apply the makeup.
style_image (io.TextIOBase) – The image that has the style as a reference.

Return type:

io.BytesIO

generate_image(deployment_token, deployment_id, query_data)

Generate an image from text prompt.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model within an application or website.
deployment_id (str) – A unique identifier to a deployment created under the project.
query_data (dict) – Specifies the text prompt. For example, {‘prompt’: ‘a cat’}

Return type:

io.BytesIO

get_matrix_agent_schema(deployment_token, deployment_id, query, doc_infos=None, deployment_conversation_id=None, external_session_id=None)

Executes a deployed AI agent function using the arguments as keyword arguments to the agent execute function.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
query (str) – User input query to initialize the matrix computation.
doc_infos (list) – An optional list of documents use for constructing the matrix. A keyword ‘doc_id’ is expected to be present in each document for retrieving contents from docstore.
deployment_conversation_id (str) – A unique string identifier for the deployment conversation used for the conversation.
external_session_id (str) – A unique string identifier for the session used for the conversation. If both deployment_conversation_id and external_session_id are not provided, a new session will be created.

Return type:

Dict

lookup_matches(deployment_token, deployment_id, data=None, filters=None, num=None, result_columns=None, max_words=None, num_retrieval_margin_words=None, max_words_per_chunk=None, score_multiplier_column=None, min_score=None, required_phrases=None, filter_clause=None, crowding_limits=None, include_text_search=False)

Lookup document retrievers and return the matching documents from the document retriever deployed with given query.

Original documents are splitted into chunks and stored in the document retriever. This lookup function will return the relevant chunks from the document retriever. The returned chunks could be expanded to include more words from the original documents and merged if they are overlapping, and permitted by the settings provided. The returned chunks are sorted by relevance.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments within this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
data (str) – The query to search for.
filters (dict) – A dictionary mapping column names to a list of values to restrict the retrieved search results.
num (int) – If provided, will limit the number of results to the value specified.
result_columns (list) – If provided, will limit the column properties present in each result to those specified in this list.
max_words (int) – If provided, will limit the total number of words in the results to the value specified.
num_retrieval_margin_words (int) – If provided, will add this number of words from left and right of the returned chunks.
max_words_per_chunk (int) – If provided, will limit the number of words in each chunk to the value specified. If the value provided is smaller than the actual size of chunk on disk, which is determined during document retriever creation, the actual size of chunk will be used. I.e, chunks looked up from document retrievers will not be split into smaller chunks during lookup due to this setting.
score_multiplier_column (str) – If provided, will use the values in this column to modify the relevance score of the returned chunks. Values in this column must be numeric.
min_score (float) – If provided, will filter out the results with score less than the value specified.
required_phrases (list) – If provided, each result will contain at least one of the phrases in the given list. The matching is whitespace and case insensitive.
filter_clause (str) – If provided, filter the results of the query using this sql where clause.
crowding_limits (dict) – A dictionary mapping metadata columns to the maximum number of results per unique value of the column. This is used to ensure diversity of metadata attribute values in the results. If a particular attribute value has already reached its maximum count, further results with that same attribute value will be excluded from the final result set. An entry in the map can also be a map specifying the limit per attribute value rather than a single limit for all values. This allows a per value limit for attributes. If an attribute value is not present in the map its limit defaults to zero.
include_text_search (bool) – If true, combine the ranking of results from a BM25 text search over the documents with the vector search using reciprocal rank fusion. It leverages both lexical and semantic matching for better overall results. It’s particularly valuable in professional, technical, or specialized fields where both precision in terminology and understanding of context are important.

Returns:

The relevant documentation results found from the document retriever.

Return type:

list[DocumentRetrieverLookupResult]

get_completion(deployment_token, deployment_id, prompt)

Returns the finetuned LLM generated completion of the prompt.

Parameters:

deployment_token (str) – The deployment token to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – The unique identifier to a deployment created under the project.
prompt (str) – The prompt given to the finetuned LLM to generate the completion.

Return type:

Dict

execute_agent_with_binary_data(deployment_token, deployment_id, arguments=None, keyword_arguments=None, deployment_conversation_id=None, external_session_id=None, blobs=None)

Executes a deployed AI agent function with binary data as inputs.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, so it is safe to embed this model inside of an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
arguments (list) – Positional arguments to the agent execute function.
keyword_arguments (dict) – A dictionary where each ‘key’ represents the parameter name and its corresponding ‘value’ represents the value of that parameter for the agent execute function.
deployment_conversation_id (str) – A unique string identifier for the deployment conversation used for the conversation.
external_session_id (str) – A unique string identifier for the session used for the conversation. If both deployment_conversation_id and external_session_id are not provided, a new session will be created.
blobs (None) – A dictionary of binary data to use as inputs to the agent execute function.

Returns:

The result of the agent execution

Return type:

AgentDataExecutionResult

start_autonomous_agent(deployment_token, deployment_id, arguments=None, keyword_arguments=None, save_conversations=True)

Starts a deployed Autonomous agent associated with the given deployment_conversation_id using the arguments and keyword arguments as inputs for execute function of trigger node.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
arguments (list) – Positional arguments to the agent execute function.
keyword_arguments (dict) – A dictionary where each ‘key’ represents the parameter name and its corresponding ‘value’ represents the value of that parameter for the agent execute function.
save_conversations (bool) – If true then a new conversation will be created for every run of the workflow associated with the agent.

Return type:

Dict

pause_autonomous_agent(deployment_token, deployment_id, deployment_conversation_id)

Pauses a deployed Autonomous agent associated with the given deployment_conversation_id.

Parameters:

deployment_token (str) – The deployment token used to authenticate access to created deployments. This token is only authorized to predict on deployments in this project, making it safe to embed this model in an application or website.
deployment_id (str) – A unique string identifier for the deployment created under the project.
deployment_conversation_id (str) – A unique string identifier for the deployment conversation used for the conversation.

Return type:

Dict

class abacusai.PredictionDataset(client, datasetId=None, datasetType=None, datasetVersion=None, default=None, required=None)

Bases: abacusai.return_class.AbstractApiClass

Batch Input Datasets

Parameters:

client (ApiClient) – An authenticated API Client instance
datasetId (str) – The unique identifier of the dataset
datasetType (str) – dataset type
datasetVersion (str) – The unique identifier of the dataset version used for predictions
default (bool) – If true, this dataset is the default dataset in the model
required (bool) – If true, this dataset is required for the batch prediction

dataset_id = None

dataset_type = None

dataset_version = None

default = None

required = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PredictionFeatureGroup(client, featureGroupId=None, featureGroupVersion=None, datasetType=None, default=None, required=None)

Bases: abacusai.return_class.AbstractApiClass

Batch Input Feature Group

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – The unique identifier of the feature group
featureGroupVersion (str) – The unique identifier of the feature group version used for predictions
datasetType (str) – dataset type
default (bool) – If true, this feature group is the default feature group in the model
required (bool) – If true, this feature group is required for the batch prediction

feature_group_id = None

feature_group_version = None

dataset_type = None

default = None

required = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PredictionInput(client, featureGroupDatasetIds=None, datasetIdRemap=None, featureGroups={}, datasets={})

Bases: abacusai.return_class.AbstractApiClass

Batch inputs

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupDatasetIds (list) – The list of dataset IDs to use as input
datasetIdRemap (dict) – Replacement datasets to swap as prediction input
featureGroups (PredictionFeatureGroup) – List of prediction feature groups
datasets (PredictionDataset) – List of prediction datasets

feature_group_dataset_ids = None

dataset_id_remap = None

feature_groups

datasets

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PredictionLogRecord(client, requestId=None, query=None, queryTimeMs=None, timestampMs=None, response=None)

Bases: abacusai.return_class.AbstractApiClass

A Record for a prediction request log.

Parameters:

client (ApiClient) – An authenticated API Client instance
requestId (str) – The unique identifier of the prediction request.
query (dict) – The query used to make the prediction.
queryTimeMs (int) – The time taken to make the prediction.
timestampMs (str) – The timestamp of the prediction request.
response (dict) – The prediction response.

request_id = None

query = None

query_time_ms = None

timestamp_ms = None

response = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PredictionOperator(client, name=None, predictionOperatorId=None, createdAt=None, updatedAt=None, projectId=None, predictFunctionName=None, sourceCode=None, initializeFunctionName=None, notebookId=None, memory=None, useGpu=None, featureGroupIds=None, featureGroupTableNames=None, codeSource={}, refreshSchedules={}, latestPredictionOperatorVersion={})

Bases: abacusai.return_class.AbstractApiClass

A prediction operator.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name for the prediction operator.
predictionOperatorId (str) – The unique identifier of the prediction operator.
createdAt (str) – Date and time at which the prediction operator was created.
updatedAt (str) – Date and time at which the prediction operator was updated.
projectId (str) – The project this prediction operator belongs to.
predictFunctionName (str) – Name of the function found in the source code that will be executed to run predictions.
sourceCode (str) – Python code used to make the prediction operator.
initializeFunctionName (str) – Name of the optional initialize function found in the source code. This function will generate anything used by predictions, based on input feature groups.
notebookId (str) – The unique string identifier of the notebook used to create or edit the prediction operator.
memory (int) – Memory in GB specified for the prediction operator.
useGpu (bool) – Whether this prediction operator is using gpu.
featureGroupIds (list) – A list of Feature Group IDs used for initializing.
featureGroupTableNames (list) – A list of Feature Group table names used for initializing.
codeSource (CodeSource) – If a python model, information on the source code.
latestPredictionOperatorVersion (PredictionOperatorVersion) – The unique string identifier of the latest version.
refreshSchedules (RefreshSchedule) – List of refresh schedules that indicate when the next prediction operator version will be processed

name = None

prediction_operator_id = None

created_at = None

updated_at = None

project_id = None

predict_function_name = None

source_code = None

initialize_function_name = None

notebook_id = None

memory = None

use_gpu = None

feature_group_ids = None

feature_group_table_names = None

code_source

refresh_schedules

latest_prediction_operator_version

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: PredictionOperator

describe()

Describe an existing prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.
Returns:: The requested prediction operator object.
Return type:: PredictionOperator

update(name=None, feature_group_ids=None, source_code=None, initialize_function_name=None, predict_function_name=None, cpu_size=None, memory=None, package_requirements=None, use_gpu=None)

Update an existing prediction operator. This does not create a new version.

Parameters:

name (str) – Name of the prediction operator.
feature_group_ids (List) – List of feature groups that are supplied to the initialize function as parameters. Each of the parameters are materialized Dataframes. The order should match the initialize function’s parameters.
source_code (str) – Contents of a valid Python source code file. The source code should contain the function predictFunctionName, and the function ‘initializeFunctionName’ if defined.
initialize_function_name (str) – Name of the optional initialize function found in the source code. This function will generate anything used by predictions, based on input feature groups.
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions.
cpu_size (str) – Size of the CPU for the prediction operator.
memory (int) – Memory (in GB) for the prediction operator.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_gpu (bool) – Whether this prediction operator needs gpu.

Returns:

The updated prediction operator object.

Return type:

PredictionOperator

delete()

Delete an existing prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.

deploy(auto_deploy=True)

Deploy the prediction operator.

Parameters:: auto_deploy (bool) – Flag to enable the automatic deployment when a new prediction operator version is created.
Returns:: The created deployment object.
Return type:: Deployment

create_version()

Create a new version of the prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.
Returns:: The created prediction operator version object.
Return type:: PredictionOperatorVersion

list_versions()

List all the prediction operator versions for a prediction operator.

Parameters:: prediction_operator_id (str) – The unique ID of the prediction operator.
Returns:: A list of prediction operator version objects.
Return type:: list[PredictionOperatorVersion]

class abacusai.PredictionOperatorVersion(client, predictionOperatorId=None, predictionOperatorVersion=None, createdAt=None, updatedAt=None, sourceCode=None, memory=None, useGpu=None, featureGroupIds=None, featureGroupVersions=None, status=None, error=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

A prediction operator version.

Parameters:

client (ApiClient) – An authenticated API Client instance
predictionOperatorId (str) – The unique identifier of the prediction operator.
predictionOperatorVersion (str) – The unique identifier of the prediction operator version.
createdAt (str) – Date and time at which the prediction operator was created.
updatedAt (str) – Date and time at which the prediction operator was updated.
sourceCode (str) – Python code used to make the prediction operator.
memory (int) – Memory in GB specified for the prediction operator version.
useGpu (bool) – Whether this prediction operator version is using gpu.
featureGroupIds (list) – A list of Feature Group IDs used for initializing.
featureGroupVersions (list) – A list of Feature Group version IDs used for initializing.
status (str) – The current status of the prediction operator version.
error (str) – The error message if the status failed.
codeSource (CodeSource) – If a python model, information on the source code.

prediction_operator_id = None

prediction_operator_version = None

created_at = None

updated_at = None

source_code = None

memory = None

use_gpu = None

feature_group_ids = None

feature_group_versions = None

status = None

error = None

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

delete()

Delete a prediction operator version.

Parameters:: prediction_operator_version (str) – The unique ID of the prediction operator version.

class abacusai.PresentationExportResult(client, filePath=None)

Bases: abacusai.return_class.AbstractApiClass

Export Presentation

Parameters:

client (ApiClient) – An authenticated API Client instance
filePath (str) – The path to the exported presentation

file_path = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ProblemType(client, problemType=None, requiredFeatureGroupType=None, optionalFeatureGroupTypes=None, useCasesSupportCustomAlgorithm=None)

Bases: abacusai.return_class.AbstractApiClass

Description of a problem type which is the common underlying problem for different use cases.

Parameters:

client (ApiClient) – An authenticated API Client instance
problemType (str) – Name of the problem type
requiredFeatureGroupType (str) – The required feature group types to train for this problem type
optionalFeatureGroupTypes (list[str]) – The optional feature group types can be used to train for this problem type
useCasesSupportCustomAlgorithm (list) – A list of use cases that support custom algorithms

problem_type = None

required_feature_group_type = None

optional_feature_group_types = None

use_cases_support_custom_algorithm = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Project(client, projectId=None, name=None, useCase=None, problemType=None, createdAt=None, tags=None)

Bases: abacusai.return_class.AbstractApiClass

A project is a container which holds datasets, models and deployments

Parameters:

client (ApiClient) – An authenticated API Client instance
projectId (str) – The ID of the project.
name (str) – The name of the project.
useCase (str) – The use case associated with the project.
problemType (str) – The problem type associated with the project.
createdAt (str) – The date and time when the project was created.
tags (list[str]) – List of tags associated with the project.

project_id = None

name = None

use_case = None

problem_type = None

created_at = None

tags = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Project

describe()

Returns a description of a project.

Parameters:: project_id (str) – A unique string identifier for the project.
Returns:: The description of the project.
Return type:: Project

rename(name)

This method renames a project after it is created.

Parameters:: name (str) – The new name for the project.

delete(force_delete=False)

Delete a specified project from your organization.

This method deletes the project, its associated trained models, and deployments. The datasets attached to the specified project remain available for use with other projects in the organization.

This method will not delete a project that contains active deployments. Ensure that all active deployments are stopped before using the delete option.

Note: All projects, models, and deployments cannot be recovered once they are deleted.

Parameters:: force_delete (bool) – If True, the project will be deleted even if it has active deployments.

add_tags(tags)

This method adds a tag to a project.

Parameters:: tags (list) – The tags to add to the project.

remove_tags(tags)

This method removes a tag from a project.

Parameters:: tags (list) – The tags to remove from the project.

set_feature_mapping(feature_group_id, feature_name, feature_mapping=None, nested_column_name=None)

Set a column’s feature mapping. If the column mapping is single-use and already set in another column in this feature group, this call will first remove the other column’s mapping and move it to this column.

Parameters:

feature_group_id (str) – The unique ID associated with the feature group.
feature_name (str) – The name of the feature.
feature_mapping (str) – The mapping of the feature in the feature group.
nested_column_name (str) – The name of the nested column if the input feature is part of a nested feature group for the given feature_group_id.

Returns:

A list of objects that describes the resulting feature group’s schema after the feature’s featureMapping is set.

Return type:

list[Feature]

validate(feature_group_ids=None)

Validates that the specified project has all required feature group types for its use case and that all required feature columns are set.

Parameters:: feature_group_ids (List) – The list of feature group IDs to validate.
Returns:: The project validation. If the specified project is missing required columns or feature groups, the response includes an array of objects for each missing required feature group and the missing required features in each feature group.
Return type:: ProjectValidation

infer_feature_mappings(feature_group_id)

Infer the feature mappings for the feature group in the project based on the problem type.

Parameters:: feature_group_id (str) – The unique ID associated with the feature group.
Returns:: A dict that contains the inferred feature mappings.
Return type:: InferredFeatureMappings

describe_feature_group(feature_group_id)

Describe a feature group associated with a project

Parameters:: feature_group_id (str) – The unique ID associated with the feature group.
Returns:: The project feature group object.
Return type:: ProjectFeatureGroup

list_feature_groups(filter_feature_group_use=None, limit=100, start_after_id=None)

List all the feature groups associated with a project

Parameters:

filter_feature_group_use (str) – The feature group use filter, when given as an argument only allows feature groups present in this project to be returned if they are of the given use. Possible values are: ‘USER_CREATED’, ‘BATCH_PREDICTION_OUTPUT’.
limit (int) – The maximum number of feature groups to be retrieved.
start_after_id (str) – An offset parameter to exclude all feature groups up to a specified ID.

Returns:

All the Feature Groups in a project.

Return type:

list[ProjectFeatureGroup]

list_feature_group_templates(limit=100, start_after_id=None, should_include_all_system_templates=False)

List feature group templates for feature groups associated with the project.

Parameters:

limit (int) – Maximum number of templates to be retrieved.
start_after_id (str) – Offset parameter to exclude all templates till the specified feature group template ID.
should_include_all_system_templates (bool) – If True, will include built-in templates.

Returns:

All the feature groups in the organization, optionally limited by the feature group that created the template(s).

Return type:

list[FeatureGroupTemplate]

get_training_config_options(feature_group_ids=None, for_retrain=False, current_training_config=None)

Retrieves the full initial description of the model training configuration options available for the specified project. The configuration options available are determined by the use case associated with the specified project. Refer to the [Use Case Documentation]({USE_CASES_URL}) for more information on use cases and use case-specific configuration options.

Parameters:

feature_group_ids (List) – The feature group IDs to be used for training.
for_retrain (bool) – Whether the training config options are used for retraining.
current_training_config (TrainingConfig) – The current state of the training config, with some options set, which shall be used to get new options after refresh. This is None by default initially.

Returns:

An array of options that can be specified when training a model in this project.

Return type:

list[TrainingConfigOptions]

create_train_test_data_split_feature_group(training_config, feature_group_ids)

Get the train and test data split without training the model. Only supported for models with custom algorithms.

Parameters:

training_config (TrainingConfig) – The training config used to influence how the split is calculated.
feature_group_ids (List) – List of feature group IDs provided by the user, including the required one for data split and others to influence how to split.

Returns:

The feature group containing the training data and folds information.

Return type:

FeatureGroup

train_model(name=None, training_config=None, feature_group_ids=None, refresh_schedule=None, custom_algorithms=None, custom_algorithms_only=False, custom_algorithm_configs=None, builtin_algorithms=None, cpu_size=None, memory=None, algorithm_training_configs=None)

Create a new model and start its training in the given project.

Parameters:

name (str) – The name of the model. Defaults to “<Project Name> Model”.
training_config (TrainingConfig) – The training config used to train this model.
feature_group_ids (List) – List of feature group IDs provided by the user to train the model on.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically retrain the created model.
custom_algorithms (list) – List of user-defined algorithms to train. If not set, the default enabled custom algorithms will be used.
custom_algorithms_only (bool) – Whether to only run custom algorithms.
custom_algorithm_configs (dict) – Configs for each user-defined algorithm; key is the algorithm name, value is the config serialized to JSON.
builtin_algorithms (list) – List of algorithm names or algorithm IDs of the builtin algorithms provided by Abacus.AI to train. If not set, all applicable builtin algorithms will be used.
cpu_size (str) – Size of the CPU for the user-defined algorithms during training.
memory (int) – Memory (in GB) for the user-defined algorithms during training.
algorithm_training_configs (list) – List of algorithm specifc training configs that will be part of the model training AutoML run.

Returns:

The new model which is being trained.

Return type:

Model

create_model_from_python(function_source_code, train_function_name, training_input_tables, predict_function_name=None, predict_many_function_name=None, initialize_function_name=None, name=None, cpu_size=None, memory=None, training_config=None, exclusive_run=False, package_requirements=None, use_gpu=False, is_thread_safe=None)

Initializes a new Model from user-provided Python code. If a list of input feature groups is supplied, they will be provided as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects functionSourceCode to be a valid language source file which contains the functions named trainFunctionName and predictFunctionName. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName and predictFunctionName has no well-defined return type, as it returns the prediction made by the predictFunctionName, which can be anything.

Parameters:

function_source_code (str) – Contents of a valid Python source code file. The source code should contain the functions named trainFunctionName and predictFunctionName. A list of allowed import and system libraries for each language is specified in the user functions documentation section.
train_function_name (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.
training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
predict_function_name (str) – Name of the function found in the source code that will be executed to run predictions through the model. It is not executed when this function is run.
predict_many_function_name (str) – Name of the function found in the source code that will be executed for batch prediction of the model. It is not executed when this function is run.
initialize_function_name (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model
name (str) – The name you want your model to have. Defaults to “<Project Name> Model”
cpu_size (str) – Size of the CPU for the model training function
memory (int) – Memory (in GB) for the model training function
training_config (TrainingConfig) – Training configuration
exclusive_run (bool) – Decides if this model will be run exclusively or along with other Abacus.AI algorithms
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’]
use_gpu (bool) – Whether this model needs gpu
is_thread_safe (bool) – Whether this model is thread safe

Returns:

The new model, which has not been trained.

Return type:

Model

list_models()

Retrieves the list of models in the specified project.

Parameters:: project_id (str) – Unique string identifier associated with the project.
Returns:: A list of models.
Return type:: list[Model]

get_custom_train_function_info(feature_group_names_for_training=None, training_data_parameter_name_override=None, training_config=None, custom_algorithm_config=None)

Returns information about how to call the custom train function.

Parameters:

feature_group_names_for_training (list) – A list of feature group table names to be used for training.
training_data_parameter_name_override (dict) – Override from feature group type to parameter name in the train function.
training_config (TrainingConfig) – Training config for the options supported by the Abacus.AI platform.
custom_algorithm_config (dict) – User-defined config that can be serialized by JSON.

Returns:

Information about how to call the customer-provided train function.

Return type:

CustomTrainFunctionInfo

create_model_monitor(prediction_feature_group_id, training_feature_group_id=None, name=None, refresh_schedule=None, target_value=None, target_value_bias=None, target_value_performance=None, feature_mappings=None, model_id=None, training_feature_mappings=None, feature_group_base_monitor_config=None, feature_group_comparison_monitor_config=None, exclude_interactive_performance_analysis=True, exclude_bias_analysis=None, exclude_performance_analysis=None, exclude_feature_drift_analysis=None, exclude_data_integrity_analysis=None)

Runs a model monitor for the specified project.

Parameters:

prediction_feature_group_id (str) – The unique ID of the prediction data feature group.
training_feature_group_id (str) – The unique ID of the training data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically retrain the created model monitor.
target_value (str) – A target positive value for the label to compute bias and PR/AUC for performance page.
target_value_bias (str) – A target positive value for the label to compute bias.
target_value_performance (str) – A target positive value for the label to compute PR curve/AUC for performance page.
feature_mappings (dict) – A JSON map to override features for prediction_feature_group, where keys are column names and the values are feature data use types.
model_id (str) – The unique ID of the model.
training_feature_mappings (dict) – A JSON map to override features for training_fature_group, where keys are column names and the values are feature data use types.
feature_group_base_monitor_config (dict) – Selection strategy for the feature_group 1 with the feature group version if selected.
feature_group_comparison_monitor_config (dict) – Selection strategy for the feature_group 1 with the feature group version if selected.
exclude_interactive_performance_analysis (bool) – Whether to exclude interactive performance analysis. Defaults to True if not provided.
exclude_bias_analysis (bool) – Whether to exclude bias analysis in the model monitor. For default value bias analysis is included.
exclude_performance_analysis (bool) – Whether to exclude performance analysis in the model monitor. For default value performance analysis is included.
exclude_feature_drift_analysis (bool) – Whether to exclude feature drift analysis in the model monitor. For default value feature drift analysis is included.
exclude_data_integrity_analysis (bool) – Whether to exclude data integrity analysis in the model monitor. For default value data integrity analysis is included.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

list_model_monitors(limit=None)

Retrieves the list of model monitors in the specified project.

Parameters:: limit (int) – Maximum number of model monitors to return. We’ll have internal limit if not set.
Returns:: A list of model monitors.
Return type:: list[ModelMonitor]

create_vision_drift_monitor(prediction_feature_group_id, training_feature_group_id, name, feature_mappings, training_feature_mappings, target_value_performance=None, refresh_schedule=None)

Runs a vision drift monitor for the specified project.

Parameters:

prediction_feature_group_id (str) – Unique string identifier of the prediction data feature group.
training_feature_group_id (str) – Unique string identifier of the training data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
feature_mappings (dict) – A JSON map to override features for prediction_feature_group, where keys are column names and the values are feature data use types.
training_feature_mappings (dict) – A JSON map to override features for training_feature_group, where keys are column names and the values are feature data use types.
target_value_performance (str) – A target positive value for the label to compute precision-recall curve/area under curve for performance page.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically rerun the created vision drift monitor.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

create_nlp_drift_monitor(prediction_feature_group_id, training_feature_group_id, name, feature_mappings, training_feature_mappings, target_value_performance=None, refresh_schedule=None)

Runs an NLP drift monitor for the specified project.

Parameters:

prediction_feature_group_id (str) – Unique string identifier of the prediction data feature group.
training_feature_group_id (str) – Unique string identifier of the training data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
feature_mappings (dict) – A JSON map to override features for prediction_feature_group, where keys are column names and the values are feature data use types.
training_feature_mappings (dict) – A JSON map to override features for training_feature_group, where keys are column names and the values are feature data use types.
target_value_performance (str) – A target positive value for the label to compute precision-recall curve/area under curve for performance page.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically rerun the created nlp drift monitor.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

create_forecasting_monitor(name, prediction_feature_group_id, training_feature_group_id, training_forecast_config, prediction_forecast_config, forecast_frequency, refresh_schedule=None)

Runs a forecasting monitor for the specified project.

Parameters:

name (str) – The name you want your model monitor to have. Defaults to “<Project Name> Model Monitor”.
prediction_feature_group_id (str) – Unique string identifier of the prediction data feature group.
training_feature_group_id (str) – Unique string identifier of the training data feature group.
training_forecast_config (ForecastingMonitorConfig) – The configuration for the training data.
prediction_forecast_config (ForecastingMonitorConfig) – The configuration for the prediction data.
forecast_frequency (str) – The frequency of the forecast. Defaults to the frequency of the prediction data.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically rerun the created forecasting monitor.

Returns:

The new model monitor that was created.

Return type:

ModelMonitor

create_eda(feature_group_id, name, refresh_schedule=None, include_collinearity=False, include_data_consistency=False, collinearity_keys=None, primary_keys=None, data_consistency_test_config=None, data_consistency_reference_config=None, feature_mappings=None, forecast_frequency=None)

Run an Exploratory Data Analysis (EDA) for the specified project.

Parameters:

feature_group_id (str) – The unique ID of the prediction data feature group.
name (str) – The name you want your model monitor to have. Defaults to “<Project Name> EDA”.
refresh_schedule (str) – A cron-style string that describes a schedule in UTC to automatically retrain the created EDA.
include_collinearity (bool) – Set to True if the EDA type is collinearity.
include_data_consistency (bool) – Set to True if the EDA type is data consistency.
collinearity_keys (list) – List of features to use for collinearity
primary_keys (list) – List of features that corresponds to the primary keys or item ids for the given feature group for Data Consistency analysis or Forecasting analysis respectively.
data_consistency_test_config (dict) – Test feature group version selection strategy for Data Consistency EDA type.
data_consistency_reference_config (dict) – Reference feature group version selection strategy for Data Consistency EDA type.
feature_mappings (dict) – A JSON map to override features for the given feature_group, where keys are column names and the values are feature data use types. (In forecasting, used to set the timestamp column and target value)
forecast_frequency (str) – The frequency of the data. It can be either HOURLY, DAILY, WEEKLY, MONTHLY, QUARTERLY, YEARLY.

Returns:

The new EDA object that was created.

Return type:

Eda

list_eda()

Retrieves the list of Exploratory Data Analysis (EDA) in the specified project.

Parameters:: project_id (str) – Unique string identifier associated with the project.
Returns:: List of EDA objects.
Return type:: list[Eda]

list_holdout_analysis(model_id=None)

List holdout analyses for a project. Optionally, filter by model.

Parameters:: model_id (str) – (optional) ID of the model to filter by
Returns:: The holdout analyses
Return type:: list[HoldoutAnalysis]

create_monitor_alert(alert_name, condition_config, action_config, model_monitor_id=None, realtime_monitor_id=None)

Create a monitor alert for the given conditions and monitor. We can create monitor alert either for model monitor or real-time monitor.

Parameters:

alert_name (str) – Name of the alert.
condition_config (AlertConditionConfig) – Condition to run the actions for the alert.
action_config (AlertActionConfig) – Configuration for the action of the alert.
model_monitor_id (str) – Unique string identifier for the model monitor created under the project.
realtime_monitor_id (str) – Unique string identifier for the real-time monitor for the deployment created under the project.

Returns:

Object describing the monitor alert.

Return type:

MonitorAlert

list_prediction_operators()

List all the prediction operators inside a project.

Parameters:: project_id (str) – The unique ID of the project.
Returns:: A list of prediction operator objects.
Return type:: list[PredictionOperator]

create_deployment_token(name=None)

Creates a deployment token for the specified project.

Deployment tokens are used to authenticate requests to the prediction APIs and are scoped to the project level.

Parameters:: name (str) – The name of the deployment token.
Returns:: The deployment token.
Return type:: DeploymentAuthToken

list_deployments()

Retrieves a list of all deployments in the specified project.

Parameters:: project_id (str) – The unique identifier associated with the project.
Returns:: An array of deployments.
Return type:: list[Deployment]

list_deployment_tokens()

Retrieves a list of all deployment tokens associated with the specified project.

Parameters:: project_id (str) – The unique ID associated with the project.
Returns:: A list of deployment tokens.
Return type:: list[DeploymentAuthToken]

list_realtime_monitors()

List the real-time monitors associated with the deployment id.

Parameters:: project_id (str) – Unique string identifier for the deployment.
Returns:: An array of real-time monitors.
Return type:: list[RealtimeMonitor]

list_refresh_policies(dataset_ids=[], feature_group_id=None, model_ids=[], deployment_ids=[], batch_prediction_ids=[], model_monitor_ids=[], notebook_ids=[])

List the refresh policies for the organization. If no filters are specified, all refresh policies are returned.

Parameters:

dataset_ids (List) – Comma-separated list of Dataset IDs.
feature_group_id (str) – Feature Group ID for which we wish to see the refresh policies attached.
model_ids (List) – Comma-separated list of Model IDs.
deployment_ids (List) – Comma-separated list of Deployment IDs.
batch_prediction_ids (List) – Comma-separated list of Batch Prediction IDs.
model_monitor_ids (List) – Comma-separated list of Model Monitor IDs.
notebook_ids (List) – Comma-separated list of Notebook IDs.

Returns:

List of all refresh policies in the organization.

Return type:

list[RefreshPolicy]

list_batch_predictions(limit=None)

Retrieves a list of batch predictions in the project.

Parameters:: limit (int) – Maximum number of batch predictions to return. We’ll have internal limit if not set.
Returns:: List of batch prediction jobs.
Return type:: list[BatchPrediction]

list_pipelines()

Lists the pipelines for an organization or a project

Parameters:: project_id (str) – Unique string identifier for the project to list graph dashboards from.
Returns:: A list of pipelines.
Return type:: list[Pipeline]

create_graph_dashboard(name, python_function_ids=None)

Create a plot dashboard given selected python plots

Parameters:

name (str) – The name of the dashboard.
python_function_ids (List) – A list of unique string identifiers for the python functions to be used in the graph dashboard.

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

list_graph_dashboards()

Lists the graph dashboards for a project

Parameters:: project_id (str) – Unique string identifier for the project to list graph dashboards from.
Returns:: A list of graph dashboards.
Return type:: list[GraphDashboard]

list_builtin_algorithms(feature_group_ids, training_config=None)

Return list of built-in algorithms based on given input data and training config.

Parameters:

feature_group_ids (List) – List of feature group IDs specifying input data.
training_config (TrainingConfig) – The training config to be used for model training.

Returns:

List of applicable builtin algorithms.

Return type:

list[Algorithm]

create_chat_session(name=None)

Creates a chat session with Data Science Co-pilot.

Parameters:: name (str) – The name of the chat session. Defaults to the project name.
Returns:: The chat session with Data Science Co-pilot
Return type:: ChatSession

create_agent(function_source_code=None, agent_function_name=None, name=None, memory=None, package_requirements=[], description=None, enable_binary_input=False, evaluation_feature_group_id=None, agent_input_schema=None, agent_output_schema=None, workflow_graph=None, agent_interface=AgentInterface.DEFAULT, included_modules=None, org_level_connectors=None, user_level_connectors=None, initialize_function_name=None, initialize_function_code=None)

Creates a new AI agent using the given agent workflow graph definition.

Parameters:

name (str) – The name you want your agent to have, defaults to “<Project Name> Agent”.
memory (int) – Overrides the default memory allocation (in GB) for the agent.
package_requirements (list) – A list of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
description (str) – A description of the agent, including its purpose and instructions.
evaluation_feature_group_id (str) – The ID of the feature group to use for evaluation.
workflow_graph (WorkflowGraph) – The workflow graph for the agent.
agent_interface (AgentInterface) – The interface that the agent will be deployed with.
included_modules (List) – A list of user created custom modules to include in the agent’s environment.
org_level_connectors (List) – A list of org level connector ids to be used by the agent.
user_level_connectors (Dict) – A dictionary mapping ApplicationConnectorType keys to lists of OAuth scopes. Each key represents a specific user level application connector, while the value is a list of scopes that define the permissions granted to the application.
initialize_function_name (str) – The name of the function to be used for initialization.
initialize_function_code (str) – The function code to be used for initialization.
function_source_code (str)
agent_function_name (str)
enable_binary_input (bool)
agent_input_schema (dict)
agent_output_schema (dict)

Returns:

The new agent.

Return type:

Agent

generate_agent_code(prompt, fast_mode=None)

Generates the code for defining an AI Agent

Parameters:

prompt (str) – A natural language prompt which describes agent specification. Describe what the agent will do, what inputs it will expect, and what outputs it will give out
fast_mode (bool) – If True, runs a faster but slightly less accurate code generation pipeline

list_agents()

Retrieves the list of agents in the specified project.

Parameters:: project_id (str) – The unique identifier associated with the project.
Returns:: A list of agents in the project.
Return type:: list[Agent]

create_document_retriever(name, feature_group_id, document_retriever_config=None)

Returns a document retriever that stores embeddings for document chunks in a feature group.

Document columns in the feature group are broken into chunks. For cases with multiple document columns, chunks from all columns are combined together to form a single chunk.

Parameters:

name (str) – The name of the Document Retriever. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
feature_group_id (str) – The ID of the feature group that the Document Retriever is associated with.
document_retriever_config (VectorStoreConfig) – The configuration, including chunk_size and chunk_overlap_fraction, for document retrieval.

Returns:

The newly created document retriever.

Return type:

DocumentRetriever

list_document_retrievers(limit=100, start_after_id=None)

List all the document retrievers.

Parameters:

limit (int) – The number of document retrievers to return.
start_after_id (str) – An offset parameter to exclude all document retrievers up to this specified ID.

Returns:

All the document retrievers in the organization associated with the specified project.

Return type:

list[DocumentRetriever]

create_model_from_functions(train_function, predict_function=None, training_input_tables=None, predict_many_function=None, initialize_function=None, cpu_size=None, memory=None, training_config=None, exclusive_run=False)

Creates a model using python.

Parameters:

train_function (callable) – The train function is passed.
predict_function (callable) – The prediction function is passed.
training_input_tables (list) – The input tables to be used for training the model. Defaults to None.
predict_many_function (callable) – Prediction function for batch input
cpu_size (str) – Size of the cpu for the feature group function
memory (int) – Memory (in GB) for the feature group function
initialize_function (callable)
training_config (dict)
exclusive_run (bool)

Returns:

The model object.

Return type:

Model

class abacusai.ProjectConfig(client, type=None, config={})

Bases: abacusai.return_class.AbstractApiClass

Project-specific config for a feature group

Parameters:

client (ApiClient) – An authenticated API Client instance
type (str) – Type of project config
config (ProjectFeatureGroupConfig) – Project-specific config for this feature group

type = None

config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ProjectFeatureGroup(client, featureGroupId=None, modificationLock=None, name=None, featureGroupSourceType=None, tableName=None, sql=None, datasetId=None, functionSourceCode=None, functionName=None, sourceTables=None, createdAt=None, description=None, sqlError=None, latestVersionOutdated=None, referencedFeatureGroups=None, tags=None, primaryKey=None, updateTimestampKey=None, lookupKeys=None, streamingEnabled=None, incremental=None, mergeConfig=None, samplingConfig=None, cpuSize=None, memory=None, streamingReady=None, featureTags=None, moduleName=None, templateBindings=None, featureExpression=None, useOriginalCsvNames=None, pythonFunctionBindings=None, pythonFunctionName=None, useGpu=None, versionLimit=None, exportOnMaterialization=None, featureGroupType=None, features={}, duplicateFeatures={}, pointInTimeGroups={}, annotationConfig={}, concatenationConfig={}, indexingConfig={}, codeSource={}, featureGroupTemplate={}, explanation={}, refreshSchedules={}, exportConnectorConfig={}, projectFeatureGroupSchema={}, projectConfig={}, latestFeatureGroupVersion={}, operatorConfig={})

Bases: abacusai.feature_group.FeatureGroup

A feature group along with project specific mappings

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – Unique identifier for this feature group.
modificationLock (bool) – If feature group is locked against a change or not.
name (str)
featureGroupSourceType (str) – The source type of the feature group
tableName (str) – Unique table name of this feature group.
sql (str) – SQL definition creating this feature group.
datasetId (str) – Dataset ID the feature group is sourced from.
functionSourceCode (str) – Source definition creating this feature group.
functionName (str) – Function name to execute from the source code.
sourceTables (list[str]) – Source tables for this feature group.
createdAt (str) – Timestamp at which the feature group was created.
description (str) – Description of the feature group.
sqlError (str) – Error message with this feature group.
latestVersionOutdated (bool) – Is latest materialized feature group version outdated.
referencedFeatureGroups (list[str]) – Feature groups this feature group is used in.
tags (list[str]) – Tags added to this feature group.
primaryKey (str) – Primary index feature.
updateTimestampKey (str) – Primary timestamp feature.
lookupKeys (list[str]) – Additional indexed features for this feature group.
streamingEnabled (bool) – If true, the feature group can have data streamed to it.
incremental (bool) – If feature group corresponds to an incremental dataset.
mergeConfig (dict) – Merge configuration settings for the feature group.
samplingConfig (dict) – Sampling configuration for the feature group.
cpuSize (str) – CPU size specified for the Python feature group.
memory (int) – Memory in GB specified for the Python feature group.
streamingReady (bool) – If true, the feature group is ready to receive streaming data.
featureTags (dict) – Tags for features in this feature group
moduleName (str) – Path to the file with the feature group function.
templateBindings (dict) – Config specifying variable names and values to use when resolving a feature group template.
featureExpression (str) – If the dataset feature group has custom features, the SQL select expression creating those features.
useOriginalCsvNames (bool) – If true, the feature group will use the original column names in the source dataset.
pythonFunctionBindings (dict) – Config specifying variable names, types, and values to use when resolving a Python feature group.
pythonFunctionName (str) – Name of the Python function the feature group was built from.
useGpu (bool) – Whether this feature group is using gpu
versionLimit (int) – Version limit for the feature group.
exportOnMaterialization (bool) – Whether to export the feature group on materialization.
featureGroupType (str) – Project type when the feature group is used in the context of a project.
features (Feature) – List of resolved features.
duplicateFeatures (Feature) – List of duplicate features.
pointInTimeGroups (PointInTimeGroup) – List of Point In Time Groups.
annotationConfig (AnnotationConfig) – Annotation config for this feature
latestFeatureGroupVersion (FeatureGroupVersion) – Latest feature group version.
concatenationConfig (ConcatenationConfig) – Feature group ID whose data will be concatenated into this feature group.
indexingConfig (IndexingConfig) – Indexing config for the feature group for feature store
codeSource (CodeSource) – If a Python feature group, information on the source code.
featureGroupTemplate (FeatureGroupTemplate) – FeatureGroupTemplate to use when this feature group is attached to a template.
explanation (NaturalLanguageExplanation) – Natural language explanation of the feature group
refreshSchedules (RefreshSchedule) – List of schedules that determines when the next version of the feature group will be created.
exportConnectorConfig (FeatureGroupRefreshExportConfig) – The export config (file connector or database connector information) for feature group exports.
projectFeatureGroupSchema (ProjectFeatureGroupSchema) – Project-specific schema for this feature group.
projectConfig (ProjectConfig) – Project-specific config for this feature group.
operatorConfig (OperatorConfig) – Operator configuration settings for the feature group.

feature_group_id = None

modification_lock = None

name = None

feature_group_source_type = None

table_name = None

sql = None

dataset_id = None

function_source_code = None

function_name = None

source_tables = None

created_at = None

description = None

sql_error = None

latest_version_outdated = None

referenced_feature_groups = None

tags = None

primary_key = None

update_timestamp_key = None

lookup_keys = None

streaming_enabled = None

incremental = None

merge_config = None

sampling_config = None

cpu_size = None

memory = None

streaming_ready = None

feature_tags = None

module_name = None

template_bindings = None

feature_expression = None

use_original_csv_names = None

python_function_bindings = None

python_function_name = None

use_gpu = None

version_limit = None

export_on_materialization = None

feature_group_type = None

features

duplicate_features

point_in_time_groups

annotation_config

concatenation_config

indexing_config

code_source

feature_group_template

explanation

refresh_schedules

export_connector_config

project_feature_group_schema

project_config

latest_feature_group_version

operator_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ProjectFeatureGroupSchema(client, nestedSchema=None, schema={}, duplicateFeatures={}, projectConfig={})

Bases: abacusai.return_class.AbstractApiClass

A schema description for a project feature group

Parameters:

client (ApiClient) – An authenticated API Client instance
nestedSchema (list) – List of schema of nested features
schema (Schema) – List of schema description for the feature
duplicateFeatures (Schema) – List of duplicate featureschemas
projectConfig (ProjectConfig) – Project-specific config for this feature group.

nested_schema = None

schema

duplicate_features

project_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ProjectFeatureGroupSchemaVersion(client, schemaVersion=None)

Bases: abacusai.return_class.AbstractApiClass

A version of a schema

Parameters:

client (ApiClient) – An authenticated API Client instance
schemaVersion (id) – The unique identifier of a schema version.

schema_version = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ProjectValidation(client, valid=None, datasetErrors=None, columnHints=None)

Bases: abacusai.return_class.AbstractApiClass

A validation result for a project

Parameters:

client (ApiClient) – An authenticated API Client instance
valid (bool) – true if the project is valid and ready to be trained, otherwise false.
datasetErrors (list[dict]) – A list of errors keeping the dataset from being valid
columnHints (dict) – Hints for what to set on the columns

valid = None

dataset_errors = None

column_hints = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.PythonFunction(client, notebookId=None, name=None, createdAt=None, functionVariableMappings=None, outputVariableMappings=None, functionName=None, pythonFunctionId=None, functionType=None, packageRequirements=None, description=None, examples=None, connectors=None, configurations=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

Customer created python function

Parameters:

client (ApiClient) – An authenticated API Client instance
notebookId (str) – The unique identifier of the notebook used to spin up the notebook upon creation.
name (str) – The name to identify the algorithm, only uppercase letters, numbers, and underscores allowed (i.e. it must be a valid Python identifier)
createdAt (str) – The ISO-8601 string representing when the Python function was created.
functionVariableMappings (dict) – A description of the function variables.
outputVariableMappings (dict) – A description of the variables returned by the function
functionName (str) – The name of the Python function to be used.
pythonFunctionId (str) – The unique identifier of the Python function.
functionType (str) – The type of the Python function.
packageRequirements (list) – The pip package dependencies required to run the code
description (str) – Description of the Python function.
examples (dict[str, list[str]]) – Dictionary containing example use cases and anti-patterns. Includes ‘positive’ examples showing recommended usage and ‘negative’ examples showing cases to avoid.
connectors (dict) – Dictionary containing user-level and organization-level connectors
configurations (dict) – Dictionary containing configurations for the Python function
codeSource (CodeSource) – Information about the source code of the Python function.

notebook_id = None

name = None

created_at = None

function_variable_mappings = None

output_variable_mappings = None

function_name = None

python_function_id = None

function_type = None

package_requirements = None

description = None

examples = None

connectors = None

configurations = None

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

add_graph_to_dashboard(graph_dashboard_id, function_variable_mappings=None, name=None)

Add a python plot function to a dashboard

Parameters:

graph_dashboard_id (str) – Unique string identifier for the graph dashboard to update.
function_variable_mappings (List) – List of arguments to be supplied to the function as parameters, in the format [{‘name’: ‘function_argument’, ‘variable_type’: ‘FEATURE_GROUP’, ‘value’: ‘name_of_feature_group’}].
name (str) – Name of the added python plot

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

validate_locally(kwargs=None)

Validates a Python function by running it with the given input values in an local environment. Taking Input Feature Group as either name(string) or Pandas DataFrame in kwargs.

Parameters:

kwargs (dict) – A dictionary mapping function arguments to values to pass to the function. Feature group names will automatically be converted into pandas dataframes.

Returns:

The result of executing the python function

Return type:

any

Raises:

TypeError – If an Input Feature Group argument has an invalid type or argument is missing.
Exception – If an error occurs while validating the Python function.

class abacusai.PythonPlotFunction(client, notebookId=None, name=None, createdAt=None, functionVariableMappings=None, functionName=None, pythonFunctionId=None, functionType=None, plotName=None, graphReferenceId=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

Create a Plot for a Dashboard

Parameters:

client (ApiClient) – An authenticated API Client instance
notebookId (str) – Unique string identifier of the notebook used to spin up the notebook upon creation.
name (str) – The name used to identify the algorithm. Only uppercase letters, numbers, and underscores are allowed.
createdAt (str) – Date and time when the Python function was created, in ISO-8601 format.
functionVariableMappings (dict) – The mappings for function parameters’ names.
functionName (str) – The name of the Python function to be used.
pythonFunctionId (str) – Unique string identifier of the Python function.
functionType (str) – The type of the Python function.
plotName (str) – Name of the plot.
graphReferenceId (str) – Reference ID of the dashboard to the plot.
codeSource (CodeSource) – Info about the source code of the Python function.

notebook_id = None

name = None

created_at = None

function_variable_mappings = None

function_name = None

python_function_id = None

function_type = None

plot_name = None

graph_reference_id = None

code_source

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.RangeViolation(client, name=None, trainingMin=None, trainingMax=None, predictionMin=None, predictionMax=None, freqAboveTrainingRange=None, freqBelowTrainingRange=None)

Bases: abacusai.return_class.AbstractApiClass

Summary of important range mismatches for a numerical feature discovered by a model monitoring instance

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – Name of feature.
trainingMin (float) – Minimum value of training distribution for the specified feature.
trainingMax (float) – Maximum value of training distribution for the specified feature.
predictionMin (float) – Minimum value of prediction distribution for the specified feature.
predictionMax (float) – Maximum value of prediction distribution for the specified feature.
freqAboveTrainingRange (float) – Frequency of prediction rows below training minimum for the specified feature.
freqBelowTrainingRange (float) – Frequency of prediction rows above training maximum for the specified feature.

name = None

training_min = None

training_max = None

prediction_min = None

prediction_max = None

freq_above_training_range = None

freq_below_training_range = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.RealtimeMonitor(client, realtimeMonitorId=None, name=None, createdAt=None, deploymentId=None, lookbackTime=None, realtimeMonitorSchedule=None)

Bases: abacusai.return_class.AbstractApiClass

A real-time monitor

Parameters:

client (ApiClient) – An authenticated API Client instance
realtimeMonitorId (str) – The unique identifier of the real-time monitor.
name (str) – The user-friendly name for the real-time monitor.
createdAt (str) – Date and time at which the real-time monitor was created.
deploymentId (str) – Deployment ID that this real-time monitor is monitoring.
lookbackTime (int) – The lookback time for the real-time monitor.
realtimeMonitorSchedule (str) – The drift computation schedule for the real-time monitor.

realtime_monitor_id = None

name = None

created_at = None

deployment_id = None

lookback_time = None

realtime_monitor_schedule = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

update(realtime_monitor_schedule=None, lookback_time=None)

Update the real-time monitor associated with the real-time monitor id.

Parameters:

realtime_monitor_schedule (str) – The cron expression for triggering monitor
lookback_time (float) – Lookback time (in seconds) for each monitor trigger

Returns:

Object describing the realtime monitor.

Return type:

RealtimeMonitor

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: RealtimeMonitor

describe()

Get the real-time monitor associated with the real-time monitor id.

Parameters:: realtime_monitor_id (str) – Unique string identifier for the real-time monitor.
Returns:: Object describing the real-time monitor.
Return type:: RealtimeMonitor

delete()

Delete the real-time monitor associated with the real-time monitor id.

Parameters:: realtime_monitor_id (str) – Unique string identifier for the real-time monitor.

class abacusai.RefreshPipelineRun(client, refreshPipelineRunId=None, refreshPolicyId=None, createdAt=None, startedAt=None, completedAt=None, status=None, refreshType=None, datasetVersions=None, featureGroupVersion=None, modelVersions=None, deploymentVersions=None, batchPredictions=None, refreshPolicy={})

Bases: abacusai.return_class.AbstractApiClass

This keeps track of the overall status of a refresh. A refresh can span multiple resources such as the creation of new dataset versions and the training of a new model version based on them.

Parameters:

client (ApiClient) – An authenticated API Client instance
refreshPipelineRunId (str) – The unique identifier for the refresh pipeline run.
refreshPolicyId (str) – Populated when the run was triggered by a refresh policy.
createdAt (str) – The time when this refresh pipeline run was created, in ISO-8601 format.
startedAt (str) – The time when the refresh pipeline run was started, in ISO-8601 format.
completedAt (str) – The time when the refresh pipeline run was completed, in ISO-8601 format.
status (str) – The status of the refresh pipeline run.
refreshType (str) – The type of refresh policy to be run.
datasetVersions (list[str]) – A list of dataset version IDs that this refresh pipeline run is monitoring.
featureGroupVersion (str) – The feature group version ID that this refresh pipeline run is monitoring.
modelVersions (list[str]) – A list of model version IDs that this refresh pipeline run is monitoring.
deploymentVersions (list[str]) – A list of deployment version IDs that this refresh pipeline run is monitoring.
batchPredictions (list[str]) – A list of batch prediction IDs that this refresh pipeline run is monitoring.
refreshPolicy (RefreshPolicy) – The refresh policy for this refresh policy run.

refresh_pipeline_run_id = None

refresh_policy_id = None

created_at = None

started_at = None

completed_at = None

status = None

refresh_type = None

dataset_versions = None

feature_group_version = None

model_versions = None

deployment_versions = None

batch_predictions = None

refresh_policy

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: RefreshPipelineRun

describe()

Retrieve a single refresh pipeline run

Parameters:: refresh_pipeline_run_id (str) – Unique string identifier associated with the refresh pipeline run.
Returns:: A refresh pipeline run object.
Return type:: RefreshPipelineRun

wait_for_complete(timeout=None)

A waiting call until refresh pipeline run has completed.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status()

Gets the status of the refresh pipeline run.

Returns:: A string describing the status of a refresh pipeline run (pending, complete, etc.).
Return type:: str

class abacusai.RefreshPolicy(client, refreshPolicyId=None, name=None, cron=None, nextRunTime=None, createdAt=None, refreshType=None, projectId=None, datasetIds=None, featureGroupId=None, modelIds=None, deploymentIds=None, batchPredictionIds=None, modelMonitorIds=None, notebookId=None, paused=None, predictionOperatorId=None, pipelineId=None, featureGroupExportConfig={})

Bases: abacusai.return_class.AbstractApiClass

A Refresh Policy describes the frequency at which one or more datasets/models/deployments/batch_predictions can be updated.

Parameters:

client (ApiClient) – An authenticated API Client instance
refreshPolicyId (str) – The unique identifier for the refresh policy
name (str) – The user-friendly name for the refresh policy
cron (str) – A cron-style string that describes when this refresh policy is to be executed in UTC
nextRunTime (str) – The next UTC time that this refresh policy will be executed
createdAt (str) – The time when the refresh policy was created
refreshType (str) – The type of refresh policy to be run
projectId (str) – The unique identifier of a project that this refresh policy applies to
datasetIds (list[str]) – Comma-separated list of Dataset IDs that this refresh policy applies to
featureGroupId (str) – Feature Group ID that this refresh policy applies to
modelIds (list[str]) – Comma-separated list of Model IDs that this refresh policy applies to
deploymentIds (list[str]) – Comma-separated list of Deployment IDs that this refresh policy applies to
batchPredictionIds (list[str]) – Comma-separated list of Batch Prediction IDs that this refresh policy applies to
modelMonitorIds (list[str]) – Comma-separated list of Model Monitor IDs that this refresh policy applies to
notebookId (str) – Notebook ID that this refresh policy applies to
paused (bool) – True if the refresh policy is paused
predictionOperatorId (str) – Prediction Operator ID that this refresh policy applies to
pipelineId (str) – The Pipeline ID With The Cron Schedule
featureGroupExportConfig (FeatureGroupRefreshExportConfig) – The export configuration for the feature group. Only applicable if refresh_type is FEATUREGROUP.

refresh_policy_id = None

name = None

cron = None

next_run_time = None

created_at = None

refresh_type = None

project_id = None

dataset_ids = None

feature_group_id = None

model_ids = None

deployment_ids = None

batch_prediction_ids = None

model_monitor_ids = None

notebook_id = None

paused = None

prediction_operator_id = None

pipeline_id = None

feature_group_export_config

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

delete()

Delete a refresh policy.

Parameters:: refresh_policy_id (str) – Unique string identifier associated with the refresh policy to delete.

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: RefreshPolicy

describe()

Retrieve a single refresh policy

Parameters:: refresh_policy_id (str) – The unique ID associated with this refresh policy.
Returns:: An object representing the refresh policy.
Return type:: RefreshPolicy

list_refresh_pipeline_runs()

List the the times that the refresh policy has been run

Parameters:: refresh_policy_id (str) – Unique identifier associated with the refresh policy.
Returns:: List of refresh pipeline runs for the given refresh policy ID.
Return type:: list[RefreshPipelineRun]

pause()

Pauses a refresh policy

Parameters:: refresh_policy_id (str) – Unique identifier associated with the refresh policy to be paused.

resume()

Resumes a refresh policy

Parameters:: refresh_policy_id (str) – The unique ID associated with this refresh policy.

run()

Force a run of the refresh policy.

Parameters:: refresh_policy_id (str) – Unique string identifier associated with the refresh policy to be run.

update(name=None, cron=None, feature_group_export_config=None)

Update the name or cron string of a refresh policy

Parameters:

name (str) – Name of the refresh policy to be updated.
cron (str) – Cron string describing the schedule from the refresh policy to be updated.
feature_group_export_config (FeatureGroupExportConfig) – Feature group export configuration to update a feature group refresh policy.

Returns:

Updated refresh policy.

Return type:

RefreshPolicy

class abacusai.RefreshSchedule(client, refreshPolicyId=None, nextRunTime=None, cron=None, refreshType=None, error=None)

Bases: abacusai.return_class.AbstractApiClass

A refresh schedule for an object. Defines when the next version of the object will be created

Parameters:

client (ApiClient) – An authenticated API Client instance
refreshPolicyId (str) – The unique identifier of the refresh policy
nextRunTime (str) – The next run time of the refresh policy. If null, the policy is paused.
cron (str) – A cron-style string that describes the when this refresh policy is to be executed in UTC
refreshType (str) – The type of refresh that will be run
error (str) – An error message for the last pipeline run of a policy

refresh_policy_id = None

next_run_time = None

cron = None

refresh_type = None

error = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.RegenerateLlmExternalApplication(client, name=None, externalApplicationId=None)

Bases: abacusai.return_class.AbstractApiClass

An external application that specifies an LLM user can regenerate with in RouteLLM.

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The external name of the LLM.
externalApplicationId (str) – The unique identifier of the external application.

name = None

external_application_id = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ResolvedFeatureGroupTemplate(client, featureGroupTemplateId=None, resolvedVariables=None, resolvedSql=None, templateSql=None, sqlError=None)

Bases: abacusai.return_class.AbstractApiClass

Final SQL from resolving a feature group template.

Parameters:

client (ApiClient) – An authenticated API Client instance
featureGroupTemplateId (str) – Unique identifier for this feature group template.
resolvedVariables (dict) – Map from template variable names to parameters available during template resolution.
resolvedSql (str) – SQL resulting from resolving the SQL template by applying the resolved bindings.
templateSql (str) – SQL that can include variables to be replaced by values from the template config to resolve this template SQL into a valid SQL query for a feature group.
sqlError (str) – if invalid, the sql error message

feature_group_template_id = None

resolved_variables = None

resolved_sql = None

template_sql = None

sql_error = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.RoutingAction(client, id=None, title=None, prompt=None, placeholder=None, value=None, displayName=None, isLarge=None, isMedium=None, additionalInfo=None)

Bases: abacusai.return_class.AbstractApiClass

Routing action

Parameters:

client (ApiClient) – An authenticated API Client instance
id (str) – The id of the routing action.
title (str) – The title of the routing action.
prompt (str) – The prompt of the routing action.
placeholder (str) – The placeholder of the routing action.
value (str) – The value of the routing action.
displayName (str) – The display name of the routing action.
isLarge (bool) – UI placement
isMedium (bool) – UI placement
additionalInfo (dict) – Additional information for the routing action.

id = None

title = None

prompt = None

placeholder = None

value = None

display_name = None

is_large = None

is_medium = None

additional_info = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Schema(client, name=None, featureMapping=None, detectedFeatureMapping=None, featureType=None, detectedFeatureType=None, dataType=None, detectedDataType=None, nestedFeatures={}, pointInTimeInfo={})

Bases: abacusai.return_class.AbstractApiClass

A schema description for a feature

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The unique name of the feature.
featureMapping (str) – The mapping of the feature. The possible values will be based on the project’s use-case. See the (Use Case Documentation)[https://api.abacus.ai/app/help/useCases] for more details.
detectedFeatureMapping (str) – Detected feature mapping for this feature
featureType (str) – The underlying data type of each feature: CATEGORICAL, CATEGORICAL_LIST, NUMERICAL, TIMESTAMP, TEXT, EMAIL, LABEL_LIST, ENTITY_LABEL_LIST, PAGE_LABEL_LIST, JSON, OBJECT_REFERENCE, MULTICATEGORICAL_LIST, COORDINATE_LIST, NUMERICAL_LIST, TIMESTAMP_LIST, ZIPCODE, URL, PAGE_INFOS, PAGES_DOCUMENT, TOKENS_DOCUMENT, MESSAGE_LIST.
detectedFeatureType (str) – The detected feature type for this feature
dataType (str) – The underlying data type of each feature: INTEGER, FLOAT, STRING, DATE, DATETIME, BOOLEAN, LIST, STRUCT, NULL, BINARY.
detectedDataType (str) – The detected data type for this feature
nestedFeatures (NestedFeatureSchema) – List of features of nested feature
pointInTimeInfo (PointInTimeFeatureInfo) – Point in time information for this feature

name = None

feature_mapping = None

detected_feature_mapping = None

feature_type = None

detected_feature_type = None

data_type = None

detected_data_type = None

nested_features

point_in_time_info

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.SftpKey(client, keyName=None, publicKey=None)

Bases: abacusai.return_class.AbstractApiClass

An SFTP key

Parameters:

client (ApiClient) – An authenticated API Client instance
keyName (str) – The name of the key
publicKey (str) – The public key

key_name = None

public_key = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.StreamingAuthToken(client, streamingToken=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

A streaming authentication token that is used to authenticate requests to append data to streaming datasets

Parameters:

client (ApiClient) – An authenticated API Client instance
streamingToken (str) – The unique token used to authenticate requests
createdAt (str) – When the token was created

streaming_token = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.StreamingClient(client_options=None)

Bases: abacusai.client.BaseApiClient

Abacus.AI Streaming API Client. Does not utilize authentication and only contains public streaming methods

Parameters:: client_options (ClientOptions) – Optional API client configurations

upsert_item_embeddings(streaming_token, model_id, item_id, vector, catalog_id=None)

Upserts an embedding vector for an item id for a model_id.

Parameters:

streaming_token (str) – The streaming token for authenticating requests to the model.
model_id (str) – A unique string identifier for the model to upsert item embeddings to.
item_id (str) – The item id for which its embeddings will be upserted.
vector (list) – The embedding vector.
catalog_id (str) – The name of the catalog in the model to update.

delete_item_embeddings(streaming_token, model_id, item_ids, catalog_id=None)

Deletes KNN embeddings for a list of item IDs for a given model ID.

Parameters:

streaming_token (str) – The streaming token for authenticating requests to the model.
model_id (str) – A unique string identifier for the model from which to delete item embeddings.
item_ids (list) – A list of item IDs whose embeddings will be deleted.
catalog_id (str) – An optional name to specify which catalog in a model to update.

upsert_multiple_item_embeddings(streaming_token, model_id, upserts, catalog_id=None)

Upserts a knn embedding for multiple item ids for a model_id.

Parameters:

streaming_token (str) – The streaming token for authenticating requests to the model.
model_id (str) – The unique string identifier of the model to upsert item embeddings to.
upserts (list) – A list of dictionaries of the form {‘itemId’: …, ‘vector’: […]} for each upsert.
catalog_id (str) – Name of the catalog in the model to update.

append_data(feature_group_id, streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters:

feature_group_id (str) – Unique string identifier for the streaming feature group to record data to.
streaming_token (str) – The streaming token for authenticating requests.
data (dict) – The data to record as a JSON object.

append_multiple_data(feature_group_id, streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters:

feature_group_id (str) – Unique string identifier of the streaming feature group to record data to.
streaming_token (str) – Streaming token for authenticating requests.
data (list) – Data to record, as a list of JSON objects.

class abacusai.StreamingConnector(client, streamingConnectorId=None, service=None, name=None, createdAt=None, status=None, auth=None)

Bases: abacusai.return_class.AbstractApiClass

A connector to an external service

Parameters:

client (ApiClient) – An authenticated API Client instance
streamingConnectorId (str) – The unique ID for the connection.
service (str) – The service this connection connects to
name (str) – A user-friendly name for the service
createdAt (str) – When the API key was created
status (str) – The status of the Database Connector
auth (dict) – Non-secret connection information for this connector

streaming_connector_id = None

service = None

name = None

created_at = None

status = None

auth = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

verify()

Checks to see if Abacus.AI can access the streaming connector.

Parameters:: streaming_connector_id (str) – Unique string identifier for the streaming connector to be checked for Abacus.AI access.

rename(name)

Renames a Streaming Connector

Parameters:: name (str) – A new name for the streaming connector.

delete()

Delete a streaming connector.

Parameters:: streaming_connector_id (str) – The unique identifier for the streaming connector.

class abacusai.StreamingRowCount(client, count=None, startTsMs=None)

Bases: abacusai.return_class.AbstractApiClass

Returns the number of rows in a streaming feature group from the specified time

Parameters:

client (ApiClient) – An authenticated API Client instance
count (int) – The number of rows in the feature group
startTsMs (int) – The start time for the number of rows.

count = None

start_ts_ms = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.StreamingSampleCode(client, python=None, curl=None, console=None)

Bases: abacusai.return_class.AbstractApiClass

Sample code for adding to a streaming feature group with examples from different locations.

Parameters:

client (ApiClient) – An authenticated API Client instance
python (str) – The python code sample.
curl (str) – The curl code sample.
console (str) – The console code sample

python = None

curl = None

console = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.StsGenSettings(client, model=None, settings=None)

Bases: abacusai.return_class.AbstractApiClass

STS generation settings

Parameters:

client (ApiClient) – An authenticated API Client instance
model (dict) – The model settings.
settings (dict) – The settings for each model.

model = None

settings = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.SttGenModel(client, displayName=None, type=None, valueType=None, optional=None, default=None, helptext=None, options={})

Bases: abacusai.return_class.AbstractApiClass

STT generation model

Parameters:

client (ApiClient) – An authenticated API Client instance
displayName (str) – The display name for the UI component.
type (str) – The type of the UI component.
valueType (str) – The data type of the values within the UI component.
optional (bool) – Whether the selection of a value is optional.
default (str) – The default value for the STT generation model.
helptext (str) – The helptext for the UI component.
options (SttGenModelOptions) – The options of models available for STT generation.

display_name = None

type = None

value_type = None

optional = None

default = None

helptext = None

options

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.SttGenModelOptions(client, keys=None, values=None)

Bases: abacusai.return_class.AbstractApiClass

STT generation model options

Parameters:

client (ApiClient) – An authenticated API Client instance
keys (list) – The keys of the image generation model options represented as the enum values.
values (list) – The display names of the image generation model options.

keys = None

values = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.SttGenSettings(client, settings=None, model={})

Bases: abacusai.return_class.AbstractApiClass

STT generation settings

Parameters:

client (ApiClient) – An authenticated API Client instance
settings (dict) – The settings for each model.
model (SttGenModel) – Dropdown for models available for STT generation.

settings = None

model

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.TemplateNodeDetails(client, notebookCode=None, workflowGraphNode={})

Bases: abacusai.return_class.AbstractApiClass

Details about WorkflowGraphNode object and notebook code for adding template nodes in workflow.

Parameters:

client (ApiClient) – An authenticated API Client instance
notebookCode (list) – The boilerplate code that needs to be shown in notebook for creating workflow graph node using corresponding template.
workflowGraphNode (WorkflowGraphNode) – The workflow graph node object corresponding to the template.

notebook_code = None

workflow_graph_node

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.TestPointPredictions(client, count=None, columns=None, data=None, metricsColumns=None, summarizedMetrics=None, errorDescription=None)

Bases: abacusai.return_class.AbstractApiClass

Test Point Predictions

Parameters:

client (ApiClient) – An authenticated API Client instance
count (int) – Count of total rows in the preview data for the SQL.
columns (list) – The returned columns
data (list) – A list of data rows, each represented as a list.
metricsColumns (list) – The columns that are the metrics.
summarizedMetrics (dict) – A map between the problem type metrics and the mean of the results matching the query
errorDescription (str) – Description of an error in case of failure.

count = None

columns = None

data = None

metrics_columns = None

summarized_metrics = None

error_description = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.ToneDetails(client, voiceId=None, name=None, gender=None, language=None, age=None, accent=None, useCase=None, description=None)

Bases: abacusai.return_class.AbstractApiClass

Tone details for audio

Parameters:

client (ApiClient) – An authenticated API Client instance
voiceId (str) – The voice id
name (str) – The name
gender (str) – The gender
language (str) – The language
age (str) – The age
accent (str) – The accent
useCase (str) – The use case
description (str) – The description

voice_id = None

name = None

gender = None

language = None

age = None

accent = None

use_case = None

description = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.TrainingConfigOptions(client, name=None, dataType=None, valueType=None, valueOptions=None, value=None, default=None, options=None, description=None, required=None, lastModelValue=None, needsRefresh=None)

Bases: abacusai.return_class.AbstractApiClass

Training options for a model

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The name of the parameter
dataType (str) – The type of input required for this option
valueType (str) – If the data_type is of type DICT_VALUES, this field specifies the expected value type of the values
valueOptions (list[str]) – The list of valid values for DICT_VALUES
value (optional[any]) – The value of this option
default (optional[any]) – The default value for this option
options (dict) – A dict of options for this parameter
description (str) – A description of the parameter
required (bool) – True if the parameter is required for training
lastModelValue (optional[str, int, float, bool]) – The last value used to train a model in this project
needsRefresh (bool) – True if training config needs to be fetched again when this config option is changed

name = None

data_type = None

value_type = None

value_options = None

value = None

default = None

options = None

description = None

required = None

last_model_value = None

needs_refresh = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.TtsGenSettings(client, model=None, settings=None)

Bases: abacusai.return_class.AbstractApiClass

TTS generation settings

Parameters:

client (ApiClient) – An authenticated API Client instance
model (dict) – The model settings.
settings (dict) – The settings for each model.

model = None

settings = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.TwitterSearchResult(client, title=None, url=None, twitterName=None, twitterHandle=None, thumbnailUrl=None, thumbnailWidth=None, thumbnailHeight=None)

Bases: abacusai.return_class.AbstractApiClass

A single twitter search result.

Parameters:

client (ApiClient) – An authenticated API Client instance
title (str) – The title of the tweet.
url (str) – The URL of the tweet.
twitterName (str) – The name of the twitter user.
twitterHandle (str) – The handle of the twitter user.
thumbnailUrl (str) – The URL of the thumbnail of the tweet.
thumbnailWidth (int) – The width of the thumbnail of the tweet.
thumbnailHeight (int) – The height of the thumbnail of the tweet.

title = None

url = None

twitter_name = None

twitter_handle = None

thumbnail_url = None

thumbnail_width = None

thumbnail_height = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.UnifiedConnector(client, applicationConnectorId=None, databaseConnectorId=None, service=None, name=None, createdAt=None, status=None, auth=None)

Bases: abacusai.return_class.AbstractApiClass

A unified connector that can handle both application and database connectors.

Parameters:

client (ApiClient) – An authenticated API Client instance
applicationConnectorId (str) – The unique ID for the connection.
databaseConnectorId (str) – The unique ID for the connection.
service (str) – The service this connection connects to
name (str) – A user-friendly name for the service
createdAt (str) – When the API key was created
status (str) – The status of the Application Connector
auth (dict) – Non-secret connection information for this connector

application_connector_id = None

database_connector_id = None

service = None

name = None

created_at = None

status = None

auth = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Upload(client, uploadId=None, datasetUploadId=None, status=None, datasetId=None, datasetVersion=None, modelId=None, modelVersion=None, batchPredictionId=None, parts=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

A Upload Reference for uploading file parts

Parameters:

client (ApiClient) – An authenticated API Client instance
uploadId (str) – The unique ID generated when the upload process of the full large file in smaller parts is initiated.
datasetUploadId (str) – Same as upload_id. It is kept for backwards compatibility purposes.
status (str) – The current status of the upload.
datasetId (str) – A reference to the dataset this upload is adding data to.
datasetVersion (str) – A reference to the dataset version the upload is adding data to.
modelId (str) – A reference the model the upload is creating a version for
modelVersion (str) – A reference to the model version the upload is creating.
batchPredictionId (str) – A reference to the batch prediction the upload is creating.
parts (list[dict]) – A list containing the order of the file parts that have been uploaded.
createdAt (str) – The timestamp at which the upload was created.

upload_id = None

dataset_upload_id = None

status = None

dataset_id = None

dataset_version = None

model_id = None

model_version = None

batch_prediction_id = None

parts = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

cancel()

Cancels an upload.

Parameters:: upload_id (str) – A unique string identifier for the upload.

part(part_number, part_data)

Uploads part of a large dataset file from your bucket to our system. Our system currently supports parts of up to 5GB and full files of up to 5TB. Note that each part must be at least 5MB in size, unless it is the last part in the sequence of parts for the full file.

Parameters:

part_number (int) – The 1-indexed number denoting the position of the file part in the sequence of parts for the full file.
part_data (io.TextIOBase) – The multipart/form-data for the current part of the full file.

Returns:

The object ‘UploadPart’ which encapsulates the hash and the etag for the part that got uploaded.

Return type:

UploadPart

mark_complete()

Marks an upload process as complete.

Parameters:: upload_id (str) – A unique string identifier for the upload process.
Returns:: The upload object associated with the process, containing details of the file.
Return type:: Upload

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Upload

describe()

Retrieves the current upload status (complete or inspecting) and the list of file parts uploaded for a specified dataset upload.

Parameters:: upload_id (str) – The unique ID associated with the file uploaded or being uploaded in parts.
Returns:: Details associated with the large dataset file uploaded in parts.
Return type:: Upload

upload_part(upload_args)

Uploads a file part.

Returns:: The object ‘UploadPart’ that encapsulates the hash and the etag for the part that got uploaded.
Return type:: UploadPart

upload_file(file, threads=10, chunksize=1024 * 1024 * 10, wait_timeout=600)

Uploads the file in the specified chunk size using the specified number of workers.

Parameters:

file (IOBase) – A bytesIO or StringIO object to upload to Abacus.AI
threads (int) – The max number of workers to use while uploading the file
chunksize (int) – The number of bytes to use for each chunk while uploading the file. Defaults to 10 MB
wait_timeout (int) – The max number of seconds to wait for the file parts to be joined on Abacus.AI. Defaults to 600.

Returns:

The upload file object.

Return type:

Upload

_yield_upload_part(file, chunksize)

wait_for_join(timeout=600)

A waiting call until the upload parts are joined.

Parameters:: timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to have timed out. Defaults to 600.

get_status()

Gets the status of the upload.

Returns:: A string describing the status of the upload (pending, complete, etc.).
Return type:: str

class abacusai.UploadPart(client, etag=None, md5=None)

Bases: abacusai.return_class.AbstractApiClass

Unique identifiers for a part

Parameters:

client (ApiClient) – An authenticated API Client instance
etag (str) – A unique string for this part.
md5 (str) – The MD5 hash of this part.

etag = None

md5 = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.UseCase(client, useCase=None, prettyName=None, description=None, problemType=None)

Bases: abacusai.return_class.AbstractApiClass

A Project Use Case

Parameters:

client (ApiClient) – An authenticated API Client instance
useCase (str) – The enum value for this use case
prettyName (str) – A user-friendly name
description (str) – A description for this use case
problemType (str) – Name for the underlying problem type

use_case = None

pretty_name = None

description = None

problem_type = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.UseCaseRequirements(client, datasetType=None, name=None, description=None, required=None, multi=None, allowedFeatureMappings=None, allowedNestedFeatureMappings=None)

Bases: abacusai.return_class.AbstractApiClass

Use Case Requirements

Parameters:

client (ApiClient) – An authenticated API Client instance
datasetType (str) – The project-specific enum value of the dataset type.
name (str) – The user-friendly name of the dataset type.
description (str) – The description of the dataset type.
required (bool) – True if the dataset type is required for this project.
multi (bool) – If true, multiple versions of the dataset type can be used for training.
allowedFeatureMappings (dict) – A collection of key-value pairs, with each key being a column mapping enum (see a list of column mapping enums here) and each value being in the following dictionary format: { “description”: str, “allowed_feature_types”: feature_type_enum, “required”: bool }.
allowedNestedFeatureMappings (dict) – A collection of key-value pairs, with each key being a column mapping enum (see a list of column mapping enums here) and each value being in the following dictionary format: { “description”: str, “allowed_feature_types”: feature_type_enum, “required”: bool }.

dataset_type = None

name = None

description = None

required = None

multi = None

allowed_feature_mappings = None

allowed_nested_feature_mappings = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.User(client, name=None, email=None, createdAt=None, status=None, organizationGroups={})

Bases: abacusai.return_class.AbstractApiClass

An Abacus.AI User

Parameters:

client (ApiClient) – An authenticated API Client instance
name (str) – The User’s name.
email (str) – The User’s primary email address.
createdAt (str) – The date and time when the user joined Abacus.AI.
status (str) – ACTIVE when the user has accepted an invite to join the organization, else INVITED.
organizationGroups (OrganizationGroup) – List of Organization Groups this user belongs to.

name = None

email = None

created_at = None

status = None

organization_groups

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.UserException(client, type=None, value=None, traceback=None)

Bases: abacusai.return_class.AbstractApiClass

Exception information for errors in usercode.

Parameters:

client (ApiClient) – An authenticated API Client instance
type (str) – The type of exception
value (str) – The value of the exception
traceback (str) – The traceback of the exception

type = None

value = None

traceback = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.VideoGenCosts(client, modelCosts=None, expensiveModels=None, warningMessages=None)

Bases: abacusai.return_class.AbstractApiClass

The most expensive price for each video gen model in credits

Parameters:

client (ApiClient) – An authenticated API Client instance
modelCosts (dict) – The costs of the video gen models in credits
expensiveModels (list) – The list of video gen models that are expensive
warningMessages (dict) – The warning messages for certain video gen models

model_costs = None

expensive_models = None

warning_messages = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.VideoGenModel(client, displayName=None, type=None, valueType=None, optional=None, default=None, helptext=None, options={})

Bases: abacusai.return_class.AbstractApiClass

Video generation model

Parameters:

client (ApiClient) – An authenticated API Client instance
displayName (str) – The display name for the UI component.
type (str) – The type of the UI component.
valueType (str) – The data type of the values within the UI component.
optional (bool) – Whether the selection of a value is optional.
default (str) – The default value for the video generation model.
helptext (str) – The helptext for the UI component.
options (VideoGenModelOptions) – The options of models available for video generation.

display_name = None

type = None

value_type = None

optional = None

default = None

helptext = None

options

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.VideoGenModelOptions(client, keys=None, values=None)

Bases: abacusai.return_class.AbstractApiClass

Video generation model options

Parameters:

client (ApiClient) – An authenticated API Client instance
keys (list) – The keys of the video generation model options represented as the enum values.
values (list) – The display names of the video generation model options.

keys = None

values = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.VideoGenSettings(client, settings=None, model={})

Bases: abacusai.return_class.AbstractApiClass

Video generation settings

Parameters:

client (ApiClient) – An authenticated API Client instance
settings (dict) – The settings for each model.
model (VideoGenModel) – Dropdown for models available for video generation.

settings = None

model

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.VideoSearchResult(client, title=None, url=None, thumbnailUrl=None, motionThumbnailUrl=None, embedUrl=None)

Bases: abacusai.return_class.AbstractApiClass

A single video search result.

Parameters:

client (ApiClient) – An authenticated API Client instance
title (str) – The title of the video.
url (str) – The URL of the video.
thumbnailUrl (str) – The URL of the thumbnail of the video.
motionThumbnailUrl (str) – The URL of the motion thumbnail of the video.
embedUrl (str) – The URL of the embed of the video.

title = None

url = None

thumbnail_url = None

motion_thumbnail_url = None

embed_url = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.VoiceGenDetails(client, model=None, voice=None)

Bases: abacusai.return_class.AbstractApiClass

Voice generation details

Parameters:

client (ApiClient) – An authenticated API Client instance
model (str) – The model used for voice generation.
voice (dict) – The voice details.

model = None

voice = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.WebAppConversation(client, deploymentConversationId=None, llmArtifactId=None, deploymentConversationName=None, externalApplicationId=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

Web App Conversation

Parameters:

client (ApiClient) – An authenticated API Client instance
deploymentConversationId (id) – The ID of the deployment conversation
llmArtifactId (id) – The ID of the LLM artifact
deploymentConversationName (str) – The name of the conversation
externalApplicationId (str) – The external application ID
createdAt (str) – The creation timestamp

deployment_conversation_id = None

llm_artifact_id = None

deployment_conversation_name = None

external_application_id = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.WebAppDomain(client, webAppDomainId=None, hostname=None, domainType=None, lifecycle=None, nameservers=None, dnsRecords=None, metadata=None, isRootDomain=None)

Bases: abacusai.return_class.AbstractApiClass

Web App Domain

Parameters:

client (ApiClient) – An authenticated API Client instance
webAppDomainId (id) – The ID of the web app domain
hostname (str) – The hostname of the web app domain
domainType (str) – The type of the web app domain
lifecycle (str) – The lifecycle of the web app domain
nameservers (list) – The nameservers of the web app domain
dnsRecords (list) – The DNS records of the web app domain
metadata (dict) – The metadata of the web app domain
isRootDomain (bool) – Whether the web app domain is a root domain

web_app_domain_id = None

hostname = None

domain_type = None

lifecycle = None

nameservers = None

dns_records = None

metadata = None

is_root_domain = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.WebPageResponse(client, content=None)

Bases: abacusai.return_class.AbstractApiClass

A scraped web page response

Parameters:

client (ApiClient) – An authenticated API Client instance
content (str) – The content of the web page.

content = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.WebSearchResponse(client, searchResults={})

Bases: abacusai.return_class.AbstractApiClass

Result of running a web search with optional content fetching.

Parameters:

client (ApiClient) – An authenticated API Client instance
searchResults (WebSearchResult) – List of search results.

search_results

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.WebSearchResult(client, title=None, url=None, snippet=None, news=None, place=None, entity=None, content=None)

Bases: abacusai.return_class.AbstractApiClass

A single search result.

Parameters:

client (ApiClient) – An authenticated API Client instance
title (str) – The title of the search result.
url (str) – The URL of the search result.
snippet (str) – The snippet of the search result.
news (str) – The news search result (if any)
place (str) – The place search result (if any)
entity (str) – The entity search result (if any)
content (str) – The page of content fetched from the url.

title = None

url = None

snippet = None

news = None

place = None

entity = None

content = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.Webhook(client, webhookId=None, deploymentId=None, endpoint=None, webhookEventType=None, payloadTemplate=None, createdAt=None)

Bases: abacusai.return_class.AbstractApiClass

A Abacus.AI Webhook attached to an endpoint and event trigger for a given object.

Parameters:

client (ApiClient) – An authenticated API Client instance
webhookId (str) – Unique identifier for this webhook.
deploymentId (str) – Identifier for the deployment this webhook is attached to.
endpoint (str) – The URI this webhook will send HTTP POST requests to.
webhookEventType (str) – The event that triggers the webhook action.
payloadTemplate (str) – Template for JSON Dictionary to be sent as the body of the POST request.
createdAt (str) – The date and time this webhook was created.

webhook_id = None

deployment_id = None

endpoint = None

webhook_event_type = None

payload_template = None

created_at = None

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

refresh()

Calls describe and refreshes the current object’s fields

Returns:: The current object
Return type:: Webhook

describe()

Describe the webhook with a given ID.

Parameters:: webhook_id (str) – Unique string identifier of the target webhook.
Returns:: The webhook with the given ID.
Return type:: Webhook

update(endpoint=None, webhook_event_type=None, payload_template=None)

Update the webhook

Parameters:

endpoint (str) – If provided, changes the webhook’s endpoint.
webhook_event_type (str) – If provided, changes the event type.
payload_template (dict) – If provided, changes the payload template.

delete()

Delete the webhook

Parameters:: webhook_id (str) – Unique identifier of the target webhook.

class abacusai.WorkflowGraphNodeDetails(client, packageRequirements=None, connectors=None, workflowGraphNode={})

Bases: abacusai.return_class.AbstractApiClass

A workflow graph node in the workflow graph.

Parameters:

client (ApiClient) – An authenticated API Client instance
packageRequirements (list[str]) – A list of package requirements that the node source code will need.
connectors (dict) – A dictionary of connectors that the node source code will need.
workflowGraphNode (WorkflowGraphNode) – The workflow graph node object.

package_requirements = None

connectors = None

workflow_graph_node

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

class abacusai.WorkflowNodeTemplate(client, workflowNodeTemplateId=None, name=None, functionName=None, sourceCode=None, description=None, packageRequirements=None, tags=None, additionalConfigs=None, inputs={}, outputs={}, templateConfigs={})

Bases: abacusai.return_class.AbstractApiClass

A workflow node template.

Parameters:

client (ApiClient) – An authenticated API Client instance
workflowNodeTemplateId (str) – The unique identifier of the workflow node template.
name (str) – The name of the workflow node template.
functionName (str) – The function name of the workflow node function.
sourceCode (str) – The source code of the function that the workflow node template will execute.
description (str) – A description of the workflow node template.
packageRequirements (list[str]) – A list of package requirements that the node source code may need.
tags (dict) – Tags to add to the workflow node template. It contains information on the intended usage of template.
additionalConfigs (dict) – Additional configurations for the workflow node template.
inputs (WorkflowNodeTemplateInput) – A list of inputs that the workflow node template will use.
outputs (WorkflowNodeTemplateOutput) – A list of outputs that the workflow node template will give.
templateConfigs (WorkflowNodeTemplateConfig) – A list of template configs that are hydrated into source to get complete code.

workflow_node_template_id = None

name = None

function_name = None

source_code = None

description = None

package_requirements = None

tags = None

additional_configs = None

inputs

outputs

template_configs

deprecated_keys

__repr__()

to_dict()

Get a dict representation of the parameters in this class

Returns:: The dict value representation of the class parameters
Return type:: dict

abacusai.__version__ = '1.4.58'