abacusai.pipeline
Classes
A Pipeline For Steps. |
Module Contents
- class abacusai.pipeline.Pipeline(client, pipelineName=None, pipelineId=None, createdAt=None, notebookId=None, cron=None, nextRunTime=None, isProd=None, warning=None, createdBy=None, steps={}, pipelineReferences={}, latestPipelineVersion={}, codeSource={}, pipelineVariableMappings={})
Bases:
abacusai.return_class.AbstractApiClass
A Pipeline For Steps.
- Parameters:
client (ApiClient) – An authenticated API Client instance
pipelineName (str) – The name of the pipeline this step is a part of.
pipelineId (str) – The reference to the pipeline this step belongs to.
createdAt (str) – The date and time which the pipeline was created.
notebookId (str) – The reference to the notebook this pipeline belongs to.
cron (str) – A cron-style string that describes when this refresh policy is to be executed in UTC
nextRunTime (str) – The next time this pipeline will be run.
isProd (bool) – Whether this pipeline is a production pipeline.
warning (str) – Warning message for possible errors that might occur if the pipeline is run.
createdBy (str) – The email of the user who created the pipeline
steps (PipelineStep) – A list of the pipeline steps attached to the pipeline.
pipelineReferences (PipelineReference) – A list of references from the pipeline to other objects
latestPipelineVersion (PipelineVersion) – The latest version of the pipeline.
codeSource (CodeSource) – information on the source code
pipelineVariableMappings (PythonFunctionArgument) – A description of the function variables into the pipeline.
- pipeline_name
- pipeline_id
- created_at
- notebook_id
- cron
- next_run_time
- is_prod
- warning
- created_by
- steps
- pipeline_references
- latest_pipeline_version
- code_source
- pipeline_variable_mappings
- deprecated_keys
- __repr__()
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Describes a given pipeline.
- update(project_id=None, pipeline_variable_mappings=None, cron=None, is_prod=None)
Updates a pipeline for executing multiple steps.
- Parameters:
project_id (str) – A unique string identifier for the pipeline.
pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.
cron (str) – A cron-like string specifying the frequency of the scheduled pipeline runs.
is_prod (bool) – Whether the pipeline is a production pipeline or not.
- Returns:
An object that describes a Pipeline.
- Return type:
- rename(pipeline_name)
Renames a pipeline.
- list_versions(limit=200)
Lists the pipeline versions for a specified pipeline
- Parameters:
limit (int) – The maximum number of pipeline versions to return.
- Returns:
A list of pipeline versions.
- Return type:
- run(pipeline_variable_mappings=None)
Runs a specified pipeline with the arguments provided.
- Parameters:
pipeline_variable_mappings (List) – List of Python function arguments for the pipeline.
- Returns:
The object describing the pipeline
- Return type:
- create_step(step_name, function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)
Creates a step in a given pipeline.
- Parameters:
step_name (str) – The name of the step.
function_name (str) – The name of the Python function.
source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.
step_input_mappings (List) – List of Python function arguments.
output_variable_mappings (List) – List of Python function outputs.
step_dependencies (list) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.
timeout (int) – Timeout for the step in minutes, default is 300 minutes.
- Returns:
Object describing the pipeline.
- Return type:
- describe_step_by_name(step_name)
Describes a pipeline step by the step name.
- Parameters:
step_name (str) – The name of the step.
- Returns:
An object describing the pipeline step.
- Return type:
- unset_refresh_schedule()
Deletes the refresh schedule for a given pipeline.
- pause_refresh_schedule()
Pauses the refresh schedule for a given pipeline.
- resume_refresh_schedule()
Resumes the refresh schedule for a given pipeline.
- create_step_from_function(step_name, function, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None)
Creates a step in the pipeline from a python function.
- Parameters:
step_name (str) – The name of the step.
function (callable) – The python function.
step_input_mappings (List[PythonFunctionArguments]) – List of Python function arguments.
output_variable_mappings (List[OutputVariableMapping]) – List of Python function ouputs.
step_dependencies (List[str]) – List of step names this step depends on.
package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].
cpu_size (str) – Size of the CPU for the step function.
memory (int) – Memory (in GB) for the step function.