abacusai.upload
Classes
A Upload Reference for uploading file parts |
Module Contents
- class abacusai.upload.Upload(client, uploadId=None, datasetUploadId=None, status=None, datasetId=None, datasetVersion=None, modelId=None, modelVersion=None, batchPredictionId=None, parts=None, createdAt=None)
Bases:
abacusai.return_class.AbstractApiClass
A Upload Reference for uploading file parts
- Parameters:
client (ApiClient) – An authenticated API Client instance
uploadId (str) – The unique ID generated when the upload process of the full large file in smaller parts is initiated.
datasetUploadId (str) – Same as upload_id. It is kept for backwards compatibility purposes.
status (str) – The current status of the upload.
datasetId (str) – A reference to the dataset this upload is adding data to.
datasetVersion (str) – A reference to the dataset version the upload is adding data to.
modelId (str) – A reference the model the upload is creating a version for
modelVersion (str) – A reference to the model version the upload is creating.
batchPredictionId (str) – A reference to the batch prediction the upload is creating.
parts (list[dict]) – A list containing the order of the file parts that have been uploaded.
createdAt (str) – The timestamp at which the upload was created.
- upload_id
- dataset_upload_id
- status
- dataset_id
- dataset_version
- model_id
- model_version
- batch_prediction_id
- parts
- created_at
- deprecated_keys
- __repr__()
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- cancel()
Cancels an upload.
- Parameters:
upload_id (str) – A unique string identifier for the upload.
- part(part_number, part_data)
Uploads part of a large dataset file from your bucket to our system. Our system currently supports parts of up to 5GB and full files of up to 5TB. Note that each part must be at least 5MB in size, unless it is the last part in the sequence of parts for the full file.
- Parameters:
part_number (int) – The 1-indexed number denoting the position of the file part in the sequence of parts for the full file.
part_data (io.TextIOBase) – The multipart/form-data for the current part of the full file.
- Returns:
The object ‘UploadPart’ which encapsulates the hash and the etag for the part that got uploaded.
- Return type:
- mark_complete()
Marks an upload process as complete.
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Retrieves the current upload status (complete or inspecting) and the list of file parts uploaded for a specified dataset upload.
- upload_part(upload_args)
Uploads a file part.
- Returns:
The object ‘UploadPart’ that encapsulates the hash and the etag for the part that got uploaded.
- Return type:
- upload_file(file, threads=10, chunksize=1024 * 1024 * 10, wait_timeout=600)
Uploads the file in the specified chunk size using the specified number of workers.
- Parameters:
file (IOBase) – A bytesIO or StringIO object to upload to Abacus.AI
threads (int) – The max number of workers to use while uploading the file
chunksize (int) – The number of bytes to use for each chunk while uploading the file. Defaults to 10 MB
wait_timeout (int) – The max number of seconds to wait for the file parts to be joined on Abacus.AI. Defaults to 600.
- Returns:
The upload file object.
- Return type:
- _yield_upload_part(file, chunksize)