abacusai.feature_group_version ============================== .. py:module:: abacusai.feature_group_version Classes ------- .. autoapisummary:: abacusai.feature_group_version.FeatureGroupVersion Module Contents --------------- .. py:class:: FeatureGroupVersion(client, featureGroupVersion=None, featureGroupId=None, sql=None, sourceTables=None, sourceDatasetVersions=None, createdAt=None, status=None, error=None, deployable=None, cpuSize=None, memory=None, useOriginalCsvNames=None, pythonFunctionBindings=None, indexingConfigWarningMsg=None, materializationStartedAt=None, materializationCompletedAt=None, columns=None, templateBindings=None, features={}, pointInTimeGroups={}, codeSource={}, annotationConfig={}, indexingConfig={}) Bases: :py:obj:`abacusai.return_class.AbstractApiClass` A materialized version of a feature group :param client: An authenticated API Client instance :type client: ApiClient :param featureGroupVersion: The unique identifier for this materialized version of feature group. :type featureGroupVersion: str :param featureGroupId: The unique identifier of the feature group this version belongs to. :type featureGroupId: str :param sql: The sql definition creating this feature group. :type sql: str :param sourceTables: The source tables for this feature group. :type sourceTables: list[str] :param sourceDatasetVersions: The dataset version ids for this feature group version. :type sourceDatasetVersions: list[str] :param createdAt: The timestamp at which the feature group version was created. :type createdAt: str :param status: The current status of the feature group version. :type status: str :param error: Relevant error if the status is FAILED. :type error: str :param deployable: whether feature group is deployable or not. :type deployable: bool :param cpuSize: Cpu size specified for the python feature group. :type cpuSize: str :param memory: Memory in GB specified for the python feature group. :type memory: int :param useOriginalCsvNames: If true, the feature group will use the original column names in the source dataset. :type useOriginalCsvNames: bool :param pythonFunctionBindings: Config specifying variable names, types, and values to use when resolving a Python feature group. :type pythonFunctionBindings: list :param indexingConfigWarningMsg: The warning message related to indexing keys. :type indexingConfigWarningMsg: str :param materializationStartedAt: The timestamp at which the feature group materialization started. :type materializationStartedAt: str :param materializationCompletedAt: The timestamp at which the feature group materialization completed. :type materializationCompletedAt: str :param columns: List of resolved columns. :type columns: list[feature] :param templateBindings: Template variable bindings used for resolving the template. :type templateBindings: list :param features: List of features. :type features: Feature :param pointInTimeGroups: List of Point In Time Groups :type pointInTimeGroups: PointInTimeGroup :param codeSource: If a python feature group, information on the source code :type codeSource: CodeSource :param annotationConfig: The annotations config for the feature group. :type annotationConfig: AnnotationConfig :param indexingConfig: The indexing config for the feature group. :type indexingConfig: IndexingConfig .. py:attribute:: feature_group_version :value: None .. py:attribute:: feature_group_id :value: None .. py:attribute:: sql :value: None .. py:attribute:: source_tables :value: None .. py:attribute:: source_dataset_versions :value: None .. py:attribute:: created_at :value: None .. py:attribute:: status :value: None .. py:attribute:: error :value: None .. py:attribute:: deployable :value: None .. py:attribute:: cpu_size :value: None .. py:attribute:: memory :value: None .. py:attribute:: use_original_csv_names :value: None .. py:attribute:: python_function_bindings :value: None .. py:attribute:: indexing_config_warning_msg :value: None .. py:attribute:: materialization_started_at :value: None .. py:attribute:: materialization_completed_at :value: None .. py:attribute:: columns :value: None .. py:attribute:: template_bindings :value: None .. py:attribute:: features .. py:attribute:: point_in_time_groups .. py:attribute:: code_source .. py:attribute:: annotation_config .. py:attribute:: indexing_config .. py:attribute:: deprecated_keys .. py:method:: __repr__() .. py:method:: to_dict() Get a dict representation of the parameters in this class :returns: The dict value representation of the class parameters :rtype: dict .. py:method:: create_snapshot_feature_group(table_name) Creates a Snapshot Feature Group corresponding to a specific Feature Group version. :param table_name: Name for the newly created Snapshot Feature Group table. Can be up to 120 characters long and can only contain alphanumeric characters and underscores. :type table_name: str :returns: Feature Group corresponding to the newly created Snapshot. :rtype: FeatureGroup .. py:method:: export_to_file_connector(location, export_file_format, overwrite = False) Export Feature group to File Connector. :param location: Cloud file location to export to. :type location: str :param export_file_format: Enum string specifying the file format to export to. :type export_file_format: str :param overwrite: If true and a file exists at this location, this process will overwrite the file. :type overwrite: bool :returns: The FeatureGroupExport instance. :rtype: FeatureGroupExport .. py:method:: export_to_database_connector(database_connector_id, object_name, write_mode, database_feature_mapping, id_column = None, additional_id_columns = None) Export Feature group to Database Connector. :param database_connector_id: Unique string identifier for the Database Connector to export to. :type database_connector_id: str :param object_name: Name of the database object to write to. :type object_name: str :param write_mode: Enum string indicating whether to use INSERT or UPSERT. :type write_mode: str :param database_feature_mapping: Key/value pair JSON object of "database connector column" -> "feature name" pairs. :type database_feature_mapping: dict :param id_column: Required if write_mode is UPSERT. Indicates which database column should be used as the lookup key. :type id_column: str :param additional_id_columns: For database connectors which support it, additional ID columns to use as a complex key for upserting. :type additional_id_columns: list :returns: The FeatureGroupExport instance. :rtype: FeatureGroupExport .. py:method:: export_to_console(export_file_format) Export Feature group to console. :param export_file_format: File format to export to. :type export_file_format: str :returns: The FeatureGroupExport instance. :rtype: FeatureGroupExport .. py:method:: delete() Deletes a Feature Group Version. :param feature_group_version: String identifier for the feature group version to be removed. :type feature_group_version: str .. py:method:: get_materialization_logs(stdout = False, stderr = False) Returns logs for a materialized feature group version. :param stdout: Set to True to get info logs. :type stdout: bool :param stderr: Set to True to get error logs. :type stderr: bool :returns: A function logs object. :rtype: FunctionLogs .. py:method:: refresh() Calls describe and refreshes the current object's fields :returns: The current object :rtype: FeatureGroupVersion .. py:method:: describe() Describe a feature group version. :param feature_group_version: The unique identifier associated with the feature group version. :type feature_group_version: str :returns: The feature group version. :rtype: FeatureGroupVersion .. py:method:: get_metrics(selected_columns = None, include_charts = False, include_statistics = True) Get metrics for a specific feature group version. :param selected_columns: A list of columns to order first. :type selected_columns: List :param include_charts: A flag indicating whether charts should be included in the response. Default is false. :type include_charts: bool :param include_statistics: A flag indicating whether statistics should be included in the response. Default is true. :type include_statistics: bool :returns: The metrics for the specified feature group version. :rtype: DataMetrics .. py:method:: get_logs() Retrieves the feature group materialization logs. :param feature_group_version: The unique version ID of the feature group version. :type feature_group_version: str :returns: The logs for the specified feature group version. :rtype: FeatureGroupVersionLogs .. py:method:: wait_for_results(timeout=3600) A waiting call until feature group version is materialized :param timeout: The waiting time given to the call to finish, if it doesn't finish by the allocated time, the call is said to be timed out. :type timeout: int .. py:method:: wait_for_materialization(timeout=3600) A waiting call until feature group version is materialized. :param timeout: The waiting time given to the call to finish, if it doesn't finish by the allocated time, the call is said to be timed out. :type timeout: int .. py:method:: get_status() Gets the status of the feature group version. :returns: A string describing the status of a feature group version (pending, complete, etc.). :rtype: str .. py:method:: _download_avro_file(file_part, tmp_dir, part_index) .. py:method:: load_as_pandas(max_workers=10) Loads the feature group version into a pandas dataframe. :param max_workers: The number of threads. :type max_workers: int :returns: A pandas dataframe displaying the data in the feature group version. :rtype: DataFrame .. py:method:: load_as_pandas_documents(doc_id_column = 'doc_id', document_column = 'page_infos', max_workers=10) Loads a feature group with documents data into a pandas dataframe. :param doc_id_column: The name of the feature / column containing the document ID. :type doc_id_column: str :param document_column: The name of the feature / column which either contains the document data itself or page infos with path to remotely stored documents. This column will be replaced with the extracted document data. :type document_column: str :param max_workers: The number of threads. :type max_workers: int :returns: A pandas dataframe containing the extracted document data. :rtype: DataFrame