abacusai.feature_group ====================== .. py:module:: abacusai.feature_group Classes ------- .. autoapisummary:: abacusai.feature_group.FeatureGroup Module Contents --------------- .. py:class:: FeatureGroup(client, featureGroupId=None, modificationLock=None, name=None, featureGroupSourceType=None, tableName=None, sql=None, datasetId=None, functionSourceCode=None, functionName=None, sourceTables=None, createdAt=None, description=None, sqlError=None, latestVersionOutdated=None, referencedFeatureGroups=None, tags=None, primaryKey=None, updateTimestampKey=None, lookupKeys=None, streamingEnabled=None, incremental=None, mergeConfig=None, samplingConfig=None, cpuSize=None, memory=None, streamingReady=None, featureTags=None, moduleName=None, templateBindings=None, featureExpression=None, useOriginalCsvNames=None, pythonFunctionBindings=None, pythonFunctionName=None, useGpu=None, versionLimit=None, exportOnMaterialization=None, features={}, duplicateFeatures={}, pointInTimeGroups={}, annotationConfig={}, concatenationConfig={}, indexingConfig={}, codeSource={}, featureGroupTemplate={}, explanation={}, refreshSchedules={}, exportConnectorConfig={}, latestFeatureGroupVersion={}, operatorConfig={}) Bases: :py:obj:`abacusai.return_class.AbstractApiClass` A feature group. :param client: An authenticated API Client instance :type client: ApiClient :param featureGroupId: Unique identifier for this feature group. :type featureGroupId: str :param modificationLock: If feature group is locked against a change or not. :type modificationLock: bool :param name: :type name: str :param featureGroupSourceType: The source type of the feature group :type featureGroupSourceType: str :param tableName: Unique table name of this feature group. :type tableName: str :param sql: SQL definition creating this feature group. :type sql: str :param datasetId: Dataset ID the feature group is sourced from. :type datasetId: str :param functionSourceCode: Source definition creating this feature group. :type functionSourceCode: str :param functionName: Function name to execute from the source code. :type functionName: str :param sourceTables: Source tables for this feature group. :type sourceTables: list[str] :param createdAt: Timestamp at which the feature group was created. :type createdAt: str :param description: Description of the feature group. :type description: str :param sqlError: Error message with this feature group. :type sqlError: str :param latestVersionOutdated: Is latest materialized feature group version outdated. :type latestVersionOutdated: bool :param referencedFeatureGroups: Feature groups this feature group is used in. :type referencedFeatureGroups: list[str] :param tags: Tags added to this feature group. :type tags: list[str] :param primaryKey: Primary index feature. :type primaryKey: str :param updateTimestampKey: Primary timestamp feature. :type updateTimestampKey: str :param lookupKeys: Additional indexed features for this feature group. :type lookupKeys: list[str] :param streamingEnabled: If true, the feature group can have data streamed to it. :type streamingEnabled: bool :param incremental: If feature group corresponds to an incremental dataset. :type incremental: bool :param mergeConfig: Merge configuration settings for the feature group. :type mergeConfig: dict :param samplingConfig: Sampling configuration for the feature group. :type samplingConfig: dict :param cpuSize: CPU size specified for the Python feature group. :type cpuSize: str :param memory: Memory in GB specified for the Python feature group. :type memory: int :param streamingReady: If true, the feature group is ready to receive streaming data. :type streamingReady: bool :param featureTags: Tags for features in this feature group :type featureTags: dict :param moduleName: Path to the file with the feature group function. :type moduleName: str :param templateBindings: Config specifying variable names and values to use when resolving a feature group template. :type templateBindings: dict :param featureExpression: If the dataset feature group has custom features, the SQL select expression creating those features. :type featureExpression: str :param useOriginalCsvNames: If true, the feature group will use the original column names in the source dataset. :type useOriginalCsvNames: bool :param pythonFunctionBindings: Config specifying variable names, types, and values to use when resolving a Python feature group. :type pythonFunctionBindings: dict :param pythonFunctionName: Name of the Python function the feature group was built from. :type pythonFunctionName: str :param useGpu: Whether this feature group is using gpu :type useGpu: bool :param versionLimit: Version limit for the feature group. :type versionLimit: int :param exportOnMaterialization: Whether to export the feature group on materialization. :type exportOnMaterialization: bool :param features: List of resolved features. :type features: Feature :param duplicateFeatures: List of duplicate features. :type duplicateFeatures: Feature :param pointInTimeGroups: List of Point In Time Groups. :type pointInTimeGroups: PointInTimeGroup :param annotationConfig: Annotation config for this feature :type annotationConfig: AnnotationConfig :param latestFeatureGroupVersion: Latest feature group version. :type latestFeatureGroupVersion: FeatureGroupVersion :param concatenationConfig: Feature group ID whose data will be concatenated into this feature group. :type concatenationConfig: ConcatenationConfig :param indexingConfig: Indexing config for the feature group for feature store :type indexingConfig: IndexingConfig :param codeSource: If a Python feature group, information on the source code. :type codeSource: CodeSource :param featureGroupTemplate: FeatureGroupTemplate to use when this feature group is attached to a template. :type featureGroupTemplate: FeatureGroupTemplate :param explanation: Natural language explanation of the feature group :type explanation: NaturalLanguageExplanation :param refreshSchedules: List of schedules that determines when the next version of the feature group will be created. :type refreshSchedules: RefreshSchedule :param exportConnectorConfig: The export config (file connector or database connector information) for feature group exports. :type exportConnectorConfig: FeatureGroupRefreshExportConfig :param operatorConfig: Operator configuration settings for the feature group. :type operatorConfig: OperatorConfig .. py:attribute:: feature_group_id :value: None .. py:attribute:: modification_lock :value: None .. py:attribute:: name :value: None .. py:attribute:: feature_group_source_type :value: None .. py:attribute:: table_name :value: None .. py:attribute:: sql :value: None .. py:attribute:: dataset_id :value: None .. py:attribute:: function_source_code :value: None .. py:attribute:: function_name :value: None .. py:attribute:: source_tables :value: None .. py:attribute:: created_at :value: None .. py:attribute:: description :value: None .. py:attribute:: sql_error :value: None .. py:attribute:: latest_version_outdated :value: None .. py:attribute:: referenced_feature_groups :value: None .. py:attribute:: tags :value: None .. py:attribute:: primary_key :value: None .. py:attribute:: update_timestamp_key :value: None .. py:attribute:: lookup_keys :value: None .. py:attribute:: streaming_enabled :value: None .. py:attribute:: incremental :value: None .. py:attribute:: merge_config :value: None .. py:attribute:: sampling_config :value: None .. py:attribute:: cpu_size :value: None .. py:attribute:: memory :value: None .. py:attribute:: streaming_ready :value: None .. py:attribute:: feature_tags :value: None .. py:attribute:: module_name :value: None .. py:attribute:: template_bindings :value: None .. py:attribute:: feature_expression :value: None .. py:attribute:: use_original_csv_names :value: None .. py:attribute:: python_function_bindings :value: None .. py:attribute:: python_function_name :value: None .. py:attribute:: use_gpu :value: None .. py:attribute:: version_limit :value: None .. py:attribute:: export_on_materialization :value: None .. py:attribute:: features .. py:attribute:: duplicate_features .. py:attribute:: point_in_time_groups .. py:attribute:: annotation_config .. py:attribute:: concatenation_config .. py:attribute:: indexing_config .. py:attribute:: code_source .. py:attribute:: feature_group_template .. py:attribute:: explanation .. py:attribute:: refresh_schedules .. py:attribute:: export_connector_config .. py:attribute:: latest_feature_group_version .. py:attribute:: operator_config .. py:attribute:: deprecated_keys .. py:method:: __repr__() .. py:method:: to_dict() Get a dict representation of the parameters in this class :returns: The dict value representation of the class parameters :rtype: dict .. py:method:: add_to_project(project_id, feature_group_type = 'CUSTOM_TABLE') Adds a feature group to a project. :param project_id: The unique ID associated with the project. :type project_id: str :param feature_group_type: The feature group type of the feature group, based on the use case under which the feature group is being created. :type feature_group_type: str .. py:method:: set_project_config(project_id, project_config = None) Sets a feature group's project config :param project_id: Unique string identifier for the project. :type project_id: str :param project_config: Feature group's project configuration. :type project_config: ProjectFeatureGroupConfig .. py:method:: get_project_config(project_id) Gets a feature group's project config :param project_id: Unique string identifier for the project. :type project_id: str :returns: The feature group's project configuration. :rtype: ProjectConfig .. py:method:: remove_from_project(project_id) Removes a feature group from a project. :param project_id: The unique ID associated with the project. :type project_id: str .. py:method:: set_type(project_id, feature_group_type = 'CUSTOM_TABLE') Update the feature group type in a project. The feature group must already be added to the project. :param project_id: Unique identifier associated with the project. :type project_id: str :param feature_group_type: The feature group type to set the feature group as. :type feature_group_type: str .. py:method:: describe_annotation(feature_name = None, doc_id = None, feature_group_row_identifier = None) Get the latest annotation entry for a given feature group, feature, and document. :param feature_name: The name of the feature the annotation is on. :type feature_name: str :param doc_id: The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation. :type doc_id: str :param feature_group_row_identifier: The key value of the feature group row the annotation is on (cast to string). Usually the feature group's primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation. :type feature_group_row_identifier: str :returns: The latest annotation entry for the given feature group, feature, document, and/or annotation key value. :rtype: AnnotationEntry .. py:method:: verify_and_describe_annotation(feature_name = None, doc_id = None, feature_group_row_identifier = None) Get the latest annotation entry for a given feature group, feature, and document along with verification information. :param feature_name: The name of the feature the annotation is on. :type feature_name: str :param doc_id: The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation. :type doc_id: str :param feature_group_row_identifier: The key value of the feature group row the annotation is on (cast to string). Usually the feature group's primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation. :type feature_group_row_identifier: str :returns: The latest annotation entry for the given feature group, feature, document, and/or annotation key value. Includes the verification information. :rtype: AnnotationEntry .. py:method:: update_annotation_status(feature_name, status, doc_id = None, feature_group_row_identifier = None, save_metadata = False) Update the status of an annotation entry. :param feature_name: The name of the feature the annotation is on. :type feature_name: str :param status: The new status of the annotation. Must be one of the following: 'TODO', 'IN_PROGRESS', 'DONE'. :type status: str :param doc_id: The ID of the primary document the annotation is on. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation. :type doc_id: str :param feature_group_row_identifier: The key value of the feature group row the annotation is on (cast to string). Usually the feature group's primary / identifier key value. At least one of the doc_id or feature_group_row_identifier must be provided in order to identify the correct annotation. :type feature_group_row_identifier: str :param save_metadata: If True, save the metadata for the annotation entry. :type save_metadata: bool :returns: The updated annotation entry. :rtype: AnnotationEntry .. py:method:: get_document_to_annotate(project_id, feature_name, feature_group_row_identifier = None, get_previous = False) Get an available document that needs to be annotated for a annotation feature group. :param project_id: The ID of the project that the annotation is associated with. :type project_id: str :param feature_name: The name of the feature the annotation is on. :type feature_name: str :param feature_group_row_identifier: The key value of the feature group row the annotation is on (cast to string). Usually the primary key value. If provided, fetch the immediate next (or previous) available document. :type feature_group_row_identifier: str :param get_previous: If True, get the previous document instead of the next document. Applicable if feature_group_row_identifier is provided. :type get_previous: bool :returns: The document to annotate. :rtype: AnnotationDocument .. py:method:: get_annotations_status(feature_name = None, check_for_materialization = False) Get the status of the annotations for a given feature group and feature. :param feature_name: The name of the feature the annotation is on. :type feature_name: str :param check_for_materialization: If True, check if the feature group needs to be materialized before using for annotations. :type check_for_materialization: bool :returns: The status of the annotations for the given feature group and feature. :rtype: AnnotationsStatus .. py:method:: import_annotation_labels(file, annotation_type) Imports annotation labels from csv file. All valid values in the file will be imported as labels (including header row if present). :param file: The file to import. Must be a csv file. :type file: io.TextIOBase :param annotation_type: The type of the annotation. :type annotation_type: str :returns: The annotation config for the feature group. :rtype: AnnotationConfig .. py:method:: create_sampling(table_name, sampling_config, description = None) Creates a new Feature Group defined as a sample of rows from another Feature Group. For efficiency, sampling is approximate unless otherwise specified. (e.g. the number of rows may vary slightly from what was requested). :param table_name: The unique name to be given to this sampling Feature Group. Can be up to 120 characters long and can only contain alphanumeric characters and underscores. :type table_name: str :param sampling_config: Dictionary defining the sampling method and its parameters. :type sampling_config: SamplingConfig :param description: A human-readable description of this Feature Group. :type description: str :returns: The created Feature Group. :rtype: FeatureGroup .. py:method:: set_sampling_config(sampling_config) Set a FeatureGroup’s sampling to the config values provided, so that the rows the FeatureGroup returns will be a sample of those it would otherwise have returned. :param sampling_config: A JSON string object specifying the sampling method and parameters specific to that sampling method. An empty sampling_config indicates no sampling. :type sampling_config: SamplingConfig :returns: The updated FeatureGroup. :rtype: FeatureGroup .. py:method:: set_merge_config(merge_config) Set a MergeFeatureGroup’s merge config to the values provided, so that the feature group only returns a bounded range of an incremental dataset. :param merge_config: JSON object string specifying the merge rule. An empty merge_config will default to only including the latest dataset version. :type merge_config: MergeConfig :returns: The updated FeatureGroup. :rtype: FeatureGroup .. py:method:: set_operator_config(operator_config) Set a OperatorFeatureGroup’s operator config to the values provided. :param operator_config: A dictionary object specifying the pre-defined operations. :type operator_config: OperatorConfig :returns: The updated FeatureGroup. :rtype: FeatureGroup .. py:method:: set_schema(schema) Creates a new schema and points the feature group to the new feature group schema ID. :param schema: JSON string containing an array of objects with 'name' and 'dataType' properties. :type schema: list .. py:method:: get_schema(project_id = None) Returns a schema for a given FeatureGroup in a project. :param project_id: The unique ID associated with the project. :type project_id: str :returns: A list of objects for each column in the specified feature group. :rtype: list[Feature] .. py:method:: create_feature(name, select_expression) Creates a new feature in a Feature Group from a SQL select statement. :param name: The name of the feature to add. :type name: str :param select_expression: SQL SELECT expression to create the feature. :type select_expression: str :returns: A Feature Group object with the newly added feature. :rtype: FeatureGroup .. py:method:: add_tag(tag) Adds a tag to the feature group :param tag: The tag to add to the feature group. :type tag: str .. py:method:: remove_tag(tag) Removes a tag from the specified feature group. :param tag: The tag to remove from the feature group. :type tag: str .. py:method:: add_annotatable_feature(name, annotation_type) Add an annotatable feature in a Feature Group :param name: The name of the feature to add. :type name: str :param annotation_type: The type of annotation to set. :type annotation_type: str :returns: The feature group after the feature has been set :rtype: FeatureGroup .. py:method:: set_feature_as_annotatable_feature(feature_name, annotation_type, feature_group_row_identifier_feature = None, doc_id_feature = None) Sets an existing feature as an annotatable feature (Feature that can be annotated). :param feature_name: The name of the feature to set as annotatable. :type feature_name: str :param annotation_type: The type of annotation label to add. :type annotation_type: str :param feature_group_row_identifier_feature: The key value of the feature group row the annotation is on (cast to string) and uniquely identifies the feature group row. At least one of the doc_id or key value must be provided so that the correct annotation can be identified. :type feature_group_row_identifier_feature: str :param doc_id_feature: The name of the document ID feature. :type doc_id_feature: str :returns: A feature group object with the newly added annotatable feature. :rtype: FeatureGroup .. py:method:: set_annotation_status_feature(feature_name) Sets a feature as the annotation status feature for a feature group. :param feature_name: The name of the feature to set as the annotation status feature. :type feature_name: str :returns: The updated feature group. :rtype: FeatureGroup .. py:method:: unset_feature_as_annotatable_feature(feature_name) Unsets a feature as annotatable :param feature_name: The name of the feature to unset. :type feature_name: str :returns: The feature group after unsetting the feature :rtype: FeatureGroup .. py:method:: add_annotation_label(label_name, annotation_type, label_definition = None) Adds an annotation label :param label_name: The name of the label. :type label_name: str :param annotation_type: The type of the annotation to set. :type annotation_type: str :param label_definition: the definition of the label. :type label_definition: str :returns: The feature group after adding the annotation label :rtype: FeatureGroup .. py:method:: remove_annotation_label(label_name) Removes an annotation label :param label_name: The name of the label to remove. :type label_name: str :returns: The feature group after adding the annotation label :rtype: FeatureGroup .. py:method:: add_feature_tag(feature, tag) Adds a tag on a feature :param feature: The feature to set the tag on. :type feature: str :param tag: The tag to set on the feature. :type tag: str .. py:method:: remove_feature_tag(feature, tag) Removes a tag from a feature :param feature: The feature to remove the tag from. :type feature: str :param tag: The tag to remove. :type tag: str .. py:method:: create_nested_feature(nested_feature_name, table_name, using_clause, where_clause = None, order_clause = None) Creates a new nested feature in a feature group from a SQL statement. :param nested_feature_name: The name of the feature. :type nested_feature_name: str :param table_name: The table name of the feature group to nest. Can be up to 120 characters long and can only contain alphanumeric characters and underscores. :type table_name: str :param using_clause: The SQL join column or logic to join the nested table with the parent. :type using_clause: str :param where_clause: A SQL WHERE statement to filter the nested rows. :type where_clause: str :param order_clause: A SQL clause to order the nested rows. :type order_clause: str :returns: A feature group object with the newly added nested feature. :rtype: FeatureGroup .. py:method:: update_nested_feature(nested_feature_name, table_name = None, using_clause = None, where_clause = None, order_clause = None, new_nested_feature_name = None) Updates a previously existing nested feature in a feature group. :param nested_feature_name: The name of the feature to be updated. :type nested_feature_name: str :param table_name: The name of the table. Can be up to 120 characters long and can only contain alphanumeric characters and underscores. :type table_name: str :param using_clause: The SQL join column or logic to join the nested table with the parent. :type using_clause: str :param where_clause: An SQL WHERE statement to filter the nested rows. :type where_clause: str :param order_clause: An SQL clause to order the nested rows. :type order_clause: str :param new_nested_feature_name: New name for the nested feature. :type new_nested_feature_name: str :returns: A feature group object with the updated nested feature. :rtype: FeatureGroup .. py:method:: delete_nested_feature(nested_feature_name) Delete a nested feature. :param nested_feature_name: The name of the feature to be deleted. :type nested_feature_name: str :returns: A feature group object without the specified nested feature. :rtype: FeatureGroup .. py:method:: create_point_in_time_feature(feature_name, history_table_name, aggregation_keys, timestamp_key, historical_timestamp_key, expression, lookback_window_seconds = None, lookback_window_lag_seconds = 0, lookback_count = None, lookback_until_position = 0) Creates a new point in time feature in a feature group using another historical feature group, window spec, and aggregate expression. We use the aggregation keys and either the lookbackWindowSeconds or the lookbackCount values to perform the window aggregation for every row in the current feature group. If the window is specified in seconds, then all rows in the history table which match the aggregation keys and with historicalTimeFeature greater than or equal to lookbackStartCount and less than the value of the current rows timeFeature are considered. An optional lookbackWindowLagSeconds (+ve or -ve) can be used to offset the current value of the timeFeature. If this value is negative, we will look at the future rows in the history table, so care must be taken to ensure that these rows are available in the online context when we are performing a lookup on this feature group. If the window is specified in counts, then we order the historical table rows aligning by time and consider rows from the window where the rank order is greater than or equal to lookbackCount and includes the row just prior to the current one. The lag is specified in terms of positions using lookbackUntilPosition. :param feature_name: The name of the feature to create. :type feature_name: str :param history_table_name: The table name of the history table. :type history_table_name: str :param aggregation_keys: List of keys to use for joining the historical table and performing the window aggregation. :type aggregation_keys: list :param timestamp_key: Name of feature which contains the timestamp value for the point in time feature. :type timestamp_key: str :param historical_timestamp_key: Name of feature which contains the historical timestamp. :type historical_timestamp_key: str :param expression: SQL aggregate expression which can convert a sequence of rows into a scalar value. :type expression: str :param lookback_window_seconds: If window is specified in terms of time, number of seconds in the past from the current time for start of the window. :type lookback_window_seconds: float :param lookback_window_lag_seconds: Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the "future" rows in the history table. :type lookback_window_lag_seconds: float :param lookback_count: If window is specified in terms of count, the start position of the window (0 is the current row). :type lookback_count: int :param lookback_until_position: Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many "future" rows in the history table. :type lookback_until_position: int :returns: A feature group object with the newly added nested feature. :rtype: FeatureGroup .. py:method:: update_point_in_time_feature(feature_name, history_table_name = None, aggregation_keys = None, timestamp_key = None, historical_timestamp_key = None, expression = None, lookback_window_seconds = None, lookback_window_lag_seconds = None, lookback_count = None, lookback_until_position = None, new_feature_name = None) Updates an existing Point-in-Time (PiT) feature in a feature group. See `createPointInTimeFeature` for detailed semantics. :param feature_name: The name of the feature. :type feature_name: str :param history_table_name: The table name of the history table. If not specified, we use the current table to do a self join. :type history_table_name: str :param aggregation_keys: List of keys to use for joining the historical table and performing the window aggregation. :type aggregation_keys: list :param timestamp_key: Name of the feature which contains the timestamp value for the PiT feature. :type timestamp_key: str :param historical_timestamp_key: Name of the feature which contains the historical timestamp. :type historical_timestamp_key: str :param expression: SQL Aggregate expression which can convert a sequence of rows into a scalar value. :type expression: str :param lookback_window_seconds: If the window is specified in terms of time, the number of seconds in the past from the current time for the start of the window. :type lookback_window_seconds: float :param lookback_window_lag_seconds: Optional lag to offset the closest point for the window. If it is positive, we delay the start of the window. If it is negative, we are looking at the "future" rows in the history table. :type lookback_window_lag_seconds: float :param lookback_count: If the window is specified in terms of count, the start position of the window (0 is the current row). :type lookback_count: int :param lookback_until_position: Optional lag to offset the closest point for the window. If it is positive, we delay the start of the window by that many rows. If it is negative, we are looking at those many "future" rows in the history table. :type lookback_until_position: int :param new_feature_name: New name for the PiT feature. :type new_feature_name: str :returns: A feature group object with the newly added nested feature. :rtype: FeatureGroup .. py:method:: create_point_in_time_group(group_name, window_key, aggregation_keys, history_table_name = None, history_window_key = None, history_aggregation_keys = None, lookback_window = None, lookback_window_lag = 0, lookback_count = None, lookback_until_position = 0) Create a Point-in-Time Group :param group_name: The name of the point in time group. :type group_name: str :param window_key: Name of feature to use for ordering the rows on the source table. :type window_key: str :param aggregation_keys: List of keys to perform on the source table for the window aggregation. :type aggregation_keys: list :param history_table_name: The table to use for aggregating, if not provided, the source table will be used. :type history_table_name: str :param history_window_key: Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used. :type history_window_key: str :param history_aggregation_keys: List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table's aggregationKeys. :type history_aggregation_keys: list :param lookback_window: Number of seconds in the past from the current time for the start of the window. If 0, the lookback will include all rows. :type lookback_window: float :param lookback_window_lag: Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed. If it is negative, "future" rows in the history table are used. :type lookback_window_lag: float :param lookback_count: If window is specified in terms of count, the start position of the window (0 is the current row). :type lookback_count: int :param lookback_until_position: Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed by that many rows. If it is negative, those many "future" rows in the history table are used. :type lookback_until_position: int :returns: The feature group after the point in time group has been created. :rtype: FeatureGroup .. py:method:: generate_point_in_time_features(group_name, columns, window_functions, prefix = None) Generates and adds PIT features given the selected columns to aggregate over, and the operations to include. :param group_name: Name of the point-in-time group. :type group_name: str :param columns: List of columns to generate point-in-time features for. :type columns: list :param window_functions: List of window functions to operate on. :type window_functions: list :param prefix: Prefix for generated features, defaults to group name :type prefix: str :returns: Feature group object with newly added point-in-time features. :rtype: FeatureGroup .. py:method:: update_point_in_time_group(group_name, window_key = None, aggregation_keys = None, history_table_name = None, history_window_key = None, history_aggregation_keys = None, lookback_window = None, lookback_window_lag = None, lookback_count = None, lookback_until_position = None) Update Point-in-Time Group :param group_name: The name of the point-in-time group. :type group_name: str :param window_key: Name of feature which contains the timestamp value for the point-in-time feature. :type window_key: str :param aggregation_keys: List of keys to use for joining the historical table and performing the window aggregation. :type aggregation_keys: list :param history_table_name: The table to use for aggregating, if not provided, the source table will be used. :type history_table_name: str :param history_window_key: Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used. :type history_window_key: str :param history_aggregation_keys: List of keys to use for joining the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table's aggregationKeys. :type history_aggregation_keys: list :param lookback_window: Number of seconds in the past from the current time for the start of the window. :type lookback_window: float :param lookback_window_lag: Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed. If it is negative, future rows in the history table are looked at. :type lookback_window_lag: float :param lookback_count: If window is specified in terms of count, the start position of the window (0 is the current row). :type lookback_count: int :param lookback_until_position: Optional lag to offset the closest point for the window. If it is positive, the start of the window is delayed by that many rows. If it is negative, those many future rows in the history table are looked at. :type lookback_until_position: int :returns: The feature group after the update has been applied. :rtype: FeatureGroup .. py:method:: delete_point_in_time_group(group_name) Delete point in time group :param group_name: The name of the point in time group. :type group_name: str :returns: The feature group after the point in time group has been deleted. :rtype: FeatureGroup .. py:method:: create_point_in_time_group_feature(group_name, name, expression) Create point in time group feature :param group_name: The name of the point-in-time group. :type group_name: str :param name: The name of the feature to add to the point-in-time group. :type name: str :param expression: A SQL aggregate expression which can convert a sequence of rows into a scalar value. :type expression: str :returns: The feature group after the update has been applied. :rtype: FeatureGroup .. py:method:: update_point_in_time_group_feature(group_name, name, expression) Update a feature's SQL expression in a point in time group :param group_name: The name of the point-in-time group. :type group_name: str :param name: The name of the feature to add to the point-in-time group. :type name: str :param expression: SQL aggregate expression which can convert a sequence of rows into a scalar value. :type expression: str :returns: The feature group after the update has been applied. :rtype: FeatureGroup .. py:method:: set_feature_type(feature, feature_type, project_id = None) Set the type of a feature in a feature group. Specify the feature group ID, feature name, and feature type, and the method will return the new column with the changes reflected. :param feature: The name of the feature. :type feature: str :param feature_type: The machine learning type of the data in the feature. :type feature_type: str :param project_id: Optional unique ID associated with the project. :type project_id: str :returns: The feature group after the data_type is applied. :rtype: Schema .. py:method:: concatenate_data(source_feature_group_id, merge_type = 'UNION', replace_until_timestamp = None, skip_materialize = False) Concatenates data from one Feature Group to another. Feature Groups can be merged if their schemas are compatible, they have the special `updateTimestampKey` column, and (if set) the `primaryKey` column. The second operand in the concatenate operation will be appended to the first operand (merge target). :param source_feature_group_id: The Feature Group to concatenate with the destination Feature Group. :type source_feature_group_id: str :param merge_type: `UNION` or `INTERSECTION`. :type merge_type: str :param replace_until_timestamp: The UNIX timestamp to specify the point until which we will replace data from the source Feature Group. :type replace_until_timestamp: int :param skip_materialize: If `True`, will not materialize the concatenated Feature Group. :type skip_materialize: bool .. py:method:: remove_concatenation_config() Removes the concatenation config on a destination feature group. :param feature_group_id: Unique identifier of the destination feature group to remove the concatenation configuration from. :type feature_group_id: str .. py:method:: refresh() Calls describe and refreshes the current object's fields :returns: The current object :rtype: FeatureGroup .. py:method:: describe() Describe a Feature Group. :param feature_group_id: A unique string identifier associated with the feature group. :type feature_group_id: str :returns: The feature group object. :rtype: FeatureGroup .. py:method:: set_indexing_config(primary_key = None, update_timestamp_key = None, lookup_keys = None) Sets various attributes of the feature group used for primary key, deployment lookups and streaming updates. :param primary_key: Name of the feature which defines the primary key of the feature group. :type primary_key: str :param update_timestamp_key: Name of the feature which defines the update timestamp of the feature group. Used in concatenation and primary key deduplication. :type update_timestamp_key: str :param lookup_keys: List of feature names which can be used in the lookup API to restrict the computation to a set of dataset rows. These feature names have to correspond to underlying dataset columns. :type lookup_keys: list .. py:method:: update(description = None) Modify an existing Feature Group. :param description: Description of the Feature Group. :type description: str :returns: Updated Feature Group object. :rtype: FeatureGroup .. py:method:: detach_from_template() Update a feature group to detach it from a template. :param feature_group_id: Unique string identifier associated with the feature group. :type feature_group_id: str :returns: The updated feature group. :rtype: FeatureGroup .. py:method:: update_template_bindings(template_bindings = None) Update the feature group template bindings for a template feature group. :param template_bindings: Values in these bindings override values set in the template. :type template_bindings: list :returns: Updated feature group. :rtype: FeatureGroup .. py:method:: update_python_function_bindings(python_function_bindings) Updates an existing Feature Group's Python function bindings from a user-provided Python Function. If a list of feature groups are supplied within the Python function bindings, we will provide DataFrames (Pandas in the case of Python) with the materialized feature groups for those input feature groups as arguments to the function. :param python_function_bindings: List of python function arguments. :type python_function_bindings: List .. py:method:: update_python_function(python_function_name, python_function_bindings = None, cpu_size = None, memory = None, use_gpu = None, use_original_csv_names = None) Updates an existing Feature Group's python function from a user provided Python Function. If a list of feature groups are supplied within the python function bindings, we will provide as arguments to the function DataFrame's (pandas in the case of Python) with the materialized feature groups for those input feature groups. :param python_function_name: The name of the python function to be associated with the feature group. :type python_function_name: str :param python_function_bindings: List of python function arguments. :type python_function_bindings: List :param cpu_size: Size of the CPU for the feature group python function. :type cpu_size: CPUSize :param memory: Memory (in GB) for the feature group python function. :type memory: MemorySize :param use_gpu: Whether the feature group needs a gpu or not. Otherwise default to CPU. :type use_gpu: bool :param use_original_csv_names: If enabled, it uses the original column names for input feature groups from CSV datasets. :type use_original_csv_names: bool .. py:method:: update_sql_definition(sql) Updates the SQL statement for a feature group. :param sql: The input SQL statement for the feature group. :type sql: str :returns: The updated feature group. :rtype: FeatureGroup .. py:method:: update_dataset_feature_expression(feature_expression) Updates the SQL feature expression for a Dataset FeatureGroup's custom features :param feature_expression: The input SQL statement for the feature group. :type feature_expression: str :returns: The updated feature group. :rtype: FeatureGroup .. py:method:: update_version_limit(version_limit) Updates the version limit for the feature group. :param version_limit: The maximum number of versions permitted for the feature group. Once this limit is exceeded, the oldest versions will be purged in a First-In-First-Out (FIFO) order. :type version_limit: int :returns: The updated feature group. :rtype: FeatureGroup .. py:method:: update_feature(name, select_expression = None, new_name = None) Modifies an existing feature in a feature group. :param name: Name of the feature to be updated. :type name: str :param select_expression: SQL statement for modifying the feature. :type select_expression: str :param new_name: New name of the feature. :type new_name: str :returns: Updated feature group object. :rtype: FeatureGroup .. py:method:: list_exports() Lists all of the feature group exports for the feature group :param feature_group_id: Unique identifier of the feature group :type feature_group_id: str :returns: List of feature group exports :rtype: list[FeatureGroupExport] .. py:method:: set_modifier_lock(locked = True) Lock a feature group to prevent modification. :param locked: Whether to disable or enable feature group modification (True or False). :type locked: bool .. py:method:: list_modifiers() List the users who can modify a given feature group. :param feature_group_id: Unique string identifier of the feature group. :type feature_group_id: str :returns: Information about the modification lock status and groups/organizations added to the feature group. :rtype: ModificationLockInfo .. py:method:: add_user_to_modifiers(email) Adds a user to a feature group. :param email: The email address of the user to be added. :type email: str .. py:method:: add_organization_group_to_modifiers(organization_group_id) Add OrganizationGroup to a feature group modifiers list :param organization_group_id: Unique string identifier of the organization group. :type organization_group_id: str .. py:method:: remove_user_from_modifiers(email) Removes a user from a specified feature group. :param email: The email address of the user to be removed. :type email: str .. py:method:: remove_organization_group_from_modifiers(organization_group_id) Removes an OrganizationGroup from a feature group modifiers list :param organization_group_id: The unique ID associated with the organization group. :type organization_group_id: str .. py:method:: delete_feature(name) Removes a feature from the feature group. :param name: Name of the feature to be deleted. :type name: str :returns: Updated feature group object. :rtype: FeatureGroup .. py:method:: delete() Deletes a Feature Group. :param feature_group_id: Unique string identifier for the feature group to be removed. :type feature_group_id: str .. py:method:: create_version(variable_bindings = None) Creates a snapshot for a specified feature group. Triggers materialization of the feature group. The new version of the feature group is created after it has materialized. :param variable_bindings: Dictionary defining variable bindings that override parent feature group values. :type variable_bindings: dict :returns: A feature group version. :rtype: FeatureGroupVersion .. py:method:: list_versions(limit = 100, start_after_version = None) Retrieves a list of all feature group versions for the specified feature group. :param limit: The maximum length of the returned versions. :type limit: int :param start_after_version: Results will start after this version. :type start_after_version: str :returns: A list of feature group versions. :rtype: list[FeatureGroupVersion] .. py:method:: set_export_connector_config(feature_group_export_config = None) Sets FG export config for the given feature group. :param feature_group_export_config: The export config to be set for the given feature group. :type feature_group_export_config: FeatureGroupExportConfig .. py:method:: set_export_on_materialization(enable) Can be used to enable or disable exporting feature group data to the export connector associated with the feature group. :param enable: If true, will enable exporting feature group to the connector. If false, will disable. :type enable: bool .. py:method:: create_template(name, template_sql, template_variables, description = None, template_bindings = None, should_attach_feature_group_to_template = False) Create a feature group template. :param name: User-friendly name for this feature group template. :type name: str :param template_sql: The template SQL that will be resolved by applying values from the template variables to generate SQL for a feature group. :type template_sql: str :param template_variables: The template variables for resolving the template. :type template_variables: list :param description: Description of this feature group template. :type description: str :param template_bindings: If the feature group will be attached to the newly created template, set these variable bindings on that feature group. :type template_bindings: list :param should_attach_feature_group_to_template: Set to `True` to convert the feature group to a template feature group and attach it to the newly created template. :type should_attach_feature_group_to_template: bool :returns: The created feature group template. :rtype: FeatureGroupTemplate .. py:method:: suggest_template_for() Suggest values for a feature gruop template, based on a feature group. :param feature_group_id: Unique identifier associated with the feature group to use for suggesting values to use in the template. :type feature_group_id: str :returns: The suggested feature group template. :rtype: FeatureGroupTemplate .. py:method:: get_recent_streamed_data() Returns recently streamed data to a streaming feature group. :param feature_group_id: Unique string identifier associated with the feature group. :type feature_group_id: str .. py:method:: append_data(streaming_token, data) Appends new data into the feature group for a given lookup key recordId. :param streaming_token: The streaming token for authenticating requests. :type streaming_token: str :param data: The data to record as a JSON object. :type data: dict .. py:method:: append_multiple_data(streaming_token, data) Appends new data into the feature group for a given lookup key recordId. :param streaming_token: Streaming token for authenticating requests. :type streaming_token: str :param data: Data to record, as a list of JSON objects. :type data: list .. py:method:: upsert_data(data, streaming_token = None, blobs = None) Update new data into the feature group for a given lookup key record ID if the record ID is found; otherwise, insert new data into the feature group. :param data: The data to record, in JSON format. :type data: dict :param streaming_token: Optional streaming token for authenticating requests if upserting to streaming FG. :type streaming_token: str :param blobs: A dictionary of binary data to populate file fields' in data to upsert to the streaming FG. :type blobs: None :returns: The feature group row that was upserted. :rtype: FeatureGroupRow .. py:method:: delete_data(primary_key) Deletes a row from the feature group given the primary key :param primary_key: The primary key value for which to delete the feature group row :type primary_key: str .. py:method:: get_data(primary_key = None, num_rows = None) Gets the feature group rows for online updatable feature groups. If primary key is set, row corresponding to primary_key is returned. If num_rows is set, we return maximum of num_rows latest updated rows. :param primary_key: The primary key value for which to retrieve the feature group row (only for online feature groups). :type primary_key: str :param num_rows: Maximum number of rows to return from the feature group :type num_rows: int :returns: A list of feature group rows. :rtype: list[FeatureGroupRow] .. py:method:: get_natural_language_explanation(feature_group_version = None, model_id = None) Returns the saved natural language explanation of an artifact with given ID. The artifact can be - Feature Group or Feature Group Version or Model :param feature_group_version: A unique string identifier associated with the Feature Group Version. :type feature_group_version: str :param model_id: A unique string identifier associated with the Model. :type model_id: str :returns: The object containing natural language explanation(s) as field(s). :rtype: NaturalLanguageExplanation .. py:method:: generate_natural_language_explanation(feature_group_version = None, model_id = None) Generates natural language explanation of an artifact with given ID. The artifact can be - Feature Group or Feature Group Version or Model :param feature_group_version: A unique string identifier associated with the Feature Group Version. :type feature_group_version: str :param model_id: A unique string identifier associated with the Model. :type model_id: str :returns: The object containing natural language explanation(s) as field(s). :rtype: NaturalLanguageExplanation .. py:method:: wait_for_dataset(timeout = 7200) A waiting call until the feature group's dataset, if any, is ready for use. :param timeout: The waiting time given to the call to finish, if it doesn't finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds. :type timeout: int .. py:method:: wait_for_upload(timeout = 7200) Waits for a feature group created from a dataframe to be ready for materialization and version creation. :param timeout: The waiting time given to the call to finish, if it doesn't finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds. :type timeout: int .. py:method:: wait_for_materialization(timeout = 7200) A waiting call until feature group is materialized. :param timeout: The waiting time given to the call to finish, if it doesn't finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds. :type timeout: int .. py:method:: wait_for_streaming_ready(timeout = 600) Waits for the feature group indexing config to be applied for streaming :param timeout: The waiting time given to the call to finish, if it doesn't finish by the allocated time, the call is said to be timed out. Default value given is 600 seconds. :type timeout: int .. py:method:: get_status(streaming_status = False) Gets the status of the feature group. :returns: A string describing the status of a feature group (pending, complete, etc.). :rtype: str .. py:method:: load_as_pandas() Loads the feature groups into a python pandas dataframe. :returns: A pandas dataframe with annotations and text_snippet columns. :rtype: DataFrame .. py:method:: load_as_pandas_documents(doc_id_column = 'doc_id', document_column = 'page_infos') Loads a feature group with documents data into a pandas dataframe. :param doc_id_column: The name of the feature / column containing the document ID. :type doc_id_column: str :param document_column: The name of the feature / column which either contains the document data itself or page infos with path to remotely stored documents. This column will be replaced with the extracted document data. :type document_column: str :returns: A pandas dataframe containing the extracted document data. :rtype: DataFrame .. py:method:: describe_dataset() Displays the dataset attached to a feature group. :returns: A dataset object with all the relevant information about the dataset. :rtype: Dataset .. py:method:: materialize() Materializes the feature group's latest change at the api call time. It'll skip materialization if no change since the current latest version. :returns: A feature group object with the lastest changes materialized. :rtype: FeatureGroup