Skip to content

Registrar

Registrar

Registers dataset and its context (metadata) to the database.

This class provides methods to setup the registration system and to register tables & context (metadata).

Attributes
  • db_path (str): Path to the database file for retrieving content summaries & context.

__init__

__init__(db_path: str)

setup

setup() -> str

Setups the database system for registration purposes.

Returns
  • str: A JSON string representing the result of the process (Response).

add_tables

add_tables(
    path: str,
    creator: str,
    source: str = "file",
    s3_region: str = None,
    s3_access_key: str = None,
    s3_secret_access_key: str = None,
    accept_duplicates: bool = False,
) -> str

Adds tables into the database.

Args
  • path (str): The path to a specific table file/folder (CSV or parquet).
  • creator (str): The creator of the file.
  • source (str): The dataset source (either file or s3).
  • s3_region (int): Amazon S3 region.
  • s3_access_key (int): Amazon S3 access key.
  • s3_secret_access_key (int): Amazon S3 secret access key.
  • accept_duplicates (bool): Option to accept duplicate tables or not.
Returns
  • str: A JSON string representing the result of the process (Response).

add_metadata

add_metadata(metadata_path: str, table_id: str = '') -> str

Adds metadata into the database.

Args
  • metadata_path (str): The path to a specific metadata file/folder (TXT or CSV).
  • table_id (str): A specific table id associated with the metadata.
Returns
  • str: A JSON string representing the result of the process (Response).

__read_table_file

__read_table_file(
    path: str, creator: str, accept_duplicates: bool = False
) -> Response

Reads a table file (CSV or Parquet), registers it in the database, and updates if an existing table has the same ID.

Args
  • path (str): The path to a specific table file (CSV or parquet).
  • creator (str): The creator of the file.
  • accept_duplicates (bool): Option to allow duplicate tables or not.
Returns
  • Response: A Response object of the process.

__read_table_folder

__read_table_folder(
    folder_path: str, creator: str, accept_duplicates: bool = False
) -> Response

Reads a folder and registers all of its tables to the database.

Args
  • folder_path (str): The path to a folder containing tables.
  • creator (str): The creator of the file.
  • accept_duplicates (bool): Option to allow duplicate tables or not.
Returns
  • Response: A Response object of the process.

__read_metadata_file

__read_metadata_file(metadata_path: str, table_id: str) -> Response

Reads a metadata file (CSV or TXT) and registers it in the database.

Args
  • metadata_path (str): The path to a specific metadata file (CSV or TXT).
  • table_id (str): The associated table of the metadata.
Returns
  • Response: A Response object of the process.

__read_metadata_folder

__read_metadata_folder(metadata_path: str, table_id: str) -> Response

Reads a folder and registers all of its metadata to the database.

Args
  • metadata_path (str): The path to a folder containing metadata files.
  • table_id (str): The associated table of the metadata files.
Returns
  • Response: A Response object of the process.

__insert_metadata

__insert_metadata(metadata_content: str, table_id: str) -> Response

Inserts metadata associated with table table_id into the database.

Args
  • metadata_content (str): The content of the metadata.
  • table_id (str): The associated table of the metadata.
Returns
  • Response: A Response object of the process.