Registrar
Registrar
Registers dataset and its context (metadata) to the database.
This class provides methods to setup the registration system and to register tables & context (metadata).
Attributes
- db_path (
str
): Path to the database file for retrieving content summaries & context.
__init__
__init__(db_path: str)
setup
setup() -> str
Setups the database system for registration purposes.
Returns
str
: A JSON string representing the result of the process (Response
).
add_tables
add_tables(
path: str,
creator: str,
source: str = "file",
s3_region: str = None,
s3_access_key: str = None,
s3_secret_access_key: str = None,
accept_duplicates: bool = False,
) -> str
Adds tables into the database.
Args
- path (
str
): The path to a specific table file/folder (CSV
orparquet
). - creator (
str
): The creator of the file. - source (
str
): The dataset source (eitherfile
ors3
). - s3_region (
int
): Amazon S3 region. - s3_access_key (
int
): Amazon S3 access key. - s3_secret_access_key (
int
): Amazon S3 secret access key. - accept_duplicates (
bool
): Option to accept duplicate tables or not.
Returns
str
: A JSON string representing the result of the process (Response
).
add_metadata
add_metadata(metadata_path: str, table_id: str = '') -> str
Adds metadata into the database.
Args
- metadata_path (
str
): The path to a specific metadata file/folder (TXT
orCSV
). - table_id (
str
): A specific table id associated with the metadata.
Returns
str
: A JSON string representing the result of the process (Response
).
__read_table_file
__read_table_file(
path: str, creator: str, accept_duplicates: bool = False
) -> Response
Reads a table file (CSV or Parquet), registers it in the database, and updates if an existing table has the same ID.
Args
- path (
str
): The path to a specific table file (CSV
orparquet
). - creator (
str
): The creator of the file. - accept_duplicates (
bool
): Option to allow duplicate tables or not.
Returns
Response
: AResponse
object of the process.
__read_table_folder
__read_table_folder(
folder_path: str, creator: str, accept_duplicates: bool = False
) -> Response
Reads a folder and registers all of its tables to the database.
Args
- folder_path (
str
): The path to a folder containing tables. - creator (
str
): The creator of the file. - accept_duplicates (
bool
): Option to allow duplicate tables or not.
Returns
Response
: AResponse
object of the process.
__read_metadata_file
__read_metadata_file(metadata_path: str, table_id: str) -> Response
Reads a metadata file (CSV or TXT) and registers it in the database.
Args
- metadata_path (
str
): The path to a specific metadata file (CSV
orTXT
). - table_id (
str
): The associated table of the metadata.
Returns
Response
: AResponse
object of the process.
__read_metadata_folder
__read_metadata_folder(metadata_path: str, table_id: str) -> Response
Reads a folder and registers all of its metadata to the database.
Args
- metadata_path (
str
): The path to a folder containing metadata files. - table_id (
str
): The associated table of the metadata files.
Returns
Response
: AResponse
object of the process.
__insert_metadata
__insert_metadata(metadata_content: str, table_id: str) -> Response
Inserts metadata associated with table table_id
into the database.
Args
- metadata_content (
str
): The content of the metadata. - table_id (
str
): The associated table of the metadata.
Returns
Response
: AResponse
object of the process.