citrine.resources.gemtables module
- class citrine.resources.gemtables.GemTable
Bases:
Resource
[Table
]A 2-dimensional projection of data.
GEM Tables are the basic unit used to flatten and manipulate data objects. While data objects can represent complex materials data, the format is NOT conducive to analysis and machine learning. GEM Tables, however, can be used to ‘flatten’ data objects into useful projections.
- access_control_dict() dict
Return an access control entity representation of this resource. Internal use only.
- classmethod build(data: dict) Self
Build an instance of this object from given data.
- dump() dict
Dump this instance.
- property config: TableConfig
Configuration used to build the table.
- property description: str
Description of the table (inherited from the config).
- download_url = None
URL pointing to the location of the GEM Table’s contents. This is an expiring download link and is not unique.
- Type:
str
- property name: str
Name of the table (inherited from the config).
- uid = None
unique Citrine id of this GEM Table
- Type:
UUID
- version = None
Version number of the GEM Table. The first table built from a given config is version 1.
- Type:
int
- class citrine.resources.gemtables.GemTableCollection(*args, team_id: UUID | None = None, project_id: UUID | None = None, session: Session | None = None)
Bases:
Collection
[GemTable
]Represents the collection of all tables associated with a project.
- build_from_config(config: TableConfig | str | UUID, *, version: int | str | None = None, timeout: float = 900) GemTable
Builds table from table config, waiting for build job to complete.
- Parameters:
config – The persisted table config from which to build a table (or its ID).
version – The version of the table config; only necessary when config is a uid.
timeout – Amount of time to wait on build job (in seconds) before giving up. Defaults to 15 minutes. Note that this number has no effect on the build job itself, which can also time out server-side.
- Returns:
A new table built from the supplied config.
- Return type:
- delete(uid: UUID | str)
Tables cannot be deleted at this time.
- get(uid: UUID | str, *, version: int | None = None) GemTable
Get a Table’s metadata. If no version is specified, get the most recent version.
- get_by_build_job(job: JobSubmissionResponse | UUID, *, timeout: float = 900) GemTable
Gets table by build job, waiting for it to complete if necessary.
- Parameters:
job – The job submission object or job ID for the table build.
timeout – Amount of time to wait on build job (in seconds) before giving up. Defaults to 15 minutes. Note that this number has no effect on the build job itself, which can also time out server-side.
- Returns:
The table built by the specified job.
- Return type:
- initiate_build(config: TableConfig | str | UUID, *, version: str | UUID | None = None) JobSubmissionResponse
Initiates tables build with provided config.
This method does not wait for job completion. If you do not need to build multiple tables in parallel, using build_from_config is preferable to using this method. Use get_by_build_job to wait for the result of this method.
- Parameters:
config – The persisted table config from which to build a table (or its ID).
version – The version of the table config; only necessary when config is a uid.
- Returns:
Information about the submitted job. Note the format of this object
may be unstable.
- list(*, per_page: int = 100) Iterator[ResourceType]
Paginate over the elements of the collection.
Leaving page and per_page as default values will yield all elements in the collection, paginating over all available pages.
- Parameters:
per_page (int, optional) – Max number of results to return per page. Default is 100. This parameter is used when making requests to the backend service. If the page parameter is specified it limits the maximum number of elements in the response.
- Returns:
An iterator that can be used to loop over all the resources in this collection. Use list() to force evaluation of all results into an in-memory list.
- Return type:
Iterator[ResourceType]
- list_by_config(table_config_uid: UUID, *, per_page: int = 100) Iterable[GemTable]
List the versions of a table associated with a given Table Config UID.
This is a paginated collection, similar to a .list() call.
- Parameters:
table_config_uid – The Table Config UID.
per_page – The number of items to fetch per-page.
- Returns:
An iterable of the versions of the Tables (as Table objects).
- list_versions(uid: UUID, *, per_page: int = 100) Iterable[GemTable]
List the versions of a table given a specific Table UID.
This is a paginated collection, similar to a .list() call.
- Parameters:
uid – The Table UID.
per_page – The number of items to fetch per-page.
- Returns:
An iterable of the versions of the Tables (as Table objects).
- read(*, table: GemTable | UUID | str, local_path: str)
Read the Table file from S3 to your local system.
If a Table object is not provided, retrieve it using the provided table and version ids.
- Parameters:
table – The persisted table config from which to build a table (or its ID and version number).
local_path – The path to the local location to save the file
- read_to_memory(table: GemTable | UUID | str) str
Read the Table file from S3 into local memory.
If a Table object is not provided, retrieve it using the provided table and version ids.
- Parameters:
table – The Table object to read from the S3 server
- Returns:
The contents of the file from S3, which is expected to be formatted as a CSV
- Return type:
str
- class citrine.resources.gemtables.GemTableVersionPaginator
Bases:
Paginator
[GemTable
]A Paginator for GEM Tables.
For a single table, we share the same UID, but have different versions - ensure that (uid, version) is used for comparisons.
- paginate(page_fetcher: Callable[[int | None, int], Tuple[Iterable[dict], str]], collection_builder: Callable[[Iterable[dict]], Iterable[ResourceType]], per_page: int = 100, search_params: dict | None = None, deduplicate: bool = True) Iterator[ResourceType]
A generic support class to paginate requests into an iterable of a built object.
Leaving page and per_page as default values will yield all elements in the collection, paginating over all available pages.
The page fetcher returns an Iterable of subsequent items on every invocation, returning an empty iterable when fetching is finished.
- Parameters:
page_fetcher (Callable[[Optional[int], int], Tuple[Iterable[dict], str]]) – Fetches the next page of elements
collection_builder (Callable[[Iterable[dict]], Iterable[ResourceType]]) – Builds each element in the collection into the appropriate resource
per_page (int, optional) – Max number of results to return per page. Default is 100. This parameter is used when making requests to the backend service. If the page parameter is specified it limits the maximum number of elements in the response.
search_params (dict, Optional) – A dictionary representing the request body to a page_fetcher function. The page_fetcher function should have a key word argument “search_params” should it pass a request body to the target endpoint. If no search_params are supplied, no search_params argument will get passed to the page_fetcher function.
deduplicate (bool, optional) – Whether to deduplicate the yielded resources by their uid. The default is true.
- Returns:
Resources in this collection.
- Return type:
Iterator[ResourceType]