Python API

Get and set runtime settings

terracotta.get_settings()[source]

Returns the current set of global runtime settings.

Example

>>> import terracotta as tc
>>> tc.get_settings().DEBUG
False
Return type

TerracottaSettings

terracotta.update_settings(**new_config)[source]

Update the global Terracotta runtime settings.

Parameters

new_config (Any) – Options to override. Have to be valid Terracotta settings.

Example

>>> import terracotta as tc
>>> tc.get_settings().DEFAULT_TILE_SIZE
(256, 256)
>>> tc.update_settings(DEFAULT_TILE_SIZE=[512, 512])
>>> tc.get_settings().DEFAULT_TILE_SIZE
(512, 512)
Return type

None

Get a driver instance

terracotta.get_driver(url_or_path, provider=None)[source]

Retrieve Terracotta driver instance for the given path.

This function always returns the same instance for identical inputs.

Warning

Always retrieve Driver instances through this function instead of instantiating them directly to prevent caching issues.

Parameters
  • url_or_path (Union[str, Path]) – A path identifying the database to connect to. The expected format depends on the driver provider.

  • provider (Optional[str]) – Driver provider to use (one of sqlite, sqlite-remote, mysql; default: auto-detect).

Example

>>> import terracotta as tc
>>> tc.get_driver('tc.sqlite')
SQLiteDriver('/home/terracotta/tc.sqlite')
>>> tc.get_driver('mysql://root@localhost/tc')
MySQLDriver('mysql://root@localhost:3306/tc')
>>> # pass provider if path is given in a non-standard way
>>> tc.get_driver('root@localhost/tc', provider='mysql')
MySQLDriver('mysql://root@localhost:3306/tc')
Return type

Driver

SQLite driver

class terracotta.drivers.sqlite.SQLiteDriver(path)[source]

An SQLite-backed raster driver.

Assumes raster data to be present in separate GDAL-readable files on disk or remotely. Stores metadata and paths to raster files in SQLite.

This is the simplest Terracotta driver, as it requires no additional infrastructure. The SQLite database is simply a file that can be stored together with the actual raster files.

Note

This driver requires the SQLite database to be physically present on the server. For remote SQLite databases hosted on S3, use RemoteSQLiteDriver.

The SQLite database consists of 4 different tables:

  • terracotta: Metadata about the database itself.

  • keys: Contains two columns holding all available keys and their description.

  • datasets: Maps key values to physical raster path.

  • metadata: Contains actual metadata as separate columns. Indexed via key values.

This driver caches raster data, but not metadata.

Warning

This driver is not thread-safe. It is not possible to connect to the database outside the main thread.

__init__(path)[source]

Initialize the SQLiteDriver.

This should not be called directly, use get_driver() instead.

Parameters

path (Union[str, Path]) – File path to target SQLite database (may or may not exist yet)

classmethod compute_metadata(cls, raster_path, *, extra_metadata=None, use_chunks=None, max_shape=None)

Read given raster file and compute metadata from it.

This handles most of the heavy lifting during raster ingestion. The returned metadata can be passed directly to insert().

Parameters
  • raster_path (str) – Path to GDAL-readable raster file

  • extra_metadata (Optional[Any]) – Any additional metadata to attach to the dataset. Will be JSON-serialized and returned verbatim by get_metadata().

  • use_chunks (Optional[bool]) – Whether to process the image in chunks (slower, but uses less memory). If not given, use chunks for large images only.

  • max_shape (Optional[Sequence[int]]) – Gives the maximum number of pixels used in each dimension to compute metadata. Setting this to a relatively small size such as (1024, 1024) will result in much faster metadata computation for large images, at the expense of inaccurate results.

Return type

Dict[str, Any]

connect()[source]

Context manager to connect to a given database and clean up on exit.

This allows you to pool interactions with the database to prevent possibly expensive reconnects, or to roll back several interactions if one of them fails.

Note

Make sure to call create() on a fresh database before using this method.

Example

>>> import terracotta as tc
>>> driver = tc.get_driver('tc.sqlite')
>>> with driver.connect():
...     for keys, dataset in datasets.items():
...         # connection will be kept open between insert operations
...         driver.insert(keys, dataset)
Return type

AbstractContextManager

create(keys, key_descriptions=None)[source]

Create and initialize database with empty tables.

This must be called before opening the first connection. Tables must not exist already.

Parameters
  • keys (Sequence[str]) – Key names to use throughout the Terracotta database.

  • key_descriptions (Optional[Mapping[str, str]]) – Optional (but recommended) full-text description for some keys, in the form of {key_name: description}.

Return type

None

property db_version

Terracotta version used to create the database

Return type

str

delete(keys)[source]

Remove a dataset from the metadata database.

Parameters

keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

Return type

None

get_datasets(where=None, page=0, limit=None)[source]

Retrieve keys and file paths of datasets.

Parameters
  • where (Optional[Mapping[str, str]]) – Constraints on returned datasets in the form {key_name: allowed_key_value}. Returns all datasets if not given (default).

  • page (int) – Current page of results. Has no effect if limit is not given.

  • limit (Optional[int]) – If given, return at most this many datasets. Unlimited by default.

Return type

Dict[Tuple[str, …], str]

Returns

dict containing {(key_value1, key_value2, ...): raster_file_path}

Example

>>> import terracotta as tc
>>> driver = tc.get_driver('tc.sqlite')
>>> driver.get_datasets()
{
    ('reflectance', '20180101', 'B04'): 'reflectance_20180101_B04.tif',
    ('reflectance', '20180102', 'B04'): 'reflectance_20180102_B04.tif',
}
>>> driver.get_datasets({'date': '20180101'})
{('reflectance', '20180101', 'B04'): 'reflectance_20180101_B04.tif'}
get_keys()[source]

Get all known keys and their fulltext descriptions.

Return type

OrderedDict

Returns

An OrderedDict in the form {key_name: key_description}

get_metadata(keys)[source]

Return all stored metadata for given keys.

Parameters

keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the requested dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

Return type

Dict[str, Any]

Returns

A dict with the values

  • range: global minimum and maximum value in dataset

  • bounds: physical bounds covered by dataset in latitude-longitude projection

  • convex_hull: GeoJSON shape specifying total data coverage in latitude-longitude projection

  • percentiles: array of pre-computed percentiles from 1% through 99%

  • mean: global mean

  • stdev: global standard deviation

  • metadata: any additional client-relevant metadata

get_raster_tile(keys, *, tile_bounds=None, tile_size=None, preserve_values=False, asynchronous=False)

Load a raster tile with given keys and bounds.

Parameters
  • keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the requested dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

  • tile_bounds (Optional[Sequence[float]]) – Physical bounds of the tile to read, in Web Mercator projection (EPSG3857). Reads the whole dataset if not given.

  • tile_size (Optional[Sequence[int]]) – Shape of the output array to return. Must be two-dimensional. Defaults to DEFAULT_TILE_SIZE.

  • preserve_values (bool) – Whether to preserve exact numerical values (e.g. when reading categorical data). Sets all interpolation to nearest neighbor.

  • asynchronous (bool) – If given, the tile will be read asynchronously in a separate thread. This function will return immediately with a Future that can be used to retrieve the result.

Return type

Any

Returns

Requested tile as MaskedArray of shape tile_size if asynchronous=False, otherwise a Future containing the result.

insert(keys, filepath, *, metadata=None, skip_metadata=False, override_path=None)[source]

Insert a raster file into the database.

Parameters
  • keys (Union[Sequence[str], Mapping[str, str]]) – Keys identifying the new dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

  • filepath (str) – Path to the GDAL-readable raster file.

  • metadata (Optional[Mapping[str, Any]]) – If not given (default), call compute_metadata() with default arguments to compute raster metadata. Otherwise, use the given values. This can be used to decouple metadata computation from insertion, or to use the optional arguments of compute_metadata().

  • skip_metadata (bool) – Do not compute any raster metadata (will be computed during the first request instead). Use sparingly; this option has a detrimental result on the end user experience and might lead to surprising results. Has no effect if metadata is given.

  • override_path (Optional[str]) – Override the path to the raster file in the database. Use this option if you intend to copy the data somewhere else after insertion (e.g. when moving files to a cloud storage later on).

Return type

None

property key_names

Names of all keys defined by the database

Return type

Tuple[str, …]

Remote SQLite driver

class terracotta.drivers.sqlite_remote.RemoteSQLiteDriver(remote_path)[source]

An SQLite-backed raster driver, where the database file is stored remotely on S3.

Assumes raster data to be present in separate GDAL-readable files on disk or remotely. Stores metadata and paths to raster files in SQLite.

See also

SQLiteDriver for the local version of this driver.

The SQLite database is simply a file that can be stored together with the actual raster files on S3. Before handling the first request, this driver will download a temporary copy of the remote database file. It is thus not feasible for large databases.

The local database copy will be updated in regular intervals defined by REMOTE_DB_CACHE_TTL.

Warning

This driver is read-only. Any attempts to use the create, insert, or delete methods will throw a NotImplementedError.

__init__(remote_path)[source]

Initialize the RemoteSQLiteDriver.

This should not be called directly, use get_driver() instead.

Parameters

remote_path (str) – S3 URL in the form s3://bucket/key to remote SQLite database (has to exist).

classmethod compute_metadata(cls, raster_path, *, extra_metadata=None, use_chunks=None, max_shape=None)

Read given raster file and compute metadata from it.

This handles most of the heavy lifting during raster ingestion. The returned metadata can be passed directly to insert().

Parameters
  • raster_path (str) – Path to GDAL-readable raster file

  • extra_metadata (Optional[Any]) – Any additional metadata to attach to the dataset. Will be JSON-serialized and returned verbatim by get_metadata().

  • use_chunks (Optional[bool]) – Whether to process the image in chunks (slower, but uses less memory). If not given, use chunks for large images only.

  • max_shape (Optional[Sequence[int]]) – Gives the maximum number of pixels used in each dimension to compute metadata. Setting this to a relatively small size such as (1024, 1024) will result in much faster metadata computation for large images, at the expense of inaccurate results.

Return type

Dict[str, Any]

connect()

Context manager to connect to a given database and clean up on exit.

This allows you to pool interactions with the database to prevent possibly expensive reconnects, or to roll back several interactions if one of them fails.

Note

Make sure to call create() on a fresh database before using this method.

Example

>>> import terracotta as tc
>>> driver = tc.get_driver('tc.sqlite')
>>> with driver.connect():
...     for keys, dataset in datasets.items():
...         # connection will be kept open between insert operations
...         driver.insert(keys, dataset)
Return type

AbstractContextManager

property db_version

Terracotta version used to create the database

Return type

str

get_datasets(where=None, page=0, limit=None)

Retrieve keys and file paths of datasets.

Parameters
  • where (Optional[Mapping[str, str]]) – Constraints on returned datasets in the form {key_name: allowed_key_value}. Returns all datasets if not given (default).

  • page (int) – Current page of results. Has no effect if limit is not given.

  • limit (Optional[int]) – If given, return at most this many datasets. Unlimited by default.

Return type

Dict[Tuple[str, …], str]

Returns

dict containing {(key_value1, key_value2, ...): raster_file_path}

Example

>>> import terracotta as tc
>>> driver = tc.get_driver('tc.sqlite')
>>> driver.get_datasets()
{
    ('reflectance', '20180101', 'B04'): 'reflectance_20180101_B04.tif',
    ('reflectance', '20180102', 'B04'): 'reflectance_20180102_B04.tif',
}
>>> driver.get_datasets({'date': '20180101'})
{('reflectance', '20180101', 'B04'): 'reflectance_20180101_B04.tif'}
get_keys()

Get all known keys and their fulltext descriptions.

Return type

OrderedDict

Returns

An OrderedDict in the form {key_name: key_description}

get_metadata(keys)

Return all stored metadata for given keys.

Parameters

keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the requested dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

Return type

Dict[str, Any]

Returns

A dict with the values

  • range: global minimum and maximum value in dataset

  • bounds: physical bounds covered by dataset in latitude-longitude projection

  • convex_hull: GeoJSON shape specifying total data coverage in latitude-longitude projection

  • percentiles: array of pre-computed percentiles from 1% through 99%

  • mean: global mean

  • stdev: global standard deviation

  • metadata: any additional client-relevant metadata

get_raster_tile(keys, *, tile_bounds=None, tile_size=None, preserve_values=False, asynchronous=False)

Load a raster tile with given keys and bounds.

Parameters
  • keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the requested dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

  • tile_bounds (Optional[Sequence[float]]) – Physical bounds of the tile to read, in Web Mercator projection (EPSG3857). Reads the whole dataset if not given.

  • tile_size (Optional[Sequence[int]]) – Shape of the output array to return. Must be two-dimensional. Defaults to DEFAULT_TILE_SIZE.

  • preserve_values (bool) – Whether to preserve exact numerical values (e.g. when reading categorical data). Sets all interpolation to nearest neighbor.

  • asynchronous (bool) – If given, the tile will be read asynchronously in a separate thread. This function will return immediately with a Future that can be used to retrieve the result.

Return type

Any

Returns

Requested tile as MaskedArray of shape tile_size if asynchronous=False, otherwise a Future containing the result.

property key_names

Names of all keys defined by the database

Return type

Tuple[str, …]

MySQL driver

class terracotta.drivers.mysql.MySQLDriver(mysql_path)[source]

A MySQL-backed raster driver.

Assumes raster data to be present in separate GDAL-readable files on disk or remotely. Stores metadata and paths to raster files in MySQL.

Requires a running MySQL server.

The MySQL database consists of 4 different tables:

  • terracotta: Metadata about the database itself.

  • key_names: Contains two columns holding all available keys and their description.

  • datasets: Maps key values to physical raster path.

  • metadata: Contains actual metadata as separate columns. Indexed via key values.

This driver caches raster data and key names, but not metadata.

__init__(mysql_path)[source]

Initialize the MySQLDriver.

This should not be called directly, use get_driver() instead.

Parameters

mysql_path (str) – URL to running MySQL server, in the form mysql://username:password@hostname/database

classmethod compute_metadata(cls, raster_path, *, extra_metadata=None, use_chunks=None, max_shape=None)

Read given raster file and compute metadata from it.

This handles most of the heavy lifting during raster ingestion. The returned metadata can be passed directly to insert().

Parameters
  • raster_path (str) – Path to GDAL-readable raster file

  • extra_metadata (Optional[Any]) – Any additional metadata to attach to the dataset. Will be JSON-serialized and returned verbatim by get_metadata().

  • use_chunks (Optional[bool]) – Whether to process the image in chunks (slower, but uses less memory). If not given, use chunks for large images only.

  • max_shape (Optional[Sequence[int]]) – Gives the maximum number of pixels used in each dimension to compute metadata. Setting this to a relatively small size such as (1024, 1024) will result in much faster metadata computation for large images, at the expense of inaccurate results.

Return type

Dict[str, Any]

connect()[source]

Context manager to connect to a given database and clean up on exit.

This allows you to pool interactions with the database to prevent possibly expensive reconnects, or to roll back several interactions if one of them fails.

Note

Make sure to call create() on a fresh database before using this method.

Example

>>> import terracotta as tc
>>> driver = tc.get_driver('tc.sqlite')
>>> with driver.connect():
...     for keys, dataset in datasets.items():
...         # connection will be kept open between insert operations
...         driver.insert(keys, dataset)
Return type

AbstractContextManager

create(keys, key_descriptions=None)[source]

Create and initialize database with empty tables.

This must be called before opening the first connection. The MySQL database must not exist already.

Parameters
  • keys (Sequence[str]) – Key names to use throughout the Terracotta database.

  • key_descriptions (Optional[Mapping[str, str]]) – Optional (but recommended) full-text description for some keys, in the form of {key_name: description}.

Return type

None

property db_version

Terracotta version used to create the database

Return type

str

delete(keys)[source]

Remove a dataset from the metadata database.

Parameters

keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

Return type

None

get_datasets(where=None, page=0, limit=None)[source]

Retrieve keys and file paths of datasets.

Parameters
  • where (Optional[Mapping[str, str]]) – Constraints on returned datasets in the form {key_name: allowed_key_value}. Returns all datasets if not given (default).

  • page (int) – Current page of results. Has no effect if limit is not given.

  • limit (Optional[int]) – If given, return at most this many datasets. Unlimited by default.

Return type

Dict[Tuple[str, …], str]

Returns

dict containing {(key_value1, key_value2, ...): raster_file_path}

Example

>>> import terracotta as tc
>>> driver = tc.get_driver('tc.sqlite')
>>> driver.get_datasets()
{
    ('reflectance', '20180101', 'B04'): 'reflectance_20180101_B04.tif',
    ('reflectance', '20180102', 'B04'): 'reflectance_20180102_B04.tif',
}
>>> driver.get_datasets({'date': '20180101'})
{('reflectance', '20180101', 'B04'): 'reflectance_20180101_B04.tif'}
get_keys()[source]

Get all known keys and their fulltext descriptions.

Return type

OrderedDict

Returns

An OrderedDict in the form {key_name: key_description}

get_metadata(keys)[source]

Return all stored metadata for given keys.

Parameters

keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the requested dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

Return type

Dict[str, Any]

Returns

A dict with the values

  • range: global minimum and maximum value in dataset

  • bounds: physical bounds covered by dataset in latitude-longitude projection

  • convex_hull: GeoJSON shape specifying total data coverage in latitude-longitude projection

  • percentiles: array of pre-computed percentiles from 1% through 99%

  • mean: global mean

  • stdev: global standard deviation

  • metadata: any additional client-relevant metadata

get_raster_tile(keys, *, tile_bounds=None, tile_size=None, preserve_values=False, asynchronous=False)

Load a raster tile with given keys and bounds.

Parameters
  • keys (Union[Sequence[str], Mapping[str, str]]) – Keys of the requested dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

  • tile_bounds (Optional[Sequence[float]]) – Physical bounds of the tile to read, in Web Mercator projection (EPSG3857). Reads the whole dataset if not given.

  • tile_size (Optional[Sequence[int]]) – Shape of the output array to return. Must be two-dimensional. Defaults to DEFAULT_TILE_SIZE.

  • preserve_values (bool) – Whether to preserve exact numerical values (e.g. when reading categorical data). Sets all interpolation to nearest neighbor.

  • asynchronous (bool) – If given, the tile will be read asynchronously in a separate thread. This function will return immediately with a Future that can be used to retrieve the result.

Return type

Any

Returns

Requested tile as MaskedArray of shape tile_size if asynchronous=False, otherwise a Future containing the result.

insert(keys, filepath, *, metadata=None, skip_metadata=False, override_path=None)[source]

Insert a raster file into the database.

Parameters
  • keys (Union[Sequence[str], Mapping[str, str]]) – Keys identifying the new dataset. Can either be given as a sequence of key values, or as a mapping {key_name: key_value}.

  • filepath (str) – Path to the GDAL-readable raster file.

  • metadata (Optional[Mapping[str, Any]]) – If not given (default), call compute_metadata() with default arguments to compute raster metadata. Otherwise, use the given values. This can be used to decouple metadata computation from insertion, or to use the optional arguments of compute_metadata().

  • skip_metadata (bool) – Do not compute any raster metadata (will be computed during the first request instead). Use sparingly; this option has a detrimental result on the end user experience and might lead to surprising results. Has no effect if metadata is given.

  • override_path (Optional[str]) – Override the path to the raster file in the database. Use this option if you intend to copy the data somewhere else after insertion (e.g. when moving files to a cloud storage later on).

Return type

None

property key_names

Names of all keys defined by the database

Return type

Tuple[str, …]