Get started =========== .. _installation: Installation ------------ On most systems, the easiest way to install Terracotta is `through the Conda package manager `__. Just install ``conda``, clone the repository, and execute the following command to create a new environment containing all dependencies and Terracotta: .. code-block:: bash $ conda env create -f environment.yml If you already have Python 3.8 (or above) installed, you can just run .. code-block:: bash $ pip install -e . in the root of the Terracotta repository instead. .. seealso:: If you are using Windows 10 and find yourself struggling with installing Terracotta, :doc:`check out our Windows 10 installation guide `! Usage in a nutshell ------------------- The simplest way to use Terracotta is to cycle through the following commands: 1. :doc:`terracotta optimize-rasters ` to pre-process your raster files; 2. :doc:`terracotta ingest ` to create a database; 3. :doc:`terracotta serve ` to spawn a server; and 4. :doc:`terracotta connect ` to connect to this server. The following sections guide you through these steps in more detail. Data exploration through Terracotta ----------------------------------- If you have some raster files lying around (e.g. in GeoTiff format), you can use Terracotta to serve them up. .. note:: Terracotta benefits greatly from the `cloud-optimized GeoTiff format `__. If your raster files are not cloud-optimized or you are unsure, you can preprocess them with :doc:`terracotta optimize-rasters `. Assume you are in a folder containing some files named with the pattern :file:`S2A__.tif`. You can start a Terracotta server via .. code-block:: bash $ terracotta serve -r {}_{date}_{band}.tif which will serve your data at ``http://localhost:5000``. Try the following URLs and see what happens: - `localhost:5000/keys `__ - `localhost:5000/datasets `__ - `localhost:5000/apidoc `__ Because it is cumbersome to explore a Terracotta instance by manually constructing URLs, we have built a tool that lets you inspect it interactively: .. code-block:: bash $ terracotta connect localhost:5000 If you did everything correctly, a new window should open in your browser that lets you explore the dataset. .. _ingestion: Creating a raster database -------------------------- For Terracotta to perform well, it is important that some metadata like the extent of your datasets or the range of its values is computed and ingested into a database. There are two ways to populate this metadata store: 1. Through the CLI ++++++++++++++++++ A simple but limited way to build a database is to use :doc:`terracotta ingest `. All you need to do is to point Terracotta to a folder of (cloud-optimized) GeoTiffs: .. code-block:: bash $ terracotta ingest \ /path/to/gtiffs/{sensor}_{name}_{date}_{band}.tif \ -o terracotta.sqlite This will create a new database with the keys ``sensor``, ``name``, ``date``, and ``band`` (in this order), and ingest all files matching the given pattern into it. For available options, see .. code-block:: bash $ terracotta ingest --help **Note:** The CLI ``ingest`` command relies on naming conventions to match files against the specified key patterns. The value that matches is restricted to alphanumerics (i.e. letters and numbers). Other characters (e.g. the ``_`` in the example above, but also ``-``, ``+``, etc) are considered separators between keys. So if your filename looks like ``sar_2019-06-24.tif`` then ``{sensor}_{date}.tif`` will not match. Alternatives include renaming the files (e.g. to ``sar_20190624.tif``), using an alternative pattern (e.g. ``{sensor}_{year}-{month}-{day}.tif``) or using the Python API instead of the CLI to perform the ingest. 2. Using the Python API +++++++++++++++++++++++ :ref:`Terracotta’s driver API ` gives you fine-grained control over ingestion and retrieval. Metadata can be computed at three different times: 1. Automatically during a call to ``driver.insert`` (fine for most applications); 2. Manually using ``driver.compute_metadata`` (in case you want to decouple computation and IO, or if you want to attach additional metadata); or 3. On demand when a dataset is requested for the first time (this is what we want to avoid through ingestion). A first ingestion script using the Python API ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following script first defines three variables that we need to work with a terracotta database: - the database filename - the metadata keys that we will use - the metadata key-values and paths to raster files that we want to insert Then it uses the Python API to connect to the database and insert the metadata. .. literalinclude:: _static/example-first-ingestion-script.py :language: python :caption: example-first-ingestion-script.py :download:`Download the script <_static/example-first-ingestion-script.py>` You could start adapting this script by changing the variable definitions to match the data you want to ingest. A next step could be to read the metadata from a CSV or parse it from the filenames of the raster data. A more advanced ingestion script using the Python API ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following script populates a database with raster files located in a local directory. It extracts the appropriate keys from the file name, ingests them into a database, and pushes the rasters and the resulting database into an S3 bucket. .. literalinclude:: _static/example-ingestion-script.py :language: python :caption: example-ingestion-script.py :download:`Download the script <_static/example-ingestion-script.py>` .. note:: The above scripts are just examples to show you some capabilities of the Terracotta Python API. More sophisticated solutions could e.g. attach additional metadata to database entries, process many rasters in parallel, or accept parameters from the command line. Serving data from a raster database ----------------------------------- After creating a database, you can use :doc:`terracotta serve ` to serve the rasters inserted into it: .. code-block:: bash $ terracotta serve -d /path/to/database.sqlite To explore the server, you can once again use :doc:`terracotta connect `: .. code-block:: bash $ terracotta connect localhost:5000 However, the server spawned by ``terracotta serve`` is indended for development and data exploration only. For sophisticated production deployments, :doc:`have a look at our tutorials `. If you are unsure which kind of deployment to choose, we recommend you to try out a :doc:`serverless deployment on AWS Lambda `, via the remote SQLite driver.