A serverless Terracotta deployment on AWS Lambda¶

Warning

While it is possible to use Terracotta entirely within AWS’ free tier, using AWS to deploy Terracotta will probably incur some charges to your account. Make sure to check the pricing policy of all relevant services for your specific region.

Environment setup¶

The easiest way to deploy Terracotta to AWS Lambda is by using Zappa. Zappa takes care of packaging Terracotta and its dependencies, and creates endpoints on AWS Lambda and API Gateway for us.

Optional: Setup a MySQL server on RDS¶

Setting up a dedicated MySQL server for your Terracotta database is slightly more cumbersome than relying on SQLite, but has some decisive advantages:

Removes the overhead of downloading the SQLite database.
The contents of the database are accessible from the outside, and ingesting additional data is more straightforward.
Multiple Terracotta instances can use the same database server.

To set up a MySQL server on AWS, just follow these steps:

Head over to RDS and create a new MySQL instance. You can either use one of the free-tier, dedicated MySQL servers, or the AWS Aurora MySQL flavor.

The default settings for RDS are unfortunately far from optimal for Terracotta. You should tweak them by creating a new “parameter group” and setting
```
wait_timeout = 1
max_connections = 16000
```
Don’t forget to apply the parameter group to your RDS instance.
By default, your Terracotta Lambda function will not have access to the RDS instance. To allow access, you will have to add it to the same security group and subnets as your RDS instance. You can achieve this by adding a section like this one to your zappa_settings.toml (see below):
```
[development.vpc_config]
SubnetIds = ["subnet-xxxxxxxx","subnet-yyyyyyyy", "subnet-zzzzzzzz"]
SecurityGroupIds = ["sg-xxxxxxxxxxxxxxxxx"]
```
You can extract the correct IDs by clicking on your RDS instance.
By adding the Lambda function to a VPC, it loses access to S3. To re-enable it, go to the VPC settings and create an endpoint for the VPC of the Lambda function, pointing to AWS S3 (e.g. com.amazonaws.eu-central-1.s3).

You are now ready to continue with the following step!

Populate data storage and database¶

The recommended way to ingest your optimized raster files into the database is through the Terracotta Python API. To initialize your database, just run something like

>>> import terracotta as tc

>>> # for sqlite
>>> driver = tc.get_driver('tc.sqlite')

>>> # for mysql
>>> driver = tc.get_driver('mysql://user:password@hostname/database')

>>> key_names = ('type', 'date', 'band')
>>> driver.create(key_names)

You can then ingest your raster files into the database:

>>> rasters = {
...     ('index', '20180101', 'ndvi'): 'S2_20180101_NDVI.tif',
...     ('reflectance', '20180101', 'B04'): 'S2_20180101_B04.tif',
... }
>>> for keys, raster_file in rasters.items():
...     driver.insert(keys, raster_file,
...                   override_path=f's3://tc-data/rasters/{raster_file}')

Verify that everything went well by executing

>>> driver.get_datasets()
{
    ('index', '20180101', 'ndvi'): 's3://tc-data/rasters/S2_20180101_NDVI.tif',
    ('reflectance', '20180101', 'B04'): 's3://tc-data/rasters/S2_20180101_B04.tif',
}

Finally, just make sure that your raster files end up in the place where Terracotta is looking for them (the paths returned by get_datasets()). You can e.g. use the AWS CLI:

$ aws s3 sync /path/to/rasters s3://tc-data/rasters
$ aws s3 cp /path/to/tc.sqlite s3://tc-data/tc.sqlite # if using sqlite

To verify whether everything went well, you can start a local Terracotta server:

$ terracotta serve s3://tc-data/tc.sqlite
$ terracotta connect localhost:5000

Deploy via Zappa¶

The Terracotta repository contains a template with sensible default values for most Zappa settings:

zappa_settings.toml.in¶

[development]
app_function = "terracotta.server.app.app"
aws_region = "eu-central-1"
profile_name = "default"
project_name = "tc-test"
runtime = "python3.9"
s3_bucket = "zappa-teracotta-dev"
exclude = [
    "*.gz",
    "*.rar",
    "boto3*",
    "botocore*",
    "s3transfer*",
    "awscli*",
    ".mypy_cache",
    ".pytest_cache",
    ".eggs"
]
debug = true
log_level = "DEBUG"
xray_tracing = true
touch_path = "/keys"

timeout_seconds = 30
memory_size = 500

    [development.aws_environment_variables]
    TC_DRIVER_PATH = "s3://tc-testdata/terracotta.sqlite"
    TC_DRIVER_PROVIDER = "sqlite-remote"
    TC_REPROJECTION_METHOD = "linear"
    TC_RESAMPLING_METHOD = "average"
    TC_XRAY_PROFILE = "true"

    [development.callbacks]
    settings = "zappa_settings_callback.check_integrity"


[production]
extends = "development"

# WARNING: using a cache cluster incurs additional costs (not covered by AWS free tier)
cache_cluster_enabled = true
cache_cluster_size = 0.5
cache_cluster_ttl = 3600

debug = false
log_level = "WARNING"
xray_tracing = true

Copy or rename zappa_settings.toml.in to zappa_settings.toml and insert the correct path to your Terracotta database into the environment variables. To execute the deployment, run

$ source ~/envs/tc-deploy/bin/activate
$ zappa deploy development

Congratulations, your Terracotta instance should now be reachable! You can verify the deployment via terracotta connect.

A serverless Terracotta deployment on AWS Lambda¶

Environment setup¶

Optional: Setup a MySQL server on RDS¶

Populate data storage and database¶

Deploy via Zappa¶

Navigation

Related Topics