AWS S3

You can use AWS’s Simple Storage Service (S3) as XTDB’s 'document store'.

Project Dependency

In order to use S3 within XTDB, you must first add S3 as a project dependency:

  • deps.edn

  • pom.xml

com.xtdb/xtdb-s3 {:mvn/version "1.23.0"}
<dependency>
    <groupId>com.xtdb</groupId>
    <artifactId>xtdb-s3</artifactId>
    <version>1.23.0</version>
</dependency>

Using S3

Replace the implementation of the document store with xtdb.s3/->document-store

  • JSON

  • Clojure

  • EDN

{
  "xtdb/document-store": {
    "xtdb/module": "xtdb.s3/->document-store",
    "bucket": "your-bucket",
    ...
  },
}
{:xtdb/document-store {:xtdb/module 'xtdb.s3/->document-store
                       :bucket "your-bucket"
                       ...}}
{:xtdb/document-store {:xtdb/module xtdb.s3/->document-store
                       :bucket "your-bucket"
                       ...}}

Parameters

  • configurator (S3Configurator)

  • bucket (string, required)

  • prefix (string): S3 key prefix

  • cache-size (int): size of in-memory document cache (number of entries, not bytes)

Checkpoint store

S3 can be used as a query index checkpoint store.

Checkpoints aren’t GC’d by XTDB - we recommend you set a lifecycle policy on your bucket to remove older checkpoints.

;; under :xtdb/index-store -> :kv-store -> :checkpointer
;; see the Checkpointing guide for other parameters
{:checkpointer {...
                :store {:xtdb/module 'xtdb.s3.checkpoint/->cp-store
                        :configurator ...
                        :bucket "..."
                        :prefix "..."}}

Parameters

  • configurator (S3Configurator)

  • bucket (string, required)

  • prefix (string): S3 key prefix

Configuring S3 requests

This is unfortunately currently only accessible from Clojure - we plan to expose it outside of Clojure soon.

While the above is sufficient to get xtdb-s3 working out of the box, there are a plethora of configuration options in S3 - how to get credentials, object properties, serialisation of the documents, etc. We expose these via the xtdb.s3.S3Configurator interface - you can supply an instance using the following in your node configuration.

Through this interface, you can supply an S3AsyncClient for xtdb-s3 to use, adapt the PutObjectRequest/GetObjectRequest as required, and choose the serialisation format. By default, we get credentials through the usual AWS credentials provider, and store documents using Nippy.

  • Clojure

{:xtdb/document-store {:xtdb/module 'xtdb.s3/->document-store
                       :configurator (fn [_]
                                       (reify S3Configurator
                                         ...)
                       ...}}