Checkpointing

XTDB nodes can save checkpoints of their query indices on a regular basis, so that new nodes can start to service queries faster.

XTDB nodes that join a cluster have to obtain a local set of query indices before they can service queries. These can be built by replaying the transaction log from the beginning, although this may be slow for clusters with a lot of history. Checkpointing allows the nodes in a cluster to share checkpoints into a central 'checkpoint store', so that nodes joining a cluster can retrieve a recent checkpoint of the query indices, rather than replaying the whole history.

The checkpoint store is a pluggable module - there are a number of officially supported implementations:

Java’s NIO FileSystem (below)
AWS’s S3
GCP’s Cloud Storage

XTDB nodes in a cluster don’t explicitly communicate regarding which one is responsible for creating a checkpoint - instead, they check at random intervals to see whether any other node has recently created a checkpoint, and create one if necessary. The desired frequency of checkpoints can be set using approx-frequency.

The default lifecycle of checkpoints is unbounded - XTDB will not attempt to clean up old checkpoints. Users will typically want to define a retention policy that best fits their requirements and create means of implementing that policy that is external to XTDB.

Setting up

You can enable checkpoints on your index-store by adding a :checkpointer dependency to the underlying KV store:

JSON
Clojure
EDN

{
  "xtdb/index-store": {
    "kv-store": {
      "xtdb/module": "xtdb.rocksdb/->kv-store",
      ...
      "checkpointer": {
        "xtdb/module": "xtdb.checkpoint/->checkpointer",
        "store": {
          "xtdb/module": "xtdb.checkpoint/->filesystem-checkpoint-store",
          "path": "/path/to/cp-store"
        },
        "approx-frequency": "PT6H"
      }
    }
  },
  ...
}

{:xtdb/index-store {:kv-store {:xtdb/module 'xtdb.rocksdb/->kv-store
                               ...
                               :checkpointer {:xtdb/module 'xtdb.checkpoint/->checkpointer
                                              :store {:xtdb/module 'xtdb.checkpoint/->filesystem-checkpoint-store
                                                      :path "/path/to/cp-store"}
                                              :approx-frequency (Duration/ofHours 6)}}}
 ...}

{:xtdb/index-store {:kv-store {:xtdb/module xtdb.rocksdb/->kv-store
                               ...
                               :checkpointer {:xtdb/module xtdb.checkpoint/->checkpointer
                                              :store {:xtdb/module xtdb.checkpoint/->filesystem-checkpoint-store
                                                      :path "/path/to/cp-store"}
                                              :approx-frequency "PT6H"}}}
 ...}

Checkpointer parameters

approx-frequency (required, Duration): approximate frequency for the cluster to save checkpoints
store: (required, CheckpointStore): see the individual store for more details.
checkpoint-dir (string/File/Path): temporary directory to store checkpoints in before they’re uploaded
keep-dir-between-checkpoints? (boolean, default true): whether to keep the temporary checkpoint directory between checkpoints
keep-dir-on-close? (boolean, default false): whether to keep the temporary checkpoint directory when the node shuts down

`FileSystem` Checkpoint Store parameters

path (required, string/File/Path/URI): path to store checkpoints.

Note about using `FileSystem` Checkpoint Store

Most Cloud providers offer some high-performance block storage facility (e.g. AWS EBS or Google Cloud Storage). These volumes usually can be mounted on compute platforms (e.g. EC2) or containers (e.g. K8S). Often the Cloud provider also offers tooling to quickly snapshot those volumes (see EBS snapshots). A possible checkpoint strategy for XTDB nodes running in a cloud environment with such a facility, would consist in:

having [:checkpointer :store :path] pointing to a filesystem mounted on such a volume
having the joining XTDB node [:checkpointer :store :path] point to a filesystem mounted an a snapshot from the main node (above).

Checkpointing

Setting up

Checkpointer parameters

FileSystem Checkpoint Store parameters

Note about using FileSystem Checkpoint Store

`FileSystem` Checkpoint Store parameters

Note about using `FileSystem` Checkpoint Store