One of the nice things about the Datomic architecture is that the index segments kept in storage are immutable. That enables them to be cached extensively. Currently that caching happens inside the peers, which keep segments they have needed thus far in the application process heap.
While this is great for process-local working sets, there is only so much a single machine can cache. So, we've added support for an optional second tier of distributed, shared cache, leveraging a memcached cluster. This tier of cache can be as large as you wish, and is shared between all the peers.
The entire use of memcached is automatic and integrated - just provide the endpoints of your memcached cluster in configuration. The peer protocols will automatically both look in it, and populate it on cache misses. Being based upon immutability, there are no cache coherence problems nor expiration policy woes.
The architecture incorporating memcached looks like this:
The benefits of this are many:
- You can get a shared cache of arbitrary size - many deployments will be able to fit their entire database in memcached if desired.
- If you are using a storage that is not otherwise distributed (e.g. unclustered PostgreSQL), the memcached tier can both almost entirely remove the read load on the single server and distribute it.
- Even when using a distributed storage like DynamoDB, the memcached tier can reduce your read provisioning and increase speed.
- Developers can set up a small memcached daemon locally so their DB will always feel 'hot' across process restarts.
- The memcached tier will enable hybrid strategies where the peers, transactors and memcached are all local but the storage is remote (e.g. DynamoDB).
Datomic is a good citizen in its use of memcached - it doesn't need to 'own' the cluster, and all of the Datomic keys incorporate UUIDs so they won't conflict with other application-level use of the same memcached cluster.
We hope you enjoy this feature, which is included in the Pro Edition at no extra charge.