Access Control in Datomic Cloud

In this article, we will look at Datomic access control, covering

  • authentication and authorization
  • network-level access
  • how the bastion works
  • some example deployments

Authentication and Authorization

Datomic integrates with AWS Identity and Access Management (IAM) for authentication and authorization, via a technique that we'll call S3 proxying. Here's how it works:

Every Datomic permission has a hierarchical name. For example, read-only access to database Jan is named access/dbs/db/Jan/read.

Permission names have a 1-1 correspondence with keys in the Datomic system S3 bucket.

The Datomic client signs requests using AWS's Signature Version 4. But instead of using your IAM credentials directly, the Datomic client uses your IAM credentials to retrieve a signing key from S3.

Thus, IAM read permissions of S3 paths act as proxies for Datomic permissions. As a result, you can use all of the ordinary IAM tools (roles, groups, users, policies, etc.) to authorize use of Datomic.

After decades of experience with racing to log into new servers to change the admin password, we think that this "secure by default" is pretty cool. But that is not the end of the story, as clients also must have network-level access to Datomic.

Network-Level Access

Datomic Cloud is designed to be accessed by applications running inside a VPC, and (unlike a service!) is never exposed to the Internet. You must make an explicit choice to access Datomic. You could:

  • run an EC2 instance in the Datomic VPC and in the Datomic applications security group
  • peer another VPC with the Datomic VPC 
  • configure a VPN Connection

Each of these approaches has its place for application access, and I will say more about them in a future article. For easy access to Datomic from a developer's laptop we offer the bastion.

How the Bastion Works

The bastion is a dedicated machine with one job only: to enable developer access to Datomic. When you turn the bastion on, you get a barebones AWS Linux instance that does exactly one thing: forwards SSH traffic to your Datomic system.

To connect through the bastion:
  1. run the Datomic socks proxy script on your local machine
  2. add a proxy port argument when creating a system client
  3. the Datomic client sees the proxy port argument and connects to the socks proxy 
  4. the socks proxy forwards encrypted SSH traffic to the bastion 
  5. the bastion forwards Datomic client protocol traffic to Datomic

Access to the bastion is secured using the same IAM + S3 proxying technique used earlier for auth. The bastion has an auto-generated, ephemeral private key that is stored in S3 and secured by IAM.

The bastion is dynamic, and you can turn it on or off an any time. And the client support means that the bastion is entirely transparent to your code, which differs only in the argument used to create the client.

A Concrete Example

As a concrete example, here is how the Datomic team configures access for some of our Datomic systems:

Our 'ci' system is dedicated to continuous integration, supporting several dozen Jenkins projects. The ci system contains no sensitive data, only canned and generated examples. The ci system runs the Solo Topology with the bastion enabled to allow access by automated tests.

Our 'devdata' system contains non-sensitive data used by the development team (think departmental and sample apps). The devdata system runs the Solo Topology with the bastion enabled to allow access by developers.

Our 'applications' system supports applications and contains real-world data. The applications system is reachable only by deployed application code, and needs to be highly available. So the applications system uses the Production Topology with the bastion disabled, and also uses fine-grained IAM permissions to limit applications to the databases they need.


Datomic is secure by default, integrating directly with AWS IAM and VPC capabilities. The bastion makes it easy for developers to get connected, so you can be up and transacting on a new system in minutes.

To learn more, check out

Or just dive in and get started.


High Availability in Datomic Cloud

Based on my rigorous polling, these are the top five questions people have about HA in Datomic Cloud:
  • How does it work?
  • What do I have to do?
  • Hey wait a minute! What about serializability?
  • Is AWS cool?
  • So what else have you got?

How Does It Work?

In the Production Topology, Datomic cluster nodes sit behind an Application Load Balancer (ALB). As any node can handle any request, there is no single point of failure. The starting cluster size of two ensures that nodes remain available in the face of a single node failure. By increasing the cluster size beyond two, you both enhance availability and increase the number of queries the system can handle. An AutoScaling Group (ASG) monitors nodes and automatically replaces nodes that fail to health check.

What Do I Have To Do?

Nothing. HA is automatic.

Hey Wait a Minute! What About Serializability?

We really like transactions:
Datomic is always transactional, fully serialized, and consistent in both the ACID and CAP senses. Don't waste your life writing code to compensate for partial failures and subtle concurrency bugs when you could be making your application better and shipping it faster.                                             
So how does that square with shared-nothing cluster nodes? The answer is simple: The nodes use DynamoDB to serialize all writes per database.

At any point in time a database has a preferred node for transactions. In normal operation all transactions for a database will flow to/through that node. If for any reason (e.g. a temporary network partition) the preferred node can't be reached, any node can and will handle transactions. Consistency is ensured by conditional writes to DynamoDB. If a node becomes unreachable, Datomic will choose a new preferred node.
Note in particular that:
  1. This is not a master/follower system like Datomic On-Prem and many other databases – nobody is tracking mastership and there are no failover intervals.
  2. This should not be confused with parallel multi-writer systems such as Cassandra. Write availability is governed by the availability of DynamoDB conditional writes and strongly-consistent reads.

Is AWS Cool?

Very cool. Datomic Cloud showcases the benefit of designing for AWS vs. porting to AWS, and there is a lot going on behind zeroconf HA:

So What Else Have You Got?

Not all systems need HA. You can prototype a Datomic Cloud system with the (non-HA) Solo Topology for about $1/day.  The topology differences are entirely abstracted away from clients and applications, so you can easily upgrade to the Production Topology later.

For More Information:

Check out the docs for
Or just jump right in.

Datomic Cloud

Datomic on AWS: Easy, Integrated, and Powerful

We are excited to announce the release of Datomic Cloud, making Datomic more accessible than ever before:
Datomic Cloud is a new product intended for greenfield development on AWS. If you are not yet targeting the cloud, check out what customers are saying about the established line of Datomic On-Prem products (Datomic Pro and Enterprise).
Datomic Cloud is accessible through the latest release of the Datomic Client APITo learn more, you can:
We would love your feedback! Come and join us on the new developer forum.

Datomic Cloud on the AWS Marketplace

Datomic Pull :as

Datomic's Pull API provides a declarative way to make hierarchical and nested selections of information about entities.  The 0.9.5656 release enhances the Pull API with a new :as clause that provides control over the returned keys.

As an example, imagine that you want information about Led Zeppelin's tracks from the mbrainz dataset. The following pull pattern navigates to the artist's tracks, using limit to return a single track:

;; pull expression
'[[:track/_artists :limit 1]]

=> #:track{:_artists
[#:db{:id 17592188757937}]}

The entity id 17592188757937 is not terribly interesting, so you can use a nested pull pattern to request the track name instead:

;; pull pattern
'[{[:track/_artists :limit 1] [:track/name]}]

=> #:track{:_artists [#:track{:name "Black Dog"}]}

That is better, but what if you want different key names? This can happen for reasons including:

  • you are targeting an environment that does not support symbolic names, so you need a string instead of a keyword key
  • you do not want to expose the direction of navigation (e.g. the underscore in :track/_artists)
  • your consumers are expecting a different name
The :as option lets you rename result keys to arbitrary values that you provide, and works at any level of nesting in a pull pattern. The pattern below uses :as twice to rename the two keys in the result:

;; pull expression
'[{[:track/_artists :limit 1 :as "Tracks"]
[[:track/name :as "Name"]]}]

=> {"Tracks" [{"Name" "Black Dog"}]}

To try it out you can grab the latest release, review the Pull grammar, and work through these examples at the REPL.


3 of 11