Datomic Pro and Datomic Cloud are now FREE!

Datomic Cloud Monitoring and Ion Cast

The new datomic.ion.cast library lets your application produce monitoring data that are integrated with Datomic's support for AWS CloudWatch.

Datomic Cloud and AWS CloudWatch

AWS CloudWatch provides a powerful set of tools for monitoring a software system running on AWS:
  • Collect and track CloudWatch Metrics -- variables that measure the behavior of your system.
  • Configure CloudWatch Alarms to notify operations or take other automated steps when potential problems arise.
  • Monitor, store, and search CloudWatch Logs across all your AWS resources.
  • Create CloudWatch Dashboards that provide a single overview for monitoring your systems.
Datomic Cloud is fully integrated with all of these AWS monitoring tools. On the producing side, Datomic creates metrics and logs; and on the consuming side, Datomic organizes metrics in custom dashboards like this Production Dashboard:

Production Dashboard

datomic.ion.cast

With the introduction of Datomic Ions, your entire application can run on Datomic Cloud nodes. The datomic.ion.cast namespace lets Ion application code add your own monitoring data alongside the monitoring data already being produced by Datomic.  Cast supports four categories of monitoring data:
  1. An event is an ordinary occurence that is of interest to an operator, such as start and stop events for a process or activity.
  2. An alert is an extraordinary occurrence that requires operator intervention, such as the failure of some important process.
  3. Dev is information of interest only to developers, e.g. fine-grained logging to troubleshoot a problem during development. Dev data can be much higher volume than events or alerts.
  4. A metric is a numeric value in a named time series, such as the latency for an operation.
To get started using ion.cast, you can
..

Datomic Ions: Your App on Datomic Cloud

Datomic Ions let you develop applications for the cloud by deploying your code to a running Datomic cluster. You can focus on your application logic, writing ordinary Clojure functions, and the ion tooling and infrastructure handles the deployment and execution details. You can leverage your code both inside Datomic transactions and queries, and from the world at large via built-in support for AWS Lambda.

With Datomic Ions you can:
  • Focus on your application
  • Leverage your Datomic Cloud cluster compute resources and data locality
  • Extend Datomic transactions and queries with your own logic
  • Connect to the broader AWS cloud via Lambda events
  • Service web consumers with API Gateway
  • Scale Datomic and your app together
  • Deliver on AWS with high agility

To learn more about ions, check out the Datomic Cloud Ions documentation.

For Datomic On-Prem, we have added classpath functions and auto-require support for transaction functions and query expressions.

Datomic Cloud on the AWS Marketplace
..

Access Control in Datomic Cloud

In this article, we will look at Datomic access control, covering

  • authentication and authorization
  • network-level access
  • how the bastion works
  • some example deployments

Authentication and Authorization

Datomic integrates with AWS Identity and Access Management (IAM) for authentication and authorization, via a technique that we'll call S3 proxying. Here's how it works:

Every Datomic permission has a hierarchical name. For example, read-only access to database Jan is named access/dbs/db/Jan/read.

Permission names have a 1-1 correspondence with keys in the Datomic system S3 bucket.

The Datomic client signs requests using AWS's Signature Version 4. But instead of using your IAM credentials directly, the Datomic client uses your IAM credentials to retrieve a signing key from S3.

Thus, IAM read permissions of S3 paths act as proxies for Datomic permissions. As a result, you can use all of the ordinary IAM tools (roles, groups, users, policies, etc.) to authorize use of Datomic.

After decades of experience with racing to log into new servers to change the admin password, we think that this "secure by default" is pretty cool. But that is not the end of the story, as clients also must have network-level access to Datomic.

Network-Level Access

Datomic Cloud is designed to be accessed by applications running inside a VPC, and (unlike a service!) is never exposed to the Internet. You must make an explicit choice to access Datomic. You could:

  • run an EC2 instance in the Datomic VPC and in the Datomic applications security group
  • peer another VPC with the Datomic VPC 
  • configure a VPN Connection

Each of these approaches has its place for application access, and I will say more about them in a future article. For easy access to Datomic from a developer's laptop we offer the bastion.

How the Bastion Works

The bastion is a dedicated machine with one job only: to enable developer access to Datomic. When you turn the bastion on, you get a barebones AWS Linux instance that does exactly one thing: forwards SSH traffic to your Datomic system.

To connect through the bastion:
  1. run the Datomic socks proxy script on your local machine
  2. add a proxy port argument when creating a system client
  3. the Datomic client sees the proxy port argument and connects to the socks proxy 
  4. the socks proxy forwards encrypted SSH traffic to the bastion 
  5. the bastion forwards Datomic client protocol traffic to Datomic


Access to the bastion is secured using the same IAM + S3 proxying technique used earlier for auth. The bastion has an auto-generated, ephemeral private key that is stored in S3 and secured by IAM.

The bastion is dynamic, and you can turn it on or off an any time. And the client support means that the bastion is entirely transparent to your code, which differs only in the argument used to create the client.

A Concrete Example

As a concrete example, here is how the Datomic team configures access for some of our Datomic systems:

Our 'ci' system is dedicated to continuous integration, supporting several dozen Jenkins projects. The ci system contains no sensitive data, only canned and generated examples. The ci system runs the Solo Topology with the bastion enabled to allow access by automated tests.

Our 'devdata' system contains non-sensitive data used by the development team (think departmental and sample apps). The devdata system runs the Solo Topology with the bastion enabled to allow access by developers.

Our 'applications' system supports applications and contains real-world data. The applications system is reachable only by deployed application code, and needs to be highly available. So the applications system uses the Production Topology with the bastion disabled, and also uses fine-grained IAM permissions to limit applications to the databases they need.

Conclusion

Datomic is secure by default, integrating directly with AWS IAM and VPC capabilities. The bastion makes it easy for developers to get connected, so you can be up and transacting on a new system in minutes.

To learn more, check out

Or just dive in and get started.

..

High Availability in Datomic Cloud

Based on my rigorous polling, these are the top five questions people have about HA in Datomic Cloud:
  • How does it work?
  • What do I have to do?
  • Hey wait a minute! What about serializability?
  • Is AWS cool?
  • So what else have you got?

How Does It Work?

In the Production Topology, Datomic cluster nodes sit behind an Application Load Balancer (ALB). As any node can handle any request, there is no single point of failure. The starting cluster size of two ensures that nodes remain available in the face of a single node failure. By increasing the cluster size beyond two, you both enhance availability and increase the number of queries the system can handle. An AutoScaling Group (ASG) monitors nodes and automatically replaces nodes that fail to health check.

What Do I Have To Do?

Nothing. HA is automatic.

Hey Wait a Minute! What About Serializability?

We really like transactions:
Datomic is always transactional, fully serialized, and consistent in both the ACID and CAP senses. Don't waste your life writing code to compensate for partial failures and subtle concurrency bugs when you could be making your application better and shipping it faster.                                             
So how does that square with shared-nothing cluster nodes? The answer is simple: The nodes use DynamoDB to serialize all writes per database.

At any point in time a database has a preferred node for transactions. In normal operation all transactions for a database will flow to/through that node. If for any reason (e.g. a temporary network partition) the preferred node can't be reached, any node can and will handle transactions. Consistency is ensured by conditional writes to DynamoDB. If a node becomes unreachable, Datomic will choose a new preferred node.
Note in particular that:
  1. This is not a master/follower system like Datomic On-Prem and many other databases – nobody is tracking mastership and there are no failover intervals.
  2. This should not be confused with parallel multi-writer systems such as Cassandra. Write availability is governed by the availability of DynamoDB conditional writes and strongly-consistent reads.

Is AWS Cool?

Very cool. Datomic Cloud showcases the benefit of designing for AWS vs. porting to AWS, and there is a lot going on behind zeroconf HA:

So What Else Have You Got?

Not all systems need HA. You can prototype a Datomic Cloud system with the (non-HA) Solo Topology for about $1/day.  The topology differences are entirely abstracted away from clients and applications, so you can easily upgrade to the Production Topology later.

For More Information:

Check out the docs for
Or just jump right in.
..


4 of 13