Wednesday, January 24, 2018

High Availability in Datomic Cloud

Based on my rigorous polling, these are the top five questions people have about HA in Datomic Cloud:
  • How does it work?
  • What do I have to do?
  • Hey wait a minute! What about serializability?
  • Is AWS cool?
  • So what else have you got?

How Does It Work?

In the Production Topology, Datomic cluster nodes sit behind an Application Load Balancer (ALB). As any node can handle any request, there is no single point of failure. The starting cluster size of two ensures that nodes remain available in the face of a single node failure. By increasing the cluster size beyond two, you both enhance availability and increase the number of queries the system can handle. An AutoScaling Group (ASG) monitors nodes and automatically replaces nodes that fail to health check.

What Do I Have To Do?

Nothing. HA is automatic.

Hey Wait a Minute! What About Serializability?

Datomic is always transactional, fully serialized, and consistent in both the ACID and CAP senses. Don't waste your life writing code to compensate for partial failures and subtle concurrency bugs when you could be making your application better and shipping it faster.                                             
So how does that square with shared-nothing cluster nodes? The answer is simple: The nodes use DynamoDB to serialize all writes per database.

At any point in time a database has a preferred node for transactions. In normal operation all transactions for a database will flow to/through that node. If for any reason (e.g. a temporary network partition) the preferred node can't be reached, any node can and will handle transactions. Consistency is ensured by conditional writes to DynamoDB. If a node becomes unreachable, Datomic will choose a new preferred node.
Note in particular that:
  1. This is not a master/follower system like Datomic On-Prem and many other databases – nobody is tracking mastership and there are no failover intervals.
  2. This should not be confused with parallel multi-writer systems such as Cassandra. Write availability is governed by the availability of DynamoDB conditional writes and strongly-consistent reads.

Is AWS Cool?

Very cool. Datomic Cloud showcases the benefit of designing for AWS vs. porting to AWS, and there is a lot going on behind zeroconf HA:

So What Else Have You Got?

Not all systems need HA. You can prototype a Datomic Cloud system with the (non-HA) Solo Topology for about $1/day.  The topology differences are entirely abstracted away from clients and applications, so you can easily upgrade to the Production Topology later.

For More Information:

Check out the docs for
Or just jump right in.

Wednesday, January 17, 2018

Datomic Cloud

Datomic on AWS: Easy, Integrated, and Powerful

We are excited to announce the release of Datomic Cloud, making Datomic more accessible than ever before:
Datomic Cloud is a new product intended for greenfield development on AWS. If you are not yet targeting the cloud, check out what customers are saying about the established line of Datomic On-Prem products (Datomic Pro and Enterprise).
Datomic Cloud is accessible through the latest release of the Datomic Client APITo learn more, you can:
We would love your feedback! Come and join us on the new developer forum.

Datomic Cloud on the AWS Marketplace

Tuesday, December 5, 2017

Datomic Pull :as

Datomic's Pull API provides a declarative way to make hierarchical and nested selections of information about entities.  The 0.9.5656 release enhances the Pull API with a new :as clause that provides control over the returned keys.

As an example, imagine that you want information about Led Zeppelin's tracks from the mbrainz dataset. The following pull pattern navigates to the artist's tracks, using limit to return a single track:

;; pull expression
'[[:track/_artists :limit 1]]

=> #:track{:_artists
           [#:db{:id 17592188757937}]}

The entity id 17592188757937 is not terribly interesting, so you can use a nested pull pattern to request the track name instead:

;; pull pattern
'[{[:track/_artists :limit 1] [:track/name]}]

=> #:track{:_artists [#:track{:name "Black Dog"}]}

That is better, but what if you want different key names? This can happen for reasons including:

  • you are targeting an environment that does not support symbolic names, so you need a string instead of a keyword key
  • you do not want to expose the direction of navigation (e.g. the underscore in :track/_artists)
  • your consumers are expecting a different name
The :as option lets you rename result keys to arbitrary values that you provide, and works at any level of nesting in a pull pattern. The pattern below uses :as twice to rename the two keys in the result:

;; pull expression
'[{[:track/_artists :limit 1 :as "Tracks"]
   [[:track/name :as "Name"]]}]

=> {"Tracks" [{"Name" "Black Dog"}]}

To try it out you can grab the latest release, review the Pull grammar, and work through these examples at the REPL.

Thursday, March 23, 2017

New Datomic Training Videos and Getting Started Documentation

We are excited to announce the release of a new set of Day of Datomic training videos!
Filmed at Clojure/Conj in Austin, TX in December of 2016, this series covers everything from the architecture and data model of Datomic to operation and scaling considerations.

The new training sessions provide a great foundation for developing a Datomic-based system. For those of you who have watched the original Day of Datomic videos, the series released today uses the new Datomic Client library for the examples and workshops, so if you haven't yet explored Datomic Clients, now is the perfect opportunity to do so!

If you ever want to refer back to the original Peer-based training videos, don't worry - they're all still available as well.

In addition to an updated Day of Datomic, we've released a fully re-organized and re-written Getting Started section in the Datomic Documentation. We have gathered and incorporated feedback from new and existing users and hope that the new Getting Started is a much more comprehensive and accessible introduction to Datomic.

We look forward to your thoughts and feedback. If you have any comments on the new training videos, the new getting started section, or any additional thoughts, please let us know!

Wednesday, January 25, 2017

The Ten Rules of Schema Growth

Data outlives code, and a valuable database supports many applications over time. These ten rules will help grow your database schema without breaking your applications.

1.  Prod is not like dev.

Production is not development. In production, one or more codebases depend on your data, and these ten rules below should be followed exactingly.

A dev environment can be much more relaxed.  Alone on your development machine experimenting with a new feature, you have no users to break.  You can soften the rules, so long as you harden them when transitioning to production.

2.  Grow your schema, and never break it.

The lack of common vocabulary makes it all too easy to automate the wrong practices. I will use the terms growth and breakage as defined in Rich Hickey's Spec-ulation talk.  In schema terms:

  • growth is providing more schema
  • breakage is removing schema, or changing the meaning of existing schema.

In contrast to these terms, many people use "migrations", "refactoring", or "evolution". These usages tend to focus on repeatability, convenience, and the needs of new programs, ignoring the distinction between growth and breakage. The problem here is obvious: Breakage is bad, so we don't want it to be more convenient!

Using precise language underscores the costs of of breakage. Most migrations are easily categorized as growth or breakage by considering the rules below.  Growth migrations are suitable for production, and breakage migrations are, at best, a dev-only convenience. Keep them widely separate.

3. The database is the source of truth.

Schema growth needs to be reproducible from one environment to another.  Reproducibility supports the development and testing of new schema before putting it into production and also the reuse of schema in different databases. Schema growth also needs to be evident in the database itself, so that you can determine what the database has, what it needs, and when growth occurred.

For both of these reasons, the database is the proper source of truth for schema growth. When the database is the source of truth, reproducability and auditability happen for free via the ordinary
query and transaction capabilities of the database.  (If your database is not up to the tasks of queries and transactions you have bigger problems beyond the scope of this article).

Storing schema in a database is strictly more powerful than storing schema as text files in source control. The database is the actual home for schema, plus it provides validation, structure, query, transactions, and history. A source control system provides only history and is separate from the data itself.

Note that this does not mean "never put schema information in source control". Source control may be convenient for other reasons, e.g. it may be more readily accessible. You may redundantly store schema in source control, but remember that the database is definitive.

4.  Growing is adding.

As you acquire more information about your domain, grow your schema to match. You can grow a schema by adding new things, and only by adding new things, for example:

  • adding new attributes to an existing 'type'
  • adding new types
  • adding relationships between types

5.  Never remove a name.

Removing a named schema component at any level is a breaking change for programs that depend on that name. Never remove a name.

6.  Never reuse a name.

The meaning of a name is established when the name is first introduced. Reusing that name to mean something substantially different breaks programs that depend on that meaning. This can be even
worse than removing the name, as the breakage may not be as immediately obvious.

7.  Use aliases.

If you are familiar with database refactoring patterns, the advice in Rules Five and Six may seem stark. After all, one purpose of refactoring is to adopt better names as we discover them. How can we
do that if names can never be removed or changed in meaning?

The simple solution is to use more than one alias to refer to the same schema entity. Consider the following example:

  • In iteration 1, users of your system are identified by their email with an attribute named :user/id
  • In iteration 2, you discover that users sometimes have non-email identifiers for users and that you want to store a user's email even when not using the email as an identifier. In short, you wish that :user/id was named :user/primary-email.

No problem! Just create :user/primary-email as an alias for :user/id. Older programs can continue to use :user/id, and newer programs can use the now-preferred :user/primary-email.

8.  Namespace all names.

Namespaces greatly reduce the cost of getting a name wrong, as the same local name can safely have different meanings in different namespaces.  Continuing the previous example, imagine that the local
name id is used to refer to a UUID in several namespaces, e.g. :inventory/id, :order/id, and so on. The fact that :user/id is not a UUID is inconsistent, and newer programs should not have to put up with this.

Namespaces let you improve the situation without breaking existing programs. You can introduce :user-v2/id, and new programs can ignore names in the user namespace. If you don't like v2, you can also pick a more semantic name for the new namespace.

9.  Annotate your schema.

Databases are good at storing data about your schema. Adding annotations to your schema can help both human readers and make sense of how the schema grew over time. For example:

  • you could annotate names that are not recommended for new programs with a :schema/deprecated flag, or you could get fancier still with :schema/deprecated-at or :schema/deprecated-because. Note that such deprecated names are still never removed (Rule Five).
  • you could provide :schema/see-also or :schema/see-instead pointers to more current conventions. 

In fact, all the database refactoring patterns that are typically implemented as breaking changes could be implemented non-destructively, with the refactoring details recorded as an annotation. For example, the breaking "split column" refactoring might instead be implemented as schema growth:

  • add N new columns
  • (optional) add a :schema/split-into attribute on the original column whose value is the new columns, and possibly even the recipe for the split

10. Plan for accretion.

If a system is going to grow at all, then programs must not bake in limiting presumptions.  For example: If a schema states that :user/id is a string, then programs can rely on :user/id being a string and not occasionally an integer or a boolean.  But a program cannot assume that a user entity will be limited to a the set of attributes previously seen, or that it understands the semantics of attributes that it has not seen before.

Are these rules specific to a particular database?

No. These rules apply to almost any SQL or NoSQL database.  The rules even apply to the so-called "schemaless" databases.  A better word for schemaless is "schema-implicit", i.e. the schema is implicit in your data and the database has no reified awareness of it.  With an implicit schema, all the rules still apply, except that the database is impotent to help you (no Rule 3).

In Context

Many of the resources on migrations, refactoring, and database evolution emphasize repeatability and the needs of new programs, without making the top-level distinctions of growth vs. breakage and prod vs. dev. As a result, these resources encourage breaking the rules in this article.

Happily, these resources can easily be recast in growth-only terms.  You can grow your schema without breaking your app. You can continuously deploy without continuously propagating breakage.  Here's what it looks like in Datomic.

Wednesday, December 14, 2016

Customer Feedback Portal

As part of our commitment to improving Datomic, a few weeks ago we enabled a new feature request and user feedback system,, where you can help us prioritize our efforts and help shape the future of Datomic.

To submit your feature request follow the "Suggest Features" link in the top navigation of the dashboard. We have already connected your account to so everything is set up and ready for you to go.

You can read more about using Receptive here.

-The Datomic Team

Monday, November 28, 2016

Datomic Update: Client API, Unlimited Peers, Enterprise Edition, and More

We are pleased to announce that the latest (0.9.5530) release of Datomic includes a set of new features and licensing changes to address needs identified by our customers:
  • In addition to the peer model, Datomic now includes a Client API suitable for smaller, short lived processes, e.g. microservices.
  • The various tiers of the Datomic Pro license model have been simplified to a single license with no restriction on peer count.
  • We have introduced an Enterprise license tier for users who need customized pricing, support, or licensing terms.
  • Tempids and explicit partitions are now optional, simplifying code for the many programs that do not care about them.
  • Schema install and update are now implicit, and do not require explicit :db.install/attribute or :db.update/attribute datoms.

The features described above are additive and opt-in, so take advantage of them as and when you please.

Each of these changes is described in more detail below.

Building On a Solid Foundation

Before talking about what is new, it is important to talk about what is unchanged. We built Datomic believing that the Rationale is a sound foundation for an information system, and experience has proven this out. We have not retracted a word of the rationale since day one, and are not doing so today. Datomic’s core ideas are unchanged:
  • getting time, process, and perception right
  • sound data model
  • ACID transactions
  • Datalog query
  • minimal schema
  • separate reads and writes
  • programming with data

Datomic has delivered these ideas with a discipline that minimizes breaking API change. As a result, Datomic users have been able to focus on their business problems without having to worry about changing semantics in their database.

Client API

Datomic’s peer library puts database query in your own application process. This provides several benefits, but at the price of a heavier dependency (both in code and in memory requirements) than a traditional client.

A smaller footprint is useful in environments that have operational limitations, or where processes are small or short-lived. The new Datomic client API addresses this need. Lightweight clients connect to Peer Servers, which are peers that run in a separate address space.

Existing peers are unchanged, and you can mix and match peer and client applications as you see fit within the same Datomic install. Clients and peers are described in detail in the new clients and peers section of the docs.

With today’s release, we are making available the alpha version of the open source Client library for Clojure. The Java library will be released shortly. We also have plans to both create more language libraries for Client and enable our customers to create their own. We are interested in your feedback on the Client API itself and the priority of our language reach efforts. As of today, we have enabled a customer feedback portal, accessible via the "Suggest Features" link in the top navigation of the dashboard, where you can help us prioritize our efforts in this (and many other) areas.

Unlimited Peers

Flexibility in Peer use has been the most often-requested update to Datomic. You are solving complex problems using cutting edge technologies and architectures. Your tools should allow you to design the system that best fits your needs.  Datomic’s new licensing model gives all users - Starter, Pro and Enterprise - the ability to design for and deploy as many Peer processes (and Clients!) as their systems require. Today’s release represents a massive upgrade to the potential of each (new and existing) Datomic installation.

Pro Starter License

The Pro Starter license provides a no-cost way to try Datomic. You get a perpetual license plus a year of software upgrades for free. Starting with this release, Pro Starter includes all the features of a Pro license, including
  • unlimited peers
  • clients
  • High Availability (HA)
  • integrated memcached

Enterprise Tier

Datomic has a number of enterprise customers already. They distinguish themselves by wanting
  • custom license terms
  • custom pricing for larger installations
  • custom support terms
  • custom development

If you match one or more of these criteria, contact us to discuss an Enterprise license.

Tempid and Partition Defaults

Datomic’s tempids provide a way to partition new entities, encoding a locality hint directly in transactions. This feature is powerful, but rarely used, and the API and data structure for tempids are an inconvenience for the majority of users, who do not need or want partition control.

Starting with the current release of Datomic:
  • tempids are optional
  • when you need a tempid to coordinate the relationship between two entities, you can use an ordinary string instead of a tempid structure, and that string can be meaningful to readers of your code
  • the existing tempid data structure and API continue to be supported unchanged. Use them if you want them.

Clients will support string tempids only.

Schema Install and Update

Transactions that change attribute schema must include either :db.install/attribute (to create an attribute) or:db.alter/attribute (to change an existing attribute). The new release of Datomic infers the need for these datoms and adds them to your transaction automatically, reducing the verbosity of schema data.


We are very excited about the additions and changes to Datomic. To celebrate, we will be offering a 20% discount on new Datomic purchases through the end of February 2017. We hope you take advantage of the new features and this discount opportunity and please feel free to reach out to us at anytime.