Thursday, December 3, 2015

Reified Transactions

Domain data often answers the question "what", e.g.
  • What products are in inventory?
  • What users have accounts on the system?
  • What tests have been run?
But many domains also include information about the provenance of data, e.g.
  • Who said this data was correct?
  • When should this data be published?
  • Where did this data come from?
  • Why was this correction necessary?
How should you model this "data about data"? Datomic stores (and lets you store) facts about transactions themselves. In other words, transactions are themselves reified as data in the system.

Tim Ewald gave a great talk about this at Datomic Conf. His talk and all the other Datomic Conf talks are now freely available online.

Tuesday, November 24, 2015

CQRS with Clojure, Kafka, and Datomic

Datomic's clean separation of reads and writes has inspired many comparisons with the CQRS (Command and Query Responsibility Segregation) architectural pattern.

One person who has traveled far down this road is Bobby Calderwood, a member of the Technology Fellows team at Capital One. Check out his recent Clojure/conj talk, where he demonstrates a complete CQRS implementation using Clojure microservices, Kafka, and Datomic.

Thursday, November 5, 2015

Yeller chooses Datomic

Yeller is an exception tracking service whose goal is to provide clear, actionable information to developers to take the pain out of examining application errors. They chose Clojure and Datomic as the platform for Yeller in order to make the development process more effective and more efficient.

You can read all about their reasons for choosing Clojure and Datomic on their blog. You can also watch Tom explain it all at EuroClojure 2015.

Monday, September 14, 2015

Nubank chooses Datomic

Brazil is the world's 5th largest country in both area and population, has the 2nd most airports (behind the United States), and one of the highest credit card interest rates in the world. The Brazilian banking industry is both heavily regulated and extremely concentrated. Enter Nubank, whose founders have created a banking alternative in one of the world’s fastest growing mobile markets.

Traditional software wasn’t going to get the job done for Nubank. From the very beginning, the goal was “differentiation through technology.” They needed to plan for growth, both in size and complexity, to meet constantly changing regulatory and business rules.

Nubank turned to Datomic to access multiple data stores simultaneously, process transactions, and easily query historical data. Engineers can perform complex analyses on real-time data while the performance of the production environment is preserved. Along the way, the Cognitect team helped steer Nubank’s engineers on the right path as they built Datomic into their business.

Read the full case study to see how Datomic and Clojure are helping Nubank change the banking industry.

Thursday, August 20, 2015

A Conversational Introduction to Datomic

My colleague Carin Meier has been writing up her conversations with Datomic, introducing the database via an imagined dialog between a new user and the database.  In a very short space she introduces

Carin puts all these ideas into context with examples that you can explore in minutes.  If you are new to Datomic, check it out.

Tuesday, July 7, 2015

Datomic Best Practices

We are happy to announce the addition of a Best Practices section to the Datomic Development Resources documentation. The Best Practices topic is a tool to help users quickly find preferred usage patterns when approaching common scenarios in Datomic.

The Datomic Best Practices documentation includes sections on:


Where appropriate, examples are included as are links to the full documentation for further reading.

See the Best Practices documentation for the full list. We expect to add more subtopics to the Best Practices documentation over time, so check back periodically.

We look forward to your feedback. If you have any comments on the new best practices topics, or any questions you think would be addressed by an additional topic, let us know!

Tuesday, January 13, 2015

More Power for Datomic Datalog: Negation, Disjunction, and Range Optimizations

Today's Datomic release includes a number of enhancements to Datomic's Datalog query language:
  • Negation, via the new not and not-join clauses
  • Disjunction (or) without using rules, via the new or and or-join clauses
  • Required rule bindings
  • Improved optimization of range predicates
Each is described below, and you can follow the examples from the mbrainz data set in Java or in Clojure

Negation

The not clause shrinks a result set, removing already-bound tuples whose variables match a set of clauses.  
For example, the following query finds the names of all artists who are not Canadian:
[:find ?name
 :where [?eid :artist/name ?name]
        (not [?eid :artist/country :country/CA])]
The not-join clause generalizes not, and allows you to also specify which variables inside the not-join will unify with the enclosing query.
The following query counts artists who did not release an album in 1970.  The not-join specifies that ?artist unifies with the containing query, and (implicitly) that ?release is local.
[:find (count ?artist) .
 :where [?artist :artist/name]
        (not-join [?artist]
          [?release :release/artists ?artist]
          [?release :release/year 1970])]
As elsewhere in Datomic Datalog, lists of clauses inside not or not-join are connected by an implicit and

Disjunction

Datomic Datalog has long supported disjunction (or) via rules.  Rules are a powerful abstraction mechanism, allowing you to create named pieces of logic to be reused across multiple queries.  Rules are also the path to recursion in query.  But sometimes you don't need all that, and you just want simple disjunction in situ.
The or clause binds variables that match at least one of a set of predicates.
As an example, consider the way mbrainz models release media.   There are four separate enums that all represent vinyl: vinylvinyl-7vinyl-10, andvinyl-12.  An or clause can group these all together, finding all the vinyl releases:
[:find (count ?medium) .
 :where (or [?medium :medium/format :medium.format/vinyl7]
            [?medium :medium/format :medium.format/vinyl10]
            [?medium :medium/format :medium.format/vinyl12]
            [?medium :medium/format :medium.format/vinyl])]
The and clause is available as an explicit form inside or clauses.  For example, the following query finds artists that are either groups or female individuals:
[:find (count ?artist) .
 :where (or [?artist :artist/type :artist.type/group]
            (and [?artist :artist/type :artist.type/person]
                 [?artist :artist/gender :artist.gender/female]))]
Note that and clauses are used only inside or – elsewhere in query, and is implicit wherever clauses appear sequentially.
The or-join clause is a generalization of or, just as not-join generalizes not.

All Together

Negation and disjunction can be nested, allowing direct expression of e.g. 'not this or that', e.g.
[:find ?name
 :where [?eid :artist/name ?name]
        (not (or [?eid :artist/country :country/US])
                 [?eid :artist/country :country/CA])]
which means "find all the artists, except those that are from the US or Canada."
Required Rule Bindings
Rules are relational, and do not dictate the order in which variables are bound.   Imagine a rule that finds tracks shorter than a certain maximum length for an artist.  You might write that rule as:
[(short-track ?a ?t ?len ?max)
 [?t :track/artists ?a]
 [?t :track/duration ?len]
 [(< ?len ?max)]]
This rule does not tell the caller anything about when the four variables might be bound, revealing little about its use.  You might try to
  • invoke the rule in a context where no variables are bound yet, finding all possible relationships
  • invoke the rule with all variables bound, as a predicate check
  • any combination of variable bindings between these two extremes 

Often, complete flexibility in binding is exactly what you want.  But in this case, the author of the rule has a specific intent:
  • ?max is a required input, as nothing inside the rule knows how to manufacture a ?max
  • ?a is an intended input (this rule is about artists)
Rules can specify required vars in a list as the first element after the rule name.  The rule head above can thus be rewritten as
(short-track [?a ?max] ?t ?len)
Now the ?a and ?max variables are required. Datomic will not invoke a rule until required variables are bound, and will throw an exception if they cannot be bound.
Optimization of Range Predicates
Datomic now better leverages indexes to lookup range predicates (=, <=, >=, <. and >) where one argument is bound at the start of the query.  For example, in the following query
[:find (count ?artist) .
 :in $ ?max-year
 :where [?artist :artist/startYear ?year]
        [(< ?year ?max-year)]]

Datomic will see that ?max-year is bound, and use the AVET index on :artist/startYear to consider only datoms guaranteed to match the predicate.  This can result in substantial speedups and reduced memory usage for queries that substantially limit results via range predicates.

Summary

Taken together, the new capabilities above add substantial expressive power and performance to Datomic Datalog.  Get the full details in the docs, and happy querying!