Keep Chocolate Love Atomic

10 July 2012

Datomic is a database of atomic facts, or datoms, that consist of entity, attribute, value, and transaction. For example, "I love chocolate (as of tx 1000)."

Of couse, I am capable of loving many things, so the :loves attribute should be :cardinality/many. Here is an abbreviated history of my loves:

; some point in time...
[[:db/add stu :loves :chocolate]
[:db/add stu :loves :vanilla]]

; later...
[[:db/add stu :loves :octomore]
[:db/retract stu :loves :vanilla]]

The set of all things I currently love is derived information, and it can be calculated from the history of atomic facts. Based on the transactions above, I currently love :chocolate and :octomore.

Datomic automatically handles this derivation, as can be seen through the entity interface:

stu.get(:loves)
=> #{:chocolate :octomore}

Now, imagine creating a web interface with checkboxes for different things a person might love. You initially populate the interface with my current loves, pulled from the database. I interact with the system, and you get back a set of checkbox states.

At this point, you should submit adds and retracts only for the new facts I created -- not a set with an add or retract for every UI element. This is a subtle point. If I liked chocolate before, and I didn't uncheck chocolate, what is the harm in saying "Stu likes chocolate" again?

The biggest problem is that you are lying to the database. I didn't repeat my love of chocolate. What if the system also had a user interface more subtle than checkboxes, that allowed me to reiterate past preferences? You wouldn't be able to tell the difference.

An obvious warning sign is when you find yourself submitting derived information (the set of my likes) when you actually have the facts (what I just said) in hand. Ignoring facts and recording derived information is always perilous -- imagine managing a system that records birthdays and ages, but not birthdates.

A more subtle mistake is to abuse transactions to extract facts from derived information. You have a new derived set in hand, and the database knows how to calculate the previous derived set. Given those two things, you could write a transaction function that takes the two sets and backtracks to figure out what changed.

This approach has a variant of the dishonesty problem mentioned before, in that it provides no way for me to reiterate my love for chocolate. But the other problem with this approach may be even worse: It imposes coordination in the implementation, where no coordination was required by the domain.

Let's say that I choose, at some point in time, to start liking :cheesecake and :nachos. These are atomic choices, requiring no coordination with any historical record. If you send Datomic a set of all checkbox states, and ask it to discover :cheesecake and :nachos inside a transaction, you are manufacturing a coordination job that has no basis in reality. Unnecessary coordination is an enemy of scalability and reuse.

The root cause of confusion here is update-in-place thinking. The checkbox model exposes derived
information (the current states) but not the facts (the choices the user made). Given the set of checkbox states, you should do the diff in the web tier as soon as you pull data out of the form. This still has the problem that there is no way to restate that you love chocolate, but now the scope of the problem is localized to its cause -- the checkbox model. You can fix the problem, or not (you often don't care, which is why checkboxes work the way they do). But at least you are not propagating the problem into the permanent record.

Datomic is built on an understanding that data is created by atomic addition, not by corruptive modification. When your input source has an update-in-place model (such as checkbox states), you should convert to atomic facts before creating a transaction.

Now go eat some chocolate.

..

Memcache Support

24 June 2012

We're happy to announce today transparent integrated support for memcached in Datomic Pro Edition.

One of the nice things about the Datomic architecture is that the index segments kept in storage are immutable. That enables them to be cached extensively. Currently that caching happens inside the peers, which keep segments they have needed thus far in the application process heap.

While this is great for process-local working sets, there is only so much a single machine can cache. So, we've added support for an optional second tier of distributed, shared cache, leveraging a memcached cluster. This tier of cache can be as large as you wish, and is shared between all the peers.

The entire use of memcached is automatic and integrated - just provide the endpoints of your memcached cluster in configuration. The peer protocols will automatically both look in it, and populate it on cache misses. Being based upon immutability, there are no cache coherence problems nor expiration policy woes.

The architecture incorporating memcached looks like this:

The benefits of this are many:

You can get a shared cache of arbitrary size - many deployments will be able to fit their entire database in memcached if desired.
If you are using a storage that is not otherwise distributed (e.g. unclustered PostgreSQL), the memcached tier can both almost entirely remove the read load on the single server and distribute it.
Even when using a distributed storage like DynamoDB, the memcached tier can reduce your read provisioning and increase speed.
Developers can set up a small memcached daemon locally so their DB will always feel 'hot' across process restarts.
The memcached tier will enable hybrid strategies where the peers, transactors and memcached are all local but the storage is remote (e.g. DynamoDB).

Datomic is a good citizen in its use of memcached - it doesn't need to 'own' the cluster, and all of the Datomic keys incorporate UUIDs so they won't conflict with other application-level use of the same memcached cluster.

We hope you enjoy this feature, which is included in the Pro Edition at no extra charge. ..

Datomic Editions and Pricing

24 June 2012

Over the past few months we've gotten feedback and input regarding our pricing and licensing, and we've revamped them to make things simpler and clearer. The subscription pricing made people feel as if the offering was a service (it's not), as well as brought about misgivings about termination etc, so we've dropped it.

Here's the new offering:

We've added Datomic Free Edition - it's free and redistributable
Datomic Pro is licensed software.
It is offered with a perpetual license.
Maintenance (updates and support) for the first 12 months is included.
Maintenance in subsequent years is ~50% of the license fee.
Pricing is yearly, and up front on web site.
Pricing is based upon the number of processes (transactors + peers) using the software in production in your organization.
Development and testing usage doesn't count against your license limits

We've priced it such that the expenditure for maintenance roughly correlates to our older subscription pricing.

Evaluation has changed a bit as well. In all modes, Datomic Pro will require a license key. You can get a free 30-day eval key via the web site. There is no longer a 'runs for a week without a key' mode.

Note that Datomic is not just for cloud deployments - our SQL and other storage support lets you run it on-premise or in the cloud.

You can get more information on the editions and pricing here. ..

Datomic Free Edition

24 June 2012

We're happy to announce today the release of Datomic Free Edition. This edition is oriented around making Datomic easier to get, and use, for open source and smaller production deployments.

Datomic Free Edition is ... free!
The system supports transactor-local storage
The peer library includes a memory database and Datomic Datalog
The Free transactor and peers are freely redistributable
The transactor supports 2 simultaneous peers

Of particular note here is that Datomic Free Edition comes with a redistributable license, and does not require a personal/business-specific license from us. That means you can download Datomic Free, build e.g. an open source application with it, and ship/include Datomic Free binaries with your software. You can also put the Datomic Free bits into public repositories and package managers (as long as you retain the licenses and copyright notices).

There is a ton of capability included in the Free Edition, including the Datomic in-process memory database (great for testing), and the Datomic datalog engine, which works on both Datomic databases and in-memory collections. That's right, free datalog for everyone.

You can use Datomic Free Edition in production, and you can use it in commercial applications.

Datomic Free edition is completely API-compatible with what we are now calling Datomic Pro edition (the one that pays the bills). Datomic Pro adds the ability to use additional storages like SQL and DynamoDB, support for more peers, as well as high-availabilty mode for transactors and our new memcache support.

You can read about the editions here.

Get Datomic!

Welcome

17 June 2012

Welcome to the Datomic blog! ..

PREVIOUS 13 of 13 NEXT

Keep Chocolate Love Atomic

Memcache Support

Datomic Editions and Pricing

Datomic Free Edition

Welcome

Datomic Pro

Datomic Cloud

Resources

Company