The MusicBrainz dataset makes a great example database for learning, evaluating, or testing Datomic for a couple of reasons:
- It deals with a domain with which nearly everyone is familiar
- It is of decent size: 60,438 labels; 664,226 artists; 1,035,592 album releases; and 13,233,625 recorded tracks
- It comprises a good number of entities, attributes, and relationships
- It is fun to play with, query, and explore
SchemaThe mbrainz-sample schema is an adaptation of a subset of the full MusicBrainz schema. We didn't include some entities, and we made some simplifying assumptions and combined some entities. In particular:
- We omit any notion of Work
- We combine Track, Tracklist and Recording into simply "track"
- We renamed Release group to "abstractRelease"
Abstract Release vs. Release vs. Medium(Adapted from the MusicBrainz schema docs)
An "abstractRelease" is an abstract "album" entity (e.g. "The Wall" by Pink Floyd). A "release" is something you can buy in your music store (e.g. the 1984 US vinyl release of "The Wall" by Columbia, as opposed to the 2000 US CD release by Capitol Records).
Therefore, when you query for releases e.g. by name, you may see duplicate releases. To find just the "work of art" level album entity, query for abstractRelease.
The media are the physical components comprising a release (disks, CDs, tapes, cartridges, piano rolls). One medium will have several tracks, and the total tracks across all media represent the track list of the release.
EntitiesFor information about the individual entities and their attributes, please see the schema page in the wiki, or the EDN schema itself.
Getting StartedFirst get Datomic, and start up a transactor.
Getting the DataNext download the mbrainz backup:
# 2.8 GB, md5 4e7d254c77600e68e9dc71b1a2785c53 wget http://s3.amazonaws.com/mbrainz/datomic-mbrainz-backup-20130611.tar
Finally, restore the backup:
# this takes a while tar -xvf datomic-mbrainz-backup-20130611.tar
# takes a while, but prints progress -- ~150,000 segments in restore bin/datomic restore-db file:datomic-mbrainz-backup-20130611 datomic:free://localhost:4334/mbrainz
Getting the CodeClone the git repo somewhere convenient:
git clone email@example.com:Datomic/mbrainz-sample.git cd mbrainz-sample
Running the examples
From JavaFire up your favorite IDE, and configure it to use both the included pom.xml and the following Java options when running:
From ClojureStart up a Clojure REPL:
Then connect to the database and run the queries.
# from the root of the mbrainz-sample repo lein repl