Professional Documents
Culture Documents
What is Camlistore?
20% project, personal itch to scratch o sick of building CMS-like systems o livejournal, photos, brackup, scanning cabinet, my websites, ... o hacking since jun 2010, idle planning for ~year before that. storage system? o yeah, but not the whole story. "a way to store, sync, share, model and back up content" ...
Content-Addressable
at the bottom-most layer, everything is addressed by its digest of bytes Terminology: "Blob" -- 0 or more bytes. No extra-metadata.
Content-Addressability properties
trivial caching / syncing: you have it or you don't. o no "which version do you have?" content-deduplication o multiple users having same content, o filesystem backup snapshots incrementals are cheap etc...
Multi-Layer
Unix school of thought: o small, well-defined composable tools Camlistore has multiple layers / parts: o blob server: super dumb o schema: how one might represent data o search/indexer: make sense of dumbness, data o frontend: interact with world, sharing.
Architecture
Filesystem backups
previous project: brackup o slide/dice/encrypt S3 backup, contentaddressed, but only files C-A, not dirs. fossil/venti, git: recursive directories contentaddressed o git: "tree objects" camlistore: "schema blobs"
Schema Blobs
so if all blobs are just dumb blobs of bytes with no metadata, how do you store metadata? as blobs themselves! how to recognize it? same way you sniff a JPEG. magic. start with a '{'? parse as JSON? in memory schema -> JSON object serialization with "camliVersion" key == "schema blob"
Schema Blob
Minimal "schema blob" is: { "camliVersion": 1, "camliType": "whatever" } Whitespace doesn't matter. Just must be valid JSON in its entirety. Use whatever JSON libraries you've got.
Terminology...
"object": a set of blobs representing a mutable object. you modify an object by adding a new mutation claim blob to the set. "signed schema blob" or "claim": a schema blob that you JSON-sign. (OpenPGP) aside: bootstrapping tools for this. "permanode": a blob that's just a signed schema blob of a random number that serves as the anchor and reference point for the blob. like a "permalink" on the web.
Permanode
$ camput --permanode sha1-ea799271abfbf85d8e22e4577f15f704c8349026 $ camget sha1-ea799271abfbf85d8e22e4577f15f704c8349026 {"camliVersion": 1, "camliSigner": "sha1-c4da9d771661563a27704b91b67989e7ea1e50b8", "camliType": "permanode", "random": "oj)r}$Wa/[J|XQThNdhE" ,"camliSig":"iQEcBAABAgAGBQJNRxceAAoJEGjzeDN/6vt8ihIH/Aov7FRIq4dODAP WGDwqL1X9Ko2ZtSSO1lwHxCQVdCMquDtAdI3387fDlEG/ALoT/LhmtXQgYTt8Qq DxVduEK1or6/jqo3RMQ8tTgZ+rW2cj9f3Q/dg7el0Ngoq03hyYXdo3whxCH2x0jajSt4 RCcgdXN6XmLlOgD/LVQEJ303Du1OhCvKX1A40BIdwe1zxBc5zkLmoa8rClAlHdq wogxYFY4cwFm+jJM5YhSPemNrDe8W7KT6r0oA7SVfOan1NbIQUel65xwIZBD0ah CXBx6WXvfId6AdiahnbZiBup1fWSzxeeW7Y2/RQwv5IZ8UgfBqRHvnxcbNmScrzlp3 V3ZoY==BfKn"}
Search / Indexer...
Search / Indexer
subscribes to blobs in real-time o or enumerates / mapreduces world on init builds index of: o directed blob graph, o resolved attributes, o set memberships, o dates, tags, ... o whatever's needed eventually consistent
demo: http://camlistore.org/docs/sharing
Questions?
http://camlistore.org/