DIY NoSQL part deux: interchangeable parts
After my last post on using Clojure’s STM as a quick-and-dirty in-memory datastore, I had an interesting discussion in the comments about the wiseness of implementing my example with a static global. Coincidentally, I attended a talk by Stuart Sierra at EuroClojure about this very thing, and started getting some ideas about how to make things better, which I want to share today.
Recall the core of our last example.
(def twits (atom []))
(defn clean-twit [twit] ...)
; ...
(defn get-twits [] @twits)
(defn get-twit [ii] (nth @twits ii))
(defn put-twit! [twit]
(swap! twits #(take MAX-TWITS (conj % (clean-twit twit)))))
There is a problem with this simple code. The first, is that our implementations aren’t pure functions, or even pure-ish; they reference that external global atom just sitting there in the twit namespace. The second is that, when we upgrade our code to use a “real” database, our implementations will have to be discarded. But, we might still want to use an in-memory data store for testing or development, or even write a still-simpler mock.
We can solve both of these issues with a little structure. The solution I’m presenting here is inspired by Stuart Sierra’s Component library (and excellent talk), but scaled back a bit for the sake of simplicity, since we don’t (at the moment) have any complex dependencies to manage.
Part 1: Refactoring our old API
First, we add some records: One representing our existing storage,
AtomStore
, and one representing the new hotness, some sort of database,
which we’ll call DBStore
. We won’t implement anything using DBStore
yet,
but we plan to eventually.
(defrecord AtomStore [store])
(defrecord DBStore [conn])
Note that neither record contains any implementation detail, just a definition of
what key it must contain. When we instantiate these, we’ll need to provide
them with what they need - AtomStore
with an (atom {})
, and DBStore
with a
connection spec of some sort, depending on the actual DB library in use.
Our next problem is using different implementations of the twit-storing machinery, without exposing this detail to the user. We’ll do this using protocols, since we’ll only really need to dispatch on whether or not we’re using AtomStore or DBStore, but this whole thing could as easily be done with regular maps and multimethods. However, I find the self-documenting nature of a protocol definition convenient and comforting, so we’re doing it that way.
First, we add a protocol formalizing our public API.
We use the existing function names put-twit!
, get-twit
, and get-twits
:
(defprotocol TweetStore
(put-twit! [this twit])
(get-twit [this ii])
(get-twits [this]))
Now, we’ll rewrite our existing functions as the implementations of put-twit!
,
get-twit
and get-twits
on the AtomStore. Another change: instead of using a vector
in our twit atom, we use a single hash-map, with a vector under the keyword :twits
.
(extend-protocol TweetStore
AtomStore
(get-twits [this] (get @(:store this) :twits))
(get-twit [this ii] (nth (get-twits this) ii))
(put-twit! [this twit]
(swap! (:store this)
#(assoc %
:twits (-> (or (:twits %) []) ; Grab (:twits store) or []
(conj (clean-twit twit)) ; Add the cleaned twit
(take MAX-TWITS)))))) ; Drop any old twits
And finally, we’ll need to get a store object all the way down to those functions. One way to do this is to create a ring middleware that injects it into the request map, so let’s just do that:
(defn wrap-store [handler store]
(fn [req] (handler (assoc req :store store))))
Our view functions will have to be changed to pass the store along:
(defn GET-index [{store :store :as request}]
{:status 200
:body (str "<html><body><h1>TOP TWITS</h1><ul>"
(apply str (map twit-as-html (get-twits store)))
"</ul>"
"<form action=\".\" method=\"POST\">"
"<input name=\"name\">"
"<textarea name=\"message\"></textarea>"
"<button>TWIT</button>"
"</form>"
"</body></html>")
:headers {}})
(defn POST-index [{{name "name" message "message"} :params
store :store
:as request}]
(put-twit! store {:name name :message message})
(GET-index request))
And finally, our main method will have to use wrap-store
with some instantiated
store.
(defn -main []
(let [port (Integer/parseInt (get (System/getenv) "PORT" "8080"))]
(-> handler
(wrap-params)
(wrap-store (AtomStore. (atom {})))
(run-jetty {:port port}))))
Part 2: Reaping the benefits
So what did we gain? Well, let’s see how implementing our DBStore
goes. We’ll need
to add implementations for TweetStore
’s methods alongside the existing AtomStore
implementation:
(extend-protocol TweetStore
AtomStore
(get-twits [this] (get @(:store this) :twits))
(get-twit [this ii] (nth (get-twits this) ii))
(put-twit! [this twit]
(swap! (:store this)
#(assoc %
:twits (-> (or (:twits %) []) ; Grab (:twits store) or []
(conj (clean-twit twit)) ; Add the cleaned twit
(take MAX-TWITS))))) ; Drop any old twits
DBStore
(get-twits [this]
(query (:conn this)
["SELECT name, message, timestamp
FROM twits
ORDER BY timestamp DESC
LIMIT 5"]))
(get-twit [this ii]
(first (query (:conn this)
["SELECT name, message, timestamp
FROM twits
ORDER BY timestamp DESC
OFFSET ? LIMIT 1" ii])))
(put-twit [this twit]
(insert! :twits (clean-twit twit))))
)
Try to ignore my questionable implementation; they key point here is that we only had to add lines. We’d also have to import some stuff from clojure.java.jdbc
, which
I skipped over, but overall the DBStore
implementation exists peacefully alongside the AtomStore
one,
and none of the consumers of the TweetStore
API have to worry about which is which; they just
pass along the datastore that they were given.
The only other change: we’ll need to adjust -main
to use DBStore
instead of AtomStore
:
(defn -main []
(let [port (Integer/parseInt (get (System/getenv) "PORT" "8080"))]
(-> handler
(wrap-params)
(wrap-store (DBStore. {:url "INSERT DB CONNECTION PARAMETERS HERE"}))
(run-jetty {:port port}))))
And with that, we have a swappable SQL implementation of our data store. You can see how easy it would be to add more; although one might start questioning the sense of writing multiple datastore implementations all at once, it must be comforting to imagine that, should you make a wrong decision, you can easily change storage backends without having to change existing code.
This is a very common approach in Java, of course; write an interface for the datastore,
and different implementations. One major difference is that, using protocols, we can
separate different of the program into different chunks (e.g. protocols for UserStore
,
PostStore
, CommentStore
for a blog), without having to explicitly compose them into
some super-interface. And, of course, we get to do it in Clojure, which I think requires
no explaining.
There are many other problems that can benefit from this pattern; the example from Stuart’s talk
involved an EmailService
, a DBService
, and a CustomerService
which depends on both
the db and the email (a dependency problem which Component library aims to solve).
I encourage you to think of places where you’re perhaps defining things like static globals, and consider whether some restructuring now could save you a lot of trouble later.