A Clojure / Datomic Web App Tutorial: Part 1
Setup and Data API
Welcome! Let's assume the author is friendly and welcoming, encouraging you to move along and grow as a software developer and that we're here to help. None of that's true, but it seems like the kind of things that should go into a paragraph at the beginning of a tutorial. On to the work!
Motivation
There are several tutorials in play about building web applications in Clojure. If you already know what you're doing, this may be a remedial exercise. Likewise, there are a few tutorials around getting up and running with Datomic.
My problem with the vast majority is that they seem to be written for people who don't need a tutorial, and by and large all have the stench of read the code and you'll understand, quickly coupled with and if you don't, you're not smart enough to use this technology anyway. A git repo isn't a tutorial, and neither are API docs. Those are a way of telling your users to fuck off, you're busy while handing them just enough that you can claim that you pandered to the community.
So there's this. I hope you enjoy it.
Audience
This series targets those who are familiar enough with Clojure that we won't need to go through syntax details, focusing instead on the nitty gritty of getting a web-facing application backed by Datomic designed and shipped.
If you are new to programming, Clojure or both, I give Kyle Kingsbury's 'Clojure from the ground up' series two enthusiastic thumbs up. I'd give it more praise if I had more thumbs.
If you'd like a readable overview of what Datomic is and does, Daniel Higginbotham's 'Datomic for Five-Year-Olds' is a good (if a little dated) start. You might follow up with datomic.com and docs.datomic.com, or you can just keep reading.
Goals
The litany of starting applications for learning a new web development platform is long. Blogs are popular, todo lists have made a big splash. Lately, it's the url shortener. We'll go with that one, with the following goals:
- explain how to develop web applications w/ clojure and datomic
- shorten urls (targets) into a shorter form (called a 'slug')
- generate a slug if none is specified
- store the shortened urls in Datomic
- collect data on traffic (ip address, etc.)
- accept new urls in a nice UI
- display traffic data in a nice UI
- authenticate users
- authenticate registered applications with an API key
- be fast
- not get murdered
This is an ambitious list. It has enough complexity to make it useful rather than just demonstrating toy code. We'll also be evolving it organically together—instead of dumping a complete thing on you and explaining how we did it, we'll walk through each bit, what decisions were made, and then move on to the next thing.
As a for instance, we'll start with a data structure in memory, written with compojure. We'll move on to store the data in datomic, then write a new data structure for tracking information. After that, we'll build a neat UI. Etcetera.
Assumptions
We have some assumptions built in: First, that you're either running a unix-like OS, or are prepared to deal with the overhead of trying to do this on Windows without my help. Second, you should have lein and Java installed. If you don't, go here. Third, this isn't your first rodeo, and you have at least a rudimentary grasp of how the sausage gets made–we're not launching rockets here, but we also won't be stopping to explain what an if
does.
Thanks
I'd like to extend some thanks to Bobby Calderwood1, Alex Miller2, and Lake Denman3 for helping out with this series.
Part 1: Hello!
First, a confession: I started this project six months ago in a hotel room in Durham, NC to get my head around Compojure and a few other things. Then I put it down for a while, and some of the orginal libraries I used got stale. I've updated them for the purpose of this post, but you might notice some discrepancy between the 'early code', and the final product. Work with me, here.
Let's start by saying that building everything from scratch makes my head hurt. Let's start simply by getting something working we can build off of with a minimum of fuss.
Lein, and the project.clj
file
I begin this little trip using @technomancy's lein-heroku template:
lein new heroku crisco
My intention was to deploy a first version to Heroku (which I did) with something basic to build on. Our lein template gives us a good headstart and a project structure to play with, beginning with the project.clj
file:
(defproject crisco "1.0.0-SNAPSHOT"
:description "a project for url shortening"
:url "http://github.com/bvandgrift/crisco-compojure"
:dependencies [[org.clojure/clojure "1.6.0"]
[compojure "1.1.8"]
[ring/ring-jetty-adapter "1.2.2"]
[ring/ring-devel "1.2.2"]
[environ "0.5.0"]]
:min-lein-version "2.0.0"
:plugins [[environ/environ.lein "0.2.1"]
[lein-ring "0.8.0"]]
:hooks [environ.leiningen.hooks]
:uberjar-name "crisco-standalone.jar"
:profiles {:production {:env {:production true}}}
:ring {:handler crisco.web/app})
The project.clj
file tells lein everything it needs to know to load dependencies, compile, jar, and eventually deploy your project. The leiningen docs do a fair job of letting you know what's what, but let's talk about what's in front of us.
This is, naturally, a clojure file that defines a project (defproject
). Based on the contents of the map-like configuration structure, you can tell lein
to do different things.
The :dependencies
keyword contains a vector of vectors, each of which contains a dependency we'll require. Those you see listed here were generated by lein-heroku
when I started up, as were a few others I've yanked because we won't need them. We require clojure of course, and then:
ring
is a popular framework for building web apps in clojure. It abstracts the dirty HTTP details, much like Rack does in ruby.compojure
is a lightweight routing abstraction that sits on top ofring
. It is similar in some respects to Sinatra.environ
is a library for incorporating a cascade of configuration files. We won't be using that up front, but I'm leaving it in because odds are good we'll need it before long. It's included with heroku's loadout since heroku wants us to store everything in environment variables.
There's some housekeeping with :min-lein-version
and :uberjar-name
.
The :plugins
keyword indicates which lein
plugins we'll be taking advantage of. You see environ
in there again, but also lein-ring
. Lein-ring will give us a few tools while developing that we'll find useful. In particular, it lets us run a web server that keeps current with changes to our files.
We can set up :profiles
with different runtime configurations; that'll be important when we're running in a production environment.
Finally, :ring
contains the lein-ring
plugin configuration. Right now, we just need to let it know what handler it will be using to handle incoming requests. {:handler crisco.web/app}
is what the template handed us, so let's work with that.
web.clj: Let's Begin
Our template created this file for us. It has more than we need at the moment, but let's look at the moving parts:
(ns crisco.web
(:require [compojure.core :refer [defroutes GET PUT POST DELETE ANY]]
[compojure.handler :refer [site]]
[compojure.route :as route]
[clojure.java.io :as io]
[ring.middleware.stacktrace :as trace]
[ring.middleware.params :as p]
[ring.adapter.jetty :as jetty]
[environ.core :refer [env]]))
We define our namespace with ns crisco.web
, and require a few libraries to help us along. I'm only displaying the libraries we strictly need.
compojure.core/handler/route
are the elements from compojure we'll be using to keep our routing reasonable.clojure.java.io
makes an appearance in case we want to read any files. Spoiler alert: we do.ring.middleware.stacktrace/params
allow us reasonable stack traces when something explodes, and the ability to cleanly handle http params, respectively.ring.adapter.jetty
interacts withjetty
. We'll need this when we want to deploy somewhere or run a standalone server.environ.core
gives us access to our environment.
Next up are the routes we'll be using. This next section is primarily what compojure brings to the party. Without compojure to handle the routing for ring—well, let's just say this is much nicer.
(defroutes app
(GET "/" []
{:status 200
:headers {"Content-Type" "text/plain"}
:body (pr-str ["Hello" :from 'Heroku])})
(ANY "*" []
(route/not-found (slurp (io/resource "404.html")))))
So with "/" we get 'Hello from Heroku', and anything else is a 404. You can see that the ANY
route reads in from a resource file, which gives us a leg up on creating a real index page. Let's replace the GET "/"
route:
(GET "/" []
{:status 200
:headers {"Content-Type" "text/plain"}
:body (slurp (io/resource "index.html"))})
This will read into the response body the contents of the index.html
file in the resources/
directory. io/resource
attaches a stream reader to the file system, and slurp
reads everything into a string. We could've also done:
(GET "/" []
{:status 200
:headers {"Content-Type" "text/html"}
:body (-> "index.html"
io/resource
slurp)})
That might be more idiomatic, but it's a short enough list that I don't think we lose anything by leaving it inline.
What compojure is doing behind the scenes here is creating handler functions that review the incoming requests and dispatch them appropriately. It will also make our params easy to integrate, once we have a few.
The next few lines set up our request handler:
(defn wrap-error-page [handler]
(fn [req]
(try (handler req)
(catch Exception e
{:status 500
:headers {"Content-Type" "text/html"}
:body (slurp (io/resource "500.html"))}))))
(defn wrap-app [app]
(-> app
((if (env :production)
wrap-error-page
trace/wrap-stacktrace))))
(defn -main [& [port]]
(let [port (Integer. (or port (env :port) 5000))]
(jetty/run-jetty (wrap-app #'app) {:port port :join? false})))
The helper function wrap-error-page
does what you think: it catches any exceptions thrown by our handlers.
Next up, wrap-app [app]
applies all the ring wrappers an app wants to use to the 'app' handler, created by defroutes
. Remember that we used a heroku template for this, and heroku will run using jetty, as defined in the -main
function. This is great if we're running that way, but since we won't be bootstrapping that way while developing, we'll need another way to get everything tied in. It's not incredibly important at the moment, but you should be aware of it.
Finally, -main
. The '-' before the method name means that it's a static method (obviously), and as such can be run as a standalone application by the JVM. This runs jetty on the port we specify (or 5000), and sets up the handler function that compojure created with defroutes
to respond to all requests.
All in all, this file doesn't do much that's related to the application itself—it's mostly focused on setting up the structure the application can run in. Now that we're at the bottom of it, let's fire it up. From your project directory:
lein ring server-headless
The first time you run this (or lein deps
), the project's dependencies are downloaded, and then your service starts on http://localhost:3000
. Assuming you remembered to actually create a index.html
file in your resources/
directory, you should be able to view it in the browser.
The lein ring server-headless
task we get with thelein-ring
plugin that we set up in our project.clj
file. That's the shortest distance to getting to work. Another alternative would be lein run
or lein trampoline
run
, but they require adding :main crisco.web
to the end of your project.clj
file.
Are you following along? Want to check in and see how you're doing? You can find the code at this point tagged with 'clean-slate' in the crisco-compojure github repo.
Data API, Version 1
They grow up so fast. Okay, first, we want to be able to relate a slug, that is, a shortened url, to a url target. Our goal is for http://localhost:3000/gh
to hit http://github.com
. The best way to do this in the proper HTTP world is to issue a status of 301, along with a "Location" header set with the new location. Something like:
HTTP/1.1 301 Moved Permanently
Date: Wed, 28 May 2014 10:46:01 GMT
Location: http://github.com
Content-Length: 0
Server: Jetty(7.6.8.v20121106)
Easy enough; we'll add that route to our defproject
in web.clj
:
(GET "/gh" []
{:status 301
:headers {"Location" "http://github.com"}})
Github isn't the only shortened url we want, however. We'd like to be able to start the server with a few, then add to them.
Entites: Data
Let's consider our data options: each slug maps uniquely to a target url. As such, a map {}
would work nicely. If we wanted to keep track of the number of times a particular slug has been visited (listed in our goals from above), then we can't just use a simple map, we need a nested data structure:
(def urls (atom {:gh {:target "http://github.com" :redirects 2}
:gg {:target "http://google.com" :redirects 1}}))
While we don't expect to be running any heavily-loaded operations in development, we should still do the right thing and prepare our data for concurrent access. For that, we'll use an atom
.
We could go the extra mile and define a record
for our slugs, but that might be overkill, and we'd still need a hash for easy access:
(defrecord Slug [slug target redirects])
(def urls (atom {
:gh (->Slug :gh "http://github.com" 2)
:gg (->Slug :gg "http://google.com" 2)
}))
Let's stick with a map.
Transformations: Functions
Now, what functions will be operating on our list of urls?
- given a slug and a target, if the slug isn't already used, then create an entry in our urls list with 0 redirects. otherwise, bail.
- given a slug, retrieve its target and update its redirect count
- there is no third thing
Easy enough! To store the slug in our list store-slug! [slug target]
will do. The bang(!) at the end of the function indicates we'll be making a lasting change to the application state. For the redirect, request-redirect! [slug]
seems right. Again, we're changing our application state, so the bang is recommended.
What we've described so far actually has nothing to do with web anything, which seems to beg for its own namespace. Let's call it crisco.data
, since we know we'll be adding a persistence mechanism later. Here's what it might look like:
(ns crisco.data)
(def ^:private urls (atom {:gh {:target "http://github.com" :redirects 0}
:gg {:target "http://google.com" :redirects 0}}))
(defn- get-target [slug]
(get-in @urls [(keyword slug) :target]))
(defn store-slug! [slug target]
(if-not (get-target slug)
(swap! urls #(assoc %1 (keyword slug) {:target target :redirects 0}))))
(defn request-redirect! [slug]
(when-let [target (get-target slug)]
(swap! urls #(update-in %1 [(keyword slug) :redirects] (fnil inc 0)))
target))
I've added a private function (denoted by the '-' after defn
, obviously) called get-target
, since we are using that functionality twice. I've also made urls
private, since we should only be interacting with it via our data API.
We can test things out using lein repl
:
crisco.web=> (require '[crisco.data :as data])
nil
crisco.web=> (data/request-redirect! "gh")
"http://github.com"
crisco.web=> (data/store-slug! "me" "http://ben.vandgrift.com")
{:gh {:redirects 1, :target "http://github.com"},
:gg {:redirects 0, :target "http://google.com"},
:me {:target "http://ben.vandgrift.com", :redirects 0}}
crisco.web=> (data/request-redirect! "me")
"http://ben.vandgrift.com"
Using the Data API from web.clj
Let's tie this in to some web functionality. In web.clj
, we'll add our data api to the required list:
(ns crisco.web
(:require ;; ...
[crisco.data :as data]))
Next we'll drop two new routes into our routes list:
(GET "/:slug" [slug]
{:status 301
:headers {"Location" (data/request-redirect! [slug])}})
(POST "/shorten/:slug" [slug target]
(if (data/store-slug! slug target)
{:status 200}
{:status 409}))
In order to properly parse params when running lein ring
, we need to wrap our app handler in wrap-params
. First, we change defroutes app
to defroutes routes
, then add a function to do the wrapping, returning the modified app handler:
(def app (-> routes
p/wrap-params))
And that's it. Run lein ring server-headless
and off you go. We don't have a UI yet, so we'll be using curl to post new slugs:
curl -i -d target=http://ben.vandgrift.com http://localhost:3000/shorten/bv
Looks like everything's working a-ok. Following along?
That's it for Part 1. In Part 2 (forthcoming) we'll talk persistence, and add Datomic into the mix. Right now, though, some Q/A.
Q&A
If you have questions, tweet/email/etc.
"Why are you writing another tutorial about building web apps in clojure?"
People keep asking me how, so either the existing tutorials are hard to find, or they're not answering the questions in a way the audience understands.
"Why Datomic and not another database/data store/kv store…?"
Two reasons: First, learning Datomic became a requirement of my job function. Turns out it's good at quite a few things, one of which being the kind of data this app will be generating.
"Why are you using ABC library instead of XYZ library?"
The tech chosen at particular stages of development are those that appeal to me, enjoy widespread use, and are well-documented enough to help you out if (when) I forget to explain things.
"do you have something against capital letters?"
I studied (among other things) poetry in college, and went through an e e cummings phase. My default writing style doesn't include capitalization. For you, dear reader, I'm making an effort.
"Is your code idiomatic?"
Probably not, but I find some of the quirks of Clojure's idiomacy irritating and hard to read.
"Do you speak for your employer?"
Don't be silly, this is the Internet. If it was their opinion, it would be on their website.
Footnotes
-
Alex Miller ( puredanger, puredanger, tech.puredanger.com) ↩