A Clojure / Datomic Web App Tutorial: Part 1

Setup and Data API

Welcome! Let's assume the author is friendly and welcoming, encouraging you to move along and grow as a software developer and that we're here to help. None of that's true, but it seems like the kind of things that should go into a paragraph at the beginning of a tutorial. On to the work!

Motivation

There are several tutorials in play about building web applications in Clojure. If you already know what you're doing, this may be a remedial exercise. Likewise, there are a few tutorials around getting up and running with Datomic.

My problem with the vast majority is that they seem to be written for people who don't need a tutorial, and by and large all have the stench of read the code and you'll understand, quickly coupled with and if you don't, you're not smart enough to use this technology anyway. A git repo isn't a tutorial, and neither are API docs. Those are a way of telling your users to fuck off, you're busy while handing them just enough that you can claim that you pandered to the community.

So there's this. I hope you enjoy it.

Audience

This series targets those who are familiar enough with Clojure that we won't need to go through syntax details, focusing instead on the nitty gritty of getting a web-facing application backed by Datomic designed and shipped.

If you are new to programming, Clojure or both, I give Kyle Kingsbury's 'Clojure from the ground up' series two enthusiastic thumbs up. I'd give it more praise if I had more thumbs.

If you'd like a readable overview of what Datomic is and does, Daniel Higginbotham's 'Datomic for Five-Year-Olds' is a good (if a little dated) start. You might follow up with datomic.com and docs.datomic.com, or you can just keep reading.

Goals

The litany of starting applications for learning a new web development platform is long. Blogs are popular, todo lists have made a big splash. Lately, it's the url shortener. We'll go with that one, with the following goals:

explain how to develop web applications w/ clojure and datomic
shorten urls (targets) into a shorter form (called a 'slug')
generate a slug if none is specified
store the shortened urls in Datomic
collect data on traffic (ip address, etc.)
accept new urls in a nice UI
display traffic data in a nice UI
authenticate users
authenticate registered applications with an API key
be fast
not get murdered

This is an ambitious list. It has enough complexity to make it useful rather than just demonstrating toy code. We'll also be evolving it organically together—instead of dumping a complete thing on you and explaining how we did it, we'll walk through each bit, what decisions were made, and then move on to the next thing.

As a for instance, we'll start with a data structure in memory, written with compojure. We'll move on to store the data in datomic, then write a new data structure for tracking information. After that, we'll build a neat UI. Etcetera.

Assumptions

We have some assumptions built in: First, that you're either running a unix-like OS, or are prepared to deal with the overhead of trying to do this on Windows without my help. Second, you should have lein and Java installed. If you don't, go here. Third, this isn't your first rodeo, and you have at least a rudimentary grasp of how the sausage gets made–we're not launching rockets here, but we also won't be stopping to explain what an if does.

Thanks

I'd like to extend some thanks to Bobby Calderwood¹, Alex Miller², and Lake Denman³ for helping out with this series.

Part 1: Hello!

First, a confession: I started this project six months ago in a hotel room in Durham, NC to get my head around Compojure and a few other things. Then I put it down for a while, and some of the orginal libraries I used got stale. I've updated them for the purpose of this post, but you might notice some discrepancy between the 'early code', and the final product. Work with me, here.

Let's start by saying that building everything from scratch makes my head hurt. Let's start simply by getting something working we can build off of with a minimum of fuss.

Lein, and the `project.clj` file

I begin this little trip using @technomancy's lein-heroku template:

lein new heroku crisco

My intention was to deploy a first version to Heroku (which I did) with something basic to build on. Our lein template gives us a good headstart and a project structure to play with, beginning with the project.clj file:

(defproject crisco "1.0.0-SNAPSHOT"
  :description "a project for url shortening"
  :url "http://github.com/bvandgrift/crisco-compojure"
  :dependencies [[org.clojure/clojure "1.6.0"]
                 [compojure "1.1.8"]
                 [ring/ring-jetty-adapter "1.2.2"]
                 [ring/ring-devel "1.2.2"]
                 [environ "0.5.0"]]
  :min-lein-version "2.0.0"
  :plugins [[environ/environ.lein "0.2.1"]
            [lein-ring "0.8.0"]]
  :hooks [environ.leiningen.hooks]
  :uberjar-name "crisco-standalone.jar"
  :profiles {:production {:env {:production true}}}
  :ring {:handler crisco.web/app})

The project.clj file tells lein everything it needs to know to load dependencies, compile, jar, and eventually deploy your project. The leiningen docs do a fair job of letting you know what's what, but let's talk about what's in front of us.

This is, naturally, a clojure file that defines a project (defproject). Based on the contents of the map-like configuration structure, you can tell lein to do different things.

The :dependencies keyword contains a vector of vectors, each of which contains a dependency we'll require. Those you see listed here were generated by lein-heroku when I started up, as were a few others I've yanked because we won't need them. We require clojure of course, and then:

ring is a popular framework for building web apps in clojure. It abstracts the dirty HTTP details, much like Rack does in ruby.
compojure is a lightweight routing abstraction that sits on top of ring. It is similar in some respects to Sinatra.
environ is a library for incorporating a cascade of configuration files. We won't be using that up front, but I'm leaving it in because odds are good we'll need it before long. It's included with heroku's loadout since heroku wants us to store everything in environment variables.

There's some housekeeping with :min-lein-version and :uberjar-name.

The :plugins keyword indicates which lein plugins we'll be taking advantage of. You see environ in there again, but also lein-ring. Lein-ring will give us a few tools while developing that we'll find useful. In particular, it lets us run a web server that keeps current with changes to our files.

We can set up :profiles with different runtime configurations; that'll be important when we're running in a production environment.

Finally, :ring contains the lein-ring plugin configuration. Right now, we just need to let it know what handler it will be using to handle incoming requests. {:handler crisco.web/app} is what the template handed us, so let's work with that.

web.clj: Let's Begin

Our template created this file for us. It has more than we need at the moment, but let's look at the moving parts:

(ns crisco.web
  (:require [compojure.core :refer [defroutes GET PUT POST DELETE ANY]]
            [compojure.handler :refer [site]]
            [compojure.route :as route]
            [clojure.java.io :as io]
            [ring.middleware.stacktrace :as trace]
            [ring.middleware.params :as p]
            [ring.adapter.jetty :as jetty]
            [environ.core :refer [env]]))

We define our namespace with ns crisco.web, and require a few libraries to help us along. I'm only displaying the libraries we strictly need.

compojure.core/handler/route are the elements from compojure we'll be using to keep our routing reasonable.
clojure.java.io makes an appearance in case we want to read any files. Spoiler alert: we do.
ring.middleware.stacktrace/params allow us reasonable stack traces when something explodes, and the ability to cleanly handle http params, respectively.
ring.adapter.jetty interacts with jetty. We'll need this when we want to deploy somewhere or run a standalone server.
environ.core gives us access to our environment.

Next up are the routes we'll be using. This next section is primarily what compojure brings to the party. Without compojure to handle the routing for ring—well, let's just say this is much nicer.

(defroutes app
  (GET "/" []
       {:status 200
        :headers {"Content-Type" "text/plain"}
        :body (pr-str ["Hello" :from 'Heroku])})
  (ANY "*" []
       (route/not-found (slurp (io/resource "404.html")))))

So with "/" we get 'Hello from Heroku', and anything else is a 404. You can see that the ANY route reads in from a resource file, which gives us a leg up on creating a real index page. Let's replace the GET "/" route:

(GET "/" []
     {:status 200
      :headers {"Content-Type" "text/plain"}
      :body (slurp (io/resource "index.html"))})

This will read into the response body the contents of the index.html file in the resources/ directory. io/resource attaches a stream reader to the file system, and slurp reads everything into a string. We could've also done:

(GET "/" []
     {:status 200
      :headers {"Content-Type" "text/html"}
      :body (-> "index.html"
                io/resource
                slurp)})

That might be more idiomatic, but it's a short enough list that I don't think we lose anything by leaving it inline.

What compojure is doing behind the scenes here is creating handler functions that review the incoming requests and dispatch them appropriately. It will also make our params easy to integrate, once we have a few.

The next few lines set up our request handler:

(defn wrap-error-page [handler]
  (fn [req]
    (try (handler req)
         (catch Exception e
           {:status 500
            :headers {"Content-Type" "text/html"}
            :body (slurp (io/resource "500.html"))}))))

(defn wrap-app [app]
  (-> app
      ((if (env :production)
           wrap-error-page
           trace/wrap-stacktrace))))

(defn -main [& [port]]
  (let [port (Integer. (or port (env :port) 5000))]
    (jetty/run-jetty (wrap-app #'app) {:port port :join? false})))

The helper function wrap-error-page does what you think: it catches any exceptions thrown by our handlers.

Next up, wrap-app [app] applies all the ring wrappers an app wants to use to the 'app' handler, created by defroutes. Remember that we used a heroku template for this, and heroku will run using jetty, as defined in the -main function. This is great if we're running that way, but since we won't be bootstrapping that way while developing, we'll need another way to get everything tied in. It's not incredibly important at the moment, but you should be aware of it.

Finally, -main. The '-' before the method name means that it's a static method (obviously), and as such can be run as a standalone application by the JVM. This runs jetty on the port we specify (or 5000), and sets up the handler function that compojure created with defroutes to respond to all requests.

All in all, this file doesn't do much that's related to the application itself—it's mostly focused on setting up the structure the application can run in. Now that we're at the bottom of it, let's fire it up. From your project directory:

lein ring server-headless

The first time you run this (or lein deps), the project's dependencies are downloaded, and then your service starts on http://localhost:3000. Assuming you remembered to actually create a index.html file in your resources/ directory, you should be able to view it in the browser.

The lein ring server-headless task we get with thelein-ring plugin that we set up in our project.clj file. That's the shortest distance to getting to work. Another alternative would be lein run or lein trampoline run, but they require adding :main crisco.web to the end of your project.clj file.

Are you following along? Want to check in and see how you're doing? You can find the code at this point tagged with 'clean-slate' in the crisco-compojure github repo.

Data API, Version 1

They grow up so fast. Okay, first, we want to be able to relate a slug, that is, a shortened url, to a url target. Our goal is for http://localhost:3000/gh to hit http://github.com. The best way to do this in the proper HTTP world is to issue a status of 301, along with a "Location" header set with the new location. Something like:

HTTP/1.1 301 Moved Permanently
Date: Wed, 28 May 2014 10:46:01 GMT
Location: http://github.com
Content-Length: 0
Server: Jetty(7.6.8.v20121106)

Easy enough; we'll add that route to our defproject in web.clj:

(GET "/gh" []
     {:status 301
      :headers {"Location" "http://github.com"}})

Github isn't the only shortened url we want, however. We'd like to be able to start the server with a few, then add to them.

Entites: Data

Let's consider our data options: each slug maps uniquely to a target url. As such, a map {} would work nicely. If we wanted to keep track of the number of times a particular slug has been visited (listed in our goals from above), then we can't just use a simple map, we need a nested data structure:

(def urls (atom {:gh {:target "http://github.com" :redirects 2}
                 :gg {:target "http://google.com" :redirects 1}}))

While we don't expect to be running any heavily-loaded operations in development, we should still do the right thing and prepare our data for concurrent access. For that, we'll use an atom.

We could go the extra mile and define a record for our slugs, but that might be overkill, and we'd still need a hash for easy access:

(defrecord Slug [slug target redirects])
(def urls (atom {
  :gh (->Slug :gh "http://github.com" 2)
  :gg (->Slug :gg "http://google.com" 2)
  }))

Let's stick with a map.

Transformations: Functions

Now, what functions will be operating on our list of urls?

given a slug and a target, if the slug isn't already used, then create an entry in our urls list with 0 redirects. otherwise, bail.
given a slug, retrieve its target and update its redirect count
there is no third thing

Easy enough! To store the slug in our list store-slug! [slug target] will do. The bang(!) at the end of the function indicates we'll be making a lasting change to the application state. For the redirect, request-redirect! [slug] seems right. Again, we're changing our application state, so the bang is recommended.

What we've described so far actually has nothing to do with web anything, which seems to beg for its own namespace. Let's call it crisco.data, since we know we'll be adding a persistence mechanism later. Here's what it might look like:

(ns crisco.data)

(def ^:private urls (atom {:gh {:target "http://github.com" :redirects 0}
                 :gg {:target "http://google.com" :redirects 0}}))

(defn- get-target [slug]
  (get-in @urls [(keyword slug) :target]))

(defn store-slug! [slug target]
  (if-not (get-target slug)
    (swap! urls #(assoc %1 (keyword slug) {:target target :redirects 0}))))

(defn request-redirect! [slug]
  (when-let [target (get-target slug)]
    (swap! urls #(update-in %1 [(keyword slug) :redirects] (fnil inc 0)))
    target))

I've added a private function (denoted by the '-' after defn, obviously) called get-target, since we are using that functionality twice. I've also made urls private, since we should only be interacting with it via our data API.

We can test things out using lein repl:

crisco.web=> (require '[crisco.data :as data])
nil
crisco.web=> (data/request-redirect! "gh")
"http://github.com"
crisco.web=> (data/store-slug! "me" "http://ben.vandgrift.com")
{:gh {:redirects 1, :target "http://github.com"},
 :gg {:redirects 0, :target "http://google.com"},
 :me {:target "http://ben.vandgrift.com", :redirects 0}}
crisco.web=> (data/request-redirect! "me")
"http://ben.vandgrift.com"

Using the Data API from web.clj

Let's tie this in to some web functionality. In web.clj, we'll add our data api to the required list:

(ns crisco.web
  (:require ;; ...
            [crisco.data :as data]))

Next we'll drop two new routes into our routes list:

(GET "/:slug" [slug]
     {:status 301
      :headers {"Location" (data/request-redirect! [slug])}})
(POST "/shorten/:slug" [slug target]
      (if (data/store-slug! slug target)
        {:status 200}
        {:status 409}))

In order to properly parse params when running lein ring, we need to wrap our app handler in wrap-params. First, we change defroutes app to defroutes routes, then add a function to do the wrapping, returning the modified app handler:

(def app (-> routes
             p/wrap-params))

And that's it. Run lein ring server-headless and off you go. We don't have a UI yet, so we'll be using curl to post new slugs:

curl -i -d target=http://ben.vandgrift.com http://localhost:3000/shorten/bv

Looks like everything's working a-ok. Following along?

That's it for Part 1. In Part 2 (forthcoming) we'll talk persistence, and add Datomic into the mix. Right now, though, some Q/A.

Q&A

If you have questions, tweet/email/etc.

"Why are you writing another tutorial about building web apps in clojure?"

People keep asking me how, so either the existing tutorials are hard to find, or they're not answering the questions in a way the audience understands.

"Why Datomic and not another database/data store/kv store…?"

Two reasons: First, learning Datomic became a requirement of my job function. Turns out it's good at quite a few things, one of which being the kind of data this app will be generating.

"Why are you using ABC library instead of XYZ library?"

The tech chosen at particular stages of development are those that appeal to me, enjoy widespread use, and are well-documented enough to help you out if (when) I forget to explain things.

"do you have something against capital letters?"

I studied (among other things) poetry in college, and went through an e e cummings phase. My default writing style doesn't include capitalization. For you, dear reader, I'm making an effort.

"Is your code idiomatic?"

Probably not, but I find some of the quirks of Clojure's idiomacy irritating and hard to read.

"Do you speak for your employer?"

Don't be silly, this is the Internet. If it was their opinion, it would be on their website.

Footnotes

Bobby Calderwood ( bobby) ↩
Alex Miller ( puredanger, puredanger, tech.puredanger.com) ↩
Lake Denman ( ldenman, l4ke) ↩

written: Apr 24 2014