Batman’s Secret Cache

The real Batman has caches of weapons stashed across Gotham City, but the Batman.js javascript framework could use a cache of it's own, a Views cache. I love the fact that Batman views are all stored in separate files (a la Rails) when doing development, it makes organizing a large project very easy. But when it comes time to deploy to a production system, stock Batmanjs will happily request each view as it is first requested via AJAX, but all those individual views should really be bundled up and served in one file, for obvious performance reasons.
To accomplish this, Ryan Funduk and I recently came up with this approach. First we will leverage the Rails asset pipeline to create a json file that contains all of our app's views.
/app/assets/javascripts/all_views.json.erb
<%= prefix = "#{Rails.root}/app/assets/javascripts/views" paths = Dir.glob("#{prefix}/**/*").select{|f| File.file?(f) && (f =~ /\.(html|erb)$/i) } paths.inject({}) do |all_views, f| viewname = f.sub( /^#{prefix}/, '' ).sub( /\..*$/i, '' ) view = File.read(f) view = ERB.new(view).result if f =~ /\.erb$/i all_views[viewname] = view all_views end.to_json %>
Now, if you go to http://localhost:3000/assets/all_views.json you should see all your views in one large json object. Note that your Batman views will be processed through ERB, so if you need the server to do some processing, go right ahead. (Just remember that the ERB code will be run only once, when the views are compiled, not each time they are requested.)
Next, tell the Rails asset pipeline to precompile this file, like so:
/config/environments/production.rb
config.assets.precompile += %w(all_views.json)
Next, we write a Batman helper that will request the all_views.json file, and create Batman views for each of the individual views. That looks like this:
/app/assets/javascripts/helpers/views_preloader.js.coffee.erb
MyApp.preloadViews = () ->
new Batman.Request
url: '<%= asset_path("all_views.json") %>'
type: 'json'
error: (response) -> throw new Error("Could not load views")
success: (all_views) =>
for view of all_views
Batman.View.store.set(view, all_views[view])Note that we run this file through ERB first, so that we get the proper path (including the digest, if applicable) to the all_views file. The last line pre-populates the view store with each of the views. (You will need to be on the master Batmanjs branch, as the View store is a new addition since the 0.8 release).
The last thing we need to do is kick off the preloader, and a good place to do that is like so:
/app/assets/javascripts/application.js.coffee
window.MyApp = class MyApp extends Batman.App
@on 'run', ->
MyApp.preloadViews()So, putting it all together, when your Rails app is deployed and the assets are precompiled, the asset pipeline will write out an all_views-1234.json file to the public/assets directory. On the client side of things, when your Batman app starts, it calls the preloader, which loads the all_views-1234.json file and creates Batman views for them all in one fell swoop. The really cool thing about this setup is that the Rails asset pipeline takes care of caching the views for you. Winning!
- John Lynch
@johnrlynch
Torquebox really does go to 11!
After working with Torquebox for the last year, I couldn't imagine writing a Rails app without it. I came for the deployment story, but I stayed because of all the fantastic tools you get for free. From diagnostic tools on the JVM, to message queueing, background tasks, the awesome Infinispan cache, etc. Do yourself a favor and check it out.
But, when doing development, I like to make sure the app will run outside of Torquebox as well, with a simple 'rails s', so here are a few tricks to make that happen.
When setting Torquebox-specific features, like for example setting the Rails cache store to Infinispan, do this:
config.cache_store = :torque_box_store if defined?(TorqueBox)
That way, when you run outside of Torquebox, Rails will use the default file cache and all will be well.
Another great Torquebox feature is the ability to mark any method as backgroundable, and the JVM will transparently spin off that method in a new thread. You accomplish that with this code:
class Battlestar include Backgroundable always_background :fire def fire ... end end
This of course will cause problems when running without Torquebox, so if you add this shim into your initializers, the method will just run in the foreground like normal.
unless defined?(Torquebox) module Backgroundable extend ActiveSupport::Concern module ClassMethods def always_background(method) end end end end
Another technique I use, especially for accessing the Torquebox message queues, is make sure you use your own class as a central point of control, like this:
class QueueManager include TorqueBox::Injectors if defined?(TorqueBox) def publish_weapon_event(event) @weapon_queue ||= inject("/queues/myapp/weapon_events") if defined?(TorqueBox) if @weapon_queue @weapon_queue.publish(event) else processor = WeaponEventProcessor.new processor.on_message(event) end end
So, when running inside Torquebox, the message gets sent to the queue, which would then at some point in the future execute the event processor, but when running under Webrick, it calls the event processor directly.
Render Rails3 Views Outside of your Controllers
If you ever had the need to break out of the MVC box and render a view template outside of your controller, the old-style Rails2 version went something like this:
body = ActionView::Base.new(Rails::Configuration.new.view_path).render(:file => "/orders/receipt.html.erb",:layout => false,:locals=>{:order=>order})
In Rails3, the same thing can be accomplished with this incantation:
av = ActionView::Base.new() av.view_paths = ActionController::Base.view_paths av.extend ApplicationHelper #or any other helpers your template may need body = av.render(:template => "orders/receipt.html.erb",:locals => {:order => order})
Transparently Cache Network Calls in Titanium
Typically, when you are building a mobile app with Titanium, you are grabbing data from the web via RSS/Atom/etc or various APIs, and displaying them to the user. And of course, when developing for mobile clients, you can never take the network for granted. So a common pattern we use is to always cache the results of network requests locally, so in the event of a network glitch or even "airplane mode", we still have some (stale) data to show to the user. We have packaged this functionality up into a CoffeeScript snippet here.
Basically, it is a wrapper around the standard Titanium.Network.HTTPClient API, except that we cache the results of each request in a local SQLite database table, and use that for subsequent requests until the cached record expires. The cache key is a hash of the full URL (plus any data parameters for POSTs).
For an example (written in CoffeeScript), we can create a new instance of the class to connect to the CNN RSS news feeds:
Ti.include("HTTPClientWithCache.js") cnn = new root.HTTPClientWithCache({ baseURL: "http://rss.cnn.com/rss", retryCount: 2, cacheSeconds: 60, onload: (response) -> Ti.API.debug("Response Data: #{response.responseText}") Ti.API.debug("Is this cached data?: #{response.cached}") Ti.API.debug("Cached at: #{response.cached_at}") })
Each time we want to make the request, we say:
cnn.get({url: "/cnn_topstories.rss"})
The response will either come from the cache, if it has not expired, or from cnn.com. Because we set the retryCount property to 2, if the first attempt to fetch the data over the network fails (meaning any HTTP status code > 400) it will automatically try again a second time. If that fails, then it will respond with the most recent version available in the cache, or null if nothing exists in the cache. The response object also has a cached property of true for anything served out of the cache, and a cached_at property of the timestamp the record was cached.
You can use the same object to make calls to other URLs as well, like so:
cnn.get({url: "/cnn_us.rss"})
If you need to pass parameters (like for pagination) you can say:
cnn.get({url: "/cnn_us.rss?page=1"})
To POST data to a URL, do this:
cnn.post({url: "/story/19912/edit", data: {param1: "value1", param2: "value2"}})
To manually prune the cache, you can call the prune_cache method, and anything older than seconds will be
deleted from the cache. For example, to remove anything older than 1 day (86,400 seconds) you would say this:
cnn.prune_cache(86400)
To completely clear the cache of every single entry, you can do this:
cnn.prune_cache(0)
The code is on GitHub as a gist here. For pointers on how to build Titanium apps using the incredible CoffeeScript instead of Javascript, see our previous blog post here.
Building iPhone Apps using Titanium and CoffeeScript
We have been using Titanium to build mobile apps for some time now. It is truly amazing how much more productive we are as opposed to when we use POOC (Plain Old Obj-C). Granted, Titanium is not a great fit for some applications, but for your standard network and data-heavy line of business applications, it rocks.
But, Titanium does not give you a lot of guidance on how you should write your app. They provide all the pieces you need, but folks are still figuring out the best way to structure things. So, we thought we would show you our particular technique for putting together a Titanium mobile app.
The first thing we did, was ditch Javascript. To be honest, something about it just makes our eyes bleed. Fortunately, jashkenas has come to our rescue with a Ruby-like language that compiles down to Javascript, called CoffeeScript. CoffeeScript makes Javascript a joy to use, and because it "compiles" to plain ole' Javascript, you can use it anywhere, including Titanium. (Basically, you run the command "coffee -w -c *.coffee" in whatever directory your files are in, and it continually watches for any changes and immediately compiles the *.coffee into a *.js file.) So to be clear, you are only ever referencing the .js files in your Titanium project, Titanium doesn't know anything about CoffeeScript or how to parse/use it.
One interesting thing CoffeeScript does, is compile each CoffeeScript file into its own namespace inside an anonymous Javascript function. This is great, because it isolates each file and prevents you from polluting the global namespace. For example, this CoffeeScript code:
show_flag = true alert "Hello World!" if show_flag
gets translated into this Javascript code:
(function() { var show_flag; show_flag = true; if (show_flag) { alert("Hello World!"); } })();
Another thing CoffeeScript enables, is an easier-on-the-eyes way to creates "classes", for example (from the CoffeeScript docs):
class Animal constructor: (@name) -> move: (meters) -> alert @name + " moved " + meters + "m." class Snake extends Animal move: -> alert "Slithering..." super 5 class Horse extends Animal move: -> alert "Galloping..." super 45 sam = new Snake "Sammy the Python" tom = new Horse "Tommy the Palomino" sam.move() tom.move()
(If you really want to see what that gets translated to, put on some shades and click here.)
So, when using CoffeeScript in Titanium apps, we can use these two features to help us structure things in a way that keeps each window isolated in its own namespace, yet allows us easy access to global objects as well. To start, we set up our app.js file, which Titanium runs on app startup. We keep this file as a plain Javascript file, and define a variable called 'root', to which we will attach anything that we want access to globally. We also include each window's CoffeeScript file (well, technically we include the Javascript file that the CoffeeScript file was translated into). Finally, we include a 'main' file, which sets up the main tab group and kicks everything off.
//app.js var root = {}; Ti.include('js/generic_window.js'); Ti.include('js/main.js');
In the generic_window.coffee file, we create a class definition for a generic window, and actually create a Titanium window in the constructor. We then attach the class definition to the root object so we can get at it from anywhere.
#generic_window.coffee class GenericWindow constructor: (theTitle, theText) -> @win = Ti.UI.createWindow({title:theTitle,backgroundColor:'#fff'}) label = Titanium.UI.createLabel({ color: '#999', text: theText, font: { fontSize: 20, fontFamily: 'Helvetica Neue' }, textAlign: 'center', width: 'auto' }) @win.add(label); root.GenericWindow = GenericWindow
Now the main file sets up the tabgroup, and actually instantiates some windows from the above generic window class...
#main.coffee tabGroup = Titanium.UI.createTabGroup({ barColor:'#336699'}) Titanium.UI.setBackgroundColor('#000') # Attach the window instance to root so we can get to it from anywhere root.Win1 = new root.GenericWindow('Win1','I am Window 1') tab1 = Titanium.UI.createTab({ icon:'KS_nav_views.png', title:'Win1', window: root.Win1.win }) tabGroup.addTab(tab1) # Attach the window instance to root so we can get to it from anywhere root.Win2 = new root.GenericWindow('Win2','I am Window 2') tab2 = Titanium.UI.createTab({ icon:'KS_nav_views.png', title:'Win2', window: root.Win2.win }) tabGroup.addTab(tab2) tabGroup.open({transition:Titanium.UI.iPhone.AnimationStyle.FLIP_FROM_LEFT})
Now, let's add a custom property to our generic window class, so add this line to the constructor of generic_window.coffee:
@custom1 = "Default Value"
This creates a property and gives it a default value. To access it, you could say "root.Win1.custom1". Now let's add an event handler that displays an alert box containing the value of @custom1:
#generic_window.coffee class GenericWindow constructor: (theTitle, theText) -> # Save off the context of this object to a local var 'self' self = this @custom1 = "Default Value" @win = Ti.UI.createWindow({title:theTitle,backgroundColor:'#fff'}) label = Titanium.UI.createLabel({ color: '#999', text: theText, font: { fontSize: 20, fontFamily: 'Helvetica Neue' }, textAlign: 'center', width: 'auto' }) @win.add(label); @win.addEventListener('click', () -> alert(self.custom1) ) root.GenericWindow = GenericWindow
If you would like to see how it all hangs together, clone the GitHub repo here and try it out. The bottom line is that before CoffeeScript, our Titanium apps tended to resemble spaghetti, but now we are able to encapsulate functionality and just generally make things look much cleaner. Sure, you don't need CoffeeScript to do this, but it sure makes it a lot more fun!
Tee with Sinatra
OK, so I wanted to take all of the JSON data that we were stuffing into our Riak cluster, and send a copy of it to our ElasticSearch cluster as well, so that we could, you know, actually find the data later. We could have done this by modifying one of the Riak client libraries, but then any data that got uploaded through a different client would be missed. So as an experiment, we turned to our new favorite tool, Sinatra, and hacked up a Rack proxy app, that will intercept the incoming HTTP requests, send them on to Riak and also send a copy to the ElasticSearch cluster. We used Typhoeus as the HTTP client to do this, so that we could concurrently execute the 2 requests in the interests of speed.
Here is the proof-of-concept:
require 'rubygems' require 'sinatra' require 'typhoeus' OPTIONS = {} OPTIONS[:riak_host] = "localhost" OPTIONS[:riak_port] = "8098" OPTIONS[:es_host] = "localhost" OPTIONS[:es_port] = "9200" OPTIONS[:riak_timeout] = 5000 # milliseconds OPTIONS[:es_timeout] = 5000 # milliseconds class Rack::Proxy def initialize(app) @app = app @hydra = Typhoeus::Hydra.new end def call(env) req = Rack::Request.new(env) # We need to use it twice, so read in the stream. This is an obvious problem with large bodies, so beware. req_body = req.body.read if req.body riak_url = "http://#{OPTIONS[:riak_host]}:#{OPTIONS[:riak_port]}#{req.fullpath}" opts = {:timeout => OPTIONS[:riak_timeout]} opts.merge!(:method => req.request_method.downcase.to_sym) opts.merge!(:headers => {"Content-type" => req.content_type}) if req.content_type opts.merge!(:body => req_body) if req_body && req_body.length > 0 riak_req = Typhoeus::Request.new(riak_url, opts) riak_response = {} riak_req.on_complete do |response| riak_response[:code] = response.code riak_response[:headers] = response.headers_hash riak_response[:body] = response.body end @hydra.queue riak_req # If we are putting or posting JSON, send a copy to the ElasticSearch index named "riak" if (req.put? || req.post?) && req.content_type == "application/json" req.path =~ %r{^/riak/([^/]+)/([^/]+)} bucket, key = $1, $2 es_url = "http://#{OPTIONS[:es_host]}:#{OPTIONS[:es_port]}/riak/#{bucket}/#{key}" opts = {:timeout => OPTIONS[:es_timeout]} opts.merge!(:method => req.request_method.downcase.to_sym) opts.merge!(:body => req_body) if req_body && req_body.length > 0 es_req = Typhoeus::Request.new(es_url, opts) es_response = {} es_req.on_complete do |response| es_response[:code] = response.code es_response[:headers] = response.headers_hash es_response[:body] = response.body end @hydra.queue es_req end # Concurrently executes both HTTP requests, blocks until they both finish @hydra.run #If we wrote to ES add a custom header riak_response[:headers].merge!("X-ElasticSearch-ResCode" => es_response[:code].to_s) if es_response && es_response[:code] #Typhoeus can add nil headers, lets get rid of them riak_response[:headers].delete_if {|k,v| v == nil} # Return original Riak response to client [riak_response[:code], riak_response[:headers], riak_response[:body]] end end use Rack::Proxy
(Gist here)
Execute the script, and it will listen on port 4567, so point your Riak client of choice there and start PUTing data, which will be seamlessly replicated into the ElasticSearch cluster. If we were really going to use this in anger, there is a lot of work yet to be done, but as a skeleton of how to use Sinatra (Rack, really) to quickly whip up custom proxys, and tee HTTP requests, I thought it might be useful.
Light a FUSE under your Riak cluster
As an experiment, we hacked together a FUSE driver in Ruby that lets you mount your Riak cluster as a file system, and browse around. Not sure how really useful it is, but was fun to do nonetheless.
If your Riak keys look like /foo/bar/logo.png, you will be able to ls and cd around the (simulated) directory structure, and cat files (keys).
Things get even more interesting if you hook this up with jsawk, then you can do things like this if your keys contain JSON values:
cat /mnt/riak/users/* | jsawk 'if (this.city != "Paris") return null'
The current implementation does a list keys function when you access a bucket, which of course is slow. We put all the keys into a tree structure and cache that for performance. Another fun project would be to rewrite it using Riak's Links to simulate the directory structure, and then run interactive map/reduce jobs to navigate the hierarchy. That will have to wait for another time...
Anyway, check it out yourself at Github, just keep in mind it is a toy and should not be pointed at a cluster you care about.
Hydra attacks Riak cluster
The great thing about querying a Riak cluster is that it's all just HTTP, so the huge universe of tools that are available to work with HTTP resources are at your disposal. For example, lets say we want to quickly request 20 keys from a Riak cluster. We could use curl or curb and request them one at a time, or we could use the multi-headed hydra that is Typhoeus to get them concurrently! Let's see this monster in action with a little Ruby code:
require 'rubygems' require 'typhoeus' require 'yajl' HOST = "riak.cluster.com" PORT = "8098" # bucket is the name of a Riak bucket # keys is an array of keys to get # returns array of response hashes with the keys, :key :code :headers :body def multi_get(bucket, keys) hydra = Typhoeus::Hydra.new @responses = [] keys.each do |key| url = "http://#{HOST}:#{PORT}/riak/#{bucket}/#{key}" request = Typhoeus::Request.new(url) # When the request completes, this block gets run request.on_complete do |response| result = {} result[:key] = key result[:code] = response.code result[:headers] = response.headers result[:body] = response.body @responses << result end # queue up the request to run later hydra.queue request end # This is a blocking call that executes all queued requests concurrently, # and returns when all requests have completed hydra.run @responses end # OK, lets try this baby out now, # first, set up the list of keys we want to get keys = YAML::load(<<EOT) --- - 1244idS1NaricUO2RtXJrjcfzr8 - 12ktMhh8KZOCYzMIRTLlqf5JeGA - 129Izkjd6Fh2i1zqxCE2acT6iju - 129AomedyZIa3gjudjCxpke5kU9 - 12BEfyiKWPqwZqiqcKGmizVN34i - 12EZagKQHnakkIChE3ruLUu4TrA - 12RBVtZV0EwQyTixXFLwHHqwLuK EOT # Now let the Hydra out! rs = multi_get("my_bucket", keys) # Or, if you are storing JSON objects, get the results as JSON using the speedy Yajl gem... rs = multi_get("my_bucket", keys).map{|r| Yajl::Parser.parse(r[:body])}
As you can see, it's pretty darn easy to leverage all the great work by the Typhoeus folks to speed up your queries.
Map/Reduce job to select specific keys
Riak will politely tell you about all the keys in a specific bucket, all you need to do is ask, like this:
curl http://localhost:8098/riak/my_bucket
The problem is what if you have a million keys? You can tell Riak to stream you the keys, but what if you only want certain keys, like all the keys that start with foo, for example. In that case, MapReduce is your friend. In Ruby, it looks like this:
results = Riak::MapReduce.new(client) .add("my_bucket") .map("function(value,keyData,arg) { var re = new RegExp(arg); return value.key.match(re) ? [value.key] : []; }", :keep => true, :arg => "^foo").run
You can pass in any regular expression in the :arg parameter. Since keys in Riak have to be unique, you will never get duplicates and don't need a reduce phase.
Update: Note that this code is pretty slow to execute on a bucket with many keys, so is best used in background jobs, not for interactive queries. For example, on a single node, small EC2 instance, with 10,000 JSON objects (3K each in size) in a bucket, running the above map reduce code takes 60 seconds.
To see how much of that time is spent marshaling the JSON objects, we removed the JSON body of each object and left only the key, and then ran the code again, which took 30 seconds, still not even in the right ballpark for interactive use. Of course, YMMV.
Riak is for Ops, but Ops don’t build Apps
You can't go anywhere on the interwebs without bumping into a NoSQL post somewhere. At least in the Rails community, MongoDB seems to be gaining the most mindshare. Mongo has the whole NoSQL thing going for it, as well as the two most important things from a developer's perspective -- a full-featured ORM or two (MongoMapper and dm-adapter-mongo) and the ability to (easily) index and query your data. So even though there is no SQL, you can still do things like this, which feels very comforting and familiar to a Rails developer:
Person.find_by_email("starbuck@galactica.mil")
However, MongoDB was designed as a single-node database, and achieves scalability in the same way a MySQL db would, by using things like masters, slaves, and shards. So while Mongo buys you the flexibility of a schema-less data store, you are still stuck with the same old scaling problems of the SQL databases. Why not just use a SQL db as a schema-less store in the first place? (a la Friendly).
Riak is a data store built from the ground up for scaling. The scaling story can be summed up in three words: "Add a node." That's it. No "shards", "masters", "slaves", etc etc. It's an incredibly compelling story. The problem is that as much as the ops folks love it, ops folks don't build apps. Developers do. And in its current form, Riak makes app developers work harder to build their app, in exchange for a much easier time scaling and maintaining the app down the road. But developer's don't care, because they don't usually have to worry about the scaling issue, its someone else's job. So it's easy for them to ignore the long-term benefits and go with something familiar and easier to get into, like Mongo.
Basho seems like a great company with a refreshing attitude towards making money from open source software, and I really want to see them succeed. But they need to move fast to gain mindshare in the developer community, and that means investing in the tools that make developer's lives easier. It seems that they have taken the first step and hired Sean Cribbs, the developer of the awesome Ripple gem, which is a Rails-compatible Object Mapper for Riak. I hope they continue to focus on developer tools and make Riak the first choice when building a new Rails app.
(We are doing our small part for the community by hacking on Briak, which is a data browser for Riak clusters based on Sinatra.)