14

We have cloud-hosted (RackSpace cloud) Ruby and Java apps that will interact as follows:

  1. Ruby app sends a request to Java app. Request consists of map structure containing strings, integers, other maps, and lists (analogous to JSON).
  2. Java app analyzes data and sends reply to Ruby App.

We are interested in evaluating both messaging formats (JSON, Buffer Protocols, Thrift, etc.) as well as message transmission channels/techniques (sockets, message queues, RPC, REST, SOAP, etc.)

Our criteria:

  1. Short round-trip time.
  2. Low round-trip-time standard deviation. (We understand that garbage collection pauses and network usage spikes can affect this value).
  3. High availability.
  4. Scalability (we may want to have multiple instances of Ruby and Java app exchanging point-to-point messages in the future).
  5. Ease of debugging and profiling.
  6. Good documentation and community support.
  7. Bonus points for Clojure support.
  8. Good dynamic language support.

What combination of message format and transmission method would you recommend? Why?

I've gathered here some materials we have already collected for review:

Community
  • 1
  • 1
jkndrkn
  • 3,860
  • 4
  • 32
  • 40
  • Do you really want reliability (from the title)? In the context of the class of messaging you're talking about, it means that messages never get lost (and possibly also that they get delivered in the order they were sent) which is *very* expensive. Of course, reliability here refers to being resistant against even things like a Backhoe Attack (i.e., physical destruction of the network or power infrastructure). I mostly prefer to have timely delivery and make the apps resistant to failures, because that's much easier… – Donal Fellows Dec 29 '10 at 23:28
  • Hi, we want reasonably good reliability and don't care about in-order delivery. Our system can tolerate occasional failures, though keeping the failure rate pretty low is important. – jkndrkn Dec 30 '10 at 16:51

4 Answers4

3

We have decided to go with BSON over RabbitMQ.

We like BSON's support for heterogeneous collections and the lack of the need to specify the format of messages up-front. We don't mind that it has poor space usage characteristics and likely poorer serialization performance than other message formats since the messaging portion of our application is not anticipated to be the bottleneck. It doesn't look like a nice Clojure interface has been written to let you directly manipulate BSON objects, but hopefully that won't be an issue. I will revise this entry if we decide that BSON won't work out for us.

We chose RabbitMQ mainly because we already have experience with it and are using it in a system that demands high throughput and availability.

If messaging does become a bottleneck, we will look first to BERT (we rejected it because it currently does not appear to have Java support), then to MessagePack (rejected because it appears that there isn't a large community of Java developers using it), then to Avro (rejected because it requires you to define your message format up-front), then Protocol Buffers (rejected because of the extra code generation step and lack of heterogeneous collections) and then Thrift (rejected for the reasons mentioned for Protocol Buffers).

We may want to go with a plain RPC scheme rather than using a message queue since our messaging style is essentially synchronous point-to-point.

Thanks for your input everyone!

Update: Here is the project.clj and core.clj that shows how to convert Clojure maps to BSON and back:

;;;; project.clj

(defproject bson-demo "0.0.1"
  :description "BSON Demo"
  :dependencies [[org.clojure/clojure "1.2.0"]
                 [org.clojure/clojure-contrib "1.2.0"]
                 [org.mongodb/mongo-java-driver "2.1"]]
  :dev-dependencies [[swank-clojure "1.3.0-SNAPSHOT"]]
  :main core)

;;;; core.clj
(ns core
  (:gen-class)
  (:import [org.bson BasicBSONObject BSONEncoder BSONDecoder]))

(defonce *encoder* (BSONEncoder.))

(defonce *decoder* (BSONDecoder.))

;; XXX Does not accept keyword arguments. Convert clojure.lang.Keyword in map to java.lang.String first.
(defn map-to-bson [m]
  (->> m (BasicBSONObject.) (.encode *encoder*)))

(defn bson-to-map [^BasicBSONObject b]
  (->> (.readObject *decoder* b) (.toMap) (into {})))

(defn -main []
  (let [m {"foo" "bar"}]
    (prn (bson-to-map (map-to-bson m)))))
jkndrkn
  • 3,860
  • 4
  • 32
  • 40
2

I can't speak from personal experience, but I know that Flightcaster is using JSON messaging to link their back-end clojure analytics engine to a front-end Rails app and it seems to be working for them. Here's the article (appears near the end):

Clojure and Rails - the Secret Sauce Behind FlightCaster

Hope this helps. --Mike

Mike Hickman
  • 116
  • 2
  • Hi Mike. Nice article. Turns out my teammates are already very familiar with that use-case ^_^ I'm not sure if the Ruby/Clojure interop described there is part of a critical speed-sensitive path. – jkndrkn Dec 20 '10 at 18:47
2

I have no experience in this regard. I will post this possibly-helpful guess anyway.

  • ZeroMQ offers point-to-point messaging, including with various types of network topologies. Messages consist of arbitrary binary values - so you will just need a binary serialization format for your structured messages.

  • BSON, ProtoBuffers, and BERT offer serialization of arbitrary data structures (numbers, strings, sequential arrays, associative arrays) into binary values.

GitHub invented BERT for fast RCP; BSON was invented by MongoDB (or 10gen) for the same reason; and ProtoBuffers likewise by Google.

yfeldblum
  • 63,188
  • 11
  • 126
  • 168
  • Thanks, we already employ RabbitMQ in the system and may consider simply using that instead of an RPC solution for messaging. So far, it looks like we are going to go with either Apache Thrift, Protocol Buffers, or maybe MessagePack for serialization. Apache Avro is another possibility. – jkndrkn Dec 30 '10 at 16:39
  • If you already have a message bus like RabbitMQ deployed, then it could make sense to re-use it when you need messaging. Otherwise, ZeroMQ might make sense because it might be simpler to deploy: it is a library for direct messaging that you use from within the components of your application, and does not require deploying any separate infrastructure. I'm adding this comment in case anyone else out there has the same question, but does not have RabbitMQ deployed. – yfeldblum Dec 30 '10 at 16:49
1

I believe Protocol-buffers would be a lot faster and more efficient than JSON (last time i checked it was around 40 times faster, i didn t try it with ruby tho so your mileage may vary).

mpenet
  • 367
  • 3
  • 6
  • edited: ProtoBuffs vs JSON in my case, but it wasn't using Jackson back then (i think i used jsonlib) and the protobuff java lib must have evolved as well since then. – mpenet Dec 18 '10 at 15:32
  • Network RTT will usually dominate and payload size is what really matters. Gzipped JSON is comparable in size to protocol buffers, so I think either is fine. – Kevin Jul 21 '12 at 06:05