296

What are the biggest pros and cons of Apache Thrift vs Google's Protocol Buffers?

Jonas
  • 97,987
  • 90
  • 271
  • 355
  • 4
    As a side note, Marc Gravell maintains a library for working with Googles protobuf called protobuf.net and it's at http://code.google.com/p/protobuf-net/ – RCIX Sep 27 '09 at 01:08
  • 6
    This question and some of the following answers are about 6 years old. Probably a lot have changed since. – AlikElzin-kilaka Jan 07 '15 at 15:59

15 Answers15

164

They both offer many of the same features; however, there are some differences:

  • Thrift supports 'exceptions'
  • Protocol Buffers have much better documentation/examples
  • Thrift has a builtin Set type
  • Protocol Buffers allow "extensions" - you can extend an external proto to add extra fields, while still allowing external code to operate on the values. There is no way to do this in Thrift
  • I find Protocol Buffers much easier to read

Basically, they are fairly equivalent (with Protocol Buffers slightly more efficient from what I have read).

Thomas
  • 150,847
  • 41
  • 308
  • 421
hazzen
  • 16,126
  • 6
  • 39
  • 33
85

Another important difference are the languages supported by default.

  • Protocol Buffers: Java, Android Java, C++, Python, Ruby, C#, Go, Objective-C, Node.js
  • Thrift: Java, C++, Python, Ruby, C#, Go, Objective-C, JavaScript, Node.js, Erlang, PHP, Perl, Haskell, Smalltalk, OCaml, Delphi, D, Haxe

Both could be extended to other platforms, but these are the languages bindings available out-of-the-box.

adu
  • 930
  • 8
  • 13
Mike Gray
  • 1,001
  • 7
  • 7
  • 16
    protobuf has excellent ruby support https://github.com/macks/ruby-protobuf and http://code.google.com/p/ruby-protobuf/. I'm using protobuf from C# (3.5) and Ruby, C# serializing the data, and when required, Ruby deserializing and working on the task. – Bryan Bailliache Feb 24 '11 at 14:04
  • 6
    http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns lists PHP, Ruby, Erlang, Perl, Haskell, C#, OCaml plus Actiona Script, Common Lisp, Go, Lua, Mathlab, Visual Basic, Scala. Thought these are all third-party implementations. – Igor Gatis Dec 28 '11 at 23:23
  • you could directly use protobuf C++ files in objective c (for both iOS and OS X) [check this qn](http://stackoverflow.com/a/15246414/1383704) – Tushar Koul Mar 14 '13 at 10:48
  • I see https://code.google.com/p/protobuf-net/ is often mentioned as a protobuf port for C#, but it's not completely true. One of important feature of Protobuf and Thrift is external structure definition, so the same definition can be used by different languages. protobuf-net doesn't support this feature because it embeds structure definition inside C# code. – Andriy Tylychko Jun 13 '13 at 19:36
  • @AndyT: That's debatable - it dependes on whether it's an advantage that the structure definition is EXTERNAL to all of the languages you want to support. With protobuf-net you define your data structure in C#, and generate the .proto file from that, which can then be used to create the support in the other languages. I consider this to be an advantage, as I'm very C#-centric and am in the process of integrating Android/Java with a large existing .Net application. So I want to continue to consider my C# classes to be the definitive structure definitions. – RenniePet Jul 26 '13 at 14:34
  • Today, in 2015, Thrift now supports 20+ languages OOTB. Still growing. – JensG Jan 27 '15 at 09:11
76

RPC is another key difference. Thrift generates code to implement RPC clients and servers wheres Protocol Buffers seems mostly designed as a data-interchange format alone.

saidimu apale
  • 1,011
  • 8
  • 12
  • 12
    That's not true. Protocol buffers define an RPC service api and there are some libraries available to implement the message passing. – Stephen May 08 '10 at 02:54
  • 7
    I didn't say Protobuf does not have RPC defined, just that it doesn't seem to have been designed for that, at least not the external release everyone has access to. Read this Google engineer's comment [here](http://steve.vinoski.net/blog/2008/07/13/protocol-buffers-leaky-rpc/#comment-1093) – saidimu apale May 08 '10 at 03:40
  • 9
    More importantly, Thrift has RPC support built in. Protobuf currently relies on third-party libraries, meaning less eyes, less testing, less reliable code. – Alec Thomas May 17 '10 at 02:32
  • 2
    For me, it's a good point about ProtoBuf. If you need to serialize only, you not add useless code. And if in the future, you need to send it by RPC, no problem, it can works. I use Netty for the network, and Protobuf is just perfectly integrated, so no problem, no test, and maximize performances. – Kikiwa Apr 05 '13 at 13:49
  • 19
    Protobufs were, in fact, designed with RPC in mind. Google just open-sourced that component fairly recently – grpc.io – andybons Apr 07 '15 at 19:41
  • 2
    also note that google released gRPC which is a fully-featured RPC system for protobufs. http://www.grpc.io/ – Travis Kaufman May 29 '16 at 21:55
  • grpc is about 10 years later from protobuf, so no it wasn't designed with it. – trilogy Mar 25 '19 at 17:08
58
  • Protobuf serialized objects are about 30% smaller than Thrift.
  • Most actions you may want to do with protobuf objects (create, serialize, deserialize) are much slower than thrift unless you turn on option optimize_for = SPEED.
  • Thrift has richer data structures (Map, Set)
  • Protobuf API looks cleaner, though the generated classes are all packed as inner classes which is not so nice.
  • Thrift enums are not real Java Enums, i.e. they are just ints. Protobuf has real Java enums.

For a closer look at the differences, check out the source code diffs at this open source project.

C R
  • 1,902
  • 5
  • 30
  • 39
eishay
  • 1,236
  • 9
  • 5
  • Here's the re-test with optimizing on: http://eishay.blogspot.com/2008/11/protobuf-with-option-optimize-for-speed.html – scobi Feb 03 '09 at 02:20
  • 1
    Quick suggestion: it'd be neat if there was another non-binary format (xml or json?) used as the baseline. There haven't been good tests that show general trends -- assumtpion is that PB and Thrift are more efficient, but if and by how much if so, is mostly an open question. – StaxMan Mar 11 '09 at 01:02
  • 4
    0.02 seconds?! I don't have that kind of time spare – Chris S Sep 07 '09 at 17:01
  • 1
    Now that Thrift has multiple protocols (including a TCompactProtocol), I think that the first bullet doesn't apply anymore. – Janus Troelsen Feb 20 '12 at 16:47
  • 14
    The optimize for speed option is now the default for protocol buffers (http://code.google.com/apis/protocolbuffers/docs/proto.html) – Willem Mar 24 '12 at 10:24
  • 6
    Do we get 30% smaller objects with "optimize_for=speed" set ? Or that is compromised ? – Prashant Sharma Sep 23 '13 at 09:01
  • @scrapcodes Yes, we still do. optimize_for=speed only affects what methods are generated (specific [de]serialise methods are created instead of falling back to the reflection-based ones), not the data members of the objects nor the binary serialisation format. BTW optimize_for=speed is default – Jim Oldfield Apr 01 '15 at 10:29
  • 1
    Since v2.0 maps have been supported by protocol buffers. https://developers.google.com/protocol-buffers/docs/proto#maps – Mehrdad Rohani Jan 02 '18 at 21:49
58

As I've said as "Thrift vs Protocol buffers" topic :

Referring to Thrift vs Protobuf vs JSON comparison :

Additionally, there are plenty of interesting additional tools available for those solutions, which might decide. Here are examples for Protobuf: Protobuf-wireshark , protobufeditor.

Vivek Viswanathan
  • 1,901
  • 17
  • 25
Grzegorz Wierzowiecki
  • 9,884
  • 8
  • 44
  • 82
  • 11
    Now this is a full circle. You've posted the exact same answer to three (similar) questions always linking back to either or. I feel like I'm playing Zelda and missed a sign. – ChrisR Jan 12 '15 at 16:47
  • +ChrisR heh, I can't recall how did it happen. Although there were couple of similar questions, maybe I should make three like structure instead of cycle. One day... It's very old question and now I am replying from phone. Anyhow, thanks for catch! – Grzegorz Wierzowiecki Jan 12 '15 at 21:21
  • 8
    "Thrift comes with a good tutorial" - how funny. Its the most incomplete tutorial I have ever seen. As soon as you want to do something beside TSimpleServer you get stuck there – Marian Klühspies Mar 11 '15 at 00:44
  • Thrift too has Wireshark plugin: https://github.com/andrewcox/wireshark-with-thrift-plugin – CCoder Jul 06 '15 at 06:10
8

I was able to get better performance with a text based protocol as compared to protobuff on python. However, no type checking or other fancy utf8 conversion, etc... which protobuff offers.

So, if serialization/deserialization is all you need, then you can probably use something else.

http://dhruvbird.blogspot.com/2010/05/protocol-buffers-vs-http.html

dhruvbird
  • 5,396
  • 5
  • 31
  • 37
8

Protocol Buffers seems to have a more compact representation, but that's only an impression I get from reading the Thrift whitepaper. In their own words:

We decided against some extreme storage optimizations (i.e. packing small integers into ASCII or using a 7-bit continuation format) for the sake of simplicity and clarity in the code. These alterations can easily be made if and when we encounter a performance-critical use case that demands them.

Also, it may just be my impression, but Protocol Buffers seems to have some thicker abstractions around struct versioning. Thrift does have some versioning support, but it takes a bit of effort to make it happen.

Daniel Spiewak
  • 52,267
  • 12
  • 104
  • 120
  • 1
    Why does the fact that Thrift admits to not being as compact as possible lead you to believe Protocol Buffers are? – Michael Mior Oct 14 '11 at 13:44
  • 1
    Protocol buffers do use variable length integer coding, both for values and for field identifiers. So the very common case of sending a int field with a small value will be two bytes, not an int16 and int32. – poolie Oct 13 '12 at 02:37
  • "Protocol buffers do use variable length integer coding" -- so does TCompactProtocol – JensG Sep 25 '19 at 17:27
7

One obvious thing not yet mentioned is that can be both a pro or con (and is same for both) is that they are binary protocols. This allows for more compact representation and possibly more performance (pros), but with reduced readability (or rather, debuggability), a con.

Also, both have bit less tool support than standard formats like xml (and maybe even json).

(EDIT) Here's an Interesting comparison that tackles both size & performance differences, and includes numbers for some other formats (xml, json) as well.

StaxMan
  • 102,903
  • 28
  • 190
  • 229
  • 3
    It's trivial to output a protocol buffer to a text representation that's much more human-readable than XML: my_proto.DebugString(). For an example, see https://code.google.com/apis/protocolbuffers/docs/overview.html – SuperElectric Oct 07 '11 at 22:46
  • Of course, ditto for all binary formats -- but that does not make them readable as is (debug on the wire). Worse, for protobuf, you really need the schema def to know field names. – StaxMan Oct 08 '11 at 05:00
  • Thrift supports different, even user-defined, protocols. You can use binary, compact, json or something you invented just last week. – JensG Sep 25 '19 at 17:25
6

And according to the wiki the Thrift runtime doesn't run on Windows.

  • 5
    I run Thrift on Windows successfully. Use windows fork at https://github.com/aubonbeurre/thrift – Sergey Podobry Oct 03 '11 at 12:35
  • 20
    The official mainline branch now has Windows support as well. – Janus Troelsen Feb 20 '12 at 16:48
  • 5
    @dalle -- Alex P added Boost thread support in Thrift. It is now the default threading for Windows. *NIX defaults to pthreads. And to confirm Janus T, Thrift now fully supports Windows. – pmont Jun 05 '12 at 22:11
  • 21
    **This is outdated Information.** Thrift runs perfectly on Windows for a looong time now. – JensG Jan 27 '15 at 09:09
6

ProtocolBuffers is FASTER.
There is a nice benchmark here:
http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking

You might also want to look into Avro, as Avro is even faster.
Microsoft has a package here:
http://www.nuget.org/packages/Microsoft.Hadoop.Avro

By the way, the fastest I've ever seen is Cap'nProto;
A C# implementation can be found at the Github-repository of Marc Gravell.

Sevle
  • 3,014
  • 2
  • 17
  • 28
Stefan Steiger
  • 68,404
  • 63
  • 337
  • 408
4

I think most of these points have missed the basic fact that Thrift is an RPC framework, which happens to have the ability to serialize data using a variety of methods (binary, XML, etc).

Protocol Buffers are designed purely for serialization, it's not a framework like Thrift.

Babra Cunningham
  • 2,691
  • 1
  • 17
  • 44
  • 3
    What do you mean by RPC framework and how is that different from protobuf's [gRPC](http://www.grpc.io/)? – marcelocra May 18 '17 at 19:46
  • gRPC is not packaged together with protobuf. It was developed like 10 years after. Thrift comes packaged with the full RPC framework. It was made together. – trilogy Sep 04 '18 at 20:13
3

For one, protobuf isn't a full RPC implementation. It requires something like gRPC to go with it.

gPRC is very slow compared to Thrift:

http://szelei.me/rpc-benchmark-part1/

trilogy
  • 1,214
  • 8
  • 24
2

I think the basic data structure is different

  1. Protocol Buffer use variable-length integee which refers to variable-length digital encoding, turning a fixed-length number into a variable-length number to save space.
  2. Thrift proposed different types of serialization formats (called "protocols"). In fact, Thrift has two different JSON encodings, and no less than three different binary encoding methods.

In conclusion,these two libraries are completely different. Thrift likes a one-stop shop, giving you the entire integrated RPC framework and many options (supporting cross-language), while Protocol Buffers is more inclined to "just do one thing and do it well".

dyy.alex
  • 410
  • 1
  • 3
  • 14
0

There are some excellent points here and I'm going to add another one in case someones' path crosses here.

Thrift gives you an option to choose between thrift-binary and thrift-compact (de)serializer, thrift-binary will have an excellent performance but bigger packet size, while thrift-compact will give you good compression but needs more processing power. This is handy because you can always switch between these two modes as easily as changing a line of code (heck, even make it configurable). So if you are not sure how much your application should be optimized for packet size or in processing power, thrift can be an interesting choice.

PS: See this excellent benchmark project by thekvs which compares many serializers including thrift-binary, thrift-compact, and protobuf: https://github.com/thekvs/cpp-serializers

PS: There is another serializer named YAS which gives this option too but it is schema-less see the link above.

Sinapse
  • 756
  • 1
  • 6
  • 20
0

It's also important to note that not all supported languages compair consistently with thrift or protobuf. At this point it's a matter of the modules implementation in addition to the underlying serialization. Take care to check benchmarks for whatever language you plan to use.

JSON
  • 1,678
  • 17
  • 26