Thrift vs. Protocol Buffers

Google recently released its Protocol Buffers as open source. About a year ago, Facebook released a similar product called Thrift. I’ve been comparing them; here’s what I’ve found:

Thrift Protocol Buffers
Backers Facebook, Apache (accepted for incubation) Google
Bindings C++, Java, Python, PHP, XSD, Ruby, C#, Perl, Objective C, Erlang, Smalltalk, OCaml, and Haskell C++, Java, Python
(Perl, Ruby, and C# under discussion)
Output Formats Binary, JSON Binary
Primitive Types bool
byte
16/32/64-bit integers

double
string
byte sequence
map<t1,t2>
list<t>
set<t>

bool

32/64-bit integers
float
double
string
byte sequence

“repeated” properties act like lists

Enumerations Yes Yes
Constants Yes No
Composite Type struct message
Exception Type Yes No
Documentation So-so Good
License Apache BSD-style
Compiler Language C++ C++
RPC Interfaces Yes Yes
RPC Implementation Yes No
Composite Type Extensions No Yes

Overall, I think Thrift wins on features and Protocol Buffers win on
documentation. Implementation-wise, they’re quite similar. Both use
integer tags to identify fields, so you can add and remove fields
without breaking existing code. Protocol Buffers support
variable-width encoding of integers, which saves a few bytes. (Thrift
has an experimental output format with variable-width ints.)

The major difference is that Thrift provides a full client/server RPC
implementation, whereas Protocol Buffers only generate stubs to use in
your own RPC system.

Update July 12, 2008: I haven’t tested for speed, but from a cursory examination it seems that, at the binary level, Thrift and Protocol Buffers are very similar. I think Thrift will develop a more coherent community now that it’s under Apache incubation. It just moved to a new web site and mailing list, and the issue tracker is active.

31 thoughts on “Thrift vs. Protocol Buffers

  1. Density

    Thrift was written by engineers at Facebook who’d worked at Google and missed using Protocol Buffers, hence the expanded feature set.

  2. Olivier

    Helpful comparison !

    I was trying to get more information on Protocol Buffers, and Wikipedia linked me to Thrift.

    I found Protocol Buffers pretty well documented, while on the other hand Thrift looks like an alpha project from the documentation point of view…It is a pity because Thrift seems to have quite a few interesting features. Considering that Thrift has been published like one year ago, I wonder if there is really a community backing it up and any enthusiasm around it ..

    It is quite an important point, because if you start using any of these libraries to communicate between different servers/services, you will probably have to stick to it for a few years …

    In addition, it would be interesting to see if these libraries could be used to store hierarchical information in a database and see what would be the performances/limitations compared with storing XML or JSON. They should work better for a basic load/save strategy but what if you just want a subset of the data, do you still need to read the whole message ? The same applied if you want to redirect a message depending on one particular field without reading the whole message …

  3. Michael Chermside

    I have not compared them in detail myself, but if you believe Google’s description of Protocol Buffers, it’s primary feature is one you have not included in your consideration. They claim that their binary format is (1) especially small and (2) especially quick to serialize and deserialize. I am not particularly familiar with Thrift, but I never had the impression that these particular optimizations were of great importance to that project. The Google version is optimized for the case where you are dealing with truly massive amounts of data (the RARE case where efficiency trumps readability).

    – Michael Chermside

  4. RogerWan

    mattrepl: The Facebook employee in question was an intern at Google while in college ;)

    That’s not to say Thrift is a copy of protocol buffers–but given the similarities it’s likely it was inspired by it.

  5. Mark Slee

    A few quick comments:

    1/ Data size and serialization performance are definitely of great importance to Thrift. Huge data sets are definitely one case where this matters, but don’t forget about high-throughput low-latency services (at Facebook, like Google, every millisecond counts). Thrift is much quicker than typical XML/RESTful service implementations, even with relatively small data sizes. This is one of the primary use cases. When you’re dealing with millions of users and thousands of servers, efficiency really starts to matter.

    2/ We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could (and probably should) be, but we do have an active and enthusiastic community. Thrift is currently being used and contributed to by Powerset, Rapleaf, iMeem, AmieStreet, the reCaptcha project, as well as a number of independent developers.

    3/ Both Thrift and Protocol Buffers are great candidates for serializing data into databases — both are more compact and quicker to read/write than XML/JSON. Another common persistent use case is the storage of replayable logfiles.

    4/ Last point, though Thrift currently only has implementations for binary/JSON, it’s designed so that the encoding format is extensible. Thrift could easily support XML or human-readable ASCII — so the trade-off of efficiency vs. readability is left up to the developer.

  6. Pingback: Thrift vs. Protocol Buffers - Digital Digressions by Stuart Sierra « The other side of the firewall

  7. software user

    > We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could

    Please don’t release code without decent documentation. If you are making effort then do it right. Code of this complexity needs good doc or it’s just a complete waste of time and totally frustrating.

  8. Pingback: links for 2008-07-14 « Breyten’s Dev Blog

  9. Pingback: Thrift | Programmer’s Paradox

  10. Pingback: links for 2008-07-23 « Brent Sordyl’s Blog

  11. Pingback: Paolo blog: Ramblings on Web2.0, Trust, Reputation, Recommender Systems, Social Software, Free Software, ICT4D and much more » Blog Archive » The consequences of opensourcing Facebook code

  12. Pingback: Google Protocol Buffers | Approaching the Dewpoint

  13. David Marcus

    I would offer that the Thrift community is lacking an official site where a newcomer (to Thrift) can download a running Thrift compiler for their system.

    I very much want to try Thrift. I’ve been looking for a ready-to-run binary of the compiler and C# generators (on Windows/XP) but cannot find them anywhere. The ‘make’ scripts (as documented) are pretty useless in the Windows world.

    I dare say that the barrier to entry is having to build the compiler on one’s own machine before anything else can be done. I bet most people give up and go elsewhere.

    -David

  14. Michael Greene

    David: Unfortunately, since Thrift has not made an Apache release yet and is in incubation at Apache, the project members are unable to provide official binaries that get distributed. If you had said something on the mailing list, I could have sent you binaries if you’re having difficulty getting up and running. Otherwise, http://wiki.apache.org/thrift/ThriftInstallationWin32 does a decent job of describing how to get going, but could probably be improved.

  15. Jacob Grydholt

    The license information is swapped in the table. Thrift is Apache while protobuffers are BSD.

  16. Pingback: Thrift vs Protocol Bufffers vs JSON | MirthLab

  17. Pingback: Thrift and Protocol Buffers performance in Java – Tim[后端技术]

  18. Pingback: Installing Thrift on Snow Leopard

  19. Pingback: Google Protocol Buffers and other data interchange formats « John's Blog

  20. Jim

    It looks like a great library, but not much good if your dev environment is Visual Studio. Probably you’d need something more Posix and cross-platform (perhaps MinGW). The library relies heavily on Posix threads, which are not available in win32 unless you’re prepared to incorporate a GPL or LGPL library, and I’m not. The protocol buffer library is cross-platform, but then again it doesn’t include a transport mechanism and is therefore less powerful.

  21. Pingback: Facebook / Apache Thrift « My Blog

Comments are closed.