Thrift vs. Protocol Buffers
Posted by: Stuart in Programming, tags: Protocol Buffers, ThriftGoogle recently released its Protocol Buffers as open source. About a year ago, Facebook released a similar product called Thrift. I’ve been comparing them; here’s what I’ve found:
| Thrift | Protocol Buffers | |
|---|---|---|
| Backers | Facebook, Apache (accepted for incubation) | |
| Bindings | C++, Java, Python, PHP, XSD, Ruby, C#, Perl, Objective C, Erlang, Smalltalk, OCaml, and Haskell | C++, Java, Python (Perl, Ruby, and C# under discussion) |
| Output Formats | Binary, JSON | Binary |
| Primitive Types | bool byte 16/32/64-bit integers double string byte sequence map<t1,t2> list<t> set<t> |
bool 32/64-bit integers float double string byte sequence “repeated” properties act like lists |
| Enumerations | Yes | Yes |
| Constants | Yes | No |
| Composite Type | struct | message |
| Exception Type | Yes | No |
| Documentation | So-so | Good |
| License | BSD-style | Apache |
| Compiler Language | C++ | C++ |
| RPC Interfaces | Yes | Yes |
| RPC Implementation | Yes | No |
| Composite Type Extensions | No | Yes |
Overall, I think Thrift wins on features and Protocol Buffers win on
documentation. Implementation-wise, they’re quite similar. Both use
integer tags to identify fields, so you can add and remove fields
without breaking existing code. Protocol Buffers support
variable-width encoding of integers, which saves a few bytes. (Thrift
has an experimental output format with variable-width ints.)
The major difference is that Thrift provides a full client/server RPC
implementation, whereas Protocol Buffers only generate stubs to use in
your own RPC system.
Update July 12, 2008: I haven’t tested for speed, but from a cursory examination it seems that, at the binary level, Thrift and Protocol Buffers are very similar. I think Thrift will develop a more coherent community now that it’s under Apache incubation. It just moved to a new web site and mailing list, and the issue tracker is active.

Entries (RSS)
Thrift was written by engineers at Facebook who’d worked at Google and missed using Protocol Buffers, hence the expanded feature set.
Helpful comparison !
I was trying to get more information on Protocol Buffers, and Wikipedia linked me to Thrift.
I found Protocol Buffers pretty well documented, while on the other hand Thrift looks like an alpha project from the documentation point of view…It is a pity because Thrift seems to have quite a few interesting features. Considering that Thrift has been published like one year ago, I wonder if there is really a community backing it up and any enthusiasm around it ..
It is quite an important point, because if you start using any of these libraries to communicate between different servers/services, you will probably have to stick to it for a few years …
In addition, it would be interesting to see if these libraries could be used to store hierarchical information in a database and see what would be the performances/limitations compared with storing XML or JSON. They should work better for a basic load/save strategy but what if you just want a subset of the data, do you still need to read the whole message ? The same applied if you want to redirect a message depending on one particular field without reading the whole message …
I have not compared them in detail myself, but if you believe Google’s description of Protocol Buffers, it’s primary feature is one you have not included in your consideration. They claim that their binary format is (1) especially small and (2) especially quick to serialize and deserialize. I am not particularly familiar with Thrift, but I never had the impression that these particular optimizations were of great importance to that project. The Google version is optimized for the case where you are dealing with truly massive amounts of data (the RARE case where efficiency trumps readability).
– Michael Chermside
nice and short summary
i like that a lot, thanks for the effort!
How do they compare in speed?
Density: What makes you say that Thrift is directly a Protocol Buffers clone? According to the Thrift whitepaper (see acknowledgment section) it is the successor to a project called Pillar, written by a fB employee while in college.
Link to paper:
http://developers.facebook.com/thrift/thrift-20070401.pdf
Mark Slee interned at Google, and the paper specifically cites Protocol Buffers after Pillar.
http://www.cricketschirping.com/weblog/?p=1210
mattrepl: The Facebook employee in question was an intern at Google while in college ;)
That’s not to say Thrift is a copy of protocol buffers–but given the similarities it’s likely it was inspired by it.
There’s currently two Haskell implementations of protocol buffers, one of which you can find here. http://hackage.haskell.org/cgi-bin/hackage-scripts/package/protocol-buffers-0.0.5 the other is under development.
A few quick comments:
1/ Data size and serialization performance are definitely of great importance to Thrift. Huge data sets are definitely one case where this matters, but don’t forget about high-throughput low-latency services (at Facebook, like Google, every millisecond counts). Thrift is much quicker than typical XML/RESTful service implementations, even with relatively small data sizes. This is one of the primary use cases. When you’re dealing with millions of users and thousands of servers, efficiency really starts to matter.
2/ We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could (and probably should) be, but we do have an active and enthusiastic community. Thrift is currently being used and contributed to by Powerset, Rapleaf, iMeem, AmieStreet, the reCaptcha project, as well as a number of independent developers.
3/ Both Thrift and Protocol Buffers are great candidates for serializing data into databases — both are more compact and quicker to read/write than XML/JSON. Another common persistent use case is the storage of replayable logfiles.
4/ Last point, though Thrift currently only has implementations for binary/JSON, it’s designed so that the encoding format is extensible. Thrift could easily support XML or human-readable ASCII — so the trade-off of efficiency vs. readability is left up to the developer.
[...] July 12, 2008 at 10:52 pm · Filed under Programming [From Thrift vs. Protocol Buffers - Digital Digressions by Stuart Sierra] [...]
> We do have to admit Thrift isn’t yet as fully-featured in the documentation department as it could
Please don’t release code without decent documentation. If you are making effort then do it right. Code of this complexity needs good doc or it’s just a complete waste of time and totally frustrating.
[...] Thrift vs. Protocol Buffers (tags: api code network programming reference rpc) [...]
[...] Until then, Stuart has provided a nice overview of Thrift’s features by comparing it to Google’s recently released Protocol Buffers. If you’re interested in either, read Stuart’s article. [...]
[...] Thrift vs. Protocol Buffers Overall, I think Thrift wins on features and Protocol Buffers win on documentation. Implementation-wise, they’re quite similar. (tags: protocolbuffers google facebook thrift programming api) [...]
[...] Some interesting links which might be worth checking in more detail: open source projects on facebook wiki, the portal for developers on Facebook code (interesting!), Project Cassandra: Facebook’s Open Source Alternative to Google BigTable, the fact Google recently released its Protocol Buffers as open source, Facebook did it much earlier with Thri…. [...]
[...] recently became an Apache project. Nice stuff. Stuart Sierra has a nice comparison on his blog, Thrift vs. Protocol Buffers. Another worthy contender but not a big enough advantage to stop the internal momentum protobuf [...]